NoiseMiner: An Algorithm for Scalable Automatic Computational Noise and Software Interference Detection
Authors:
Isaac Dooley, Chao Mei, Laxmikant V. Kale
Parallel Programming Laboratory, Department of Computer Science, University
of Illinois at Urbana-Champaign
To appear in Proceedings of HIPS Workshop at IEEE International Parallel and Distributed Processing Symposium 2008
This paper describes a new scalable stream mining algorithm called NoiseMiner that analyzes parallel application traces to detect computational noise, operating system interference, software interference, or other irregularities in a parallel application's performance. The algorithm detects these occurrences of noise during real application runs, whereas standard techniques for detecting noise use carefully crafted test programs to detect the problems. This paper concludes by showing the output of NoiseMiner for a real-world case in which 6 ms delays, caused by a bug in an MPI implementation, significantly limited the performance of a molecular dynamics code on a new supercomputer.