Understanding Application Performance on Three Predominant Supercomputer Architectures: Intrepid, Ranger and Jaguar, using Micro-benchmarks

PPL Paper Number: 09-04
PPL CVS: 200804_PerfCompSC08

Authors:
Abhinav Bhatele, Lukasz Wesolowski, Eric Bohm, Edgar Solomonik and Laxmikant V. Kale
Parallel Programming Laboratory, Department of Computer Science, University of Illinois at Urbana-Champaign

PPL Technical Report, 2009


Abstract

Emergence of new parallel architectures presents new challenges for application developers. Supercomputers vary in processor speed, network topology, interconnect communication characteristics and memory subsystems. This paper presents a performance comparison of three of the fastest machines in the world: IBM's Blue Gene/P installation at ANL (Intrepid), the SUN-Infiniband cluster at TACC (Ranger) and Cray's XT4 installation at ORNL (Jaguar). Comparisons are based on three applications selected by NSF for the Track 1 proposal to benchmark the Blue Waters system: NAMD, MILC and a turbulence code, DNS. We present a comprehensive overview of the architectural details of each of these machines and compare across them. Application performance on the machines is explained through micro-benchmarking results obtained on the machines. We hope that insights from this work will be useful to application developers who port and tune applications on these and future machines to obtain maximum performance.


[postscript] [PDF] [bibtex] [text reference]