Automating Topology Aware Mapping for Supercomputers




 Abstract:

Petascale machines with hundreds of thousands of cores are being built. These machines have varying interconnect topologies and large network diameters. Computation is cheap and communication on the network is becoming the bottleneck for scaling of parallel applications. Network contention, specifically, is becoming an increasingly important factor affecting overall performance. Most parallel applications have a certain communication topology. Mapping of tasks in a parallel application based on their communication graph, to the physical processors on a machine can potentially lead to performance improvements.

Mapping of the communication graph for an application on to the interconnect topology of a machine while trying to localize communication is the research problem under consideration. The farther different messages travel on the network, greater is the chance of resource sharing between messages. This can create contention on the network for common networks used today. Evaluative studies in this dissertation show that on IBM Blue Gene and Cray XT machines, message latencies can be severely affected under contention. Realizing this fact, application developers have started paying attention to the mapping of tasks to physical processors to minimize contention. Placement of communicating tasks on nearby physical processors can minimize the distance traveled by messages and reduce the chances of contention.

Performance improvements through topology aware placement for applications such as NAMD and OpenAtom are used to motivate this work. Building on these ideas, the dissertation proposes algorithms and techniques for automatic mapping of parallel applications to relieve the application developers of this burden. The effect of contention on message latencies is studied in depth to guide the design of mapping algorithms. The hop-bytes metric is proposed for the evaluation of mapping algorithms as a better metric than the previously used maximum dilation metric. The main focus of this dissertation is on developing topology aware mapping algorithms for parallel applications with regular and irregular communication patterns. The automatic mapping framework is a suite of such algorithms with capabilities to choose the best mapping for a problem with a given communication graph. The dissertation also briefly discusses completely distributed mapping techniques which will be imperative for machines of the future.


 PhD Committee:

Prof. Laxmikant V. Kale (Chair and Director of Dissertation Research), Professor of Computer Science and Director, Parallel Programming Laboratory, Illinois
Prof. William D. Gropp, Paul and Cynthia Saylor Professor of Computer Science, University of Illinois
Prof. David A. Padua, Donald Biggar Willett Professor of Computer Science, University of Illinois
Dr. Matthew H. Reilly, Co-founder, Vice President of Semiconductor Development and Chief Engineer, SiCortex, Inc.

 Final Defense:

Date: June 02nd, 2010 (2:30 pm)
Place: SC 1304
Thesis: [pdf]
Talk: [key] [pdf]
Live Webcast: [link]
List of Publications [pdf]

 Preliminary Examination:

Date: October 30th, 2008 (3:30 pm)
Place: SC 4407
Proposal: [pdf]
Talk: [pptx] [pdf]


 In the news:

Projects in Scientific Computing, 2009 (PSC)
George Michael Memorial High Performance Computing Fellow, 2009


 Related Work:

Papers on Topology-aware Mapping
Survey of Interconnect Topologies

 Progress So Far:

Topology Manager (obtains topology information on different machines)
Latency and Contention Benchmark Suite (in MPI) and results on different supercomputers

 Software Downloads:

The TopoManager API
MPI Contention Suite


 Publications:

  1. Abhinav Bhatele, Eric Bohm and Laxmikant V. Kale, Optimizing communication for Charm++ applications by reducing network contention, accepted for publication in Concurrency and Computation: Practice and Experience (EuroPar special issue), 2010 [pdf]
  2. Abhinav Bhatele and Laxmikant V. Kale, Quantifying Network Contention on Large Parallel Machines, Parallel Processing Letters (Special Issue on Large-Scale Parallel Processing), Vol. 19 Issue 4, Pages 553-572, 2009 [pdf]
  3. Abhinav Bhatele, Eric Bohm, Laxmikant V. Kale, A Case Study of Communication Optimizations on 3D Mesh Interconnects, Proceedings of Euro-Par (Topic 13 - High Performance Networks), 2009 [pdf]
  4. Abhinav Bhatele, Laxmikant V. Kale, Nicholas Chen and Ralph E. Johnson, A Pattern Language for Topology Aware Mapping, Workshop on Parallel Programming Patterns (ParaPLOP 2009) [pdf]
  5. Abhinav Bhatele, Laxmikant V. Kale, Sameer Kumar, Dynamic Topology Aware Load Balancing Algorithms for MD Applications, Proceedings of International Conference on Supercomputing, 2009 [pdf]
  6. Abhinav Bhatele, Laxmikant V. Kale, An Evaluative study on the Effect of Contention on Message Latencies in Large Supercomputers, Proceedings of Workshop on Large-Scale Parallel Processing (IPDPS), 2009 [pdf]
  7. Abhinav Bhatele, Laxmikant V. Kale, Benefits of Topology-aware Mapping for Mesh Topologies, Parallel Processing Letters (Special issue on Large Scale Parallel Processing), Vol. 18, Issue 4, Pages 549-566, 2008 [pdf]
  8. Abhinav Bhatele, Laxmikant V. Kale, Application-specific Topology-aware Mapping for Three Dimensional Topologies, Proceedings of Workshop on Large-Scale Parallel Processing (IPDPS), 2008 [pdf]
  9. Abhinav Bhatele, Application-specific Topology-aware Mapping and Load Balancing for three-dimensional Torus Topologies, Master's Thesis, Department of Computer Science, University of Illinois, 2007 [pdf]

 Presentations and Posters:

  1. Abhinav Bhatele, Automating Topology Aware Task Mapping for Large Supercomputers, Doctoral Showcase, SC '09, Portland, OR [pptx] [pdf]
  2. Abhinav Bhatele, Load Balancing and Topology Aware Mapping for Petascale Machines, Scaling to Petascale Summer School, UIUC, Urbana, IL [pptx] [pdf]
  3. Abhinav Bhatele, A Case Study of Communication Optimizations on 3D Interconnects, Euro-Par 2009, Delft, The Netherlands [pptx] [pdf]
  4. Abhinav Bhatele, Dynamic Topology Aware Load Balancing Algorithms for MD Applications, International Conference on Supercomputing (ICS) 2009, New York, NY [pptx] [pdf]
  5. Abhinav Bhatele, An Evaluative Study on the Effects of Contention on Message Latencies in Large Supercomputers, Workshop on Large-Scale Parallel Processing (IPDPS 2009), Rome, Italy [pptx] [pdf]
  6. Abhinav Bhatele, IS TOPOLOGY IMPORTANT AGAIN? - Effects of Contention on Message Latencies in Large Supercomputers, ACM Student Research Competition, SC '08, Austin, TX [pptx] [pdf]
  7. Abhinav Bhatele, Topology Aware Mapping for Performance Optimization of Science Applications, IACAT Seminar, UIUC, Urbana, IL [pptx]
  8. Abhinav Bhatele, Dynamic Topology Aware Load Balancing Algorithms for MD Applications, UK e-Science All Hands Meeting 2008, Edinburgh, UK [ppt]
  9. Abhinav Bhatele, Application-specific Topology-aware Mapping for Three Dimensional Topologies, LSPP (IPDPS '08), Miami, FL, 2008 [ppt]
  10. Abhinav Bhatele, Eric Bohm, Laxmikant V. Kale, Topology Aware Task Mapping Techniques: An API and Case Study (poster), PPoPP 2009 [pdf]
  11. Abhinav Bhatele, Laxmikant V. Kale, Effects of Contention on Message Latencies in Large Supercomputers (poster), ACM Student Research Competition, SC, 2008 [pdf]
  12. Abhinav Bhatele, Laxmikant V. Kale, Automatic Topology-Aware Task Mapping for Parallel Applications Running on Large Parallel Machines (poster), TCPP PhD Forum, IPDPS, 2008 [pdf]

 Resources:

My Official HomePage
Research Page on Topology-aware Mapping

Cray XT3 at PSC (Bigben)
Cray XT4 at ORNL (Jaguar)
Blue Gene/L at IBM T J Watson
Blue Gene/L at LLNL
Blue Gene/P at ANL

Big N Computing

 Other Papers:

Papers on Matrix Mulitplication
Papers on CPMD


 My Work:
  • Considers both topological and cardinality variation
  • Hence considers load imbalance
  • Considers the amount of communication on each link
  • Objective function: Hop-bytes
  • Can do both initial mapping and dynamic movement depending on changes to the communication graph during runtime
  • Currently does not consider the routing rules of the network


Back to Home