Center for Petascale Computing

Center for Petascale Computing

A collaboration led by Laxmikant Kalé (Computer Science) and Duane Johnson (Materials Science and Engineering) on a research theme within IACAT

Dynamic Load Balancing

A unique capability in Charm++/AMPI, and one that makes a compelling case for its adoption, is measurement-based dynamic load balancing, relying on our empirical observation called the principle of persistence: once an application has been expressed in terms of its natural objects, the object loads and communication patterns persist over time, even in a dynamic application whose load evolves over time. The RTS mediates communication and scheduling of objects. Therefore, it can easily keep track of the CPU time consumed by each object, and the number of bytes and messages exchanged between objects. Since the principle of persistence implies that the recent past is a good predictor of the near future, the RTS can then rebalance the computation and communication by reassigning objects to processors. The RTS includes a suite of load balancing strategies that are useful in varying circumstances. However, some open research issues still remain in the face of large petascale machines. One major challenge is the scalability and efficiency of the load balancer itself on petascale machines. We have proposed and developed a prototype hierarchical scheme that combines the advantages of those centralized and distributed schemes that potentially can scale to petascale parallel machines. We plan to explore topology-aware load balancing strategies in the scalable hierarchical load balancer which improves the work-to-processor mapping by taking the machine topology into account in a distributed fashion.

Investigator: L.V. Kale

AMPI Information

Dynamic Load Balancing Information