Hierarchical Load Balancing for Charm++ Applications on Large Supercomputers
International Workshop on Parallel Programming Models and Systems Software for High-End Computing at ICPP (P2S2) 2010
Publication Type: Paper
Repository URL: 201002_HierLdb
Abstract
Large parallel machines with hundreds of thousands of processors
are being built. Recent studies have shown that ensuring good load
balance is critical for scaling certain classes of parallel
applications on even thousands of processors. Centralized load
balancing algorithms suffer from scalability problems, especially
on machines with relatively small amount of memory. Fully
distributed load balancing algorithms, on the other hand, tend to
yield poor load balance on very large machines. In this paper, we
present an automatic dynamic hierarchical load balancing method
that overcomes the scalability challenges of centralized schemes
and poor solutions of traditional distributed schemes. This is done
by creating multiple levels of aggressive load balancing domains
which form a tree. This hierarchical method is demonstrated within
a measurement-based load balancing framework in Charm++. We present
techniques to deal with scalability challenges of load balancing at
very large scale. We show performance data of the hierarchical load
balancing method on up to 16,384 cores of Ranger (at TACC) for a
synthetic benchmark. We also demonstrate the successful deployment
of the method in a scientific application, NAMD with results on the
Blue Gene/P machine at ANL.
TextRef
Gengbin Zheng, Esteban Meneses, Abhinav Bhatele and Laxmikant V. Kale, "Hierarchical Load Balancing for Charm++ Applications on Large Supercomputers", Proceedings of the Third International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), 2010
People
Research Areas