Our research in load balancing focuses on two primary areas: Object
Migration and Seed Balancing.
Object Migration
- Periodic Load Balancing for bipartite object networks
- Adaptive use of Workstation Clusters
- Optimal Object Migration to handle background load variation
A major reason slowing deployment of parallel programs is that
efficient parallel programs are difficult to write. Parallel
programming adds a second dimension to programming: not just when will
a particular operation be executed, but where, i.e. what processor
will perform it. A vast number of parallelizable applications do not
have a regular structure for efficient parallelization. Such
applications require load balancing to perform efficiently in
parallel. The load in these applications may also change over time,
requiring rebalancing. The programmer is left with the choice of
either distributing computation haphazardly, producing
poorly-performing programs, or spending more development time
including load-balancing code in the application.
|
A typical object distribution on two processors in a Charm++ program.
|
The components and interactions in the load balancing framework.
|
In recent years, new types of "parallel" computers have
appeared. Networks of commodity workstations are making parallel
computation available to an expanding group of
researchers. Workstation networks present new issues for the
application programmer. Now, in addition to application imbalance, a
parallel program must be concerned with background load from other
simultaneous users. Parallel programs may run on clusters of
workstations on an interactive user's desks, where the primary user
only permits parallel computation when the computer is not being used
interactively. Finally, computational clusters may expand over time,
but with the rapid increase in computational power, new processors are
likely to be faster than the older machines that they are supplementing. To
maximize throughput, load balancers in parallel applications must
account for all these factors.
|
Work migration is a unified scheme for handling both
application-specific and externally-arising load imbalance. The
difficulty with migrating work is that either work is repartitioned in
an application-specific way, placing the burden on the application
programmer, or that automatic migration is supported, but with poor
accuracy, due to the lack of application-specific knowledge.
Object migration provides a way of performing accurate,
fine-grained automatic load balancing. Objects usually have small,
well-defined regions of memory on which they operate, reducing the
cost of migration. Using the Charm++ object model, the run-time system
measures the work represented by particular objects, rather than
deriving execution time from application-specific
heuristics. Furthermore, the run-time system records object-to-object
communication patterns, so the load balancer can asses the
communication impact of migrating particular objects.
With the advent of massively parallel machines like Bluegene/L and Cray XT3, our recent work has focussed on topology-aware migration of objects. We have developed strategies which take into account both the message sizes and the network hop length to minimize the total amount of communication.
Communication latencies form a significant factor in the performance of parallel applications on these large machines. The latencies are primarily due to network contention in the grid and torus networks, which are usually used in these large parallel machines. Our load balancing strategies minimize the impact of topology by heuristically minimizing the number of hops traveled by each communicated byte. They are not network specific and work for all classes of interconnection networks.
Seed Load Balancing
Seed load balancing involves the movement of object creation
messages, or "seeds", to create a balance of work across a set of
processors. Several variations of strategies are being analyzed. In
particular, we distinguish between global strategies, which may result
in communication amongst all processors to exchange load information,
and neighborhood strategies, which typically impose a dense graph
organization on the processors, and restrict communication to
neighbors only. Some strategies use averaging of loads to determine
how seeds should be distributed, while others use receiver-initiated
strategies, where a processor requests work from elsewhere when it is
about to go idle. A strategy that places seeds randomly when they are
created and does no movement of seeds thereafter is used as a baseline
for comparison on numerous benchmarks.
|
- 11-07
Aaron Becker, Gengbin Zheng and Laxmikant Kale, Distributed Memory Load Balancing, Encyclopedia of Parallel Computing, David Padua, Ed., 2011 (to appear)
- 10-26
Eduardo R. Rodrigues, Philippe O. A. Navaux, Jairo Panetta, Alvaro Fazenda, Celso L. Mendes and Laxmikant V. Kale , A Comparative Analysis of Load Balancing Algorithms Applied to a Weather Forecast Model, Proceedings of 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'2010)
- 10-20
Gengbin Zheng, Abhinav Bhatele, Esteban Meneses and Laxmikant V. Kale, Periodic Hierarchical Load Balancing for Large Supercomputers, accepted for publication in International Journal for High Performance Computing Applications (IJHPCA), 2010
- 10-17
Abhinav Bhatele, Automating Topology Aware Mapping for Supercomputers, PhD Thesis, Department of Computer Science, University of Illinois, 2010
http://hdl.handle.net/2142/16578
- 10-08
Gengbin Zheng, Esteban Meneses, Abhinav Bhatele and Laxmikant V. Kale, Hierarchical Load Balancing for Charm++ Applications on Large Supercomputers, Proceedings of the International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), 2010
- 09-02
Abhinav Bhatele, Laxmikant V. Kale and Sameer Kumar, Dynamic Topology Aware Load Balancing Algorithms for Molecular Dynamics Applications, Proceedings of 23rd ACM International Conference on Supercomputing (ICS), 2009.
- 07-12
Abhinav Bhatele, Application-specific Topology-aware Mapping and Load Balancing for three-dimensional Torus Topologies, Master's Thesis, Department of Computer Science, University of Illinois, 2007
- 07-01
Gregory A. Koenig and Laxmikant V. Kale, Optimizing Distributed Application Performance Using Dynamic Grid Topology-Aware Load Balancing, Proceedings of 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007), Long Beach California USA, March 2007.
- 06-05
Gengbin Zheng, Orion Sky Lawlor, Laxmikant V. Kale, Multiple Flows of Control in Migratable Parallel Programs, the 8th Workshop on High Performance Scientific and Engineering Computing (HPSEC-06)
- 05-18
Tarun Agarwal, Amit Sharma, Laxmikant V. Kale, Topology-aware task mapping for reducing communication contention on large parallel machines, Proceedings of IEEE International Parallel and Distributed Processing Symposium 2006
- 05-06
Gengbin Zheng, Achieving High Performance on Extremely Large Parallel Machines: Performance Prediction and Load Balancing, Ph.D. Thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, 2005
- 99-06
Shobana Radhakrishnan, Robert Brunner, and Laxmikant Kale, Branch and Bound Based Load Balancing for Parallel Applications , Lecture Notes in Computer Science, Volume 1732, 1999.
- 99-03
Robert K. Brunner and Laxmikant V. Kale, Handling Application-Induced Load Imbalance using Parallel Objects, Parallel and Distributed Computing for Symbolic and Irregular Applications, pp. 167-181
- 99-02
Robert K. Brunner and Laxmikant V. Kale, Adapting to Load on Workstation Clusters, The Seventh Symposium on the Frontiers of Massively Parallel Computation
- 98-02
L. V. Kale, Milind Bhandarkar and Robert Brunner , Load Balancing in Parallel Molecular Dynamics, Fifth International Symposium on Solving Irregularly Structured Problems in Parallel
- 96-07
Sanjeev Krishnan and L. V. Kale, Automating Runtime Optimizations for Load Balancing in Irregular Problems, Proceedings of the Conference on Parallel and Distributed Processing Technology and Applications, San Jose, August 1996.
- 93-13
Amitabh B. Sinha and Laxmikant V. Kale, A Load Balancing Strategy For Prioritized Execution of Tasks, International Symposium on Parallel Processing, Newport Beach, CA, April 1993.
- 92-05
Amitabh B. Sinha and Laxmikant V. Kale, A Load Balancing Strategy For Prioritized Execution of Tasks, Workshop on Dynamic Object Placement and Load Balancing, in co-operation with ECOOP's 92, April 1992. Utrecht, The Netherlands.
- 89-08
W. W. Shu and L. V. Kale, A Dynamic Load Balancing Strategy for the Chare Kernel System, Supercomputing89, November, 1989, pp. 389--398
|