2.3 Migration-Based Load Balancing

The CHARM++ runtime framework includes an automatic run-time load balancer, which can monitor the performance of your parallel program. If needed, the load balancer can ``migrate'' threads from heavily-loaded processors to more lightly-loaded processors, improving the load balance and speeding up the program. For this to be useful, you need to pass the link-time argument -balancer B to set the load balancing algorithm, and the run-time argument +vp N (use N virtual processors) to set the number of threads. The ideal number of threads per processor depends on the problem, but we've found five to a hundred threads per processor to be a useful range.

When a thread migrates, all its data must be brought with it. ``Stack data'', such as variables declared locally in a subroutine, will be brought along with the thread automatically. Global data, as described in Section 2.1, is never brought with the thread and should generally be avoided.

``Heap data'' in C is structures and arrays allocated using malloc or new; in Fortran, heap data is TYPEs or arrays allocated using ALLOCATE. To bring heap data along with a migrating thread, you have two choices: write a pup routine or use isomalloc. Pup routines are described in Section 3.1.

Isomalloc is a special mode which controls the allocation of heap data. You enable isomalloc allocation using the link-time flag ``-memory isomalloc''. With isomalloc, migration is completely transparent--all your allocated data is automatically brought to the new processor. The data will be unpacked at the same location (the same virtual addresses) as it was stored originally; so even cross-linked data structures that contain pointers still work properly.

The limitations of isomalloc are:

February 12, 2012
Charm Homepage