A CHARM++ program consists of a number of CHARM++ objects distributed across the available number of processors. Thus, the basic unit of parallel computation in CHARM++ programs is the chare, a CHARM++ object that can be created on any available processor and can be accessed from remote processors. A chare is similar to a process, an actor, an ADA task, etc. Chares are created dynamically, and many chares may be active simultaneously. Chares send messages to one another to invoke methods asynchronously. Conceptually, the system maintains a ``work-pool'' consisting of seeds for new chares, and messages for existing chares. The runtime system (called Charm Kernel) may pick multiple items, non-deterministically, from this pool and execute them.
Methods of a chare that can be remotely invoked are called entry methods. Entry methods may take marshalled parameters, or a pointer to a message object. Since chares can be created on remote processors, obviously some constructor of a chare needs to be an entry method. Ordinary entry methods6 are completely non-preemptive- CHARM++ will never interrupt an executing method to start any other work, and all calls made are asynchronous.
CHARM++ provides dynamic seed-based load balancing. Thus location (processor number) need not be specified while creating a remote chare. The Charm Kernel will then place the remote chare on a least loaded processor. Thus one can imagine chare creation as generating only a seed for the new chare, which may take root on the most fertile processor. Charm Kernel identifies a chare by a ChareID. Since user code does not need to name a chares' processor, chares can potentially migrate from one processor to another. (This behavior is used by the dynamic load-balancing framework for chare containers, such as arrays.)
Other CHARM++ objects are collections of chares. They are: chare-arrays, chare-groups, and chare-nodegroups, referred to as arrays, groups, and nodegroups throughout this manual. An array is a collection of arbitrary number of migratable chares, indexed by some index type, and mapped to processors according to a user-defined map group. A group (nodegroup) is a collection of chares, one per processor (SMP node), that is addressed using a unique system-wide name.
Every CHARM++ program must have at least one mainchare. Each mainchare is created by the system on processor 0 when the CHARM++ program starts up. Execution of a CHARM++ program begins with the Charm Kernel constructing all the designated mainchares. For a mainchare named X, execution starts at constructor X() or X(CkArgMsg *) which are equivalent. Typically, the mainchare constructor starts the computation by creating arrays, other chares, and groups. It can also be used to initialize shared readonly objects.
The only method of communication between processors in CHARM++ is asynchronous entry method invocation on remote chares. For this purpose, Charm Kernel needs to know the types of chares in the user program, the methods that can be invoked on these chares from remote processors, the arguments these methods take as input etc. Therefore, when the program starts up, these user-defined entities need to be registered with Charm Kernel, which assigns a unique identifier to each of them. While invoking a method on a remote object, these identifiers need to be specified to Charm Kernel. Registration of user-defined entities, and maintaining these identifiers can be cumbersome. Fortunately, it is done automatically by the CHARM++ interface translator. The CHARM++ interface translator generates definitions for proxy objects. A proxy object acts as a handle to a remote chare. One invokes methods on a proxy object, which in turn carries out remote method invocation on the chare.
In addition, the CHARM++ interface translator provides ways to enhance the basic functionality of Charm Kernel using user-level threads and futures. These allow entry methods to be executed in separate user-level threads. These threaded entry methods may block waiting for data by making synchronous calls to remote object methods that return results in messages.
CHARM++ program execution is terminated by the CkExit call. Like the exit system call, CkExit never returns. The Charm Kernel ensures that no more messages are processed and no entry methods are called after a CkExit. CkExit need not be called on all processors; it is enough to call it from just one processor at the end of the computation.
November 23, 2009
Charm Homepage