Thus a Charisma component is a flat collection of computations attached to a set of access points as shown in figure 4.1. However, real world components may have complex control-flow structure as shown in figure 4.2. Individual computations not only depend on the availability of data for them but also upon the completion of previous computations. Further, components that implement iterative parallel computations may have loops in their control-flow structures, which cannot be clearly expressed in components written in the style of figure 4.1.
|
In addition, The order of execution of computations is determined by the order of data (bundled in Converse messages) available at the input ports and by priorities associated with these ports. Due to unpredictable delays in remote response times, the messages may arrive in any order and the programmer must deal with all possible message orderings. This increases the programming complexity significantly because the component code has to handle all possible message orderings, by buffering unexpected messages for later delivery, and by writing bookkeeping code to check and fetch from these buffers upon completion of sub-tasks. We call this code ``component coordination''.
The intra-component control flow can be expressed by issuing blocking receives for tagged messages in multithreaded components. This method is appropriate for explicit message-passing components that block for receiving specific tagged messages. In thread-based coordination, a blocking receive blocks only the calling thread, and not entire processor. An example of such orchestrator is the sequencer subroutine in NAMD that coordinates the actions of a patch object. For message-passing components that use single threaded messaging libraries such as MPI, a user-level thread is an alien concept. Thus exposing explicit thread yielding to the component developer will need major changes to made to the existing MPI-based components. However, implementation of blocking calls such as MPI_Recv or collective communication operations can internally suspend a thread and yield to other running threads on a processor. In the next chapter, we describe Adaptive MPI, our implementation of MPI over Converse that simplifies component coordination.
While threads simplify coordination of message-driven components, there are several disadvantages of threads. Threads have to pay a performance penalty in terms of the context-switching costs. Implicit references to threads' stacks make thread migration complicated. Also, it is difficult to precisely estimate the stacksize requirements for threads, leading to wastage of memory. At thread creation time, one has to reserve the maximum stack space that may be used by the thread in its lifetime. For example, in NAMD, the estimated stack size for sequencer threads was 128 KB each. For a molecular system of 37000 atoms divided into 343 patches, NAMD would consume about 65 MB of memory, causing it to exhaust all available memory on some machines.
Our method for simplifying component coordination avoids use of threads. Instead, our approach provides a language for expressing control-dependence graphs for components, and generating the coordination glue code from these graphs. Dagger [28] allows easy expression of such dependence graphs by introducing when-blocks and condition variables in a message-driven component. A when-block specifies dependences as a list of messages and condition variables with their associated reference numbers. A Dagger program tells the run-time system that it is at a stage to process a message by issuing an expect statement. A condition variable is used to signal the end of a when-block with a ready statement. Thus control-dependences among when-blocks belonging to the same message-driven object can be expressed using condition variables. However, the structure of Dagger programs does not clearly reflect the control flow within an object because a Dagger program is a flat collection of when-blocks.
We have developed a coordination language called Structured Dagger (Structured Dagger [39]) for this purpose. This language provides structured constructs that adequately express most commonly occurring control flow graphs viz. the series-parallel graphs.
The next section motivates simplification of component coordination with an example.