We have a working prototype of our interface model implemented over Converse and Charm++. Currently, components can be written in Charm++ and Adaptive MPI, and can execute on all the machines that Converse supports. We describe this implementation with example components written using Charm++.
There are three important parts of our implementation: Component registration, Component creation, and Component connections. Before we describe each of these in detail, we will discuss what a Component is in terms of Charm++.
A Charisma component written using Charm++ is a Charm++ chare array. An array element in Charm++ is the most general abstraction for a message-driven object, since it can belong to arbitrary collections (arrays), can migrate in order to balance load, and has a globally unique name and index given to it at runtime. The index type used for this array element can be any user-defined type up to a maximum size (configured at compile-time). This is critical for writing reusable components, since a component may be used as a subcomponent of a component that could have any type of collection. For example, a 3-D Jacobi component which performs neighborhood averaging may be used in a component that uses 3-D block-decomposition, 2-D pencil-like decomposition, or as leaves (indexed with a bit-vector) of an oct-tree in adaptive mesh-refinement applications.
The port abstraction of Charisma does not impose any restrictions on the syntax for using ports. Instead, the syntax for the ports is dependent on language bindings. This makes it possible for language-runtime developers to provide abstractions that are natural for the component developer in that language. For example, ports implemented as templated classes may be natural to a Charm++ component developer, but alien to an MPI-FORTRAN90 programmer. One choice for the port abstraction for AMPI programmer would be to use cross-communicators (section 5.4), where publishing data on an output port would be equivalent to sending a message using that communicator.
For Charm++ implementation of Charisma, we have implemented output ports as subclasses of CkCallback class from Charm++. CkCallback mechanism allows runtime binding of various types of method invocations. An input port in Charisma corresponds to an entry method of a Charm++ array element. Binding an output port OutP of a component CompA to an input port InP of component CompB involves getting the unique ID of CompB and binding the callback part of OutP to invoke method InP on CompB. The emit method of OutP is a wrapper to asynchronously fire a callback.
Runtime systems of different languages on top of Converse may provide different kinds of callbacks. For example, a multi-threaded language will associate a callback with its input ports that awakens a suspended thread. In message-passing languages, a callback object for an input port would make it appear as if the data to be accepted on that port comes as a message with a specific tag or from a specific processor. Even in Charm++, a component may use the Charisma API instead of the translator generated code to specify a different kind of callback for its input port. This leads to maximum flexibility in building components.
The interface language translator reads the definition of the component from an appropriately named file. For example, a component Jacobi would be defined in file Jacobi.co. It then generates a definition of class CoJacobi in file Jacobi.co.h. The CoJacobi class is used as a base class of the Jacobi component. The CoJacobi class contains output port definitions (with names and types specified in the .co file), and two constructors. One of the constructors is invoked by the initialization code of the application to ask the component to register itself and its ports with the runtime system without actually creating the component. The other constructor is invoked when the component is actually created. This constructor reads the component-specific part of connection information from the system and binds its ports.
The application script specifies the top level components to be created, and specifies connections between these top-level components using the same scripting language. However, an application script is translated using a special mode of the translator which treats it as an application rather than as a component. This involves producing a main program which actually starts the execution.
This generated main program starts with registering the top-level components with the Charisma runtime. These top-level components may consist of sub-components themselves, and the registration call for these components calls the registration method of the subcomponents and so on recursively. Thus the components themselves form a hierarchy. For example, if a simulation application consists of components Rocflo, Rocsol, and Rocface, calls to Rocflo::Register, Rocsol::Register, and Rocface::Register are generated. These components in turn register their subcomponents by calling FloChunk::Register, SolChunk::Register, and FaceChunk::Register for all the chunks it contains. Components that do not contain any further subcomponents register their ports. This hierarchical registration also aids in assigning globally unique names for components and their ports. For example, the main script of the producer-consumer application registers producer component with name p and consumer component with name c, whereas the producer registers its ports such as PutData without knowing the name it has been assigned by the main script. The Charisma runtime maintains the hierarchy of components and ports. For example, the PutData port of p is referred to as p.PutData outside of p, but simply PutData inside p. Thus, in the main application script outside p, a connection such as p.PutData to c.GetData can be made.
At the end of the registration phase, the Charisma runtime knows how many components exist in an application. The next phase is the connection phase. In the connection phase, the top level connections are specified to the runtime, and the generated connect methods on all the components are called, which in turn specify their own connections to the runtime. The runtime makes use of the hierarchical structure of the registration components to store connections. At the end of the connection specification phase, Charisma runtime knows about the connectivity of the components in an application. This is stored in the form of a graph, and upon the end of connection specification phase, the runtime passes this graph to a graph-partitioner (we use the freely available METIS [51] partitioning tool.) The graph partitioner produces an assignment of components to different processors. This assignment is in the form of a table that specifies a processor number for each component.
For the purpose of partitioning all the components are currently thought to be of equal computational load, and all the connections of equal weight. In future, we would associate some measure of load and communication parameters to each of the components and connections that can be taken into account while doing graph partitioning. Processor-assignment for components produced by Charisma is used only as a guideline for creating components. The dynamic load balancing strategies of Charisma take into account the connectivity of components in addition to the actual measurement of computational loads of individual components to balance the load periodically by migrating components at runtime.
Components are then created as Charm++ arrays. Even singleton components such as producer in the producer-consumer application are created as an array (of one element). This gives us uniformity to address all components, and simplifies the implementation. This is not a restriction imposed by Charisma. Indeed, Charisma promotes cross-paradigm components by leaving it to the component developers to use whatever abstraction they find suitable.
The components (array elements) themselves are created with a special parameter that enables them to look up their connections upon creation in the distributed database of connections maintained by the runtime system. Thus the first task each component performs upon creation is to inquire with the runtime system of the other end-points of its output ports. Between issuing the creation commands to components, and actual creation, the Charisma runtime makes sure that each processor has the connection graph, and thus can satisfy the connection-information request of the newly created component locally. This is currently achieved by storing the connection database in a readonly message that gets distributed to all the processors before any objects are actually created. In future, we would use the port ownership information provided to the runtime system by the components, and distribute the connection graph in the form of a distributed table with local caching instead of replicating the entire graph on all processors.
The output ports of each component contain a callback structure that gets initialized with the array identifier, index of a component within that array, and the entry method index corresponding to the input port of that component. The emit method on the output port then becomes a wrapper around the send method of the callback object.