By allowing the components to provide specialized implementations of input ports, Charisma provides efficient ways to integrate legacy codes, as well as to deal with advanced irregular applications, even those using the multi-partitioning approach (see chapter 5). In this section, we describe how a language runtime may enable its applications to be ``componentized'' by providing efficient port implementations.
Consider an example of a Finite Element Method Framework [10], which uses a sequential mesh partitioner (such as METIS [51]) to re-partition the finite element mesh distributed across a parallel machine. Computation on the finite element mesh are carried out by a component FEMComp, implemented as a chare array in Charm++, where each chare contains a chunk of the entire mesh. Each element of this chare array has one output port, where the element emits its portion of the mesh connectivity information. This connectivity information needs to be combined and when connectivity of the entire mesh is available, it needs to be supplied to the sequential METIS library routine, which re-partitions the mesh. The new partitioning needs to be conveyed to the FEMComp via its input ports.
As a most simplistic implementation of the re-partitioner component, one may
write a sequential object wrapper SeqPartitioner around the METIS
partitioner (Figure 3.22). SeqPartitioner can be
implemented as a chare in Charm++. For the FEM application running with
partitions, SeqPartitioner has
input ports and
output ports.
Each of SeqPartitioner's input ports are connected to the
corresponding output ports of FEMComp, and its output ports are
connected to the corresponding input ports of FEMComp. When the mesh
connectivity of a partition is published on the output port by any element of
the FEMComp chare array, the connected input port simply forwards that
information in the form of a Charm++ message to the SeqPartitioner
chare. The SeqPartitioner object buffers incoming messages until it
receives all
messages. After all the expected messages have been received,
it combines the mesh connectivity information into a single array and supplies
that to the METIS library function to partition the mesh. Upon partitioning,
the SeqPartitioner splits the returned partitioning into FEM mesh
chunks, and publishes a chunk on the corresponding output port, which reaches
the appropriate constituent chare in FEMComp component through the
connected input ports.
{CodeOne}
class SeqPartitioner : public Chare {
Port *inputs;
Port *outputs;
int np, nrcvd;
Partition** parts;
public:
SeqPartitioner(char *name, int p) {
np = p;
inputs = new Port[p];
outputs = new Port[p];
parts = new Partition*[p];
// initialize input and output ports
for(int i=0;i<p;i++) {
inputs[i].init(i, thishandle,
GetMethodID("RecvPartition", "SeqPartitioner"));
outputs[i].init(name, "outputs", i);
}
nrcvd = 0;
}
void RecvPartition(int p, Partition *part) {
// buffer incoming message
parts[p] = part;
nrcvd++;
// have all messages been received ?
if(nrcvd==np) {
Partition *comb = CombinePartitions(np, parts);
// call METIS partitioner
METIS_PartGraphRecursive(...);
parts = SplitPartitions(comb, np, ..);
for(int i = 0; i<np; i++)
outputs[i].emit(parts[i]);
nrcvd = 0;
}
}
}
Note that the number of messages that are received by the SeqPartitioner components are equal to the number of chares in the FEMComp chare array. Typically, irregular applications such as FEM computations may use the multi-partitioning approach, where the number of chare array elements are much larger than the available number of processors. In such cases, the number of messages that SeqPartitioner has to process are much larger than the number of processors, and each processor may send a number of messages to the processor where SeqPartitioner resides. An obvious optimization in such cases is to combine all the messages originating from the same processor, and send the combined message to SeqPartitioner. One can implement this optimization in the input port abstraction provided by SeqPartitioner as shown in Figure 3.23. This implementation will not simply forward the mesh connectivity it receives to SeqPartitioner chare as was done previously, but would buffer it in object group ParPartitioner until all the output ports on that processor have emitted connectivity. Then it would concatenate all the data into a single message and send it to SeqPartitioner, which would split it into different parts and carry on as before. As another optimization, instead of simply concatenating all the connectivity information of mesh partitions on the same processor into a single message, the input port implementation may be modified to eliminate duplicate mesh connectivity information resulting from duplicate boundary nodes in adjoining regions. Note that these optimizations can be carried out by the input port implementation provided by the SeqPartitioner component without the connected FEMComp component being aware of them.
{CodeOne}
class ParPartitioner : public Group {
int nregistered, nrcvd;
vector<Partition*> parts;
CkChareID cid;
public:
// SeqPartitioner creates the ports as before
// supplying them with the ID of this group
ParPartitioner(CkChareID id) {
cid = id; // id of the SeqPartitioner
nregistered = 0;
nrcvd = 0;
}
// input ports register with the local group representative
void Register(void) {
nregistered++;
}
// called only from local input ports
void RecvPartition(int p, Partition *part) {
// buffer incoming message
parts[p] = part;
nrcvd++;
// have all local messages been received ?
if(nrcvd==nregistered) {
Partition *comb = CombinePartitions(nregistered, parts);
// Send the combined partition to SeqPartitioner
CProxy_SeqPartitioner psp(cid);
psp.RecvPartition(CkMyPe(), comb);
nrcvd = 0;
}
}
// ...
}
Indeed, one can use a parallel mesh partitioner such as ParMETIS [52] in place of the sequential partitioner. ParPartitioner, the component wrapper around ParMETIS could be implemented as a Charm++ object group, with one representative object per processor. The input ports provided by ParPartitioner would be similar to the mesh connectivity-combining ports described above. Once again this substitution could be made completely independently of the FEMComp component.
To summarize, Charisma interface model, which is based on the specification of data each component consumes and publishes, enables independent development of reusable components. An important difference between traditional interface models and Charisma is that Charisma interfaces are contracts between components and the runtime system, rather than between the components themselves. Charisma provides control points for the runtime system using asynchronous method invocations, allowing the runtime system to utilize system resources more effectively. Since Charisma uses a message-driven interoperable runtime system, Converse, at its core, one can use a variety of parallel programming paradigms for building reusable components. Message-driven object-based languages such as Charm++ provides the right building blocks for building efficient components because of its close match to the Charisma runtime. Charm++ also facilitates building reusable components because it supports encapsulation and object-virtualization. However, a significant limitation of Charm++ i that it is difficult to express intra-component control-flow in the message-driven style of Charm++. We have developed a notation, Structured Dagger, that simplifies expression of control-flow for message-driven objects, and is described in the next chapter.