The orchestration code in the .or file can be divided into two part. The header part contains information about the program, included external files, defines, and declaration of parallel constructs used in the code. The orchestration section is made up of statements that forms a global control flow of the parallel program. In the orchestration code, Charisma employs a macro dataflow approach; the statements produce and consume values, from which the control flows can be organized, and messages and method invocations generated.
The very first line should give the name of the Charisma program with the program keyword.
{foodecl}
program jacobi
The program keyword can be replaced with module, which means that the output program is going to be a library module instead of a stand-alone program. Please refer to Section 4 for more details.
Next, the programmer can include external code files in the generated code with keyword include with the filename without extension. For example, the following statement tells the Charisma compiler to look for header file ``particles.h'' to be included in the generated header file ``jacobi.h'' and to look for C/C++ code file ``particles.[C|cc|cpp|cxx|c]'' to be included in the generated C++ code file ``jacobi.C''.
{foodecl}
include particles;
It is useful when there are source code that must precede the generated parallel code, such as basic data structure declaration.
After the include section is the define section, where environmental variables can be defined for Charisma. For example, to tell Charisma to generate additional code to enable the load balancing module, the programmer needs to define ``ldb'' in the orchestration code. Please refer to Section 7 for details.
Next comes the declaration section, where classes, objects and parameters are declared. A Charisma program is composed of multiple sets of parallel objects which are organized by the orchestration code. Different sets of objects can be instantiated from different class types. Therefore, we have to specify the class types and object instantiation. Also we need to specify the parameters (See Section 2.1.3) to use in the orchestration statements.
A Charisma program or module has one ``MainChare'' class, and it does not require explicit instantiation since it is a singleton. The statement to declare MainChare looks like this:
{foodecl}
class JacobiMain : MainChare;
For object arrays, we first need to declare the class types inherited from 1D object array, 2D object array, etc, and then instantiate from the class types. The dimensionality information of the object array is given in a pair of brackets with each dimension size separated by a comma.
{foodecl}
class JacobiWorker : ChareArray1D;
obj workers : JacobiWorker[N];
class Cell : ChareArray3D;
obj cells : Cell[M,M,M];
Note that key word ``class'' is for class type derivation, and ``obj'' is for parallel object or object array instantiation. The above code segment declares a new class type JacobiWorker which is a 1D object array, (and the programmer is supposed to supply sequential code for it in files ``JacobiWorker.h'' and ``JacobiWorker.C'' (See Section 2.2 for more details on sequential code). Object array ``workers'' is instantiated from ``JacobiWorker'' and has 16 elements.
The last part is orchestration parameter declaration. These parameters are used only in the orchestration code to connect input and output of orchestration statements, and their data type and size is declared here. More explanation of these parameters can be found in Section 2.1.3.
{foodecl}
param lb : double[N];
param rb : double[N];
With this, ``lb'' and ``rb'' are declared as parameters of that can be ``connected'' with local variables of double array with size of 512.
In the main body of orchestration code, the programmer describes the behavior and interaction of the elements of the object arrays using orchestration statements.
Foreach Statement
The most common kind of parallelism is the invocation of a method across all elements in an object array. Charisma provides a foreach statement for specifying such parallelism. The keywords foreach and end-foreach forms an enclosure within which the parallel invocation is performed. The following code segment invokes the entry method compute on all the elements of array myWorkers.
{foodecl}
foreach i in workers
workers[i].compute();
end-foreach
Publish Statement and Produced/Consumed Parameters
In the orchestration code, an object method invocation can have input and output (consumed and produced) parameters. Here is an orchestration statement that exemplifies the input and output of this object methods workers.produceBorders and workers.compute.
{foodecl}
foreach i in workers
(lb[i], rb[i]) <- workers[i].produceBorders();
workers[i].compute(lb[i+1], rb[i-1]);
(+error) <- workers[i].reduceData();
end-foreach
Here, the entry method workers[i].produceBorders produces (called published in Charisma) values of lb[i], rb[i], enclosed in a pair of parentheses before the publishing sign ``<-''. In the second statement, function workers[i].compute consumes values of lb[i+1], rb[i-1], just like normal function parameters. If a reduction operation is needed, the reduced parameter is marked with a ``+'' before it, like the error in the third statement.
A entry method can have arbitrary number of published (produced and reduced) values and consumed values. In addition to basic data types, each of these values can also be an object of arbitrary type. The values published by A[i] must have the index i, whereas values consumed can have the index e(i), which is an index expression in the form of i where is a constant. Although we have used different symbols (p and q) for the input and the output variables, they are allowed to overlap.
The parameters are produced and consumed in the program order. Namely, a parameter produced in an early statement will be consumed by the next consuming statement, but will no longer be visible to any consuming statement after a subsequent statement producing the same parameter in program order. Special rules involving loops are discussed later with loop statement.
Overlap Statement
Complicated parallel programs usually have concurrent flows of control. To explicitly express this, Charisma provides a overlap keyword, whereby the programmer can fire multiple overlapping control flows. These flows may contain different number of steps or statements, and their execution should be independent of one another so that their progress can interleave with arbitrary order and always return correct results.
{foodecl}
overlap
{
foreach i in workers1
(lb[i], rb[i]) <- workers1[i].produceBorders();
end-foreach
foreach i in workers1
workers1[i].compute(lb[i+1], rb[i-1]);
end-foreach
}
{
foreach i in workers2
(lb[i], rb[i]) <- workers2[i].compute(lb[i+1], rb[i-1]);
end-foreach
}
end-overlap
This example shows an overlap statement where two blocks in curly brackets are executed in parallel. Their execution join back to one at the end mark of end-overlap.
Loop Statement
Loops are supported with for statement and while statement. Here are two examples.
{foodecl}
for iter = 0 to MAX_ITER
workers.doWork();
end-for
{foodecl}
while (err > epsilon)
(+err) <- workers.doWork();
MainChare.updateError(err);
end-while
The loop condition in for statement is independent from the main program; It simply tells the program to repeat the block for so many times. The loop condition in while statement is actually updated in the MainChare. In the above example, err and epsilon are both member variables of class MainChare, and can be updated as the example shows. The programmer can active the ``autoScalar'' feature by including a ``define autoScalar;'' statement in the orchestration code. When autoScalar is enabled, Charisma will find all the scalars in the .or file, and create a local copy in the MainChare. Then every time the scalar is published by a statement, an update statement will automatically be inserted after that statement. The only thing that the programmer needs to do is to initialize the local scalar with a proper value.
Rules of connecting produced and consumed parameters concerning loops are natural. The first consuming statement will look for values produced by the last producing statement before the loop, for the first iteration. The last producing statement within the loop body, for the following iterations. At the last iteration, the last produced values will be disseminated to the code segment following the loop body. Within the loop body, program order holds.
{foodecl}
for iter = 1 to MAX_ITER
foreach i in workers
(lb[i], rb[i]) <- workers[i].compute(lb[i+1], rb[i-1]);
end-foreach
end-for
One special case is when one statement's produced parameter and consumed parameter overlaps. It must be noted that there is no dependency within the same foreach statement. In the above code segment, the values consumed lb[i], rb[i] by worker[i] will not come from its neighbors in this iteration. The rule is that the consumed values always originate from previous foreach statements or foreach statements from a previous loop iteration, and the published values are visible only to following foreach statements or foreach statements in following loop iterations.
Scatter and Gather Operation
A collection of values produced by one object may be split and consumed by multiple object array elements for a scatter operation. Conversely, a collection of values from different objects can be gathered to be consumed by one object.
{foodecl}
foreach i in A
(points[i,*]) <- A[i].f(...);
end-foreach
foreach k,j in B
(...) <- B[k,j].g(points[k,j]);
end-foreach
A wildcard dimension ``*'' in A[i].f()'s output points specifies that it will publish multiple data items. At the consuming side, each B[k,j] consumes only one point in the data, and therefore a scatter communication will be generated from A to B. For instance, A[1] will publish data points[1,0..N-1] to be consumed by multiple array objects B[1,0..N-1].
{foodecl}
foreach i,j in A
(points[i,j]) <- A[i,j].f(...);
end-foreach
foreach k in B
(...) <- B[k].g(points[*,k]);
end-foreach
Similar to the scatter example, if a wildcard dimension ``*'' is in the consumed parameter and the corresponding published parameter does not have a wildcard dimension, there is a gather operation generated from the publishing statement to the consuming statement. In the following code segment, each A[i,j] publishes a data point, then data points from A[0..N-1,j] are combined together to for the data to be consumed by B[j].
Many communication patterns can be expressed with combination of orchestration statements. For more details, please refer to PPL technical report 06-18, ``Charisma: Orchestrating Migratable Parallel Objects''.
Last but not least, all the orchestration statements in the .or file together form the dependency graph. According to this dependency graph, the messages are created and the parallel program progresses. Therefore, the user is advised to put only parallel constructs that are driven by the data dependency into the orchestration code. Other elements such as local dependency should be coded in the sequential code.
{foodecl}
class MDMain : MainChare;
class Cell : ChareArray3D;
class CellPair : ChareArray6D;
The user is supposed to prepare the following sequential files for the classes: MDMain.h, MDMain.C, Cell.h, Cell.C, CellPair.h and CellPair.C, unless a class does not need sequential declaration and/or definition code. Please refer to the example in the Appendix.
For each class, a member function ``void initialize(void)'' can be defined and the generated constructor will automatically call it. This saves the trouble of explicitly call initialization code for each array object.
{foodecl}
produce(produced_parameter, local_variable[, size_of_array]);
When the parameter represents a data array, we need the additional size_of_array to specify the size of the data array.
The dimensionality of an orchestration parameter is divided into two parts: its dimension in the orchestration code, which is implied by the dimensionality of the object arrays the parameter is associated, and the local dimensionality, which is declared in the declaration section. The orchestration dimension is not explicitly declared anywhere, but it is derived from the object arrays. For instance, in the 1D Jacobi worker example, ``lb'' and ``rb'' has the same orchestration dimensionality of workers, namely 1D of size [16]. The local dimensionality is used when the parameter is associated with local variables in sequential code. Since ``lb'' and ``rb'' are declared to have the local type and dimension of ``double [512]'', the producing statement should connect it with a local variable of ``double [512]''.
{foodecl}
void JacobiWorker::produceBorders(outport lb, outport rb){
. . .
produce(lb,localLB,512);
produce(rb,localRB,512);
}
Special cases of the produced/consumed parameters involve scatter/gather operations. In scatter operation, since an additional dimension is implied in the produced parameter, we the local_variable should have additional dimension equal to the dimension over which the scatter is performed. Similarly, the input parameter in gather operation will have an additional dimension the same size of the dimension of the gather operation.
For reduction, one additional parameter of type char[] is added to specify the reduction operation. Built-in reduction operations are ``+'' (sum), ``*'' (product), ``'' (minimum), ``'' (maximum) for basic data types. For instance the following statements takes the sum of all local value of result and for output in sum.
{foodecl}
reduce(sum, result, ``+'');
If the data type is a user-defined class, then you might use the function or operator defined to do the reduction. For example, assume we have a class called ``Force'', and we have an ``add'' function (or a ``+'' operator) defined.
{foodecl}
Force& Force::add(const Force& f);
In the reduction to sum all the local forces, we can use
{foodecl}
reduce(sumForces, localForce, "add");
{foodecl}
1D: thisIndex
2D: thisIndex.{x,y}
3D: thisIndex.{x,y,z}
4D: thisIndex.{w,x,y,z}
5D: thisIndex.{v,w,x,y,z}
6D: thisIndex.{x1,y1,z1,x2,y2,z2}
November 23, 2009
Charisma Homepage
Charm Homepage