Task Graph in Charm++

This document attempts to explain the taskGraph library for the Charm++ programming system. The library is at a first-release stage.

Problems Solved

Task Graph is designed to make it easier to solve "directed information flow" problems. As a trivial example, consider calculating the Fibonacci numbers. To calculate a number fib(n), it needs to know the values of fib(n-1) and fib(n-2).

The dependency representation for Fibonacci may look something like this:

X'th Fibonacci Depends On

0 none

1 none

2 0, 1

3 1, 2

4 2, 3

X'th Fibonacci	Depends On
0	none
1	none
2	0, 1
3	1, 2
4	2, 3

And so on until the number you desire. It is not designed to solve problems such as Jacobi relaxation; it is not an iterative framework. You could contort the Jacobi program to using it by saying iteration N depends on iteration N-1, but that would defeat much of the purpose of the framework.

Library Structure

There are three basic components to an application using the library:

Graph Generator
Data/Solver Class
taskGraph Library

The graph generator is the controller of the whole process. It will generate the base-case data/solver class instances, and pass them into the library with no dependencies. Then, the graph generator will generate new data/solver classes (and give them any static data they need to know; like their location in the computation, etc) and pass it to the library with a list of dependencies.

Once the library has received data/solver classes, it will gather up the other data that each needs to solve itself (what the graph generator passed to it). Once it has all the data gathered, it will fire off the data/solver class and tell it to solve itself.

Once the class has solved itself, the library will leave it waiting there for any other data/solver tasks which may depend on it. Your graph generator can tell the library that a data/solver task will not be needed anymore and to delete it. The library will do so. (NOTE! You can delete a solver that has other solvers waiting on it for data! If that happens your computation will hang. Only delete when you're sure nobody will depend on that data/solver again.)

If your graph generator needs the results of a data/solver computation back in order to generate new data/solvers (perhaps a deformation has occurred and shifted the area occupied by new data/solvers), then your data/solver class must send the data your graph generator needs to it. You may also have the library send the whole data/solver class instance back to your graph generator, but it is messier than sending just the data you need.

How-to

This section gives a short how-to write a taskGraph application, using the Fibonacci example. I am assuming you are already familiar with Charm++ programming, if you are not you'll need to be at least familiar enough with it to write a simple program, like Jacobi, before this how-to will be of use to you.

The following steps are the minimal steps needed for a taskGraph program;

A graph generator. This can be your main chare, or you could spawn any sort of other Charm++ construct to generate the graph.
A data/solver class, inherited off the taskGraphSolver class.

The generator has three basic phases; initialization, generation, and results collection. For instance, a generator may look like:

class main : public Chare {
  public:
    main(CkArgMsg *m) {
      CkArrayID firstTaskGraph = yourSolver::newTaskGraph();
      ...
      yourSolver base(firstTaskGraph, 0);
      base.startTask();
      yourSolver next(firstTaskGraph, 1);
      next.dependsOn(0);
      next.startTask();
    }
    void results(int value) {
      ckout << "A task completed and sent in the result " << value << endl;
    }
};

The taskGraphSolver class is an abstract class that allows the library to pass around your data/solver class with ease. taskGraphSolver contains the following methods;

taskGraphSolver(CkArrayID, int x)
taskGraphSolver(CkArrayID, int x, int y)
taskGraphSolver(CkArrayID, int x, int y, int z)
The constructor of a taskGraphSolver needs to know which running taskGraph it is related to, thus you must pass it the CkArrayID of that instance.

void pup(PUP::er &p) The pack and unpack routine of the class. You must override this routine and pack all the data your class contains. This allows the library to move your class across processors, so work can be distributed from wherever it is generated to where it should be run.

void solve(int count, taskGraphSolver *data[]) The solve function is an abstract function. You must override this function and make it run the computation you need run. Note that the data you get passed in is in no particular order.
If you need to send information back to your graph generator, this is the function to do it in.

void setup() The setup function is an abstract function. It is called instead of solve when a task has no dependences. You can use this function to setup information, and to send information back to your graph generator.

void dependsOn(int x)
void dependsOn(int x, int y)
void dependsOn(int x, int y, int z)
This trio of functions are defined in taskGraphSolver. They are used to indicate that the current task object depends on these other task objects defined by index x,y,z.

CkArrayID newTaskGraph() This function is defined in taskGraphSolver, it is used to create a new taskGraph. For each taskGraph you need to call this function, and will then need to pass the CkArrayID you get to all tasks that are part of this taskGraph.

void startTask() This function is defined in taskGraphSolver. It will take the inerhited object you have created and fire it off as a task. If this task had any dependences, you need to have called dependsOn() before calling this function.

void removeTask() This function is defined in taskGraphSolver. It will remove the task from the library's management. Be careful, if you remove a task that others depend on, your computation will hang!

For a simple example, check out a current CVS charm build, and look at the charm/pgms/charm++/taskGraph/fib/ example. Fib does what is described here with one addition; it keeps track (in the main chare) of how many tasks are running (that's what the 'int pending' is used for). When the count of tasks running hits 0, it knows that it's computation is done so it exits the program with CkExit().

taskGraphSolver(CkArrayID, int x) taskGraphSolver(CkArrayID, int x, int y) taskGraphSolver(CkArrayID, int x, int y, int z)	The constructor of a taskGraphSolver needs to know which running taskGraph it is related to, thus you must pass it the CkArrayID of that instance.
void pup(PUP::er &p)	The pack and unpack routine of the class. You must override this routine and pack all the data your class contains. This allows the library to move your class across processors, so work can be distributed from wherever it is generated to where it should be run.
void solve(int count, taskGraphSolver *data[])	The solve function is an abstract function. You must override this function and make it run the computation you need run. Note that the data you get passed in is in no particular order. If you need to send information back to your graph generator, this is the function to do it in.
void setup()	The setup function is an abstract function. It is called instead of solve when a task has no dependences. You can use this function to setup information, and to send information back to your graph generator.
void dependsOn(int x) void dependsOn(int x, int y) void dependsOn(int x, int y, int z)	This trio of functions are defined in taskGraphSolver. They are used to indicate that the current task object depends on these other task objects defined by index x,y,z.
CkArrayID newTaskGraph()	This function is defined in taskGraphSolver, it is used to create a new taskGraph. For each taskGraph you need to call this function, and will then need to pass the CkArrayID you get to all tasks that are part of this taskGraph.
void startTask()	This function is defined in taskGraphSolver. It will take the inerhited object you have created and fire it off as a task. If this task had any dependences, you need to have called dependsOn() before calling this function.
void removeTask()	This function is defined in taskGraphSolver. It will remove the task from the library's management. Be careful, if you remove a task that others depend on, your computation will hang!