Subsections

3.8 Advanced Arrays

The basic array features described above (creation, messaging, broadcasts, and reductions) are needed in almost every CHARM++ program. The more advanced techniques that follow are not universally needed; but are still often useful.

3.8.1 Declaring Multidimensional, or User-defined Index Arrays

CHARM++ contains direct support for multidimensional and even user-defined index arrays. These arrays can be declared as:

//In the .ci file:
message MyMsg;
array [1D] A1 { entry A1(); entry void e(parameters);}
array [2D] A2 { entry A2(); entry void e(parameters);}
array [3D] A3 { entry A3(); entry void e(parameters);}
array [4D] A4 { entry A4(); entry void e(parameters);}
array [5D] A5 { entry A5(); entry void e(parameters);}
array [6D] A6 { entry A6(); entry void e(parameters);}
array [Foo] AF { entry AF(); entry void e(parameters);}

The last declaration expects an array index of type CkArrayIndexFoo, which must be defined before including the .decl.h file (see ``User-defined array index type'' below).

//In the .h file:
class A1 : public CBase_A1 { public: A1(){} ...};
class A2 : public CBase_A2 { public: A2(){} ...};
class A3 : public CBase_A3 { public: A3(){} ...};
class A4 : public CBase_A4 { public: A4(){} ...};
class A5 : public CBase_A5 { public: A5(){} ...};
class A6 : public CBase_A6 { public: A6(){} ...};
class AF : public CBase_AF { public: AF(){} ...};

A 1D array element can access its index via its inherited ``thisIndex'' field; a 2D via ``thisIndex.x'' and ``thisIndex.y'', and a 3D via ``thisIndex.x'', ``thisIndex.y'', and ``thisIndex.z''. The subfields of 4D, 5D, and 6D are respectively {w,x,y,z}, {v,w,x,y,z}, and {x1,y1,z1,x2,y2,z2}. A user-defined index array can access its index as ``thisIndex''.

Likewise, you can create a dense multidimensional array by passing the extents at creation time to ckNew.

CProxy_A1 a1 = CProxy_A1::ckNew(parameters, num_elements);
CProxy_A2 a2 = CProxy_A2::ckNew(parameters, num_rows, num_colums);
CProxy_A3 a3 = CProxy_A3::ckNew(parameters, num_rows, num_columns, num_depth);

For 4D, 5D, 6D and user-defined arrays, this functionality cannot be used. You need to insert the array elements individually (Section 3.8.7).

3.8.2 Advanced Array Creation

There are several ways to control the array creation process. You can adjust the map and bindings before creation, change the way the initial array elements are created, create elements explicitly during the computation, and create elements implicitly, ``on demand''.

You can create all your elements using any one of these methods, or create different elements using different methods. An array element has the same syntax and semantics no matter how it was created.

3.8.3 Advanced Array Creation: CkArrayOptions

The array creation method ckNew actually takes a parameter of type CkArrayOptions. This object describes several optional attributes of the new array.

The most common form of CkArrayOptions is to set the number of initial array elements. A CkArrayOptions object will be constructed automatically in this special common case. Thus the following code segments all do exactly the same thing:

//Implicit CkArrayOptions
  a1=CProxy_A1::ckNew(parameters,nElements);

//Explicit CkArrayOptions
  a1=CProxy_A1::ckNew(parameters,CkArrayOptions(nElements));

//Separate CkArrayOptions
  CkArrayOptions opts(nElements);
  a1=CProxy_A1::ckNew(parameters,opts);

Note that the ``numElements'' in an array element is simply the numElements passed in when the array was created. The true number of array elements may grow or shrink during the course of the computation, so numElements can become out of date. This ``bulk'' constructor approach should be preferred where possible, especially for large arrays. Bulk construction is handled via a broadcast which will be significantly more efficient in the number of messages required than inserting each element individually which will require one message send per element.

3.8.4 Advanced Array Creation: Map Object

You can use CkArrayOptions to specify a ``map object'' for an array. The map object is used by the array manager to determine the ``home'' processor of each element. The home processor is the processor responsible for maintaining the location of the element.

There is a default map object, which maps 1D array indices in a round-robin fashion to processors, and maps other array indices based on a hash function.

A custom map object is implemented as a group which inherits from CkArrayMap and defines these virtual methods:

class CkArrayMap : public Group
{
public:
  //...
  
  //Return an ``arrayHdl'', given some information about the array
  virtual int registerArray(int numInitialElements,CkArrayID aid);
  //Return the home processor number for this element of this array
  virtual int procNum(int arrayHdl,const CkArrayIndex &element);
}

For example, a simple 1D blockmapping scheme. Actual mapping is handled in the procNum function.

class BlockMap : public CkArrayMap
{
 public:
  BlockMap(void) {}
  BlockMap(CkMigrateMessage *m){}
  int registerArray(CkArrayMapRegisterMessage *m) {
    delete m;
    return 0;
  }
  int procNum(int /*arrayHdl*/,const CkArrayIndex &idx) {
    int elem=*(int *)idx.data();
    int penum =  (elem/(32/CkNumPes()));
    return penum;
  }
};

Once you've instantiated a custom map object, you can use it to control the location of a new array's elements using the setMap method of the CkArrayOptions object described above. For example, if you've declared a map object named ``blockMap'':

//Create the map group
  CProxy_blockMap myMap=CProxy_blockMap::ckNew();
//Make a new array using that map
  CkArrayOptions opts(nElements);
  opts.setMap(myMap);
  a1=CProxy_A1::ckNew(parameters,opts);

3.8.5 Advanced Array Creation: Initial Elements

The map object described above can also be used to create the initial set of array elements in a distributed fashion. An array's initial elements are created by its map object, by making a call to populateInitial on each processor.

You can create your own set of elements by creating your own map object and overriding this virtual function of CkArrayMap:

  virtual void populateInitial(int arrayHdl,int numInitial,
 void *msg,CkArrMgr *mgr)

In this call, arrayHdl is the value returned by registerArray, numInitial is the number of elements passed to CkArrayOptions, msg is the constructor message to pass, and mgr is the array to create.

populateInitial creates new array elements using the method void CkArrMgr::insertInitial(CkArrayIndex idx,void *ctorMsg). For example, to create one row of 2D array elements on each processor, you would write:

void xyElementMap::populateInitial(int arrayHdl,int numInitial,
 void *msg,CkArrMgr *mgr)
{
  if (numInitial==0) return; //No initial elements requested
 
  //Create each local element
  int y=CkMyPe();
  for (int x=0;x<numInitial;x++) {
    mgr->insertInitial(CkArrayIndex2D(x,y),CkCopyMsg(&msg));
  }
  mgr->doneInserting();
  CkFreeMsg(msg);
}

Thus calling ckNew(10) on a 3-processor machine would result in 30 elements being created.

3.8.6 Advanced Array Creation: Bound Arrays

You can ``bind'' a new array to an existing array using the bindTo method of CkArrayOptions. Bound arrays act like separate arrays in all ways except for migration- corresponding elements of bound arrays always migrate together. For example, this code creates two arrays A and B which are bound together- A[i] and B[i] will always be on the same processor.

//Create the first array normally
  aProxy=CProxy_A::ckNew(parameters,nElements);
//Create the second array bound to the first
  CkArrayOptions opts(nElements);
  opts.bindTo(aProxy);
  bProxy=CProxy_B::ckNew(parameters,opts);

An arbitrary number of arrays can be bound together- in the example above, we could create yet another array C and bind it to A or B. The result would be the same in either case- A[i], B[i], and C[i] will always be on the same processor.

There is no relationship between the types of bound arrays- it is permissible to bind arrays of different types or of the same type. It is also permissible to have different numbers of elements in the arrays, although elements of A which have no corresponding element in B obey no special semantics. Any method may be used to create the elements of any bound array.

Bound arrays are often useful if A[i] and B[i] perform different aspects of the same computation, and thus will run most efficiently if they lie on the same processor. Bound array elements are guaranteed to always be able to interact using ckLocal (see section 3.8.12), although the local pointer must be refreshed after any migration. This should be done during the pup routine. When migrated, all elements that are bound together will be created at the new processor before pup is called on any of them, ensuring that a valid local pointer to any of the bound objects can be obtained during the pup routine of any of the others.

For example, an array Alibrary is implemented as a library module. It implements a certain functionality by operating on a data array dest which is just a pointer to some user provided data. A user defined array UserArray is created and bound to the array Alibrary to take advanatage of the functionality provided by the library. When bound array element migrated, the data pointer in UserArray is re-allocated in pup(), thus UserArray is responsible to refresh the pointer dest stored in Alibrary.

class Alibrary: public CProxy_Alibrary {
public:
  ...
  void set_ptr(double *ptr) { dest = ptr; }
  virtual void pup(PUP::er &p);
private:
  double *dest;           // point to user data in user defined bound array
};

class UserArray: public CProxy_UserArray {
public:
  virtual void pup(PUP::er &p) {
                p|len;
                if(p.isUnpacking()) {
                  data = new double[len];
                  Alibrary *myfellow = AlibraryProxy(thisIndex).ckLocal();
                  myfellow->set_ptr(data);    // refresh data in bound array
                }
                p(data, len);
  }
private:
  CProxy_Alibrary  AlibraryProxy;   // proxy to my bound array
  double *data;          // user allocated data pointer
  int len;
};

3.8.7 Advanced Array Creation: Dynamic Insertion

In addition to creating initial array elements using ckNew, you can also create array elements during the computation.

You insert elements into the array by indexing the proxy and calling insert. The insert call optionally takes parameters, which are passed to the constructor; and a processor number, where the element will be created. Array elements can be inserted in any order from any processor at any time. Array elements need not be contiguous.

If using insert to create all the elements of the array, you must call CProxy_Array::doneInserting before using the array.

//In the .C file:
int x,y,z;
CProxy_A1 a1=CProxy_A1::ckNew();  //Creates a new, empty 1D array
for (x=...) {
   a1[x  ].insert(parameters);  //Bracket syntax
   a1(x+1).insert(parameters);  // or equivalent parenthesis syntax
}
a1.doneInserting();

CProxy_A2 a2=CProxy_A2::ckNew();   //Creates 2D array
for (x=...) for (y=...)
   a2(x,y).insert(parameters);  //Can't use brackets!
a2.doneInserting();

CProxy_A3 a3=CProxy_A3::ckNew();   //Creates 3D array
for (x=...) for (y=...) for (z=...)
   a3(x,y,z).insert(parameters);
a3.doneInserting();

CProxy_AF aF=CProxy_AF::ckNew();   //Creates user-defined index array
for (...) {
   aF[CkArrayIndexFoo(...)].insert(parameters); //Use brackets...
   aF(CkArrayIndexFoo(...)).insert(parameters); //  ...or parenthesis
}
aF.doneInserting();

The doneInserting call starts the reduction manager (see ``Array Reductions'') and load balancer (see 3.11.1)- since these objects need to know about all the array's elements, they must be started after the initial elements are inserted. You may call doneInserting multiple times, but only the first call actually does anything. You may even insert or destroy elements after a call to doneInserting, with different semantics- see the reduction manager and load balancer sections for details.

If you do not specify one, the system will choose a procesor to create an array element on based on the current map object.

3.8.8 Advanced Array Creation: Demand Creation

Normally, invoking an entry method on a nonexistant array element is an error. But if you add the attribute [createhere] or [createhome] to an entry method, the array manager will ``demand create'' a new element to handle the message.

With [createhome], the new element will be created on the home processor, which is most efficient when messages for the element may arrive from anywhere in the machine. With [createhere], the new element is created on the sending processor, which is most efficient if when messages will often be sent from that same processor.

The new element is created by calling its default (taking no paramters) constructor, which must exist and be listed in the .ci file. A single array can have a mix of demand-creation and classic entry methods; and demand-created and normally created elements.

3.8.9 User-defined array index type

CHARM++ array indices are arbitrary collections of integers. To define a new array index, you create an ordinary C++ class which inherits from CkArrayIndex and sets the ``nInts'' member to the length, in integers, of the array index.

For example, if you have a structure or class named ``Foo'', you can use a Foo object as an array index by defining the class:

#include <charm++.h>
class CkArrayIndexFoo:public CkArrayIndex {
    Foo f;
public:
    CkArrayIndexFoo(const Foo &in)
    {
        f=in;
        nInts=sizeof(f)/sizeof(int);
    }
    //Not required, but convenient: cast-to-foo operators
    operator Foo &() {return f;}
    operator const Foo &() const {return f;}
};

Note that Foo's size must be an integral number of integers- you must pad it with zero bytes if this is not the case. Also, Foo must be a simple class- it cannot contain pointers, have virtual functions, or require a destructor. Finally, there is a CHARM++ configuration-time option called CK_ARRAYINDEX_MAXLEN which is the largest allowable number of integers in an array index. The default is 3; but you may override this to any value by passing ``-DCK_ARRAYINDEX_MAXLEN=n'' to the CHARM++ build script as well as all user code. Larger values will increase the size of each message.

You can then declare an array indexed by Foo objects with

//in the .ci file:
array [Foo] AF { entry AF(); ... }

//in the .h file:
class AF : public CBase_AF
{ public: AF() {} ... }

//in the .C file:
    Foo f;
    CProxy_AF a=CProxy_AF::ckNew();
    a[CkArrayIndexFoo(f)].insert();
    ...

Note that since our CkArrayIndexFoo constructor is not declared with the explicit keyword, we can equivalently write the last line as:

    a[f].insert();

When you implement your array element class, as shown above you can inherit from CBase_ClassName, a class templated by the index type Foo. In the old syntax, you could also inherit directly from ArrayElementT. The array index (an object of type Foo) is then accessible as ``thisIndex''. For example:


//in the .C file:
AF::AF()
{
    Foo myF=thisIndex;
    functionTakingFoo(myF);
}

3.8.10 Migratable Array Elements

Array objects can migrate from one PE to another. For example, the load balancer (see section 3.11.1) might migrate array elements to better balance the load between processors. For an array element to migrate, it must implement a pack/unpack or ``pup'' method:

//In the .h file:
class A2 : public CBase_A2 {
private: //My data members:
    int nt;
    unsigned char chr;
    float flt[7];
    int numDbl;
    double *dbl;
public:
    //...other declarations

    virtual void pup(PUP::er &p);
};

//In the .C file:
void A2::pup(PUP::er &p)
{
    CBase_A2::pup(p); //<- MUST call superclass's pup routine
    p|nt;
    p|chr;
    p(flt,7);
    p|numDbl;
    if (p.isUnpacking()) dbl=new double[numDbl];
    p(dbl,numDbl);
}

Please note that if your object contains Structured Dagger code (see section ``Structured Dagger'') you must use the following syntax to correctly pup the object:

class bar: public CBase_bar {
 private:
    int a,b;
 public:
    bar_SDAG_CODE
    ...other methods...

    virtual void pup(PUP::er& p) {
      __sdag_pup(p);
      ...pup other data here...
    }
};

See the section ``PUP'' for more details on pup routines and the PUP::er type.

The system uses one pup routine to do both packing and unpacking by passing different types of PUP::ers to it. You can determine what type of PUP::er has been passed to you with the isPacking(), isUnpacking(), and isSizing() calls.

An array element can migrate by calling the migrateMe(destination processor) member function- this call must be the last action in an element entry point. The system can also migrate array elements for load balancing (see the section 3.11.3).

To migrate your array element to another processor, the CHARM++ runtime will:

Migration constructors, then, are normally empty- all the unpacking and allocation of the data items is done in the element's pup routine. Deallocation is done in the element destructor as usual.

3.8.11 Load Balancing Chare Arrays

see section 3.11.1

3.8.12 Local Access

You can get direct access to a local array element using the proxy's ckLocal method, which returns an ordinary C++ pointer to the element if it exists on the local processor; and NULL if the element does not exist or is on another processor.

A1 *a=a1[i].ckLocal();
if (a==NULL) //...is remote- send message
else //...is local- directly use members and methods of a

Note that if the element migrates or is deleted, any pointers obtained with ckLocal are no longer valid. It is best, then, to either avoid ckLocal or else call ckLocal each time the element may have migrated; e.g., at the start of each entry method.

3.8.13 Array Section

CHARM++ supports array section which is a subset of array elements in a chare array. A special proxy for an array section can be created given a list of array indexes of elements. Multicast operations are directly supported in array section proxy with an unoptimized direct-sending implementation. Section reduction is not directly supported by the section proxy. However, an optimized section multicast/reduction library called ''CkMulticast'' is provided as a separate library module, which can be plugged in as a delegation of a section proxy for performing section-based multicasts and reductions.

For each chare array "A" declared in a ci file, a section proxy of type "CProxySection_A" is automatically generated in the decl and def header files. In order to create an array section, a user needs to provide array indexes of all the array section members. You can create an array section proxy in your application by invoking ckNew() function of the CProxySection. For example, for a 3D array:

  CkVec<CkArrayIndex3D> elems;    // add array indices
  for (int i=0; i<10; i++)
    for (int j=0; j<20; j+=2)
      for (int k=0; k<30; k+=2)
         elems.push_back(CkArrayIndex3D(i, j, k));
  CProxySection_Hello proxy = CProxySection_Hello::ckNew(helloArrayID, elems.getVec(), elems.size());

Alternatively, one can do the same thing by providing [lbound:ubound:stride] for each dimension:

  CProxySection_Hello proxy = CProxySection_Hello::ckNew(helloArrayID, 0, 9, 1, 0, 19, 2, 0, 29, 2);

The above codes create a section proxy that contains array elements of [0:9, 0:19:2, 0:29:2].

For user-defined array index other than CkArrayIndex1D to CkArrayIndex6D, one needs to use the generic array index type: CkArrayIndexMax.

  CkArrayIndexMax *elems;    // add array indices
  int numElems;
  CProxySection_Hello proxy = CProxySection_Hello::ckNew(helloArrayID, elems, numElems);

Once you have the array section proxy, you can do multicast to all the section members, or send messages to one member using its index that is local to the section, like these:

  CProxySection_Hello proxy;
  proxy.someEntry(...)          // multicast
  proxy[0].someEntry(...)       // send to the first element in the section.

You can move the section proxy in a message to another processor, and still safely invoke the entry functions to the section proxy.

In the multicast example above, for a section with k members, total number of k messages will be sent to all the memebers, which is considered inefficient when several members are on a same processor, in which case only one message needs to be sent to that processor and delivered to all section members on that processor locally. To support this optimization, a separate library called CkMulticast is provided. This library also supports section based reduction.

Note: Use of the bulk array constructor (dimensions given in the CkNew or CkArrayOptions rather than individual insertion) will allow construction to race ahead of several other startup procedures, this creates some limitation on the construction delegation and use of array section proxies. For safety, array sections should be created in a post constructor entry method.

To use the library, you need to compile and install CkMulticast library and link your applications against the library using -module:

  # compile and install the CkMulticast library, do this only once
  cd charm/net-linux/tmp
  make multicast

  # link CkMulticast library using -module when compiling application
  charmc  -o hello hello.o -module CkMulticast -language charm++

CkMulticast library is implemented using delegation(Sec. 3.20). A special ''CkMulticastMgr'' Chare Group is created as a deletegation for section multicast/reduction - all the messages sent by the section proxy will be passed to the local delegation branch.

To use the CkMulticast delegation, one needs to create the CkMulticastMgr Group first, and then setup the delegation relationship between the section proxy and CkMulticastMgr Group. One only needs to create one CkMulticastMgr Group globally. CkMulticastMgr group can serve all multicast/reduction delegations for different array sections in an application:

  CProxySection_Hello sectProxy = CProxySection_Hello::ckNew(...);
  CkGroupID mCastGrpId = CProxy_CkMulticastMgr::ckNew();
  CkMulticastMgr *mCastGrp = CProxy_CkMulticastMgr(mCastGrpId).ckLocalBranch();

  sectProxy.ckSectionDelegate(mCastGrp);  // initialize section proxy

  sectProxy.someEntry(...)           //multicast via delegation library as before

By default, CkMulticastMgr group builds a spanning tree for multicast/reduction with a factor of 2 (binary tree). One can specify a different factor when creating a CkMulticastMgr group. For example,

  CkGroupID mCastGrpId = CProxy_CkMulticastMgr::ckNew(3);   // factor is 3

Note, to use CkMulticast library, all multicast messages must inherit from CkMcastBaseMsg, as the following. Note that CkMcastBaseMsg must come first, this is IMPORTANT for CkMulticast library to retrieve section information out of the message.

class HiMsg : public CkMcastBaseMsg, public CMessage_HiMsg
{
public:
  int *data;
};

Due to this restriction, you need to define message explicitly for multicast entry functions and no parameter marshalling can be used for multicast with CkMulticast library.

3.8.13.1 Array Section Reduction

Since an array element can be members for multiple array sections, there has to be a way for each array element to tell for which array section it wants to contribute. For this purpose, a data structure called ''CkSectionInfo'' is created by CkMulticastMgr for each array section that the array element belongs to. When doing section reduction, the array element needs to pass the CkSectionInfo as a parameter in the contribute(). The CkSectionInfo can be retrieved from a message in a multicast entry function using function call CkGetSectionInfo:

  CkSectionInfo cookie;

  void SayHi(HiMsg *msg)
  {
    CkGetSectionInfo(cookie, msg);     // update section cookie every time
    int data = thisIndex;
    mcastGrp->contribute(sizeof(int), &data, CkReduction::sum_int, cookie);
  }

Note that the cookie cannot be used as a one-time local variable in the function, the same cookie is needed for the next contribute. This is because cookie includes some context sensive information for example the reduction counter. Function CkGetSectionInfo() only update some part of the data in cookie, not creating a brand new one.

Similar to array reduction, to use section based reduction, a reduction client CkCallback object need to be created. You may pass the client callback as an additional parameter to contribute. If different contribute calls pass different callbacks, some (unspecified, unreliable) callback will be chosen for use. See the followin example:

    CkCallback cb(CkIndex_myArrayType::myReductionEntry(NULL),thisProxy);
    mcastGrp->contribute(sizeof(int), &data, CkReduction::sum_int, cookie, cb);

If no member passes a callback to contribute, the reduction will use the default callback. You set the default callback for an array section using the setReductionClient call by the section root member. A CkReductionMsg message will be passed to this callback, which must delete the message when done.

  CProxySection_Hello sectProxy;
  CkMulticastMgr *mcastGrp = CProxy_CkMulticastMgr(mCastGrpId).ckLocalBranch();
  mcastGrp->setReductionClient(sectProxy, new CkCallback(...));

Same as in array reduction, users can use built-in reduction types(Section 3.14.2) or define his/her own reducer functions (Section 3.14.3).

3.8.13.2 Array section multicast/reduction when migration happens

Using multicast/reduction, you don't need to worry about array migrations. When migration happens, array element in the array section can still use the CkSectionInfo it stored previously for doing reduction. Reduction messages will be correctly delivered but may not be as efficient until a new multicast spanning tree is rebuilt internally in CkMulticastMgr library. When a new spanning tree is rebuilt, a updated CkSectionInfo is passed along with a multicast message, so it is recommended that CkGetSectionInfo() function is always called when a multicast message arrives (as shown in the above SayHi example).

In case when a multicast root migrates, one needs to reconstruct the spanning tree to get optimal performance. One will get the following warning message if not doing so: "Warning: Multicast not optimized after multicast root migrated." In current implementation, user needs to initiate the rebuilding process like:

void Foo::pup(PUP::er & p)
    // if I am multicast root and it is unpacking
   if (ismcastroot && p.isUnpacking())
      CProxySection_Foo   fooProxy;    // proxy for the section
      CkMulticastMgr *mg = CProxy_CkMulticastMgr(mCastGrpId).ckLocalBranch();
      mg->resetSection(fooProxy);
        // you may want to reset reduction client to root
      CkCallback *cb = new CkCallback(...);
      mg->setReductionClient(mcp, cb);
   

August 04, 2008
Charm Homepage