| ||
Charm++
We believe that the Charm++ programming model is a good fit for the Cell
processor for several reasons, including: data encapsulation, vitualization,
and the ability to peek-ahead in the message queue.
Notes for building Charm++ on the Cell architecture can be found here. Offload API
We have developed an interfaced called the Offload API which will be used by the Charm++ runtime system
to offload entry method execution onto the SPEs. The Offload API is independent of Charm++. That is,
one can write an application using the Offload API directly without using Charm++. However, the design of the
Offload API has been specifically geared towards the needs of the Charm++ runtime system.
In the Offload API model, the computation heavy portions of the computation are broken down into chunks of computation called work requests. Each work request can have multiple input and output buffers. On each SPE, there is a small SPE Runtime that continuously executes. When the application code creates a work request via the Offload API on the PPE, the Offload API decides which SPE the work request should be executed on and then passes the work request to the SPE. The SPE Runtime then takes care of moving the data, allocating memory in the local store, executing the work request, and eventually moving the results of the work request back into system memory. The life of a work request is depicted in Figure 1.
[1] : The application code on the PPE issues a work request to the Offload. The Offload decides which
SPE should execute the work request and sends the work request to that SPE.
[2] : The SPE Runtime notices that it has a new work request and issues a DMA-Get to bring the input data from system memory into the SPE's local store. [3] : The DMA controller for the SPE moves the data. During this time, the SPE is free to do other work including executing another work request. [4] : Once the input data for the work request has arrived, the SPE is free to execute the work request. Once the work request has been executed, the SPE Runtime issues a DMA-Put to place the results into system memory. [5] : The DMA controller for the SPE moves the data. During this time, the SPE is free to do other work including executing another work request. [6] : Once the DMA-Put has finished moving the data, the SPE Runtime notifies the PPE that the work request has been completed. While the Offload API is independent of Charm++, it is distributed as part of the Charm++ distribution. Currently, only the nightly build of Charm++ includes the Offload API. For more information on Charm++ on Cell and the Offload API, please refer to the papers and posters listed below. | ||
| People | ||
| Papers | ||
| ||
| Posters | ||
|