Towards a Framework for Abstracting Accelerators in Parallel Applications: Experience with Cell
International Conference for High Performance Computing, Networking, Storage and Analysis (SC) 2009
Publication Type: Paper
Repository URL: 200911_AccelSC09
While accelerators have become more prevalent in recent years, they are still considered hard to program. In this work, we extend a framework for parallel programming so that programmers can easily take advantage of the Cell processor's Synergistic Processing Elements (SPEs) as seamlessly as possible. Using this framework, the same application code can be compiled and executed on multiple platforms, including x86-based and Cell-based clusters. Furthermore, our model allows independently developed libraries to efficiently time-share one or more SPEs by interleaving work from multiple libraries. To demonstrate the framework, we present performance data for an example molecular dynamics (MD) application. When compared to a single Xeon core utilizing streaming SIMD extensions (SSE), the MD program achieves a speedup of 5.74 on a single Cell chip (with 8 SPEs). In comparison, a similar speedup of 5.89 is achieved using six Xeon (x86) cores.
Research Areas