Towards a Framework for Abstracting Accelerators in Parallel Applications: Experience with Cell
International Conference for High Performance Computing, Networking, Storage and Analysis (SC) 2009
Publication Type: Paper
Repository URL: 200911_AccelSC09
Abstract
While accelerators have become more prevalent in recent years, they
are still considered hard to program. In this work, we extend a
framework for parallel programming so that programmers can easily
take advantage of the Cell processor's Synergistic Processing
Elements (SPEs) as seamlessly as possible. Using this framework,
the same application code can be compiled and executed on multiple
platforms, including x86-based and Cell-based clusters.
Furthermore, our model allows independently developed libraries to
efficiently time-share one or more SPEs by interleaving work from
multiple libraries. To demonstrate the framework, we present
performance data for an example molecular dynamics (MD)
application. When compared to a single Xeon core utilizing
streaming SIMD extensions (SSE), the MD program achieves a speedup
of 5.74 on a single Cell chip (with 8 SPEs). In comparison, a
similar speedup of 5.89 is achieved using six Xeon (x86) cores.
People
Research Areas