ACM SRC: Runtime Support for Concurrent Execution of Overdecomposed Heterogeneous Tasks
International Conference for High Performance Computing, Networking, Storage and Analysis (SC) 2017
Publication Type: Poster
Repository URL:
With the rise of heterogeneous systems in high performance computing, how we utilize accelerators has become a critical factor in achieving the optimal performance. We explore several issues with using accelerators in Charm++, a parallel programming model that employs overdecomposition. We propose a runtime support scheme that enables concurrent execution of heterogeneous tasks and evaluate its performance. Using a synthetic benchmark that utilizes busy-waiting to simulate workload, we observe that the effectiveness of the runtime support varies with the application characteristics, with a maximum speedup of 4.79x. With a two-dimensional five-point stencil benchmark designed to represent a realistic workload, we obtain up to 2.75x speedup.
Research Areas