Improving the memory access locality of hybrid MPI applications
European MPI Users' Group Meeting (EuroMPI) 2017
Publication Type: Paper
Repository URL: https://charm.cs.illinois.edu/gerrit/#/admin/projects/papers/eurompi2017
Abstract
Maintaining memory access locality is continuing to be a challenge
for parallel applications and their runtime environments. By
exploiting locality, application performance, resource usage, and
performance portability can be improved. The main challenge is
to detect and fix memory locality issues for applications that use
shared-memory programming models for intra-node parallelization.
In this paper, we investigate improving memory access locality of
a hybrid MPI+OpenMP application in two different ways, by manually
fixing locality issues in its source code and by employing the Adaptive MPI (AMPI) runtime environment. Results show that
AMPI can result in similar locality improvements as manual source
code changes, leading to substantial performance and scalability
gains compared to the unoptimized version and to a pure MPI
runtime. Compared to the hybrid MPI+OpenMP baseline, our optimizations improved performance by 1.8x on a single cluster node,
and by 1.4x on 32 nodes, with a speedup of 2.4x compared to a pure
MPI execution on 32 nodes. In addition to performance, we also
evaluate the impact of memory locality on the load balance within
a node.
People
- Matthias Diener
- Sam White
- Laxmikant Kale
- Michael Campbell
- Dan Bodony
- Jon Freund
Research Areas