Parallel Programming Laboratory

Optimizing Point-to-Point Communication between Adaptive MPI Endpoints in Shared Memory

| Sam White | Laxmikant Kale

Workshop on Exascale MPI (ExaMPI) 2017

Publication Type: Paper

Repository URL:

Download: [PDF] [BIB]

Abstract

Adaptive MPI is an implementation of the MPI standard that supports the virtualization of ranks as user-level threads, rather than OS processes. In this work, we optimize the communication performance of AMPI based on the locality of the endpoints communicating within a cluster of SMP nodes. We differentiate between point-to-point messages with both endpoints co-located on the same execution unit and point-to-point messages with both endpoints residing in the same process but not on the same execution unit. We demonstrate how the messaging semantics of Charm++ enable and hinder AMPI’s implementation in different ways, and motivate extensions to Charm++ to address the limitations. Using the OSU microbenchmark suite, we show that our locality-aware design offers lower latency, higher bandwidth, and reduced memory footprint for applications.

People

Research Areas

Live Webcast 15th Annual Charm++ Workshop