Improving Parallel System Performance with a NUMA-aware Load Balancer
Illinois Research and Technical Reports - Computer Science (CS Res. & Tech. Report) 2011
Publication Type: Paper
Repository URL:
Download:
[PDF]
Abstract
Multi-core nodes with Non-Uniform Memory Access
(NUMA) are now a common architecture for high performance
computing. On such NUMA nodes, the shared memory
is physically distributed into memory banks connected by a
network. Owing to this, memory access costs may vary depending
on the distance between the processing unit and the memory
bank. Therefore, a key element in improving the performance
on these machines is dealing with memory affinity. We propose a
NUMA-aware load balancer that combines the information about
the NUMA topology with the statistics captured by the Charm++
runtime system. We present speedups of up to 1.8 for synthetic
benchmarks running on different NUMA platforms. We also
show improvements over existing load balancing strategies both
in benchmark performance and in the time for load balancing.
In addition, by avoiding unnecessary migrations, our algorithm
incurs up to seven times smaller overheads in migration, than
the other strategies.
TextRef
Laercio L. Pilla, Christiane Pousa Ribeiro, Daniel Cordeiro, Abhinav Bhatele, Philippe O. A. Navaux, Jean-Francois Mehaut, Laxmikant V. Kale, Improving Parallel System Performance with a NUMA-aware Load Balancer, Computer Science Research and Tech Reports, http://hdl.handle.net/2142/25911 August 2011
People
- Laercio Pilla
- Christiane Ribeiro
- Daniel Cordeiro
- Abhinav Bhatele
- Philippe Navaux
- Jean-Francois Mehaut
- Laxmikant Kale
Research Areas