ACM SRC: Optimizing All-to-All Algorithm for PERCS Network Using Simulation
International Conference for High Performance Computing, Networking, Storage and Analysis (SC) 2011
Publication Type: Paper
Communication algorithms play a crucial role in the performance of large-scale parallel systems. They are implemented in runtime systems and used in most parallel applications as a critical component. As vendors are willing to design new custom networks with significantly different performance properties for their new supercomputers, designing new efficient communication algorithms is an inevitable challenge. This task is desirable to be done before the machine comes online since inefficient use of the system before the new algorithm’s availability is a huge waste of a possibly hundreds of millions of dollars resource. Here, we demonstrate the usability of our simulation framework, BigSim, in meeting this challenge. Using BigSim, we observe that the commonly used Pairwise-Exchange algorithm for all-to-all communication pattern is suboptimal for a supernode of the PERCS network (two-level directly connected similar to Dragonfly topology). We designed a new all-to-all algorithm for it and predict a five-fold performance improvement for large message sizes using this algorithm.