Quantifying Network Contention on Large Parallel Machines
Authors:
Abhinav Bhatele and Laxmikant V. Kale
Parallel Programming Laboratory, Department of Computer Science, University
of Illinois at Urbana-Champaign
submitted to Parallel Processing Letters (Special Issue on Large-Scale Parallel Processing), 2009.
Impact of network topology on application performance has become important
again with the emergence of very large supercomputers, typically connected as a
3D torus or mesh. This article presents a quantitative study on the effect of
contention on message latencies on torus and mesh networks. Several MPI
benchmarks are used to evaluate the effect of hops (links) traversed by
messages, on their latencies. The benchmarks demonstrate that when multiple
messages compete for network resources, link occupancy or contention can
increase message latencies by up to a factor of 8 times on some architectures.
Results are shown for two parallel machines -- ANL's IBM Blue Gene/P (Surveyor)
and PSC's Cray XT3 (BigBen).
Significant theoretical research was done on interconnect topologies and
topology aware mapping for parallel computers in the 80s. With the deployment
of virtual cut-through, wormhole routing and faster interconnects, message
latencies reduced and research in the area died down. Findings in this article
suggest that application developers should now consider interconnect topologies
when mapping tasks to processors in order to obtain the best performance on
large parallel machines.