Architecture for supporting Hardware Collectives in Output-Queued High-Radix Routers
IEEE International Conference on High Performance Computing (HiPC) 2005
Publication Type: Paper
Repository URL:
Abstract
Collective communication performance is critical for many
applications. In this paper, we present an architecture to
efficiently support collective operations (like multicasts and
reductions) in the switches of parallel computer interconnects. We
present an output queuing switch architecture with cross-point
buffering. Output queuing architectures have been less popular in
the past as they require more internal speedup and buffering.
However, with current technology it is straightforward to build
output-queued switches. We demonstrate in this paper that
output-queued architectures make multicasts and reductions fairly
easy to implement efficiently. We show the scalability of our
schemes to a large number of switch ports. We present performance
of multicasts and reductions on individual switches and networks of
switches. We assume a fat-tree topology for the networks of
switches. We also present simulation results based on synthetic
workloads that emulate a molecular dynamics application.
TextRef
Sameer Kumar and Laxmikant V. Kale and Craig Stunkel, "Architecture for
supporting Hardware Collectives in Output-Queued High-Radix Routers",
Parallel Programming Laboratory, Department of Computer Science,
University of Illinois at Urbana-Champaign, March 2005.
People
- Sameer Kumar
- Laxmikant Kale
- Craig Stunkel
Research Areas