Welcome! We are the Parallel Programming Laboratory.

Our goal is to develop technology that improves performance of parallel applications while also improving programmer productivity. We aim to reach a point where, with our freely distributed software base, complex irregular and dynamic applications can (a) be developed quickly and (b) perform scalably on machines with thousands of processors.

Processor virtualization is one of our core techniques: the programmer divides the computation into a large number of entities, which are mapped to the available processors by an intelligent runtime system. This separation of concerns between programmers and the system is key to attaining both our goals together.

Maximizing Network Throughput on the Dragonfly Interconnect

This work, accepted to SC'14, analyzes the behavior of the dragonfly network for various routing strategies, job placement policies, and application communication patterns. The study is based on a novel model that predicts traffic on individual links for direct, indirect, and adaptive routing strategies. In the paper, we analyze results for single jobs and some common parallel job workloads. The predictions presented in this paper are for a 100+ Petaflop/s prototype machine with 92,160 high-radix routers...