Welcome! We are the Parallel Programming Laboratory.

Our goal is to develop technology that improves performance of parallel applications while also improving programmer productivity. We aim to reach a point where, with our freely distributed software base, complex irregular and dynamic applications can (a) be developed quickly and (b) perform scalably on machines with thousands of processors.

Processor virtualization is one of our core techniques: the programmer divides the computation into a large number of entities, which are mapped to the available processors by an intelligent runtime system. This separation of concerns between programmers and the system is key to attaining both our goals together.

Scalable Fault Tolerance with Partial-Order Dependencies
Fault tolerance has become increasingly important as system sizes become larger and the probability of hard or soft errors increases. In this work, we focus on methodologies for tolerating hard errors: those that stem from hardware failure and cause a detectable fault in the system. Many approaches have been applied in this area, ranging from checkpoint/restart, where the application's state is saved to disk, to in-memory checkpoint, where the application incrementally saves its state to partner...