Towards Realizing the Potential of Malleable Jobs
IEEE International Conference on High Performance Computing (HiPC) 2014
Publication Type: Paper
Malleable jobs are those which can dynamically shrink or expand the number of processors on which they are executing at runtime in response to an external command. Malleable jobs can significantly improve system utilization and reduce average response time, compared to traditional jobs. To realize these benefits, three components are critical – an adaptive job scheduler, an adaptive resource manager, and an adaptive parallel runtime system. In this paper, we present a novel mechanism for enabling shrink/expand capability in the parallel runtime system using task migration and dynamic load balancing, checkpoint-restart, and Linux shared memory. Our technique performs true shrink/expand eliminating the need of any residual processes, requires little application programmer effort, and is fast. Further, we establish a bidirectional communication channel between the resource manager and the parallel runtime, and present an asynchronous split-phase mechanism for executing adaptive scheduling decisions. Performance results using Charm++ on Stampede supercomputer show the efficacy, scalability, and benefits of our approach. Shrinking from 2k to 1k cores takes 16s while expand from 1k to 2k takes 40s. Also, we demonstrate the utility of our runtime in traditional as well as emerging scenarios, e.g., proactive fault tolerance and clouds.