Evaluating and Improving the Performance and Scheduling of HPC Applications in Cloud
| Abhishek Gupta | Paolo Faraboschi | Filippo Gioachin | Laxmikant Kale | Richard Kaufmann | Bu-Sung Lee | Verdi March | Dejan Milojicic | Chun Hui Suen
IEEE Transactions on Cloud Computing (IEEE TCC) 2014
Publication Type: Paper
Repository URL: http://dx.doi.org/10.1109/TCC.2014.2339858
Cloud computing is emerging as a promising alternative to supercomputers for some high-performance computing (HPC) applications. With cloud as an additional deployment option, HPC users and providers are faced with the challenges of dealing with highly heterogeneous resources, where the variability spans across a wide range of processor configurations, interconnects, virtualization environments, and pricing models. In this paper, we take a holistic viewpoint to answer the question – why and who should choose cloud for HPC, for what applications, and how should cloud be used for HPC? To this end, we perform comprehensive performance and cost evaluation and analysis of running a set of HPC applications on a range of platforms, varying from supercomputers to clouds. Further, we improve performance of HPC applications in cloud by optimizing HPC applications’ characteristics for cloud and cloud virtualization mechanisms for HPC. Finally, we present novel heuristics for online application-aware job scheduling in multi-platform environments. Experimental results and simulations using CloudSim show that current clouds cannot substitute supercomputers but can effectively complement them. Significant improvement in average turnaround time (up to 2X) and throughput (up to 6X) can be attained using our intelligent application-aware dynamic scheduling heuristics compared to single-platform or application-agnostic scheduling.
Research Areas