Position Paper: Power-aware and Temperature Restrain Modeling for Maximizing Performance and Reliability
DOE Workshop on Modeling and Simulation of Exascale Systems and Applications (MODSIM) 2014
Publication Type: Paper
Ability to constrain power consumption in the recent hardware architectures is a powerful capability that can be leveraged for efficient utilization of available power. We propose to develop power-aware performance models that can predict job performance given a resource configuration, that is, the CPU/memory power cap, the number of nodes, etc. In addition to performance optimization under a fixed power budget, our proposed model also alleviates the difference in thermal profiles amongst different processors to achieve a balance in the overall temperature distribution of the data center. Reduced temperature of operation improves the reliability of the system in addition to saving cooling energy of the data center, while minimizing the overall execution time of the jobs. The power-aware performance model can be used to determine the optimal resource configurations for a job or for a set of jobs, with the aim of efficient utilization of power.
