BSLD Threshold Driven Parallel Job Scheduling for Energy Efficient HPC centers
نویسندگان
چکیده
Recently, power awareness in high performance computing (HPC) community has increased significantly. While CPU power reduction of HPC applications using Dynamic Voltage Frequency Scaling (DVFS) has been explored thoroughly, CPU power management for large scale parallel systems at system level has left unexplored. In this paper we propose a power-aware parallel job scheduler assuming DVFS enabled clusters. Traditional parallel job schedulers determine when a job will be run, power aware ones should assign CPU frequency which it will be run at. We have introduced two adjustable thresholds in order to enable fine grain energy performance trade-off control. Since our power reduction approach is policy independent it can be added to any parallel job scheduling policy. Furthermore, we have done an analysis of HPC system dimension. Running an application at lower frequency on more processors can be more energy efficient than running it at the highest CPU frequency on less processors. This paper investigates whether having more DVFS enabled processors and same load can lead to better energy efficiency and performance. Five workload logs from systems in production use with up to 9 216 processors are simulated to evaluate the proposed algorithm and the dimensioning problem. Our approach decreases CPU energy by 7%18% on average depending on allowed job performance penalty. Applying the same frequency scaling algorithm on 20% larger system, CPU energy needed to execute same load can be decreased by almost 30% while having same or better job performance.
منابع مشابه
Integrating Cooling Awareness with Thermal Aware Workload Placement for HPC Data Centers
High Performance Computing (HPC) data centers are becoming increasingly dense; the associated power-density and energy consumption of their operation is increasing. Up to half of the total energy is attributed to cooling the data center; greening the data center operations to reduce both computing and cooling energy is imperative. To this effect, this paper integrates awareness of the dynamic b...
متن کاملEnvironment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers
The use of High Performance Computing (HPC) in commercial and consumer IT applications is becoming popular. HPC users need the ability to gain rapid and scalable access to high-end computing capabilities. Cloud computing promises to deliver such a computing infrastructure using data centers so that HPC users can access applications and data from a Cloud anywhere in the world on demand and pay b...
متن کاملTowards understanding HPC users and systems: A NERSC case study
The high performance computing (HPC) scheduling landscape is changing. Previously dominated by tightly coupled MPI jobs, HPC workloads are increasingly including high-throughput, data-intensive, and stream-processing applications. As a consequence, workloads are becoming more diverse at both application and job level, posing new challenges to classical HPC schedulers. There is a need to underst...
متن کاملPower-Aware Parallel Job Scheduling
Recent increase in performance of High Performance Computing (HPC) centers has been followed by even higher increase in power consumption. Power draw of modern supercomputers is not only an economic problem but it has negative consequences on environment. Roughly speaking, CPU power presents 50% of total system power. Dynamic Voltage Frequency Scaling(DVFS) is a technique widely used to manage ...
متن کاملSpatio-temporal thermal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers
Job scheduling in data centers can be considered from a cyber-physical point of view, as it affects the data center’s computing performance (i.e. the cyber aspect) and energy efficiency (the physical aspect). Driven by the growing needs to green contemporary data centers, this paper uses recent technological advances in data center virtualization and proposes cyber-physical, spatio-temporal (i....
متن کامل