A Comparison of Job Management Systems in Supporting HPC ClusterTools
نویسندگان
چکیده
This paper compares three most common job management systems and their workings with Sun HPC ClusterTools 3.1. Various aspects such as installation, customization, scheduling and resource control issues are discussed. The three chosen systems are: Load Sharing Facility (LSF), Portable Batch System (PBS) and COmputing in DIstributed Networked Environment (CODINE)/ Global Resource Director (GRD). We give a brief overview of each product but mainly focus on integrating these job management systems with Sun HPC ClusterTools. We provide useful guidelines to Sun HPC ClusterTools users when using job management systems with Sun HPC ClusterTools. We further demonstrate how to use the job management systems in support of commercial MPI applications with HPC ClusterTools.
منابع مشابه
A New Open Resource Management Architecture in the Sun HPC ClusterToolsTM Environment
Sun Microsystems, Inc. has intellectual property rights relating to technology embodied in the product that is described in this document. In particular, and without limitation, these intellectual property rights may include one or more of the U.S. patents listed at http:// www.sun.com/patents and one or more additional patents or pending patent applications in the U.S. and in other countries. ...
متن کاملmyHadoop - Hadoop-on-Demand on Traditional HPC Resources
Traditional High Performance Computing (HPC) resources, such as those available on the TeraGrid, support batch job submissions using Distributed Resource Management Systems (DRMS) like TORQUE or the Sun Grid Engine (SGE). For large-scale data intensive computing, programming paradigms such as MapReduce are becoming popular. A growing number of codes in scientific domains such as Bioinformatics ...
متن کاملCharacterization and Comparison of Google Cloud Load versus Grids
A new era of Cloud Computing has emerged, but the characteristics of Cloud load in data centers is not perfectly clear. Yet this characterization is critical for the design of novel Cloud job and resource management systems. In this paper, we comprehensively characterize the job/task load and host load in a real-world production data center at Google Inc. We use a detailed trace of over 25 mill...
متن کاملBSLD Threshold Driven Parallel Job Scheduling for Energy Efficient HPC centers
Recently, power awareness in high performance computing (HPC) community has increased significantly. While CPU power reduction of HPC applications using Dynamic Voltage Frequency Scaling (DVFS) has been explored thoroughly, CPU power management for large scale parallel systems at system level has left unexplored. In this paper we propose a power-aware parallel job scheduler assuming DVFS enable...
متن کاملSelf-tuning job scheduling strategies for the resource management of HPC systems and computational grids
In this thesis we develop and study self-tuning job schedulers for resource management systems. Such schedulers search for the best solution among the available scheduling alternatives in order to improve the performance of static schedulers. In two domains of real world job scheduling this concept is implemented. First of all, we study the scheduling in resource management software for high pe...
متن کامل