Benchmarking and tuning the MILC code on clusters and supercomputers
نویسنده
چکیده
Recently, we have benchmarked and tuned the MILC code on a number of architectures including Intel Itanium and Pentium IV (PIV), dual-CPU Athlon, and the latest Compaq Alpha nodes. Results will be presented for many of these, and we shall discuss some simple code changes that can result in a very dramatic speedup of the KS conjugate gradient on processors with more advanced memory systems such as PIV, IBM SP and Alpha.
منابع مشابه
Comparing Clusters and Supercomputers for Lattice QCD
Since the development of the Beowulf project to build a parallel computer from commodity PC components, there have been many such clusters built. The MILC QCD code has been run on a variety of clusters and supercomputers. Key design features are identified, and the cost effectiveness of clusters and supercomputers are compared.
متن کاملLattice QCD Production on Commodity Clusters at Fermilab
Large scale QCD Monte Carlo calculations have typically been performed on either commercial supercomputers or specially built massively parallel computers. Commodity clusters equipped with high performance networking equipment present an attractive alternative, achieving superior performance to price ratios and offering clear upgrade paths. The U.S. Department of Energy, through the SciDAC (Sci...
متن کاملCost-Effective Clustering
Small Beowulf clusters can effectively serve as personal or group supercomputers. In such an environment, a cluster can be optimally designed for a specific problem (or a small set of codes). We discuss how theoretical analysis of the code and benchmarking on similar hardware lead to optimal systems.
متن کاملUnderstanding Application Performance via Micro-benchmarks on Three Large Supercomputers: Intrepid, Ranger and Jaguar
Emergence of new parallel architectures presents new challenges for application developers. Supercomputers vary in processor speed, network topology, interconnect communication characteristics and memory subsystems. This paper presents a performance comparison of three of the fastest machines in the world: IBM’s Blue Gene/P installation at ANL (Intrepid), the SUN-Infiniband cluster at TACC (Ran...
متن کاملTuning HipGISAXS on Multi and Many Core Supercomputers
With the continual development of multi and manycore architectures, there is a constant need for architecturespecific tuning of application-codes in order to realize high computational performance and energy efficiency, closer to the theoretical peaks of these architectures. In this paper, we present optimization and tuning of HipGISAXS, a parallel X-ray scattering simulation code [1], on vario...
متن کامل