Coping at the User-Level with Resource Limitations in the Cray Message Passing Toolkit MPI at Scale: How Not to Spend Your Summer Vacation
نویسندگان
چکیده
As the number of processor cores available in Cray XT series computers has rapidly grown, users have increasingly encountered instances where an MPI code that has previously worked for years unexpectedly fails at high core counts (“at scale”) due to resource limitations being exceeded within the MPI implementation. Here, we examine several examples drawn from user experiences and discuss strategies for working around these difficulties at the user level.
منابع مشابه
Resource-Efficient, Hierarchical Auto-Tuning of a Hybrid Lattice Boltzmann Computation on the Cray XT4
We apply auto-tuning to a hybrid MPI-pthreads lattice Boltzmann computation running on the Cray XT4 at National Energy Research Scientific Computing Center (NERSC). Previous work showed that multicorespecific auto-tuning can improve the performance of lattice Boltzmann magnetohydrodynamics (LBMHD) by a factor of 4× when running on dualand quad-core Opteron dual-socket SMPs. We extend these stud...
متن کاملMPICH G A Grid Enabled Implementation of the Message Passing Interface
1 Application development for distributed-computing \Grids" can beneet from tools that variously hide or enable application-level management of critical aspects of the heterogeneous environment. As part of an investigation of these issues, we h a ve d e v eloped MPICH-G2, a Grid-enabled implementation of the Message Passing Interface (MPI) that allows a user to run MPI programs across multiple ...
متن کاملWide-Area Implementation of the Message Passing Interface
The Message Passing Interface (MPI) can be used as a portable, high-performance programming model for wide-area computing systems. The wide-area environment introduces challenging problems for the MPI implementor, due to the heterogeneity of both the underlying physical infrastructure and the software environment at di erent sites. In this article, we describe an MPI implementation that incorpo...
متن کاملArchitecture Characterization of DoD MSRC HPC Platforms
This paper outlines the results for a set of low-level, architecture characterization benchmarks that measure the performance of dense numerical computations, access to the memory hierarchy and MPI message passing of three high performance architectures. The machines evaluated are the Cray T3E, IBM SP, and SGI Origin 2000 platforms at the CEWES Major Shared Resource Center. Verification of the ...
متن کاملClustering T3Es for Metacomputing Applications
This paper presents an environment which enables the coupling of different supercomputers to overcome the limitations of a single computing system. This requires an extension to MPI, since MPI provides no interoperability-features. A library called PACX-MPI is presented which provides the user with a distributed MPI environment with most of the important functionality of standard MPI. First res...
متن کامل