HiCOO: Hierarchical cooperation for scalable communication in Global Address Space programming models on Cray XT systems
نویسندگان
چکیده
Global Address Space (GAS) programming models enable a convenient, shared-memory style addressing model. Typically this is realized through one-sided operations that can enable asynchronous communication and data movement. With the size of petascale systems reaching 10,000s of nodes and 100,000s of cores, the underlying runtime systems face critical challenges in (1) scalably managing resources (such as memory for communication buffers), and (2) gracefully handling unpredictable communication patterns and any associated contention. For any solution that addresses these resource scalability challenges, equally important is the need to maintain the performance of GAS programming models. In this paper, we describe a Hierarchical COOperation (HiCOO) architecture for scalable communication in GAS programming models. HiCOO formulates a cooperative communication architecture: with inter-node cooperation amongst multiple nodes (a.k.a multinode) and hierarchical cooperation among multinodes that are arranged in various virtual topologies. We have implemented HiCOO for a popular GAS runtime library, Aggregate Remote Memory Copy Interface (ARMCI). By extensively evaluating different virtual topologies inHiCOO in termsof their impact tomemory scalability, network contention, and application performance,we identifyMFCGas themost suitable virtual topology. The resulting HiCOO architecture is able to realize scalable resource management and achieve resilience to network contention, while at the same time maintaining or enhancing the performance of scientific applications. In one case, it reduces the total execution time of an NWChem application by 52%. © 2012 Elsevier Inc. All rights reserved.
منابع مشابه
Porting GASNet to Portals: Partitioned Global Address Space (PGAS) Language Support for the Cray XT
Partitioned Global Address Space (PGAS) Languages are an emerging alternative to MPI for HPC applications development. The GASNet library from Lawrence Berkeley National Lab and the University of California at Berkeley provides the network runtime for multiple implementations of four PGAS Languages: Unified Parallel C (UPC), Co-Array Fortran (CAF), Titanium and Chapel. GASNet provides a low ove...
متن کاملPGAS Models using an MPI Runtime: Design Alternatives and Performance Evaluation
Programming models play a critical role in designing scalable applications. In the past few decades, MPI [3] has become the de facto programming model for writing parallel applications. At the same time, alternative programming models such as Partitioned Global Address Space (PGAS) programming models are gaining traction due to the asynchrony, ability to read/write distributed data structures a...
متن کاملScalable performance analysis of large-scale parallel applications on Cray XT systems with Scalasca
The open-source Scalasca toolset (available from www.scalasca.org) supports integrated runtime summarization and automated trace analysis on a diverse range of HPC computer systems. An HPC-Europa2 visit to EPCC in 2009 resulted in significantly enhanced support for Cray XT systems, particularly the auxilliary programming environments and hybrid OpenMP/MPI. Combined with its previously demonstra...
متن کاملParallel Finite Element Earthquake Rupture Simulations on Quad- and Hex-core Cray XT Systems
In this paper, we integrate a 3D mesh generator into the simulation, and use MPI to parallelize the 3D mesh generator, illustrate an element-based partitioning scheme for explicit finite element methods, and based on the partitioning scheme and what we learned from our previous work, we implement our hybrid MPI/OpenMP finite element earthquake simulation code in order to not only achieve multip...
متن کاملEffective use of the PGAS Paradigm: Driving Transformations and Self-Adaptive Behavior in DASH-Applications
DASH is a library of distributed data structures and algorithms designed for running the applications on modern HPC architectures, composed of hierarchical network interconnections and stratified memory. DASH implements a PGAS (partitioned global address space) model in the form of C++ templates, built on top of DART – a run-time system with an abstracted tier above existing one-sided communica...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Parallel Distrib. Comput.
دوره 72 شماره
صفحات -
تاریخ انتشار 2012