Resource Scaling Effects on MPP Performance: The STAP Benchmark Implications

نویسندگان

  • Kai Hwang
  • Choming Wang
  • Cho-Li Wang
  • Zhiwei Xu
چکیده

ÐPresently, massively parallel processors (MPPs) are available only in a few commercial models. A sequence of three ASCI Teraflops MPPs has appeared before the new millenium. This paper evaluates six MPP systems through STAP benchmark experiments. The STAP is a radar signal processing benchmark which exploits regularly structured SPMD data parallelism. We reveal the resource scaling effects on MPP performance along orthogonal dimensions of machine size, processor speed, memory capacity, messaging latency, and network bandwidth. We show how to achieve balanced resources scaling against enlarged workload (problem size). Among three commercial MPPs, the IBM SP2 shows the highest speed and efficiency, attributed to its well-designed network with middleware support for single system image. The Cray T3D demonstrates a high network bandwidth with a good NUMA memory hierarchy. The Intel Paragon trails far behind due to slow processors used and excessive latency experienced in passing messages. Our analysis projects the lowest STAP speed on the ASCI Red, compared with the projected speed of two ASCI Blue machines. This is attributed to slow processors used in ASCI Red and the mismatch between its hardware and software. The Blue Pacific shows the highest potential to deliver scalable performance up to thousands of nodes. The Blue Mountain is designed to have the highest network bandwidth. Our results suggest a limit on the scalability of the distributed shared-memory (DSM) architecture adopted in Blue Mountain. The scaling model offers a quantitative method to match resource scaling with problem scaling to yield a truly scalable performance. The model helps MPP designers optimize the processors, memory, network, and I/O subsystems of an MPP. For MPP users, the scaling results can be applied to partition a large workload for SPMD execution or to minimize the software overhead in collective communication or remote memory update operations. Finally, our scaling model is assessed to evaluate MPPs with benchmarks other than STAP. Index TermsÐMassively parallel processors, SPMD parallelism, ASCI program, STAP benchmark, phase-parallel model, latency and bandwidth, scalability analysis, supercomputer performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ESX Server Performance and Resource Management for CPU-Intensive Workloads

VMware® ESX Server® 2 provides a robust, scalable virtualization framework for consolidating multiple systems onto a single hardware platform. By default, machine resources are shared equally among the multiple virtual systems. In addition customers can tailor virtual machine configurations to allocate CPU and system resources based on various environmental, application, and workload factors. T...

متن کامل

Evaluating MPI Collective Communication on the SP2, T3D, and Paragon Multicomputers

We evaluate the architectural support of collective communication operations on the IBM SP2, Cray T3D, and Intel Paragon. The MPI performance data are obtained f o m the STAP benchmark experiments jointly performed at the USC and HKU. The T3D demonstrated clearly the best timing performance in almost all collective operations. This is attributed to the special hardware built in the T3D for fast...

متن کامل

Effects of Far- and Near-Field Multiple Earthquakes on the RC SDOF Fragility Curves Using Different First Shock Scaling Methods

Typically, to study the effects of consecutive earthquakes, it is necessary to consider definite intensity levels of the first shock. Methods commonly used to define intensity involve scaling the first shock to a specified maximum interstorey drift. In this study the structure’s predefined elastic spectral acceleration caused by the first shock is also considered for scaling. This study aims to...

متن کامل

Space-time Adaptive Processing (stap) for Low Sample Support Applications

Airborne radar Space-Time Adaptive Processing (STAP) in a heterogeneous, target-rich environment is addressed. An efficient Kalman Filter implementation of the normalized form of the Parametric Adaptive Matched Filter (NPAMF) is introduced and shown to perform well against a detailed simulation of a site-specific, dense-target environment, Ground Moving Target Indication (GMTI) scenario. The nu...

متن کامل

A Combined Frequency Scaling and Application Elasticity Approach for Energy-Efficient Virtualized Data Centers

At present, large-scale data centers are typically over-provisioned in order to handle peak load requirements. The resulting low utilization of resources contribute to a huge amounts of power consumption in data centers. The effects of high power consumption manifest in a high operational cost in data centers and carbon footprints to the environment. Therefore, the management solutions for larg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Parallel Distrib. Syst.

دوره 10  شماره 

صفحات  -

تاریخ انتشار 1999