Benchmarking Hypercube Hardware and Software by Dirk C . Grunwald and Daniel A . Reed

نویسندگان

  • Daniel A. Reed
  • DANIEL A. REED
چکیده

It has long been a truism in computer systems design tha t balanced systems achieve the best performance. Message passing parallel processors are no different. T o quantify the balance of a hypercube design, we have developed an experimental methodology and applied the associated suite of benchmarks t o several existing hypercubes. The benchmark suite includes tests of both processor speed in the absence of internode communication and message transmission speed as a function of communication patterns. Introduction The appearance of a new computer system always raises many questions about its performance, both in absolute terms and in comparison to other machines of its class. In addition, repeated studies have shown tha t a system’s performance is maximized when the components are balanced (Le., there is no single system bottleneck) [DeBu78]. Message passing parallel processors are no different; optimizing performance requires a judicious combination of node computation speed, message transmission latency, and operating system software. For example, high speed processors connected by high latency communication links restrict the classes of algorithms that can be efficiently supported. Although the interaction of communication and computation can be examined analytically [ReSc83], time varying behavior and the idiosyncrasies of system software can only be captured by observation and measurement. Consequently, we began a benchmark study of hypercubes, with three primary goals. Because the performance of any system does depend oa a combination of hardware and software, our first and primary goal was determining the performance of both the underlying hardware and the fraction of that performance lost due to poor compilers and operating system overhead. Second, we wished to characterize the balance of processing power and communication speed. With these parameters, algorithms can be developed tha t are best suited to the machine [SaNN86]. Finally, we are developing a high-performance, portable operating system for hypercubes, called Plcasso, that provides dynamic task migration to balance workloads and adaptive routing of data to avoid congested portions of the network. To meet these goals, it must be possible to rapidly transmit small s ta tus ’messages. Thus, we sought performance da ta to tune Picasso’s algorithms for each hypercube. Overview Because any hypercube computation combines both communication and computation [Heat86, ReFuSb], a single number (e.g., MIPS, MFLOPS, or bits/sec) will not accurately reflect ~~ ~ ~~ ~ * Department of Computer Science, University of Illinois, Urbana, Illinois 61801. This work was supported in part by NSF Grant Number DCR 84-17948, NASA Contract Number NAG-1-613 and by the Jet Propulsion Laboratory.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reconfigurable Logic for Low - Power Space Systems by Jeffrey

Date The final copy of this thesis has been examined by the signatories, and we find that both the content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline. Thesis directed by Prof. Dirk Grunwald This research investigates a reconfigurable processing approach to a signal processing application that runs on an onboard space system. The recon...

متن کامل

An Evolving Curriculum to Match the Evolution of Reconfigurable Computing Platforms

Reconfigurable platforms have evolved from “sea of gates” architectures into diverse System on a Chip (SoC) platforms with embedded processor cores and dedicated hardware components. This evolution has greatly increased the performance of this technology, but creates challenges when teaching the new technology to Computer Science and Electrical Engineering graduate students. Previously, knowled...

متن کامل

An Integrated Performance Data Collection, Analysis, and Visualization System

The lack of tools to observe the operation and performance of message-based parallel architectures limits the user's ability to e ectively optimize application and system performance. Performance data collection, analysis, and visualization tools are needed to manage the complexity and quantity of performance data. Furthermore, these tools must be integrated with the machine hardware, the syste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1987