Computation-Communication Overlap on Network-of-Workstation Multiprocessors

نویسندگان

Gary Liu

Tarek S. Abdelrahman

چکیده

This paper describes and evaluates a compiler transformation that improves the performance of parallel programs on Network-of-Workstation (NOW) sharedmemory multiprocessors. The transformation overlaps the communication time resulting form non-local memory accesses with the computation time in parallel loops to effectively hide the latency of the remote accesses. The transformation peels from a parallel loop iterations that access remote data and re-schedules them after the execution of iterations that access only local data (localonly iterations). Asynchronous prefetching of remote data is used to overlap non-local access latency with the execution of local-only iterations. Experimental evaluation of the transformation on a NOW multiprocessor indicates that it is generally effective in improving parallel execution time (up to 1.9 times). The extent of the benefit is determined by three factors: the size of localonly computations, the significance of remote memory access latency, and the position of the iterations that access remote data in a parallel loop.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overlap of Computation and Communication on Shared-Memory

This paper describes and evaluates a compiler transformation that improves the performance of parallel programs on Network-of-Workstation (NOW) shared-memory multiprocessors. The transformation overlaps the communication time resulting form nonlocal memory accesses with the computation time in parallel loops to e ectively hide the latency of the remote accesses. The transformation peels from a ...

متن کامل

HyFi: Architecture-Independent Parallelism on Networks of Multiprocessors

A network of parallel workstations promises cost-effective parallel computing. This paper presents the HyFi (Hybrid Filaments) package, which can be used to create architectureindependent parallel programs—that is, programs that are portable and efficient across different parallel machines. HyFi integrates Shared Filaments (SF), which provides parallelism on sharedmemory multiprocessors, and Di...

متن کامل

Non - Uniform Partitioning of Finite Di erence Methods Running on SMP Clusters

A multicomputer or workstation cluster with multiprocessor nodes introduces signiicant need and opportunity for overlapping communication with computation. We evaluate partitioning strategies for an important application class, nite diierence methods, running on clusters of symmetric multiprocessors. Our results show that even for a regular, uniform nite diierence method, a non-uniform partitio...

متن کامل

Compression-Based Ray Casting of Very Large Volume Data in Distributed Environments

This paper proposes a new parallel/distributed raycasting scheme for very large volume data that can be effectively used in distributed environments. Our method, based on data compression, attempts to enhance the rendering speedups by quickly reconstructing voxel data from local memory rather than expensively fetching them from remote memory spaces. Our compression-based volume rendering scheme...

متن کامل

Demand-based coscheduling of parallel jobs on multiprogrammed multiprocessors

This thesis describes demand-based coscheduling, a new approach to scheduling parallel computations on multiprogrammed multiprocessors. In demand-based coscheduling, rather than making the pessimistic assumption that all the processes constituting a parallel job must be simultaneously scheduled in order to achieve good performance, information about which processes are communicating is used in ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Computation-Communication Overlap on Network-of-Workstation Multiprocessors

نویسندگان

چکیده

منابع مشابه

Overlap of Computation and Communication on Shared-Memory

HyFi: Architecture-Independent Parallelism on Networks of Multiprocessors

Non - Uniform Partitioning of Finite Di erence Methods Running on SMP Clusters

Compression-Based Ray Casting of Very Large Volume Data in Distributed Environments

Demand-based coscheduling of parallel jobs on multiprogrammed multiprocessors

عنوان ژورنال:

اشتراک گذاری