Efficient Support for Pipelining in Distributed Shared Memory Systems∗

نویسندگان

Karthik Balasubramanian

David K. Lowenthal

چکیده

Though more difficult to program, distributed-memory parallel machines provide greater scalability than their shared-memory counterparts. Distributed Shared Memory (DSM) systems provide the abstraction of shared memory on a distributed machine. While DSMs provide an attractive programming model, they currently can not efficiently support all classes of scientific applications. One such class are those with recurrences that cause dependencies across processors or nodes. A popular solution to such problems is to use pipelining, which breaks the computation into blocks; each processor performs the computation of a block, which enables the next processor in the pipeline to compute its corresponding block. Once the pipeline is filled, the computation of blocks proceeds in parallel. While pipelining is useful, it is not efficiently supported by current DSM systems. This paper presents an approach to integrating pipelining into DSM systems. We describe our design and implementation of one-way pipelining in a DSM. The key idea is to retain the shared-memory model, but design the extensions such that the execution will mimic what would be done in an explicit message-passing program. We show that one-way pipelining is superior to the two most common ways to program pipelined applications, which are distributed locks and explicit matrix transposition. Finally, we show that one-way pipelining is competitive with a hand-coded, explicit message-passing program.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Dependence Boundary Row Boundary Row Node

Though more diicult to program, distributed-memory parallel machines provide greater scalability than their shared-memory counterparts. Distributed Shared Memory (DSM) systems provide the abstraction of shared memory on a distributed machine. While DSMs provide an attractive programming model, they currently can not eeciently support all classes of scientiic applications. One such class are tho...

متن کامل

Radish: Compiling Efficient Query Plans for Distributed Shared Memory

We present Radish, a query compiler that generates distributed programs. Recent efforts have shown that compiling queries to machine code for a single-core can remove iterator and control overhead for significant performance gains. So far, systems that generate distributed programs only compile plans for single processors and stitch them together with messaging. In this paper, we describe an ap...

متن کامل

System Software Support for Reducing Memory Latency on Distributed Shared Memory Multiprocessors

This paper overviews results from our recent work on building customized system software support for Distributed Shared Memory Multiprocessors. The mechanisms and policies outlined in this paper are connected with a single conceptual thread: they all attempt to reduce the memory latency of parallel programs by optimizing critical system services, while hiding the complex architectural details o...

متن کامل

Effiziente Implementierung eingebetteter Runge-Kutta-Verfahren durch Ausnutzung der Speicherzugriffslokalität

Embedded Runge-Kutta methods are among themost popular numerical solutionmethods for non-stiff initial value problems of ordinary differential equations. While possessing a simple computational structure, they provide desirable numerical properties and can adapt the step size efficiently. Therefore, embedded Runge-Kutta methods can often compute the solution function faster than other solution ...

متن کامل

Determining Asynchronous Pipeline Execution

Asynchronous pipelining is a form of parallelism in which processors execute diierent loop tasks (loop statements) as opposed to diierent loop iterations. An asynchronous pipeline schedule for a loop is an assignment of loop tasks to processors, plus an order on instances of tasks assigned to the same processor. This variant of pipelining is particularly relevant in distributed memory systems (...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Efficient Support for Pipelining in Distributed Shared Memory Systems∗

نویسندگان

چکیده

منابع مشابه

Data Dependence Boundary Row Boundary Row Node

Radish: Compiling Efficient Query Plans for Distributed Shared Memory

System Software Support for Reducing Memory Latency on Distributed Shared Memory Multiprocessors

Effiziente Implementierung eingebetteter Runge-Kutta-Verfahren durch Ausnutzung der Speicherzugriffslokalität

Determining Asynchronous Pipeline Execution

عنوان ژورنال:

اشتراک گذاری