On Automatic Loop Data-Mapping for Distributed-Memory Multiprocessors

نویسندگان

Jordi Torres

Eduard Ayguadé

Jesús Labarta

José María Llabería

Mateo Valero

چکیده

In this paper we present a unified approach for compiling programs for Distributed-Memory Multiprocessors (DMM). Parallelization of sequential programs for DMM is much more difficult to achieve than for shared memory systems due to the exclusive local memory of each Virtual Processor (VP). The approach presented distributes computations among VPs of the system and maps data onto their private memories. It tries to obtain maximum parallelism out of DO loops while minimizing interprocessor communication. The method presented, which is named Graph Traverse Scheduling (GTS), is considered in this paper for single-nested loops including one or several recurrences. In the parallel code generated, dependences included in a hamiltonian recurrence that involves all the statements of the loop are enforced by the sequential execution of the computation assigned to each VP. Other dependences not included in the hamiltonian recurrence and involving data mapped onto different VPs will need explicit communication and synchronization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Locality Analysis for Distributed Shared-Memory Multiprocessors

This paper studies the locality analysis problem for shared-memory multiprocessors, a class of parallel machines that has experienced steady and rapid growth in the past few years. The focus of this work is on estimation of the memory performance of a loop nest for a given set of computation and data distributions. We assume a distributed shared-memory multiprocessor model. We discuss how to es...

متن کامل

A Framework for Integrating Data Alignment, Distribution, and Redistribution in Distributed Memory Multiprocessors

ÐParallel architectures with physically distributed memory provide a cost-effective scalability to solve many large scale scientific problems. However, these systems are very difficult to program and tune. In these systems, the choice of a good data mapping and parallelization strategy can dramatically improve the efficiency of the resulting program. In this paper, we present a framework for au...

متن کامل

Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors

This paper presents a theoretical framework for automatically partitioning parallel loops to minimize cache coherency tra c on shared-memory multiprocessors. While several previous papers have looked at hyperplane partitioning of iteration spaces to reduce communication tra c, the problem of deriving the optimal tiling parameters for minimal communication in loops with general a ne index expres...

متن کامل

Enhancing the Performance of Autoscheduling in Distributed Shared Memory Multiprocessors

Abstract. Autoscheduling is a parallel program compilation and execution model that combines uniquely three features: Automatic extraction of loop and functional parallelism at any level of granularity, dynamic scheduling of parallel tasks, and dynamic program adaptability on multiprogrammed shared memory multiprocessors. This paper presents a technique that enhances the performance of autosche...

متن کامل

Enhancing the Performance of Autoscheduling with Locality-Based Partitioning in Distributed Shared Memory Multiprocessors

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1991

On Automatic Loop Data-Mapping for Distributed-Memory Multiprocessors

نویسندگان

چکیده

منابع مشابه

Locality Analysis for Distributed Shared-Memory Multiprocessors

A Framework for Integrating Data Alignment, Distribution, and Redistribution in Distributed Memory Multiprocessors

Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors

Enhancing the Performance of Autoscheduling in Distributed Shared Memory Multiprocessors

Enhancing the Performance of Autoscheduling with Locality-Based Partitioning in Distributed Shared Memory Multiprocessors

عنوان ژورنال:

اشتراک گذاری