Parallelizing nested loops on multicomputers-the grouping approach
نویسندگان
چکیده
In this paper, the design of a tool for partitioning and parallelking nested loops for execution on distributed-memory multicomputers is presented. The core of the tool is a technique called grouping, which identifies appropriate loop partition patterns based on data dependencies across the iterations. The grouping technique combined with analytic results f“ performance modeling tools will allow us to partition certain nested loops systematically and automatically, without users specifying the data partitions. Grouping is based on the concept of pipelined data parallel computation, which promises to achieve a balanced computation and communication on multicomputers. The basic structure of the parallelizing tool is presented. We will also describe the grouping and performance analysis techniques for pipelined data parallel computations. Finally, a prototype of the tool will be introduced to illustrate the feasibility of the approach.
منابع مشابه
Chain-Based Scheduling: Part I { Loop Transformations and Code Generation
Chain-based scheduling [1] is an e cient partitioning and scheduling scheme for nested loops on distributed-memory multicomputers. The idea is to take advantage of the regular data dependence structure of a nested loop to overlap and pipeline the communication and computation. Most partitioning and scheduling algorithms proposed for nested loops on multicomputers [1,2,3] are graph algorithms on...
متن کاملStatement-Level Communication-Free Hyperplane Partitioning Techniques for Parallelizing Compilers on Multicomputers
This paper addresses the problems of communication free partitions of statement-iterations of nested loops and data accessed by these statement-iterations. Communication-free hyperplane partitions of disjoint subsets of data and statement-iterations are considered. This approach is more possible than existing methods in nding the data and program distribution patterns that can cause the process...
متن کاملChain-based Scheduling: Part I { Loop Transformations and Code Generation Chain-based Scheduling: Part I { Loop Transformations and Code Generation
Chain-based scheduling 1] is an eecient partitioning and scheduling scheme for nested loops on distributed-memory multicomputers. The idea is to take advantage of the regular data dependence structure of a nested loop to overlap and pipeline the communication and computation. Most partitioning and scheduling algorithms proposed for nested loops on multicomputers 1,2,3] are graph algorithms on t...
متن کاملTiling of Iteration Spaces for Multicomputers
We deal with compiler support for parallelizing perfectly nested loops for coarse-grain distributed memory machines. The relatively high communication start-up costs in these machines renders frequent communication very expensive. We study the eeect of clustering communication and the ensuing loss of parallelism on performance and propose a method for aggregating a number of loop iterations int...
متن کاملProcessor Tagged Descriptors: A Data Structure for Compiling for Distributed-Memory Multicomputers
The computation partitioning, communication analysis, and optimization phases performed during compilation for distributed-memory multicomputers require an eecient way of describing distributed sets of iterations and regions of data. Processor Tagged Descriptors (PTDs) provide these capabilities through a single set representation parameterized by the processor location for each dimension of a ...
متن کامل