Analyses for the Translation of OpenMP Codes into SPMD Style with Array Privatization
نویسندگان
چکیده
A so-called SPMD style OpenMP program can achieve scalability on ccNUMA systems by means of array privatization, and earlier research has shown good performance under this approach. Since it is hard to write SPMD OpenMP code, we showed a strategy for the automatic translation of many OpenMP constructs into SPMD style in our previous work. In this paper, we first explain how to ensure that the OpenMP program consistently schedules the parallel loops. Then we describe the analyses required for the translation of an OpenMP program into an equivalent SPMD-style OpenMP code with array privatization. Interprocedural analysis is required to help determine the shape of the privatized array and the quality of the translation.
منابع مشابه
Improving the Performance of OpenMP by Array Privatization
OpenMP emerges as a popular parallel programming interface for medium scale high performance applications. Strong points are its ability to support incremental parallelization, portability, and ease of use. However, the obstacles to scale an OpenMP code to hundreds or thousands of processors, as they beginning to be configured in ccNUMA systems, are remote memory access latency and poor cache m...
متن کاملA Tool to Display Array Access Patterns in OpenMP Programs
A program analysis tool can play an important role in helping users understand and improve OpenMP codes. Array privatization is one of the most effective ways to improve the performance and scalability of OpenMP programs. In this paper we present an extension to the Open64 compiler and the Dragon tool, a program analysis tool built on top of this compiler, to enable them to collect and represen...
متن کاملOpenMP-oriented Applications for Distributed Shared Memory Architectures
The fast emergence of OpenMP as the preferable parallel programming paradigm for small-to-medium scale parallelism could decline unless OpenMP will show capabilities to be the model-of-choice for large scale high performance parallel computing of the next decade. The main stumbling block from adapting OpenMP for distributed shared memory (DSM) machines, which are based on architecture like cc-N...
متن کاملPerformance of Parallel Bit-Reversal with Cilk and UPC for Fast Fourier Transform
Bit-reversal is widely known being an important program, as essential part of Fast Fourier Transform. If not carefully and well designed, it may easily take large portion of FFT application’s total execution time. In this paper, we present a parallel implementation of Bit-reversal for FFT using Cilk and UPC. Based on our previous work of creating parallel Bit-reversal using OpenMP in SPMD style...
متن کاملPorting and performance evaluation of irregular codes using OpenMP
In the last two years, OpenMP has been gaining popularity as a standard for developing portable shared memory parallel programs. With the improvements in centralized shared memory technologies and the emergence of distributed shared memory (DSM) architectures, several medium-to-large physical and logical shared memory con gurations are now available. Thus, OpenMP stands to be a promising medium...
متن کامل