Implementing O(n) N{body Algorithms Eeciently in Data Parallel Languages (high Performance Fortran) Implementing O(n) N{body Algorithms Eeciently in Data Parallel Languages (high Performance Fortran)
نویسندگان
چکیده
O(N) algorithms for N{body simulations enable the simulation of particle systems with up to 100 million particles on current Massively Parallel Processors (MPPs). Our optimization techniques mainly focus on minimizing the data movement through careful management of the data distribution and the data references, both between the memories of diierent nodes, and within the memory hierarchy of each node. We show how the techniques can be expressed in languages with an array syntax, such as Connection Machine Fortran (CMF). All CMF constructs used, with one exception, are included in High Performance Fortran. The eeectiveness of our techniques is demonstrated on an implementation of Anderson's hierarchical O(N) N{body method for the Connection Machine system CM{5/5E. Of the total execution time, communication accounts for about 10{20% of the total time, with the average eeciency for arithmetic operations being about 40% and the total eeciency (including communication) being about 35%. For the CM{5E a performance in excess of 60 MMop/s per node (peak 160 MMop/s per node) has been measured.
منابع مشابه
Implementing O(N) N-Body Algorithms Efficiently in Data-Parallel Languages
The optimization techniques for hierarchical O(N) N{body algorithms described here focus on managing the data distribution and the data references, both between the memories of diierent nodes, and within the memory hierarchy of each node. We show how the techniques can be expressed in data{parallel languages, such as High Performance Fortran (HPF) and Connection Machine Fortran (CMF). The eeect...
متن کاملImplementing O(N) N-body Algorithms Efficiently in Data Parallel Languages (High Performance Fortran)
O(N ) algorithms for N{body simulations enable the simulation of particle systems with up to 100 million particles on current Massively Parallel Processors (MPPs). Our optimization techniques mainly focus on minimizing the data movement through careful management of the data distribution and the data references, both between the memories of di erent nodes, and within the memory hierarchy of eac...
متن کاملEfficient Data Parallel Implementations of Highly Irregular Problems
This dissertation presents optimization techniques for efficient data parallel formulation/implementation of highly irregular problems, and applies the techniques to O(N) hierarchical N–body methods for large–scale N–body simulations. It demonstrates that highly irregular scientific and engineering problems such as nonadaptive and adaptive O(N) hierarchical N–body methods can be efficiently imp...
متن کاملNew data-parallel language features for sparse matrix computations
High-level data-parallel languages such as Vienna Fortran and High Performance Fortran (HPF) have been introduced to allow the programming of massively parallel distributed-memory machines at a relatively high level of abstraction, based on the Single-Program-Multiple-Data (SPMD) paradigm. Their main features include mechanisms for expressing the distribution of data across the processors of a ...
متن کاملData Distribution Concepts for Parallel Image Processing Data Distribution Concepts for Parallel Image Processing
| Data distributions gained a considerable interest in the eld of data parallel programming. In most cases they are key factors for the eeciency of the implementation. In this paper we analyze data distributions suited for parallel image processing and those deened by some of todays more popular parallel languages (HPF, Vienna Fortran, pC++) and libraries (ScaLAPACK). The majority of them belon...
متن کامل