Optimized Communication Patterns on Workstation Clusters Optimized Communication Patterns on Workstation Clusters
نویسنده
چکیده
The limited communication bandwidth and high startup latencies of clustered workstations restrict their use to problems with sparse communication patterns or good concurrency between calculation and communication. First we describe our modiications to the popular PVMM5] message passing library, and on performance improvements using the PVM package on an FDDI-ring. Applications developed with a parallel communications architecture in mind perform poorly when ported to a message passing library running on workstations with sequential communication. In the second part, we present a dynamic loop scheduling algorithm for the data parallel programming model which optimizes the network usage on such clusters. As a proof of concept we have implemented a basic matrix multiplication and nd a signiicant increase in parallel eeciency.
منابع مشابه
Mapping of coarse-grained applications onto workstation clusters
We present an environment for configuring and coordinating coarse grained parallel applications on workstation clusters. The environment named CoPA is based on PVM and allows an automatic distribution of functional modules as they occur in typical CAE-applications. By implementing link-based communication on top of PVM, CoPA is able to perform a “post-game” analysis of the communication load be...
متن کاملMPI Communication in SMP Clusters
The recent years have seen a considerable increase in the number of cluster systems. These systems provide a very good performance-cost ratio. However, in order to meet the requirements for ever-increasing computing power of present day applications, several SMP cluster systems have emerged. By deploying two or more processors per workstation can increase the performance of a cluster significan...
متن کاملThe Parallel Implementation of a Full Configuration Interaction Program
Both the replicated and distributed data parallel full configuration interaction (FCI) implementations are described. The implementation of the FCI algorithm is organized in a hybrid strings-integral driven approach. Redundant communication is avoided, and the network performance is further optimized by an improved distributed data interface library. Examples show linear scalability of the dist...
متن کاملcient Parallel Computing on Workstation Clusters
We present novel hardand software that e ciently implements communication primitives for parallel execution on workstation clusters. We provide low communication latencies, minimal protocol, zero operating system overhead, and high throughput. With this technology, it is possible to build e ective parallel systems using o -the-shelf workstations. Our goal is to develop a standard interfaceboard...
متن کاملEfficient Parallel Computing on Workstation Clusters
We present novel hardand software that efficiently implements communication primitives for parallel execution on workstation clusters. We provide low communication latencies, minimal protocol, zero operating system overhead, and high throughput. With this technology, it is possible to build effective parallel systems using off-the-shelf workstations. Our goal is to develop a standard interfaceb...
متن کامل