New Data Distribution for Solving Triangular Systems on Distributed Memory Machines
نویسنده
چکیده
The aim is to present a new data distribution of triangular matrices that provides steady distribution of blocks among processes and reduces memory wasting compared to the standard block-cyclic data layout used in the ScaLAPACK library for dense matrix computations. A new algorithm for solving triangular systems of linear equations is also introduced. The results of experiments performed on a cluster of Itanium 2 processors and Cray X1 show that in some cases, the new method is faster than corresponding PBLAS routines PSTRSV and PSTRSM.
منابع مشابه
On Designing Optimal Parallel Triangular Solvers 1
This paper explores the problem of solving triangular linear systems on parallel distributed-memory machines. Working within the LogP model, tight asymptotic bounds for solving these systems using forward/backward substitution are presented. Specifically, lower bounds on execution time independent of the data layout, lower bounds for data layouts in which the number of data items per processor ...
متن کاملEfficient Parallel Solutions of Large Sparse Spd Systems on Distributed-memory Multiprocessors
We consider several issues involved in the solution of sparse symmetric positive deenite systems by multifrontal method on distributed-memory multiprocessors. First, we present a new algorithm for computing the partial factorization of a frontal matrix on a subset of processors which signiicantly improves the performance of a distributed multifrontal algorithm previously designed. Second, new p...
متن کاملThe Shifted Hessenberg System Solve Computation
We present methods for improving data reuse in solving sequences of linear systems that are Hessenberg matrices shifted by a sequence of scalars times the identity or a triangular matrix. The methods take into consideration the robust handling of overrow and include new condition estimation strategies. We provide timings on both scalar and vector machines to demonstrate both the diversity and i...
متن کاملThe Parallel Solution of Triangular Systems of Linear Equations
We present a parallel algorithm for solving triangular systems of linear equations on distributed-memory multiprocessor machines. The parallelism is achieved by partitioning the rows of the coefficient matrix in segments of a fixed size and distributing these sets in a wrap-around fashion among the processors, connected in a ring. The granularity of the algorithm is controlled by varying the se...
متن کاملCompiling for Distributed-Memory Systems
Distributed-memop systems are potentiall) scalable to a very large number of processors and promise to be povcvrjiul tools ,for solving large-scale scientijc and engineering problems. Howwer. these machines are currently dificult to program, sirice the user has to distribute the data across the processors atid e.xplicitly formulate the communication required by the program under the selected di...
متن کامل