On Designing Optimal Parallel Triangular Solvers 1
نویسنده
چکیده
This paper explores the problem of solving triangular linear systems on parallel distributed-memory machines. Working within the LogP model, tight asymptotic bounds for solving these systems using forward/backward substitution are presented. Specifically, lower bounds on execution time independent of the data layout, lower bounds for data layouts in which the number of data items per processor is bounded, and lower bounds for specific data layouts commonly used in designing parallel algorithms for this problem are presented in this paper. Furthermore, algorithms are provided which have running times within a constant factor of the lower bounds described. One interesting result is that the popular two-dimensional block matrix layout necessarily results in significantly longer running times than simpler one-dimensional schemes. Finally, a generalization of the lower bounds to banded triangular linear systems is presented. © 2000 Academic Press
منابع مشابه
Parallel Triangular Solvers on GPU
In this paper, we investigate GPU based parallel triangular solvers systematically. The parallel triangular solvers are fundamental to incomplete LU factorization family preconditioners and algebraic multigrid solvers. We develop a new matrix format suitable for GPU devices. Parallel lower triangular solvers and upper triangular solvers are developed for this new data structure. With these solv...
متن کاملGeneralizing the Implementation of an Optimal Parallel Recursive Algorithm for Triangular Matrix Inversion
This paper describes a generalization of a study on an implementation of a parallel divide and conquer algorithm for triangular matrix inversion [3]. Indeed, given an original (lower) triangular matrix of size n=m2 (m, k ≥ 1) and an available number of processors p power of 2 (<n), we designed a strong cost optimal parallel algorithm i.e. whose efficiency (resp. speedup) is equal to 1 (resp. p)...
متن کاملDevelopment of Krylov and AMG Linear Solvers for Large-Scale Sparse Matrices on GPUs
This research introduce our work on developing Krylov subspace and AMG solvers on NVIDIA GPUs. As SpMV is a crucial part for these iterative methods, SpMV algorithms for single GPU and multiple GPUs are implemented. A HEC matrix format and a communication mechanism are established. And also, a set of specific algorithms for solving preconditioned systems in parallel environments are designed, i...
متن کاملDense Triangular Solvers on Multicore Clusters using UPC
The popularity of Partitioned Global Address Space (PGAS) languages has increased during the last years thanks to their high programmability and performance through an efficient exploitation of data locality. This paper describes the implementation of efficient parallel dense triangular solvers in the PGAS language Unified Parallel C (UPC). The solvers are built on top of sequential BLAS functi...
متن کاملParallel Algorithms and Condition Estimators for Standard and Generalized Triangular Sylvester-Type Matrix Equations
We discuss parallel algorithms for solving eight common standard and generalized triangular Sylvester-type matrix equation. Our parallel algorithms are based on explicit blocking, 2D block-cyclic data distribution of the matrices and wavefront-like traversal of the right hand side matrices while solving small-sized matrix equations at different nodes and updating the rest of the right hand side...
متن کامل