Parallel Tridiagonalization through Two-Step Band Reduction
نویسندگان
چکیده
We present a two-step variant of the \successive band reduction" paradigm for the tridiagonalization of symmetric matrices. Here we reduce a full matrix rst to narrow-banded form and then to tridiagonal form. The rst step allows easy exploitation of block orthogonal transformations. In the second step, we employ a new blocked version of a banded matrix tridiagonal-ization algorithm by Lang. In particular, we are able to express the update of the orthogonal transformation matrix in terms of block transformations. This expression leads to an algorithm that is almost entirely based on BLAS-3 kernels and has greatly improved data movement and communication characteristics. We also present some performance results on the Intel Touchstone DELTA and the IBM SP1.
منابع مشابه
An Algorithm for Simultaneous Band Reduction of Two Dense Symmetric Matrices
In this paper, we propose an algorithm for simultaneously reducing two dense symmetric matrices to band form with the same bandwidth by congruent transformations. The simultaneous band reduction can be considered as an extension of the simultaneous tridiagonalization of two dense symmetric matrices. In contrast to algorithms of simultaneous tridiagonalization that are based on Level-2 BLAS (Bas...
متن کاملCommunication Avoiding Symmetric Band Reduction
The running time of an algorithm depends on both arithmetic and communication (i.e., data movement) costs, and the relative costs of communication are growing over time. In this work, we present both theoretical and practical results for tridiagonalizing a symmetric band matrix: we present an algorithm that asymptotically reduces communication, and we show that it indeed performs well in practi...
متن کاملParallel Bandreduction and Tridiagonalization
This paper presents a parallel implementation of a blocked band reduction algorithm for symmetric matrices suggested by Bischof and Sun. The reduction to tridiagonal or block tridiagonal form is a special case of this algorithm. A blocked double torus wrap mapping is used as the underlying data distribution and the so-called WY representation is employed to represent block orthogonal transforma...
متن کاملHigh Performance Computing in Material Sciences Higher Level Blas in Symmetric Eigensolvers High Performance Computing in Material Sciences Higher Level Blas in Symmetric Eigensolvers
In this report a way to apply high level Blas to the tridiagonalization process of a symmetric matrix A is investigated. Tridiagonalization is a very important and work-intensive preprocessing step in eigenvalue computations. It also arises as a very central part of the material sciences code Wien 97 (Blaha et al. [12]). After illustrating the drawbacks and limitations of the tridiagonalization...
متن کاملParallel block tridiagonalization of real symmetric matrices
Two parallel block tridiagonalization algorithms and implementations for dense real symmetric matrices are presented. Block tridiagonalization is a critical pre-processing step for the block-tridiagonal divide-and-conquer algorithm for computing eigensystems and is useful for many algorithms desiring the efficiencies of block structure in matrices. For an “effectively” sparse matrix, which freq...
متن کامل