Packed Storage Extension for ScaLAPACK
نویسندگان
چکیده
We describe an extension to ScaLAPACK for computing with symmetric (and hermitian) matrices stored in a packed form. This is similar to the compact storage for symmetric (and hermitian) matrices available in LAPACK [2]. This enables more efficient use of memory by storing only the lower or upper triangular part of a symmetric matrix. The capabilities include Choleksy factorization (PxSPTRF) and solution (PxSPTRS) of symmetric (Hermitian) linear systems, the computation of eigenvalues and eigenvectors (PxSPEV), expert drivers (PxSPEVX) for generalized eigenvalue problem (PxSPGVX) for symmetric (Hermitian) matrices in packed storage. This work differs from an earlier work [5] on packed storage by considering wider block column panels of width NB * NPCOL instead of single block column of width NB in performance sensitive routines. The goal is to reduce the overhead in index calculation by increasing the granularity and operating on wider column panels. The packed storage scheme (described in §2) resembles the ScaLAPACK data distribution but physically stores only the lower (or upper) blocks. Each block column panel of width NB * NPCOL can be considered as a trapezoidal submatirx in a fully dense ScaLAPACK matrix. Section 3 contains two concrete examples on how such an arrangement can be used with conventional ScaLAPACK routines for dense storage. For some performance critical PBLAS like routines such as PxTPSM (triangular solve) and PxTPMM (multiplication by triangular matrix), block diagonal submatrices are copied into fully dense ScaLAPACK matrices to reuse standard high perform parallel BLAS routines. The right-looking Cholesky factorization re-
منابع مشابه
Department of Computer Science Technical Report CS - 98 - 385 Packed storage extension for ScaLAPACK
We describe a new extension to ScaLAPACK [2] for computing with symmetric (Hermi-tian) matrices stored in a packed form. The new code is built upon the ScaLAPACK routines for full dense storage for a high degree of software reuse. The original ScaLAPACK stores a symmetric matrix as a full matrix but accesses only the lower or upper triangular part. The new code enables more efficient use of mem...
متن کاملDepartment of Computer Science Technical Report CS - 97 - 347 Packed storage extension for ScaLAPACK
We describe a new extension to ScaLAPACK [2] for computing with symmetric (Hermi-tian) matrices stored in a packed form. The new code is built upon the ScaLAPACK routines for full dense storage for a high degree of software reuse. The original ScaLAPACK stores a symmetric matrix as a full matrix but accesses only the lower or upper triangular part. The new code enables more efficient use of mem...
متن کاملA distributed packed storage for large dense parallel in-core calculations
We propose in this paper a distributed packed storage format that exploits the symmetry or the triangular structure of a dense matrix. This format stores only half of the matrix while maintaining most of the efficiency compared to a full storage for a wide range of operations. This work has been motivated by the fact that, contrary to sequential linear algebra libraries (e.g. LAPACK [4]), there...
متن کاملThree Algorithms for Cholesky Factorization on Distributed Memory Using Packed Storage
We present three algorithms for Cholesky factorization using minimum block storage for a distributed memory (DM) environment. One of the distributed square blocked packed (SBP) format algorithms performs similar to ScaLAPACK PDPOTRF, and with iteration overlapping outperforms it by as much as 67%. By storing the blocks in a standard contiguous way, we get better performing BLAS operations. Our ...
متن کاملQuality improvement and shelf life extension of fresh apricot fruit (Prunus Armeniaca cv. Shahroudi) using postharvest chemical treatments and packaging during cold storage
The main objective of this work was to assess the effectiveness of salicylic acid (SA), calcium chloride (CaCl2) or sodium bicarbonate (NaHCO3), and packaging on some qualitative properties of apricot fruit during cold storage. The experiments were conducted using a completely randomized design as factorial, with three replicates. Fruits were dipped in SA (0.1 or 0.5 mM), CaCl2 (1 or 2%) or NaH...
متن کامل