Optimization of a Statically Partitioned Hypermatrix Sparse Cholesky Factorization
نویسندگان
چکیده
The sparse Cholesky factorization of some large matrices can require a two dimensional partitioning of the matrix. The sparse hypermatrix storage scheme produces a recursive 2D partitioning of a sparse matrix. The subblocks are stored as dense matrices so BLAS3 routines can be used. However, since we are dealing with sparse matrices some zeros may be stored in those dense blocks. The overhead introduced by the operations on zeros can become large and considerably degrade performance. In this paper we present an improvement to our sequential in-core implementation of a sparse Cholesky factorization based on a hypermatrix storage structure. We compare its performance with several codes and analyze the results.
منابع مشابه
Advances in Sparse Hypermatrix Cholesky Factorization
We present our work on the sparse Cholesky factorization using a hypermatrix data structure. First, we provide some background on the sparse Cholesky factorization and explain the hypermatrix data structure. Next, we present the matrix test suite used. Afterwards, we present the techniques we have developed in pursuit of performance improvements for the sparse hypermatrix Cholesky factorization...
متن کاملIntra-Block Amalgamation in Sparse Hypermatrix Cholesky Factorization
In this paper we present an improve m e nt to o ur s e que nti al i nco re i m pl e mentation of a sparse Cholesky factorization based on a hypermatrix storage structure. We allow the inclusion of additional zeros in data submatrices to create larger blocks and in this way use more efficient routines for matrix multiplication. Since matrix multiplication takes about 90% of the total factorizati...
متن کاملReducing Overhead in Sparse Hypermatrix Cholesky Factorization
The sparse hypermatrix storage scheme produces a recursive 2D partitioning of a sparse matrix. Data subblocks are stored as dense matrices. Since we are dealing with sparse matrices some zeros can be stored in those dense blocks. The overhead introduced by the operations on zeros can become really large and considerably degrade performance. In this paper, we present several techniques for reduc...
متن کاملImproving Performance of Hypermatrix Cholesky Factorization
This paper shows how a sparse hypermatrix Cholesky factorization can be improved. This is accomplished by means of efficient codes which operate on very small dense matrices. Different matrix sizes or target platforms may require different codes to obtain good performance. We write a set of codes for each matrix operation using different loop orders and unroll factors. Then, for each matrix siz...
متن کاملSparse Hypermatrix Cholesky: Customization for High Performance
Efficient execution of numerical algorithms requires adapting the code to the underlying execution platform. In this paper we show the process of fine tuning our sparse Hypermatrix Cholesky factorization in order to exploit efficiently two important machine resources: processor and memory. Using the techniques we presented in previous papers we tune our code on a different platform. Then, we ex...
متن کامل