SpMV Profiling and Optimization Analysis
نویسندگان
چکیده
Sparse matrix-vector multiplication is an important operation when it comes to sparse matrix computations. Very large and sparse matrices are used in many engineering and scientific operations. Hence the matrix needs to be partitioned properly. Even though the matrix is partitioned and stored appropriately there still exists a possibility, the performance achieved is not significant. Thus, the need to address these issues. System proposes an integrated analytical and profile based performance modelling that accurately measures the kernel execution time of various SpMV CUDA kernels for a given target sparse-matrix. Based on this the designed optimal solution auto-selection algorithm automatically reports the SpMV optimal solution for a target sparse-matrix. The system is evaluated on NVIDIA GeForce GTX 680 and NVIDIA Quadro 8000. The system is further extended to one more matrix storage format.
منابع مشابه
A lightweight optimization selection method for Sparse Matrix-Vector Multiplication
In this paper, we propose an optimization selection methodology for the ubiquitous sparse matrix-vector multiplication (SpMV) kernel. We propose two models that attempt to identify the major performance bottleneck of the kernel for every instance of the problem and then select an appropriate optimization to tackle it. Our first model requires online profiling of the input matrix in order to det...
متن کاملA Survey on Performance Modelling and Optimization Techniques for SpMV on GPUs
Sparse Matrix is a matrix consisting of very few non-zero entries. Large sparse matrices are often used in engineering and scientific operations. Especially sparse-matrix vector multiplication is an important operation for solving linear system and partial differential equations. However, there is a possibility that even though the matrix is partitioned and stored appropriately, the performance...
متن کاملA hybrid computing method of SpMV on CPU-GPU heterogeneous computing systems
Sparsematrix–vectormultiplication (SpMV) is an important issue in scientific computing and engineering applications. The performance of SpMV can be improved using parallel computing. The implementation and optimization of SpMV on GPU are research hotspots. Due to some irregularities of sparse matrices, the use of a single compression format is not satisfactory. The hybrid storage format can exp...
متن کاملThe sparse matrix vector product on GPUs
The sparse matrix vector product (SpMV) is a paramount operation in engineering and scientific computing and, hence, has been a subject of intense research for long. The irregular computations involved in SpMV make its optimization challenging. Therefore, enormous effort has been devoted to devise data formats to store the sparse matrix with the ultimate aim of maximizing the performance. The G...
متن کاملBreaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors
The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse ma...
متن کامل