GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication

نویسندگان

Yuan Tao

Yangdong Deng

Shuai Mu

Zhenzhong Zhang

Mingfa Zhu

Limin Xiao

Li Ruan

چکیده

Many high performance computing applications require computing both sparse matrix-vector product (SMVP) and sparse matrix-transpose vector product (SMTVP) for better overall performance. Under such a circumstance, it is critical to maintain a similarly high throughput for these two computing patterns with the underlying sparse matrix encoded in a single storage format. The compressed sparse block (CSB) format proposed by Buluç et al. allows computing both problems on multi-core CPUs with nearly identical throughputs. On the other hand, a direct porting of CSB to graphics processing units (GPUs), which have been recently recognized as a powerful general purpose computing platform, turns out to be inefficient. In this work, we propose a new data structure, designated as expanded CSB (eCSB), to minimize the throughput gap between SMVP and SMTVP computations on GPUs, while at the same time enable a high computing throughput. We also use a hybrid storage format to store elements in each block, which can be selected dynamically at runtime. Experimental results show that the proposed techniques implemented on a Kepler GPU delivers similar throughput on both SMVP and SMTVP and the throughput is up to 13 times faster than that of the CPU-based CSB implementation. In addition, our eCSB procedure outperforms the previous GPU results by up to 188% and 914% in computing SMVP and SMTVP, and we validate the effectiveness of eCSB by means of wall-clock time of bi-conjugate gradient algorithm; our eCSB is 25% faster than Compressed Sparse Rows (CSR) and 6% faster than HYB, respectively. Copyright © 2014 John Wiley & Sons, Ltd.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse Matrix-vector Multiplication on Nvidia Gpu

In this paper, we present our work on developing a new matrix format and a new sparse matrix-vector multiplication algorithm. The matrix format is HEC, which is a hybrid format. This matrix format is efficient for sparse matrix-vector multiplication and is friendly to preconditioner. Numerical experiments show that our sparse matrix-vector multiplication algorithm is efficient on

متن کامل

Techniques for Parallel Manipulation of Sparse Matrices

New techniques are presented forthe manipulation of sparse matrices on parallel MIMD computers. We consider the following problems: matrix addition, matrix multiplication, row and column permutation, matrix transpose, matrix vector multiplication, and Gaussian elimination.

متن کامل

Transposition Mechanism for Sparse Matrices on Vector Processors

Many scientific applications involve operations on sparse matrices. However, due to irregularities induced by the sparsity patterns, many operations on sparse matrices execute inefficiently on traditional scalar and vector architectures. To tackle this problem a scheme has been proposed consisting of two parts: (a) An extension to a vector architecture to support sparse matrix-vector multiplica...

متن کامل

Sparse Matrix-Vector Multiplication on a GPU

متن کامل

Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining

Scaling up the sparse matrix-vector multiplication kernel on modern Graphics Processing Units (GPU) has been at the heart of numerous studies in both academia and industry. In this article we present a novel non-parametric, selftunable, approach to data representation for computing this kernel, particularly targeting sparse matrices representing power-law graphs. Using real web graph data, we s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

Concurrency and Computation: Practice and Experience

دوره 27 شماره

صفحات -

تاریخ انتشار 2015

GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication

نویسندگان

چکیده

منابع مشابه

Sparse Matrix-vector Multiplication on Nvidia Gpu

Techniques for Parallel Manipulation of Sparse Matrices

Transposition Mechanism for Sparse Matrices on Vector Processors

Sparse Matrix-Vector Multiplication on a GPU

Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining

عنوان ژورنال:

اشتراک گذاری