Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication
نویسندگان
چکیده
منابع مشابه
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication
Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdős-Rényi matrices,...
متن کاملCoded Sparse Matrix Multiplication
In a large-scale and distributed matrix multiplication problem C = AB, where C ∈ Rr×t, the coded computation plays an important role to effectively deal with “stragglers” (distributed computations that may get delayed due to few slow or faulty processors). However, existing coded schemes could destroy the significant sparsity that exists in large-scale machine learning problems, and could resul...
متن کاملFast Sparse Matrix-Vector Multiplication by Exploiting Variable Block Structure
We improve the performance of sparse matrix-vector multiply (SpMV) on modern cache-based superscalar machines when the matrix structure consists of multiple, irregularly aligned rectangular blocks. Matrices from finite element modeling applications often have this kind of structure. Our technique splits the matrix, A, into a sum, A1 + A2 + . . . + As, where each term is stored in a new data str...
متن کاملHighly Parallel Sparse Matrix-Matrix Multiplication
Generalized sparse matrix-matrix multiplication is a key primitive for many high performance graph algorithms as well as some linear solvers such as multigrid. We present the first parallel algorithms that achieve increasing speedups for an unbounded number of processors. Our algorithms are based on two-dimensional block distribution of sparse matrices where serial sections use a novel hyperspa...
متن کاملOpportunities for Parallelism in Matrix Multiplication
BLIS is a new framework for rapid instantiation of the BLAS. We describe how BLIS extends the “GotoBLAS approach” to implementing matrix multiplication (gemm). While gemm was previously implemented as three loops around an inner kernel, BLIS exposes two additional loops within that inner kernel, casting the computation in terms of the BLIS micro-kernel so that porting gemm becomes a matter of c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SIAM Journal on Scientific Computing
سال: 2016
ISSN: 1064-8275,1095-7197
DOI: 10.1137/15m104253x