An Auto-tuning Method for Run-time Data Transformation for Sparse Matrix-Vector Multiplication

نویسندگان

TAKAHIRO KATAGIRI

MASAHIKO SATO

چکیده

In this paper, we research the run-time sparse matrix data transformation from Compressed Row Storage (CRS) to Coordinate (COO) storage and an ELL (ELLPACK/ITPACK) format with OpenMP parallelization for sparse matrix-vector multiplication (SpMV). We propose an auto-tuning (AT) method by using the Dmat i Rell graph, which plots the derivation/average for the number of non-zero elements per row (Dmat) and the ratio, SpMV speedups/transformation time from the CRS to ELL (Rell ). The experimental results show the ELL format is very effective in the Earth Simulator 2. The speedup factor of 151 with the ELL-Row inner-parallelized format is obtained. The transformation overhead is also very small, such as 0.01 to 1.0 SpMV time with the CRS format. In addition, the Dmat i Rell graph can be modeled for the effectiveness of transformation according to the Dmat i value. 疎行列-ベクトル積における

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Energy-efficient Sparse Matrix Auto-tuning with CSX

This whitepaper describes the programming techniques used to develop an auto-tuning compression scheme for sparse matrices with respect to accelerating matrix-vector multiplication and minimizing its energy footprint, as well as a method for extracting a power profile from a corresponding implementation of the conjugate gradient method. Using two example systems, we show how these techniques ca...

متن کامل

Autotuning Sparse Matrix-Vector Multiplication for Multicore

Sparse matrix-vector multiplication (SpMV) is an important kernel in scientific and engineering computing. Straightforward parallel implementations of SpMV often perform poorly, and with the increasing variety of architectural features in multicore processors, it is getting more difficult to determine the sparse matrix data structure and corresponding SpMV implementation that optimize performan...

متن کامل

SMAT: An Input Adaptive Sparse Matrix-Vector Multiplication Auto-Tuner

Sparse matrix vector multiplication (SpMV) is an important kernel in scientific and engineering applications. The previous optimizations are sparse matrix format specific and expose the choice of the best format to application programmers. In this work we develop an auto-tuning framework to bridge gap between the specific optimized kernels and their general-purpose use. We propose an SpMV autot...

متن کامل

Yet another Hybrid Strategy for Auto-tuning SpMV on GPUs

Sparse matrix-vector multiplication (SpMV) is a key linear algebra algorithm and is widely used in many application domains. Besides multi-core architecture, there is also extensive research focusing on accelerating SpMV on many-core Graphics Processing Units (GPUs). SpMV computations have many indirect and irregular memory accesses, and load imbalance could occur while mapping computations ont...

متن کامل

A New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure

The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

An Auto-tuning Method for Run-time Data Transformation for Sparse Matrix-Vector Multiplication

نویسندگان

چکیده

منابع مشابه

Energy-efficient Sparse Matrix Auto-tuning with CSX

Autotuning Sparse Matrix-Vector Multiplication for Multicore

SMAT: An Input Adaptive Sparse Matrix-Vector Multiplication Auto-Tuner

Yet another Hybrid Strategy for Auto-tuning SpMV on GPUs

A New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure

عنوان ژورنال:

اشتراک گذاری