Sparse matrix–vector multiplication (SpMV) is one of the most important kernels in high-performance computing (HPC), yet SpMV normally suffers from ill performance on many devices. Due to performance, requires special care store and tune for a given device. Moreover, HPC facing heterogeneous hardware containing multiple different compute units, e.g., many-core CPUs GPUs. Therefore, an emerging ...