Performance Evaluation of Multithreaded Sparse Matrix-Vector Multiplication Using OpenMP

Sparse matrix-vector multiplication is an important computational kernel in scientific applications. However, it performs poorly on modern processors because of a low compute-to-memory ratio and its irregular memory access patterns. This paper discusses the implementations of sparse matrix-vector al...

Full description

Saved in:
Bibliographic Details
Published in2009 11th IEEE International Conference on High Performance Computing and Communications pp. 659 - 665
Main Authors Shengfei Liu, Yunquan Zhang, Xiangzheng Sun, RongRong Qiu
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2009
Subjects
Online AccessGet full text
ISBN1424446007
9781424446001
DOI10.1109/HPCC.2009.75

Cover

More Information
Summary:Sparse matrix-vector multiplication is an important computational kernel in scientific applications. However, it performs poorly on modern processors because of a low compute-to-memory ratio and its irregular memory access patterns. This paper discusses the implementations of sparse matrix-vector algorithm using OpenMP to execute iterative methods on the Dawning S4800A1. Two storage formats (CSR and BCSR) for sparse matrices and three scheduling schemes (static, dynamic and guided) provided by the standard OpenMP are evaluated. We also compared these three schemes with non-zero scheduling, where each thread is assigned approximately the same number of non-zero elements. Experimental data shows that, the non-zero scheduling can provide the best performance in most cases. The current implementation provides satisfactory scalability for most of matrices. However, we only get a limited speedup for some large matrices that contain millions of non-zero elements.
ISBN:1424446007
9781424446001
DOI:10.1109/HPCC.2009.75