Performance Evaluation of Multithreaded Sparse Matrix-Vector Multiplication Using OpenMP
Sparse matrix-vector multiplication is an important computational kernel in scientific applications. However, it performs poorly on modern processors because of a low compute-to-memory ratio and its irregular memory access patterns. This paper discusses the implementations of sparse matrix-vector al...
Saved in:
| Published in | 2009 11th IEEE International Conference on High Performance Computing and Communications pp. 659 - 665 |
|---|---|
| Main Authors | , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
01.06.2009
|
| Subjects | |
| Online Access | Get full text |
| ISBN | 1424446007 9781424446001 |
| DOI | 10.1109/HPCC.2009.75 |
Cover
| Summary: | Sparse matrix-vector multiplication is an important computational kernel in scientific applications. However, it performs poorly on modern processors because of a low compute-to-memory ratio and its irregular memory access patterns. This paper discusses the implementations of sparse matrix-vector algorithm using OpenMP to execute iterative methods on the Dawning S4800A1. Two storage formats (CSR and BCSR) for sparse matrices and three scheduling schemes (static, dynamic and guided) provided by the standard OpenMP are evaluated. We also compared these three schemes with non-zero scheduling, where each thread is assigned approximately the same number of non-zero elements. Experimental data shows that, the non-zero scheduling can provide the best performance in most cases. The current implementation provides satisfactory scalability for most of matrices. However, we only get a limited speedup for some large matrices that contain millions of non-zero elements. |
|---|---|
| ISBN: | 1424446007 9781424446001 |
| DOI: | 10.1109/HPCC.2009.75 |