SPMSD: An Partitioning-Strategy for Parallel General Sparse Matrix-Matrix Multiplication on GPU

SpGEMM (General Sparse Matrix-Matrix Multiplication) is one of the kernels of an algebraic multi-grid method, graph algorithm, and solving linear equations. Due to the non-uniformity of some sparse matrices, the existing parallel SpGEMM algorithms suffer from load imbalance, lead to a decrease in co...

Full description

Saved in:

Bibliographic Details
Published in	Parallel processing letters Vol. 34; no. 2
Main Authors	Cui, Huanyu, Wang, Nianbin, Han, Qilong, Wang, Ye
Format	Journal Article
Language	English
Published	Singapore World Scientific Publishing Company 01.06.2024 World Scientific Publishing Co. Pte., Ltd
Subjects	Algorithms Atomic properties Grid method Insertion Linear equations Multiplication Nonuniformity Process controls Sparse matrices Sparsity Standard deviation load imbalance hash table parallel efficiency non-uniformity GPU SpGEMM
Online Access	Get full text
ISSN	0129-6264 1793-642X
DOI	10.1142/S012962642450004X

Cover

More Information
Summary:	SpGEMM (General Sparse Matrix-Matrix Multiplication) is one of the kernels of an algebraic multi-grid method, graph algorithm, and solving linear equations. Due to the non-uniformity of some sparse matrices, the existing parallel SpGEMM algorithms suffer from load imbalance, lead to a decrease in computational efficiency. This paper proposes a new algorithm, SPMSD (SpGEMM Based on Minimum Standard Deviation). The algorithm is developed based on a hash table and partition strategy. First, the number of intermediate results in the matrix is divided into multiple blocks based on a new partition strategy to ensure the minimum standard deviation among blocks. Second, the input matrix is transformed according to the result of the partition strategy. Finally, SPMSD performs the parallel computing of SpGEMM based on the advantages of fast insertion and also fast access storage of the hash table and the calculation process controls the insertion and merging of intermediate results according to the offset to avoid the shortage of atomic operations. These experiments indicate the execution of SPMSD is faster than the existing cuSPARSE libraries by 7.4x. Compared with the Out of Core method, SPMSD improves the computational performance by 1.2x, SPMSD memory utilization is decreased by 0.19x.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0129-6264 1793-642X
DOI:	10.1142/S012962642450004X