Optimization of Sparse Matrix-Vector Multiplication with Variant CSR on GPUs

Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound application whose performance mostly depends on the input matrix and the underlying architecture. Many researchers have paid more attentions on explori...

Full description

Saved in:

Bibliographic Details
Published in	2011 IEEE 17th International Conference on Parallel and Distributed Systems pp. 165 - 172
Main Authors	Xiaowen Feng, Hai Jin, Ran Zheng, Kan Hu, Jingxiang Zeng, Zhiyuan Shao
Format	Conference Proceeding
Language	English
Published	IEEE 01.12.2011
Subjects	Compress Sparse Row Computer architecture GPU Graphics processing unit Instruction sets Interleaved Row Combination Kernel Segmented Processing Silicon carbide Sparse matrices Sparse Matrix-Vector Multiplication Vectors
Online Access	Get full text
ISBN	1457718758 9781457718755
ISSN	1521-9097
DOI	10.1109/ICPADS.2011.91

Cover

Abstract	Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound application whose performance mostly depends on the input matrix and the underlying architecture. Many researchers have paid more attentions on exploring a variety of optimization techniques to SpMV. One of the most promising respects is how to adapt the storage format to satisfy the underlying architecture. Alterative storage formats can largely lessen memory pressure, however, the computational resources are often underutilized. Therefore, a new storage format, which is called Compressed Sparse Row with Segmented Interleave Combination (SIC), is proposed. Stemming from Compressed Sparse Row format (CSR), SIC format employs an interleave combination pattern that combines certain amount of CSR rows to form a new SIC row. In order to further improve performance, segmented processing is also brought in. According to the empirical data, we also develop an automatic SIC-based SpMV suitable for all the matrices. Experimental results show that our approach outperforms the NVIDIA CSR vector kernel, achieving up to 12.6 × speedup. It also demonstrates a comparable performance with the Hybrid format, even with the highest 2.89 × speedup.
AbstractList	Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound application whose performance mostly depends on the input matrix and the underlying architecture. Many researchers have paid more attentions on exploring a variety of optimization techniques to SpMV. One of the most promising respects is how to adapt the storage format to satisfy the underlying architecture. Alterative storage formats can largely lessen memory pressure, however, the computational resources are often underutilized. Therefore, a new storage format, which is called Compressed Sparse Row with Segmented Interleave Combination (SIC), is proposed. Stemming from Compressed Sparse Row format (CSR), SIC format employs an interleave combination pattern that combines certain amount of CSR rows to form a new SIC row. In order to further improve performance, segmented processing is also brought in. According to the empirical data, we also develop an automatic SIC-based SpMV suitable for all the matrices. Experimental results show that our approach outperforms the NVIDIA CSR vector kernel, achieving up to 12.6 × speedup. It also demonstrates a comparable performance with the Hybrid format, even with the highest 2.89 × speedup.
Author	Hai Jin Kan Hu Jingxiang Zeng Ran Zheng Zhiyuan Shao Xiaowen Feng
Author_xml	– sequence: 1 surname: Xiaowen Feng fullname: Xiaowen Feng organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China – sequence: 2 surname: Hai Jin fullname: Hai Jin email: hjin@hust.edu.cn organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China – sequence: 3 surname: Ran Zheng fullname: Ran Zheng organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China – sequence: 4 surname: Kan Hu fullname: Kan Hu organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China – sequence: 5 surname: Jingxiang Zeng fullname: Jingxiang Zeng organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China – sequence: 6 surname: Zhiyuan Shao fullname: Zhiyuan Shao organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
BookMark	eNotjE1Lw0AURUesYFu7deNm_kDivCQzb2ZZotZCSoux3ZZnMsGRNAnJiB-_3kDd3Hs5HO6MTZq2sYzdgggBhLlfp7vlQx5GAiA0cMEWBrVAZWQix7xkMxgHgkapJ2wKMoLACIPXbDYMH0JEIpZiyrJt593J_ZJ3bcPbiucd9YPlG_K9-w4OtvBtzzeftXdd7Yqz9uX8Oz9Q76jxPM1f-MhWu_1ww64qqge7-O852z89vqbPQbZdrdNlFjhA6QMqBRpUmFRoDZUlaohRE-lK2RiwslLJ8k2XlCRYFCjjRFVKmWI0yUJUxHN2d_511tpj17sT9T9HBRFEmMR_6nFRbg
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ICPADS.2011.91
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings Accès Toulouse INP et ENVT - IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	9780769545769 0769545769
EndPage	172
ExternalDocumentID	6121274
Genre	orig-research
GroupedDBID	23M 29O 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL RNS
ID	FETCH-LOGICAL-i175t-ad0797674f7e9add781378aa8f6e317fe565db8da447cc75346f669cdd7ae12c3
IEDL.DBID	RIE
ISBN	1457718758 9781457718755
ISSN	1521-9097
IngestDate	Wed Aug 27 03:46:13 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i175t-ad0797674f7e9add781378aa8f6e317fe565db8da447cc75346f669cdd7ae12c3
PageCount	8
ParticipantIDs	ieee_primary_6121274
PublicationCentury	2000
PublicationDate	2011-Dec.
PublicationDateYYYYMMDD	2011-12-01
PublicationDate_xml	– month: 12 year: 2011 text: 2011-Dec.
PublicationDecade	2010
PublicationTitle	2011 IEEE 17th International Conference on Parallel and Distributed Systems
PublicationTitleAbbrev	icpads
PublicationYear	2011
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0020350 ssib026767514 ssj0000669466
Score	2.0303154
Snippet	Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound...
SourceID	ieee
SourceType	Publisher
StartPage	165
SubjectTerms	Compress Sparse Row Computer architecture GPU Graphics processing unit Instruction sets Interleaved Row Combination Kernel Segmented Processing Silicon carbide Sparse matrices Sparse Matrix-Vector Multiplication Vectors
Title	Optimization of Sparse Matrix-Vector Multiplication with Variant CSR on GPUs
URI	https://ieeexplore.ieee.org/document/6121274
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ3PT8IwFMcb5MQJFYy_04NHC4N163Y0KKIRJSKEG-na18QYgehIjH-9r12HxnjwtjU7bG33-r6v731KyFkQS65BSCZCHTDUX5plwCOWAIrnFBQIVwszvI8HE347i2YVcr6phQEAl3wGLXvp9vL1Uq1tqKxtaVeoorbIlkjiolarnDtdBx7zO4KFFY4tOn0jvuwOmmOnupQEiwOsdXgk0DSjx1yyn_x95OmOnSBt3_RGF5fjgvVpQZ4_zmBxS1C_ToblyxeZJy-tdZ611OcvruN_v26bNL-L_ehos4ztkAosdkm9PO2B-p-_Qe4e0Lq8-rJNujR0vEJRDHRoIf8fbOrC_3RYJCj6SCC1YV46RT2OA0h740eKbdejyXuTTPpXT70B82cxsGd0MHImdSBSC_4xAlK0idhboUikTEwM6IIYQMdQZ4mWnAulUAPx2OAIKHxSQqerwj1SXSwXsE9oqkLNQQRdE-LSGGpppIFQJlmUKWNkdEAatnPmqwK3Mff9cvh38xGpuTCvyzA5JtX8bQ0n6Cfk2ambIF86UbVN
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4gHuSECsa3e_BoodBttz0aFEEpEnmEG9nuzibGCERLYvz1zm4LGuPBW7vpoZ3dzsw3j28IuXQDwRRw4XBPuQ7iL-UkwHwnBATPEUjgthcm7gedMbuf-tMCudr0wgCALT6Dmrm0uXy1kCsTKqsbtitEUVtk22eM-Vm31vr0NC31WJ4TzPRwYMjTN_DL5NAse6otSjCEgKUG8zkqZ_SZ1-xP-b2f8zs23KjebQ2ub4YZ26eh8vwxhcUaoXaZxOvXz2pPXmqrNKnJz1_Mjv_9vl1S_W73o4ONIdsjBZjvk_J63gPNf_8K6T2ifnnNGzfpQtPhEmEx0NjQ_H84E5sAoHFWopjHAqkJ9NIJInLcQtoaPlFcuxuM36tk3L4dtTpOPo3BeUYXI3WEcnlkqH80hwi1IkrL46EQoQ4AnRAN6BqqJFSCMS4loiAWaNwBiU8KaDSld0CK88UcDgmNpKcYcLepPTSOnhJaaPBEmPiJ1Fr4R6RihDNbZoQbs1wux38vX5CdzijuzXrd_sMJKdmgr603OSXF9G0FZ-g1pMm5PSxf7yS4mg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2011+IEEE+17th+International+Conference+on+Parallel+and+Distributed+Systems&rft.atitle=Optimization+of+Sparse+Matrix-Vector+Multiplication+with+Variant+CSR+on+GPUs&rft.au=Xiaowen+Feng&rft.au=Hai+Jin&rft.au=Ran+Zheng&rft.au=Kan+Hu&rft.date=2011-12-01&rft.pub=IEEE&rft.isbn=9781457718755&rft.issn=1521-9097&rft.spage=165&rft.epage=172&rft_id=info:doi/10.1109%2FICPADS.2011.91&rft.externalDocID=6121274
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1521-9097&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1521-9097&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1521-9097&client=summon