Optimization of Sparse Matrix-Vector Multiplication with Variant CSR on GPUs

Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound application whose performance mostly depends on the input matrix and the underlying architecture. Many researchers have paid more attentions on explori...

Full description

Saved in:
Bibliographic Details
Published in2011 IEEE 17th International Conference on Parallel and Distributed Systems pp. 165 - 172
Main Authors Xiaowen Feng, Hai Jin, Ran Zheng, Kan Hu, Jingxiang Zeng, Zhiyuan Shao
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2011
Subjects
Online AccessGet full text
ISBN1457718758
9781457718755
ISSN1521-9097
DOI10.1109/ICPADS.2011.91

Cover

Abstract Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound application whose performance mostly depends on the input matrix and the underlying architecture. Many researchers have paid more attentions on exploring a variety of optimization techniques to SpMV. One of the most promising respects is how to adapt the storage format to satisfy the underlying architecture. Alterative storage formats can largely lessen memory pressure, however, the computational resources are often underutilized. Therefore, a new storage format, which is called Compressed Sparse Row with Segmented Interleave Combination (SIC), is proposed. Stemming from Compressed Sparse Row format (CSR), SIC format employs an interleave combination pattern that combines certain amount of CSR rows to form a new SIC row. In order to further improve performance, segmented processing is also brought in. According to the empirical data, we also develop an automatic SIC-based SpMV suitable for all the matrices. Experimental results show that our approach outperforms the NVIDIA CSR vector kernel, achieving up to 12.6 × speedup. It also demonstrates a comparable performance with the Hybrid format, even with the highest 2.89 × speedup.
AbstractList Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound application whose performance mostly depends on the input matrix and the underlying architecture. Many researchers have paid more attentions on exploring a variety of optimization techniques to SpMV. One of the most promising respects is how to adapt the storage format to satisfy the underlying architecture. Alterative storage formats can largely lessen memory pressure, however, the computational resources are often underutilized. Therefore, a new storage format, which is called Compressed Sparse Row with Segmented Interleave Combination (SIC), is proposed. Stemming from Compressed Sparse Row format (CSR), SIC format employs an interleave combination pattern that combines certain amount of CSR rows to form a new SIC row. In order to further improve performance, segmented processing is also brought in. According to the empirical data, we also develop an automatic SIC-based SpMV suitable for all the matrices. Experimental results show that our approach outperforms the NVIDIA CSR vector kernel, achieving up to 12.6 × speedup. It also demonstrates a comparable performance with the Hybrid format, even with the highest 2.89 × speedup.
Author Hai Jin
Kan Hu
Jingxiang Zeng
Ran Zheng
Zhiyuan Shao
Xiaowen Feng
Author_xml – sequence: 1
  surname: Xiaowen Feng
  fullname: Xiaowen Feng
  organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
– sequence: 2
  surname: Hai Jin
  fullname: Hai Jin
  email: hjin@hust.edu.cn
  organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
– sequence: 3
  surname: Ran Zheng
  fullname: Ran Zheng
  organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
– sequence: 4
  surname: Kan Hu
  fullname: Kan Hu
  organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
– sequence: 5
  surname: Jingxiang Zeng
  fullname: Jingxiang Zeng
  organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
– sequence: 6
  surname: Zhiyuan Shao
  fullname: Zhiyuan Shao
  organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
BookMark eNotjE1Lw0AURUesYFu7deNm_kDivCQzb2ZZotZCSoux3ZZnMsGRNAnJiB-_3kDd3Hs5HO6MTZq2sYzdgggBhLlfp7vlQx5GAiA0cMEWBrVAZWQix7xkMxgHgkapJ2wKMoLACIPXbDYMH0JEIpZiyrJt593J_ZJ3bcPbiucd9YPlG_K9-w4OtvBtzzeftXdd7Yqz9uX8Oz9Q76jxPM1f-MhWu_1ww64qqge7-O852z89vqbPQbZdrdNlFjhA6QMqBRpUmFRoDZUlaohRE-lK2RiwslLJ8k2XlCRYFCjjRFVKmWI0yUJUxHN2d_511tpj17sT9T9HBRFEmMR_6nFRbg
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICPADS.2011.91
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
Accès Toulouse INP et ENVT - IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9780769545769
0769545769
EndPage 172
ExternalDocumentID 6121274
Genre orig-research
GroupedDBID 23M
29O
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
RNS
ID FETCH-LOGICAL-i175t-ad0797674f7e9add781378aa8f6e317fe565db8da447cc75346f669cdd7ae12c3
IEDL.DBID RIE
ISBN 1457718758
9781457718755
ISSN 1521-9097
IngestDate Wed Aug 27 03:46:13 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-ad0797674f7e9add781378aa8f6e317fe565db8da447cc75346f669cdd7ae12c3
PageCount 8
ParticipantIDs ieee_primary_6121274
PublicationCentury 2000
PublicationDate 2011-Dec.
PublicationDateYYYYMMDD 2011-12-01
PublicationDate_xml – month: 12
  year: 2011
  text: 2011-Dec.
PublicationDecade 2010
PublicationTitle 2011 IEEE 17th International Conference on Parallel and Distributed Systems
PublicationTitleAbbrev icpads
PublicationYear 2011
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020350
ssib026767514
ssj0000669466
Score 2.0303154
Snippet Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound...
SourceID ieee
SourceType Publisher
StartPage 165
SubjectTerms Compress Sparse Row
Computer architecture
GPU
Graphics processing unit
Instruction sets
Interleaved Row Combination
Kernel
Segmented Processing
Silicon carbide
Sparse matrices
Sparse Matrix-Vector Multiplication
Vectors
Title Optimization of Sparse Matrix-Vector Multiplication with Variant CSR on GPUs
URI https://ieeexplore.ieee.org/document/6121274
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ3PT8IwFMcb5MQJFYy_04NHC4N163Y0KKIRJSKEG-na18QYgehIjH-9r12HxnjwtjU7bG33-r6v731KyFkQS65BSCZCHTDUX5plwCOWAIrnFBQIVwszvI8HE347i2YVcr6phQEAl3wGLXvp9vL1Uq1tqKxtaVeoorbIlkjiolarnDtdBx7zO4KFFY4tOn0jvuwOmmOnupQEiwOsdXgk0DSjx1yyn_x95OmOnSBt3_RGF5fjgvVpQZ4_zmBxS1C_ToblyxeZJy-tdZ611OcvruN_v26bNL-L_ehos4ztkAosdkm9PO2B-p-_Qe4e0Lq8-rJNujR0vEJRDHRoIf8fbOrC_3RYJCj6SCC1YV46RT2OA0h740eKbdejyXuTTPpXT70B82cxsGd0MHImdSBSC_4xAlK0idhboUikTEwM6IIYQMdQZ4mWnAulUAPx2OAIKHxSQqerwj1SXSwXsE9oqkLNQQRdE-LSGGpppIFQJlmUKWNkdEAatnPmqwK3Mff9cvh38xGpuTCvyzA5JtX8bQ0n6Cfk2ambIF86UbVN
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4gHuSECsa3e_BoodBttz0aFEEpEnmEG9nuzibGCERLYvz1zm4LGuPBW7vpoZ3dzsw3j28IuXQDwRRw4XBPuQ7iL-UkwHwnBATPEUjgthcm7gedMbuf-tMCudr0wgCALT6Dmrm0uXy1kCsTKqsbtitEUVtk22eM-Vm31vr0NC31WJ4TzPRwYMjTN_DL5NAse6otSjCEgKUG8zkqZ_SZ1-xP-b2f8zs23KjebQ2ub4YZ26eh8vwxhcUaoXaZxOvXz2pPXmqrNKnJz1_Mjv_9vl1S_W73o4ONIdsjBZjvk_J63gPNf_8K6T2ifnnNGzfpQtPhEmEx0NjQ_H84E5sAoHFWopjHAqkJ9NIJInLcQtoaPlFcuxuM36tk3L4dtTpOPo3BeUYXI3WEcnlkqH80hwi1IkrL46EQoQ4AnRAN6BqqJFSCMS4loiAWaNwBiU8KaDSld0CK88UcDgmNpKcYcLepPTSOnhJaaPBEmPiJ1Fr4R6RihDNbZoQbs1wux38vX5CdzijuzXrd_sMJKdmgr603OSXF9G0FZ-g1pMm5PSxf7yS4mg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2011+IEEE+17th+International+Conference+on+Parallel+and+Distributed+Systems&rft.atitle=Optimization+of+Sparse+Matrix-Vector+Multiplication+with+Variant+CSR+on+GPUs&rft.au=Xiaowen+Feng&rft.au=Hai+Jin&rft.au=Ran+Zheng&rft.au=Kan+Hu&rft.date=2011-12-01&rft.pub=IEEE&rft.isbn=9781457718755&rft.issn=1521-9097&rft.spage=165&rft.epage=172&rft_id=info:doi/10.1109%2FICPADS.2011.91&rft.externalDocID=6121274
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1521-9097&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1521-9097&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1521-9097&client=summon