Optimization of Sparse Matrix-Vector Multiplication with Variant CSR on GPUs
Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound application whose performance mostly depends on the input matrix and the underlying architecture. Many researchers have paid more attentions on explori...
        Saved in:
      
    
          | Published in | 2011 IEEE 17th International Conference on Parallel and Distributed Systems pp. 165 - 172 | 
|---|---|
| Main Authors | , , , , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        01.12.2011
     | 
| Subjects | |
| Online Access | Get full text | 
| ISBN | 1457718758 9781457718755  | 
| ISSN | 1521-9097 | 
| DOI | 10.1109/ICPADS.2011.91 | 
Cover
| Abstract | Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound application whose performance mostly depends on the input matrix and the underlying architecture. Many researchers have paid more attentions on exploring a variety of optimization techniques to SpMV. One of the most promising respects is how to adapt the storage format to satisfy the underlying architecture. Alterative storage formats can largely lessen memory pressure, however, the computational resources are often underutilized. Therefore, a new storage format, which is called Compressed Sparse Row with Segmented Interleave Combination (SIC), is proposed. Stemming from Compressed Sparse Row format (CSR), SIC format employs an interleave combination pattern that combines certain amount of CSR rows to form a new SIC row. In order to further improve performance, segmented processing is also brought in. According to the empirical data, we also develop an automatic SIC-based SpMV suitable for all the matrices. Experimental results show that our approach outperforms the NVIDIA CSR vector kernel, achieving up to 12.6 × speedup. It also demonstrates a comparable performance with the Hybrid format, even with the highest 2.89 × speedup. | 
    
|---|---|
| AbstractList | Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound application whose performance mostly depends on the input matrix and the underlying architecture. Many researchers have paid more attentions on exploring a variety of optimization techniques to SpMV. One of the most promising respects is how to adapt the storage format to satisfy the underlying architecture. Alterative storage formats can largely lessen memory pressure, however, the computational resources are often underutilized. Therefore, a new storage format, which is called Compressed Sparse Row with Segmented Interleave Combination (SIC), is proposed. Stemming from Compressed Sparse Row format (CSR), SIC format employs an interleave combination pattern that combines certain amount of CSR rows to form a new SIC row. In order to further improve performance, segmented processing is also brought in. According to the empirical data, we also develop an automatic SIC-based SpMV suitable for all the matrices. Experimental results show that our approach outperforms the NVIDIA CSR vector kernel, achieving up to 12.6 × speedup. It also demonstrates a comparable performance with the Hybrid format, even with the highest 2.89 × speedup. | 
    
| Author | Hai Jin Kan Hu Jingxiang Zeng Ran Zheng Zhiyuan Shao Xiaowen Feng  | 
    
| Author_xml | – sequence: 1 surname: Xiaowen Feng fullname: Xiaowen Feng organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China – sequence: 2 surname: Hai Jin fullname: Hai Jin email: hjin@hust.edu.cn organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China – sequence: 3 surname: Ran Zheng fullname: Ran Zheng organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China – sequence: 4 surname: Kan Hu fullname: Kan Hu organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China – sequence: 5 surname: Jingxiang Zeng fullname: Jingxiang Zeng organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China – sequence: 6 surname: Zhiyuan Shao fullname: Zhiyuan Shao organization: Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China  | 
    
| BookMark | eNotjE1Lw0AURUesYFu7deNm_kDivCQzb2ZZotZCSoux3ZZnMsGRNAnJiB-_3kDd3Hs5HO6MTZq2sYzdgggBhLlfp7vlQx5GAiA0cMEWBrVAZWQix7xkMxgHgkapJ2wKMoLACIPXbDYMH0JEIpZiyrJt593J_ZJ3bcPbiucd9YPlG_K9-w4OtvBtzzeftXdd7Yqz9uX8Oz9Q76jxPM1f-MhWu_1ww64qqge7-O852z89vqbPQbZdrdNlFjhA6QMqBRpUmFRoDZUlaohRE-lK2RiwslLJ8k2XlCRYFCjjRFVKmWI0yUJUxHN2d_511tpj17sT9T9HBRFEmMR_6nFRbg | 
    
| ContentType | Conference Proceeding | 
    
| DBID | 6IE 6IL CBEJK RIE RIL  | 
    
| DOI | 10.1109/ICPADS.2011.91 | 
    
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings Accès Toulouse INP et ENVT - IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present  | 
    
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Computer Science | 
    
| EISBN | 9780769545769 0769545769  | 
    
| EndPage | 172 | 
    
| ExternalDocumentID | 6121274 | 
    
| Genre | orig-research | 
    
| GroupedDBID | 23M 29O 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL RNS  | 
    
| ID | FETCH-LOGICAL-i175t-ad0797674f7e9add781378aa8f6e317fe565db8da447cc75346f669cdd7ae12c3 | 
    
| IEDL.DBID | RIE | 
    
| ISBN | 1457718758 9781457718755  | 
    
| ISSN | 1521-9097 | 
    
| IngestDate | Wed Aug 27 03:46:13 EDT 2025 | 
    
| IsPeerReviewed | false | 
    
| IsScholarly | true | 
    
| Language | English | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-i175t-ad0797674f7e9add781378aa8f6e317fe565db8da447cc75346f669cdd7ae12c3 | 
    
| PageCount | 8 | 
    
| ParticipantIDs | ieee_primary_6121274 | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2011-Dec. | 
    
| PublicationDateYYYYMMDD | 2011-12-01 | 
    
| PublicationDate_xml | – month: 12 year: 2011 text: 2011-Dec.  | 
    
| PublicationDecade | 2010 | 
    
| PublicationTitle | 2011 IEEE 17th International Conference on Parallel and Distributed Systems | 
    
| PublicationTitleAbbrev | icpads | 
    
| PublicationYear | 2011 | 
    
| Publisher | IEEE | 
    
| Publisher_xml | – name: IEEE | 
    
| SSID | ssj0020350 ssib026767514 ssj0000669466  | 
    
| Score | 2.0303154 | 
    
| Snippet | Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound... | 
    
| SourceID | ieee | 
    
| SourceType | Publisher | 
    
| StartPage | 165 | 
    
| SubjectTerms | Compress Sparse Row Computer architecture GPU Graphics processing unit Instruction sets Interleaved Row Combination Kernel Segmented Processing Silicon carbide Sparse matrices Sparse Matrix-Vector Multiplication Vectors  | 
    
| Title | Optimization of Sparse Matrix-Vector Multiplication with Variant CSR on GPUs | 
    
| URI | https://ieeexplore.ieee.org/document/6121274 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ3PT8IwFMcb5MQJFYy_04NHC4N163Y0KKIRJSKEG-na18QYgehIjH-9r12HxnjwtjU7bG33-r6v731KyFkQS65BSCZCHTDUX5plwCOWAIrnFBQIVwszvI8HE347i2YVcr6phQEAl3wGLXvp9vL1Uq1tqKxtaVeoorbIlkjiolarnDtdBx7zO4KFFY4tOn0jvuwOmmOnupQEiwOsdXgk0DSjx1yyn_x95OmOnSBt3_RGF5fjgvVpQZ4_zmBxS1C_ToblyxeZJy-tdZ611OcvruN_v26bNL-L_ehos4ztkAosdkm9PO2B-p-_Qe4e0Lq8-rJNujR0vEJRDHRoIf8fbOrC_3RYJCj6SCC1YV46RT2OA0h740eKbdejyXuTTPpXT70B82cxsGd0MHImdSBSC_4xAlK0idhboUikTEwM6IIYQMdQZ4mWnAulUAPx2OAIKHxSQqerwj1SXSwXsE9oqkLNQQRdE-LSGGpppIFQJlmUKWNkdEAatnPmqwK3Mff9cvh38xGpuTCvyzA5JtX8bQ0n6Cfk2ambIF86UbVN | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4gHuSECsa3e_BoodBttz0aFEEpEnmEG9nuzibGCERLYvz1zm4LGuPBW7vpoZ3dzsw3j28IuXQDwRRw4XBPuQ7iL-UkwHwnBATPEUjgthcm7gedMbuf-tMCudr0wgCALT6Dmrm0uXy1kCsTKqsbtitEUVtk22eM-Vm31vr0NC31WJ4TzPRwYMjTN_DL5NAse6otSjCEgKUG8zkqZ_SZ1-xP-b2f8zs23KjebQ2ub4YZ26eh8vwxhcUaoXaZxOvXz2pPXmqrNKnJz1_Mjv_9vl1S_W73o4ONIdsjBZjvk_J63gPNf_8K6T2ifnnNGzfpQtPhEmEx0NjQ_H84E5sAoHFWopjHAqkJ9NIJInLcQtoaPlFcuxuM36tk3L4dtTpOPo3BeUYXI3WEcnlkqH80hwi1IkrL46EQoQ4AnRAN6BqqJFSCMS4loiAWaNwBiU8KaDSld0CK88UcDgmNpKcYcLepPTSOnhJaaPBEmPiJ1Fr4R6RihDNbZoQbs1wux38vX5CdzijuzXrd_sMJKdmgr603OSXF9G0FZ-g1pMm5PSxf7yS4mg | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2011+IEEE+17th+International+Conference+on+Parallel+and+Distributed+Systems&rft.atitle=Optimization+of+Sparse+Matrix-Vector+Multiplication+with+Variant+CSR+on+GPUs&rft.au=Xiaowen+Feng&rft.au=Hai+Jin&rft.au=Ran+Zheng&rft.au=Kan+Hu&rft.date=2011-12-01&rft.pub=IEEE&rft.isbn=9781457718755&rft.issn=1521-9097&rft.spage=165&rft.epage=172&rft_id=info:doi/10.1109%2FICPADS.2011.91&rft.externalDocID=6121274 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1521-9097&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1521-9097&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1521-9097&client=summon |