RV-SCNN: A RISC-V Processor With Customized Instruction Set for SNN and CNN Inference Acceleration on Edge Platforms
The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we pro...
        Saved in:
      
    
          | Published in | IEEE transactions on computer-aided design of integrated circuits and systems Vol. 44; no. 4; pp. 1567 - 1580 | 
|---|---|
| Main Authors | , , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        New York
          IEEE
    
        01.04.2025
     The Institute of Electrical and Electronics Engineers, Inc. (IEEE)  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0278-0070 1937-4151  | 
| DOI | 10.1109/TCAD.2024.3472293 | 
Cover
| Abstract | The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we propose a novel processor architecture called RV-SCNN to address this challenge. The architecture is based on the RISC-V generic instruction set and incorporates various single instruction multiple data (SIMD) custom instruction extensions to accelerate the computation of spike neural networks (SNNs) and convolutional neural networks (CNNs), enabling efficient execution of complex neural network models. The core operators of the processor are shared by both SNN and CNN operations, thus supporting both computation modes. Other acceleration implementations include an internal hardware loop control unit that reduces the instruction overhead, an address calculation unit and an interlayer fusion unit that minimize the memory access overhead, as well as an image to column (IM2COL) unit that improves the computational efficiency of the <inline-formula> <tex-math notation="LaTeX">3 \times 3 </tex-math></inline-formula> convolutions in SNNs and CNNs. The custom instructions are called through inline assembly in the C program, providing higher flexibility compared to traditional ASICs and supporting custom complex SNN/CNN network structures. Compared to traditional instruction sets, the RV-SCNN processor reduces the execution time of CNNs and SNNs by over 90%. We validate the processor on FPGA platform and evaluate its performance under CMOS 55-nm process. The processor achieves an operational efficiency of 9.88 pJ/SOP in SNN network inference tasks, while the peak energy efficiency reaches 679 GOPS/W in CNN network inference. | 
    
|---|---|
| AbstractList | The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we propose a novel processor architecture called RV-SCNN to address this challenge. The architecture is based on the RISC-V generic instruction set and incorporates various single instruction multiple data (SIMD) custom instruction extensions to accelerate the computation of spike neural networks (SNNs) and convolutional neural networks (CNNs), enabling efficient execution of complex neural network models. The core operators of the processor are shared by both SNN and CNN operations, thus supporting both computation modes. Other acceleration implementations include an internal hardware loop control unit that reduces the instruction overhead, an address calculation unit and an interlayer fusion unit that minimize the memory access overhead, as well as an image to column (IM2COL) unit that improves the computational efficiency of the [Formula Omitted] convolutions in SNNs and CNNs. The custom instructions are called through inline assembly in the C program, providing higher flexibility compared to traditional ASICs and supporting custom complex SNN/CNN network structures. Compared to traditional instruction sets, the RV-SCNN processor reduces the execution time of CNNs and SNNs by over 90%. We validate the processor on FPGA platform and evaluate its performance under CMOS 55-nm process. The processor achieves an operational efficiency of 9.88 pJ/SOP in SNN network inference tasks, while the peak energy efficiency reaches 679 GOPS/W in CNN network inference. The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we propose a novel processor architecture called RV-SCNN to address this challenge. The architecture is based on the RISC-V generic instruction set and incorporates various single instruction multiple data (SIMD) custom instruction extensions to accelerate the computation of spike neural networks (SNNs) and convolutional neural networks (CNNs), enabling efficient execution of complex neural network models. The core operators of the processor are shared by both SNN and CNN operations, thus supporting both computation modes. Other acceleration implementations include an internal hardware loop control unit that reduces the instruction overhead, an address calculation unit and an interlayer fusion unit that minimize the memory access overhead, as well as an image to column (IM2COL) unit that improves the computational efficiency of the <inline-formula> <tex-math notation="LaTeX">3 \times 3 </tex-math></inline-formula> convolutions in SNNs and CNNs. The custom instructions are called through inline assembly in the C program, providing higher flexibility compared to traditional ASICs and supporting custom complex SNN/CNN network structures. Compared to traditional instruction sets, the RV-SCNN processor reduces the execution time of CNNs and SNNs by over 90%. We validate the processor on FPGA platform and evaluate its performance under CMOS 55-nm process. The processor achieves an operational efficiency of 9.88 pJ/SOP in SNN network inference tasks, while the peak energy efficiency reaches 679 GOPS/W in CNN network inference.  | 
    
| Author | Wang, Xingbo Wang, Qi Huang, Yucong Ye, Terry Tao Kang, Xinyu Feng, Chenxi  | 
    
| Author_xml | – sequence: 1 givenname: Xingbo orcidid: 0000-0001-9349-2718 surname: Wang fullname: Wang, Xingbo organization: Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China – sequence: 2 givenname: Chenxi surname: Feng fullname: Feng, Chenxi organization: Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China – sequence: 3 givenname: Xinyu orcidid: 0000-0003-1640-6198 surname: Kang fullname: Kang, Xinyu organization: Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China – sequence: 4 givenname: Qi surname: Wang fullname: Wang, Qi organization: Department of Electrical and Computer Engineering, Southern University of Science and Technology, Shenzhen, China – sequence: 5 givenname: Yucong orcidid: 0000-0002-8499-4254 surname: Huang fullname: Huang, Yucong organization: Department of Electronic and Computer Engineering, Southern University of Science and Technology, Shenzhen, China – sequence: 6 givenname: Terry Tao orcidid: 0000-0002-4359-3550 surname: Ye fullname: Ye, Terry Tao email: yet@sustech.edu.cn organization: Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China  | 
    
| BookMark | eNpNkMFOAjEQhhujiYA-gImHJp4Xp93tduuNrKgkBAkgHjdLd6pLYBfbctCntwgHk0nmMN__T_J1yXnTNkjIDYM-Y6DuF_ngsc-BJ_04kZyr-Ix0mIpllDDBzkkHuMwiAAmXpOvcGoAlgqsO8bNlNM8nkwc6oLPRPI-WdGpbjc61lr7X_pPme-fbbf2DFR01ztu99nXb0Dl6agIzn0xo2VQ0dIS7QYuNRjrQGjdoyz80zLD6QDrdlD5Etu6KXJhy4_D6tHvk7Wm4yF-i8evzKB-MIx2zzEdmBVJWmYg5N6XmVZXoVFWYSpFpLbIkQzAaZMZXEoQKWAKpMDqtDJiUpybukbtj7862X3t0vli3e9uEl0XMpFJMJEIFih0pbVvnLJpiZ-ttab8LBsVBbnGQWxzkFie5IXN7zNSI-I-XwEUK8S9QtHWK | 
    
| CODEN | ITCSDI | 
    
| Cites_doi | 10.3390/electronics9061005 10.4324/9781410605337-29 10.1109/ISCAS45731.2020.9180551 10.1109/5.726791 10.1109/TVLSI.2023.3282239 10.1109/TCAD.2015.2474396 10.1109/tbcas.2018.2880425 10.1109/MCSoC51149.2021.00055 10.1109/ICCE55644.2022.9852060 10.1109/TC.2024.3362060 10.1109/VLSICircuits18222.2020.9163000 10.1109/FPL50879.2020.00075 10.23919/DATE48585.2020.9116529 10.1109/ICICSE52190.2021.9404134 10.1109/MM.2018.112130359 10.1109/HPCC/SmartCity/DSS.2019.00268 10.1109/dac18072.2020.9218714 10.1007/s11263-014-0788-3 10.1109/ISSCC.2017.7870353 10.1109/ESSCIRC53450.2021.9567767 10.46586/tches.v2021.i1.109-136 10.1109/JSSC.2016.2616357 10.1109/JSSC.2022.3214170 10.1109/TBCAS.2019.2928793 10.1109/TVLSI.2017.2654506 10.3390/electronics13040733  | 
    
| ContentType | Journal Article | 
    
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025 | 
    
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025 | 
    
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D  | 
    
| DOI | 10.1109/TCAD.2024.3472293 | 
    
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts  Academic Computer and Information Systems Abstracts Professional  | 
    
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional  | 
    
| DatabaseTitleList | Technology Research Database | 
    
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Engineering | 
    
| EISSN | 1937-4151 | 
    
| EndPage | 1580 | 
    
| ExternalDocumentID | 10_1109_TCAD_2024_3472293 10702560  | 
    
| Genre | orig-research | 
    
| GroupedDBID | --Z -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFS ACIWK ACNCT AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 IBMZZ ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P PZZ RIA RIE RNS TN5 VH1 VJK AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D  | 
    
| ID | FETCH-LOGICAL-c318t-fb077d85322fac2dd4c69de6758cc5848e0fc0782b70595324065fc6df0f626f3 | 
    
| IEDL.DBID | RIE | 
    
| ISSN | 0278-0070 | 
    
| IngestDate | Thu Aug 14 02:12:08 EDT 2025 Wed Oct 01 06:36:40 EDT 2025 Wed Aug 27 01:38:21 EDT 2025  | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Issue | 4 | 
    
| Language | English | 
    
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037  | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c318t-fb077d85322fac2dd4c69de6758cc5848e0fc0782b70595324065fc6df0f626f3 | 
    
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14  | 
    
| ORCID | 0000-0003-1640-6198 0000-0002-4359-3550 0000-0002-8499-4254 0000-0001-9349-2718  | 
    
| PQID | 3179915459 | 
    
| PQPubID | 85470 | 
    
| PageCount | 14 | 
    
| ParticipantIDs | crossref_primary_10_1109_TCAD_2024_3472293 ieee_primary_10702560 proquest_journals_3179915459  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2025-04-01 | 
    
| PublicationDateYYYYMMDD | 2025-04-01 | 
    
| PublicationDate_xml | – month: 04 year: 2025 text: 2025-04-01 day: 01  | 
    
| PublicationDecade | 2020 | 
    
| PublicationPlace | New York | 
    
| PublicationPlace_xml | – name: New York | 
    
| PublicationTitle | IEEE transactions on computer-aided design of integrated circuits and systems | 
    
| PublicationTitleAbbrev | TCAD | 
    
| PublicationYear | 2025 | 
    
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)  | 
    
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)  | 
    
| References | ref12 ref15 ref30 ref11 ref10 ref2 ref1 ref17 (ref25) 2020 ref16 ref19 ref18 Krizhevsky (ref13) Simonyan (ref14) 2014 ref24 ref23 ref26 ref20 ref22 ref21 Howard (ref28) 2017 ref27 ref29 ref8 ref7 ref9 ref4 ref3 ref6 ref5  | 
    
| References_xml | – ident: ref8 doi: 10.3390/electronics9061005 – ident: ref15 doi: 10.4324/9781410605337-29 – ident: ref5 doi: 10.1109/ISCAS45731.2020.9180551 – ident: ref27 doi: 10.1109/5.726791 – ident: ref11 doi: 10.1109/TVLSI.2023.3282239 – ident: ref24 doi: 10.1109/TCAD.2015.2474396 – year: 2020 ident: ref25 article-title: OpenHW group CORE-V CV32E40P RISC-V IP. – ident: ref23 doi: 10.1109/tbcas.2018.2880425 – start-page: 1 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref13 article-title: ImageNet classification with deep convolutional neural networks – ident: ref12 doi: 10.1109/MCSoC51149.2021.00055 – ident: ref3 doi: 10.1109/ICCE55644.2022.9852060 – ident: ref10 doi: 10.1109/TC.2024.3362060 – ident: ref20 doi: 10.1109/VLSICircuits18222.2020.9163000 – ident: ref1 doi: 10.1109/FPL50879.2020.00075 – ident: ref21 doi: 10.23919/DATE48585.2020.9116529 – ident: ref6 doi: 10.1109/ICICSE52190.2021.9404134 – ident: ref30 doi: 10.1109/MM.2018.112130359 – ident: ref7 doi: 10.1109/HPCC/SmartCity/DSS.2019.00268 – ident: ref17 doi: 10.1109/dac18072.2020.9218714 – ident: ref16 doi: 10.1007/s11263-014-0788-3 – ident: ref19 doi: 10.1109/ISSCC.2017.7870353 – ident: ref22 doi: 10.1109/ESSCIRC53450.2021.9567767 – ident: ref4 doi: 10.46586/tches.v2021.i1.109-136 – ident: ref18 doi: 10.1109/JSSC.2016.2616357 – year: 2017 ident: ref28 article-title: MobileNets: Efficient convolutional neural networks for mobile vision applications publication-title: arXiv:1704.04861 – year: 2014 ident: ref14 article-title: Very deep convolutional networks for large-scale image recognition publication-title: arXiv:1409.1556 – ident: ref9 doi: 10.1109/JSSC.2022.3214170 – ident: ref29 doi: 10.1109/TBCAS.2019.2928793 – ident: ref26 doi: 10.1109/TVLSI.2017.2654506 – ident: ref2 doi: 10.3390/electronics13040733  | 
    
| SSID | ssj0014529 | 
    
| Score | 2.45656 | 
    
| Snippet | The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However,... | 
    
| SourceID | proquest crossref ieee  | 
    
| SourceType | Aggregation Database Index Database Publisher  | 
    
| StartPage | 1567 | 
    
| SubjectTerms | Acceleration Artificial intelligence Artificial neural networks Computer architecture Control systems Convolution convolutional neural network (CNN) Convolutional neural networks edge computing Efficiency Hardware Inference Instruction sets Interlayers Logic Memory management Microprocessors Neural networks Neurons Process control Registers RISC RISC-V Single instruction multiple data single instruction multiple data (SIMD) spike neural network (SNN)  | 
    
| Title | RV-SCNN: A RISC-V Processor With Customized Instruction Set for SNN and CNN Inference Acceleration on Edge Platforms | 
    
| URI | https://ieeexplore.ieee.org/document/10702560 https://www.proquest.com/docview/3179915459  | 
    
| Volume | 44 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1937-4151 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014529 issn: 0278-0070 databaseCode: RIE dateStart: 19820101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4IJz34xIii2YMnk62l73ojBAImNoYCcmvafShRwUi58Oud2RZFjYlJD0063Wx3Zjvf7Ox8S8hlyHmTg19nViBc5niOyVJfhqwpPVfawg6VwHXIu8jrjZzbiTspi9V1LYyUUm8-kwbe6ly-mPMlLpXBDPe1i66Qih94RbHWZ8oAM4h6QQUpY0GwTGE2zfB6CF8FoaDlGDZyI4b2NyekT1X59SvW_qW7R6J1z4ptJc_GMs8MvvpB2vjvru-T3RJp0lZhGgdkS84Oyc4G_-ARyQdjFrej6Ia26KAft9mYloUD83f6MM2faHsJ4PB1upKC9r-4Zmkscwpol8ZRRNOZoNAGPC9LB2mLc3BmhWlRuDriUdL7lzRHgLyokVG3M2z3WHkMA-Mw4XOmMtP3Bbh1y1Ipt4RwuBcKiZEG54BfAmkqjkgj8wGrucjw57mKe0KZCsIlZR-T6mw-kyeEuoEpUjcD-wAp5MQVmTLRNoIMGrTtOrla6yV5K9g2Eh2lmGGCSkxQiUmpxDqp4ThvCBZDXCeNtSqTckIuEhuZ7xAuhqd_vHZGti0821fvymmQKgypPAfAkWcX2tA-AKgsz5k | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLZ4HIADb8RgQA6ckDK6tmlXbtME2nhUiI3HrWrzgAnYEHSX_XrstOMpJKQeKtWN0tipP8fxF4D9SMq6RL_O3YYS3A98h6ehjnhdB0J7youMonXIizhoX_und-KuLFa3tTBaa7v5TNfo1uby1VCOaKkMZ3hoXfQ0zArf90VRrvWRNKAcol1SIdJYFC2TmHUnOuzhd2Ew6Po1j9gRI--bG7Lnqvz6GVsPc7IE8aRvxcaSx9ooz2py_IO28d-dX4bFEmuyZmEcKzClB6uw8IWBcA3yqxvebcXxEWuyq063xW9YWTowfGW3_fyBtUYID5_7Y61Y55NtlnV1zhDvsm4cs3SgGLaBz8viQdaUEt1ZYVwMr2N1r9nlU5oTRH5bh-uT416rzcuDGLjEKZ9zkzlhqNCxu65JpauUL4NIaYo1pEQE09COkYQ1shDRmiCOv0AYGSjjGAyYjLcBM4PhQG8CEw1HpSJDC0EpYsVVmXHIOhoZNuh5FTiY6CV5Kfg2EhunOFFCSkxIiUmpxAqs0zh_ESyGuALViSqTckq-JR5x3xFgjLb-eG0P5tq9i_PkvBOfbcO8Syf92j06VZjB4dU7CD_ybNca3Tu0OdLm | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RV-SCNN%3A+A+RISC-V+Processor+With+Customized+Instruction+Set+for+SNN+and+CNN+Inference+Acceleration+on+Edge+Platforms&rft.jtitle=IEEE+transactions+on+computer-aided+design+of+integrated+circuits+and+systems&rft.au=Wang%2C+Xingbo&rft.au=Feng%2C+Chenxi&rft.au=Kang%2C+Xinyu&rft.au=Wang%2C+Qi&rft.date=2025-04-01&rft.issn=0278-0070&rft.eissn=1937-4151&rft.volume=44&rft.issue=4&rft.spage=1567&rft.epage=1580&rft_id=info:doi/10.1109%2FTCAD.2024.3472293&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCAD_2024_3472293 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0278-0070&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0278-0070&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0278-0070&client=summon |