RV-SCNN: A RISC-V Processor With Customized Instruction Set for SNN and CNN Inference Acceleration on Edge Platforms

The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we pro...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on computer-aided design of integrated circuits and systems Vol. 44; no. 4; pp. 1567 - 1580
Main Authors Wang, Xingbo, Feng, Chenxi, Kang, Xinyu, Wang, Qi, Huang, Yucong, Ye, Terry Tao
Format Journal Article
LanguageEnglish
Published New York IEEE 01.04.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0278-0070
1937-4151
DOI10.1109/TCAD.2024.3472293

Cover

Abstract The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we propose a novel processor architecture called RV-SCNN to address this challenge. The architecture is based on the RISC-V generic instruction set and incorporates various single instruction multiple data (SIMD) custom instruction extensions to accelerate the computation of spike neural networks (SNNs) and convolutional neural networks (CNNs), enabling efficient execution of complex neural network models. The core operators of the processor are shared by both SNN and CNN operations, thus supporting both computation modes. Other acceleration implementations include an internal hardware loop control unit that reduces the instruction overhead, an address calculation unit and an interlayer fusion unit that minimize the memory access overhead, as well as an image to column (IM2COL) unit that improves the computational efficiency of the <inline-formula> <tex-math notation="LaTeX">3 \times 3 </tex-math></inline-formula> convolutions in SNNs and CNNs. The custom instructions are called through inline assembly in the C program, providing higher flexibility compared to traditional ASICs and supporting custom complex SNN/CNN network structures. Compared to traditional instruction sets, the RV-SCNN processor reduces the execution time of CNNs and SNNs by over 90%. We validate the processor on FPGA platform and evaluate its performance under CMOS 55-nm process. The processor achieves an operational efficiency of 9.88 pJ/SOP in SNN network inference tasks, while the peak energy efficiency reaches 679 GOPS/W in CNN network inference.
AbstractList The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we propose a novel processor architecture called RV-SCNN to address this challenge. The architecture is based on the RISC-V generic instruction set and incorporates various single instruction multiple data (SIMD) custom instruction extensions to accelerate the computation of spike neural networks (SNNs) and convolutional neural networks (CNNs), enabling efficient execution of complex neural network models. The core operators of the processor are shared by both SNN and CNN operations, thus supporting both computation modes. Other acceleration implementations include an internal hardware loop control unit that reduces the instruction overhead, an address calculation unit and an interlayer fusion unit that minimize the memory access overhead, as well as an image to column (IM2COL) unit that improves the computational efficiency of the [Formula Omitted] convolutions in SNNs and CNNs. The custom instructions are called through inline assembly in the C program, providing higher flexibility compared to traditional ASICs and supporting custom complex SNN/CNN network structures. Compared to traditional instruction sets, the RV-SCNN processor reduces the execution time of CNNs and SNNs by over 90%. We validate the processor on FPGA platform and evaluate its performance under CMOS 55-nm process. The processor achieves an operational efficiency of 9.88 pJ/SOP in SNN network inference tasks, while the peak energy efficiency reaches 679 GOPS/W in CNN network inference.
The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we propose a novel processor architecture called RV-SCNN to address this challenge. The architecture is based on the RISC-V generic instruction set and incorporates various single instruction multiple data (SIMD) custom instruction extensions to accelerate the computation of spike neural networks (SNNs) and convolutional neural networks (CNNs), enabling efficient execution of complex neural network models. The core operators of the processor are shared by both SNN and CNN operations, thus supporting both computation modes. Other acceleration implementations include an internal hardware loop control unit that reduces the instruction overhead, an address calculation unit and an interlayer fusion unit that minimize the memory access overhead, as well as an image to column (IM2COL) unit that improves the computational efficiency of the <inline-formula> <tex-math notation="LaTeX">3 \times 3 </tex-math></inline-formula> convolutions in SNNs and CNNs. The custom instructions are called through inline assembly in the C program, providing higher flexibility compared to traditional ASICs and supporting custom complex SNN/CNN network structures. Compared to traditional instruction sets, the RV-SCNN processor reduces the execution time of CNNs and SNNs by over 90%. We validate the processor on FPGA platform and evaluate its performance under CMOS 55-nm process. The processor achieves an operational efficiency of 9.88 pJ/SOP in SNN network inference tasks, while the peak energy efficiency reaches 679 GOPS/W in CNN network inference.
Author Wang, Xingbo
Wang, Qi
Huang, Yucong
Ye, Terry Tao
Kang, Xinyu
Feng, Chenxi
Author_xml – sequence: 1
  givenname: Xingbo
  orcidid: 0000-0001-9349-2718
  surname: Wang
  fullname: Wang, Xingbo
  organization: Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
– sequence: 2
  givenname: Chenxi
  surname: Feng
  fullname: Feng, Chenxi
  organization: Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
– sequence: 3
  givenname: Xinyu
  orcidid: 0000-0003-1640-6198
  surname: Kang
  fullname: Kang, Xinyu
  organization: Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
– sequence: 4
  givenname: Qi
  surname: Wang
  fullname: Wang, Qi
  organization: Department of Electrical and Computer Engineering, Southern University of Science and Technology, Shenzhen, China
– sequence: 5
  givenname: Yucong
  orcidid: 0000-0002-8499-4254
  surname: Huang
  fullname: Huang, Yucong
  organization: Department of Electronic and Computer Engineering, Southern University of Science and Technology, Shenzhen, China
– sequence: 6
  givenname: Terry Tao
  orcidid: 0000-0002-4359-3550
  surname: Ye
  fullname: Ye, Terry Tao
  email: yet@sustech.edu.cn
  organization: Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
BookMark eNpNkMFOAjEQhhujiYA-gImHJp4Xp93tduuNrKgkBAkgHjdLd6pLYBfbctCntwgHk0nmMN__T_J1yXnTNkjIDYM-Y6DuF_ngsc-BJ_04kZyr-Ix0mIpllDDBzkkHuMwiAAmXpOvcGoAlgqsO8bNlNM8nkwc6oLPRPI-WdGpbjc61lr7X_pPme-fbbf2DFR01ztu99nXb0Dl6agIzn0xo2VQ0dIS7QYuNRjrQGjdoyz80zLD6QDrdlD5Etu6KXJhy4_D6tHvk7Wm4yF-i8evzKB-MIx2zzEdmBVJWmYg5N6XmVZXoVFWYSpFpLbIkQzAaZMZXEoQKWAKpMDqtDJiUpybukbtj7862X3t0vli3e9uEl0XMpFJMJEIFih0pbVvnLJpiZ-ttab8LBsVBbnGQWxzkFie5IXN7zNSI-I-XwEUK8S9QtHWK
CODEN ITCSDI
Cites_doi 10.3390/electronics9061005
10.4324/9781410605337-29
10.1109/ISCAS45731.2020.9180551
10.1109/5.726791
10.1109/TVLSI.2023.3282239
10.1109/TCAD.2015.2474396
10.1109/tbcas.2018.2880425
10.1109/MCSoC51149.2021.00055
10.1109/ICCE55644.2022.9852060
10.1109/TC.2024.3362060
10.1109/VLSICircuits18222.2020.9163000
10.1109/FPL50879.2020.00075
10.23919/DATE48585.2020.9116529
10.1109/ICICSE52190.2021.9404134
10.1109/MM.2018.112130359
10.1109/HPCC/SmartCity/DSS.2019.00268
10.1109/dac18072.2020.9218714
10.1007/s11263-014-0788-3
10.1109/ISSCC.2017.7870353
10.1109/ESSCIRC53450.2021.9567767
10.46586/tches.v2021.i1.109-136
10.1109/JSSC.2016.2616357
10.1109/JSSC.2022.3214170
10.1109/TBCAS.2019.2928793
10.1109/TVLSI.2017.2654506
10.3390/electronics13040733
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TCAD.2024.3472293
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1937-4151
EndPage 1580
ExternalDocumentID 10_1109_TCAD_2024_3472293
10702560
Genre orig-research
GroupedDBID --Z
-~X
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFS
ACIWK
ACNCT
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
H~9
IBMZZ
ICLAB
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
PZZ
RIA
RIE
RNS
TN5
VH1
VJK
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c318t-fb077d85322fac2dd4c69de6758cc5848e0fc0782b70595324065fc6df0f626f3
IEDL.DBID RIE
ISSN 0278-0070
IngestDate Thu Aug 14 02:12:08 EDT 2025
Wed Oct 01 06:36:40 EDT 2025
Wed Aug 27 01:38:21 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c318t-fb077d85322fac2dd4c69de6758cc5848e0fc0782b70595324065fc6df0f626f3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-1640-6198
0000-0002-4359-3550
0000-0002-8499-4254
0000-0001-9349-2718
PQID 3179915459
PQPubID 85470
PageCount 14
ParticipantIDs crossref_primary_10_1109_TCAD_2024_3472293
ieee_primary_10702560
proquest_journals_3179915459
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2025-04-01
PublicationDateYYYYMMDD 2025-04-01
PublicationDate_xml – month: 04
  year: 2025
  text: 2025-04-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on computer-aided design of integrated circuits and systems
PublicationTitleAbbrev TCAD
PublicationYear 2025
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref12
ref15
ref30
ref11
ref10
ref2
ref1
ref17
(ref25) 2020
ref16
ref19
ref18
Krizhevsky (ref13)
Simonyan (ref14) 2014
ref24
ref23
ref26
ref20
ref22
ref21
Howard (ref28) 2017
ref27
ref29
ref8
ref7
ref9
ref4
ref3
ref6
ref5
References_xml – ident: ref8
  doi: 10.3390/electronics9061005
– ident: ref15
  doi: 10.4324/9781410605337-29
– ident: ref5
  doi: 10.1109/ISCAS45731.2020.9180551
– ident: ref27
  doi: 10.1109/5.726791
– ident: ref11
  doi: 10.1109/TVLSI.2023.3282239
– ident: ref24
  doi: 10.1109/TCAD.2015.2474396
– year: 2020
  ident: ref25
  article-title: OpenHW group CORE-V CV32E40P RISC-V IP.
– ident: ref23
  doi: 10.1109/tbcas.2018.2880425
– start-page: 1
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref13
  article-title: ImageNet classification with deep convolutional neural networks
– ident: ref12
  doi: 10.1109/MCSoC51149.2021.00055
– ident: ref3
  doi: 10.1109/ICCE55644.2022.9852060
– ident: ref10
  doi: 10.1109/TC.2024.3362060
– ident: ref20
  doi: 10.1109/VLSICircuits18222.2020.9163000
– ident: ref1
  doi: 10.1109/FPL50879.2020.00075
– ident: ref21
  doi: 10.23919/DATE48585.2020.9116529
– ident: ref6
  doi: 10.1109/ICICSE52190.2021.9404134
– ident: ref30
  doi: 10.1109/MM.2018.112130359
– ident: ref7
  doi: 10.1109/HPCC/SmartCity/DSS.2019.00268
– ident: ref17
  doi: 10.1109/dac18072.2020.9218714
– ident: ref16
  doi: 10.1007/s11263-014-0788-3
– ident: ref19
  doi: 10.1109/ISSCC.2017.7870353
– ident: ref22
  doi: 10.1109/ESSCIRC53450.2021.9567767
– ident: ref4
  doi: 10.46586/tches.v2021.i1.109-136
– ident: ref18
  doi: 10.1109/JSSC.2016.2616357
– year: 2017
  ident: ref28
  article-title: MobileNets: Efficient convolutional neural networks for mobile vision applications
  publication-title: arXiv:1704.04861
– year: 2014
  ident: ref14
  article-title: Very deep convolutional networks for large-scale image recognition
  publication-title: arXiv:1409.1556
– ident: ref9
  doi: 10.1109/JSSC.2022.3214170
– ident: ref29
  doi: 10.1109/TBCAS.2019.2928793
– ident: ref26
  doi: 10.1109/TVLSI.2017.2654506
– ident: ref2
  doi: 10.3390/electronics13040733
SSID ssj0014529
Score 2.45656
Snippet The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However,...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Index Database
Publisher
StartPage 1567
SubjectTerms Acceleration
Artificial intelligence
Artificial neural networks
Computer architecture
Control systems
Convolution
convolutional neural network (CNN)
Convolutional neural networks
edge computing
Efficiency
Hardware
Inference
Instruction sets
Interlayers
Logic
Memory management
Microprocessors
Neural networks
Neurons
Process control
Registers
RISC
RISC-V
Single instruction multiple data
single instruction multiple data (SIMD)
spike neural network (SNN)
Title RV-SCNN: A RISC-V Processor With Customized Instruction Set for SNN and CNN Inference Acceleration on Edge Platforms
URI https://ieeexplore.ieee.org/document/10702560
https://www.proquest.com/docview/3179915459
Volume 44
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1937-4151
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014529
  issn: 0278-0070
  databaseCode: RIE
  dateStart: 19820101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4IJz34xIii2YMnk62l73ojBAImNoYCcmvafShRwUi58Oud2RZFjYlJD0063Wx3Zjvf7Ox8S8hlyHmTg19nViBc5niOyVJfhqwpPVfawg6VwHXIu8jrjZzbiTspi9V1LYyUUm8-kwbe6ly-mPMlLpXBDPe1i66Qih94RbHWZ8oAM4h6QQUpY0GwTGE2zfB6CF8FoaDlGDZyI4b2NyekT1X59SvW_qW7R6J1z4ptJc_GMs8MvvpB2vjvru-T3RJp0lZhGgdkS84Oyc4G_-ARyQdjFrej6Ia26KAft9mYloUD83f6MM2faHsJ4PB1upKC9r-4Zmkscwpol8ZRRNOZoNAGPC9LB2mLc3BmhWlRuDriUdL7lzRHgLyokVG3M2z3WHkMA-Mw4XOmMtP3Bbh1y1Ipt4RwuBcKiZEG54BfAmkqjkgj8wGrucjw57mKe0KZCsIlZR-T6mw-kyeEuoEpUjcD-wAp5MQVmTLRNoIMGrTtOrla6yV5K9g2Eh2lmGGCSkxQiUmpxDqp4ThvCBZDXCeNtSqTckIuEhuZ7xAuhqd_vHZGti0821fvymmQKgypPAfAkWcX2tA-AKgsz5k
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLZ4HIADb8RgQA6ckDK6tmlXbtME2nhUiI3HrWrzgAnYEHSX_XrstOMpJKQeKtWN0tipP8fxF4D9SMq6RL_O3YYS3A98h6ehjnhdB0J7youMonXIizhoX_und-KuLFa3tTBaa7v5TNfo1uby1VCOaKkMZ3hoXfQ0zArf90VRrvWRNKAcol1SIdJYFC2TmHUnOuzhd2Ew6Po1j9gRI--bG7Lnqvz6GVsPc7IE8aRvxcaSx9ooz2py_IO28d-dX4bFEmuyZmEcKzClB6uw8IWBcA3yqxvebcXxEWuyq063xW9YWTowfGW3_fyBtUYID5_7Y61Y55NtlnV1zhDvsm4cs3SgGLaBz8viQdaUEt1ZYVwMr2N1r9nlU5oTRH5bh-uT416rzcuDGLjEKZ9zkzlhqNCxu65JpauUL4NIaYo1pEQE09COkYQ1shDRmiCOv0AYGSjjGAyYjLcBM4PhQG8CEw1HpSJDC0EpYsVVmXHIOhoZNuh5FTiY6CV5Kfg2EhunOFFCSkxIiUmpxAqs0zh_ESyGuALViSqTckq-JR5x3xFgjLb-eG0P5tq9i_PkvBOfbcO8Syf92j06VZjB4dU7CD_ybNca3Tu0OdLm
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RV-SCNN%3A+A+RISC-V+Processor+With+Customized+Instruction+Set+for+SNN+and+CNN+Inference+Acceleration+on+Edge+Platforms&rft.jtitle=IEEE+transactions+on+computer-aided+design+of+integrated+circuits+and+systems&rft.au=Wang%2C+Xingbo&rft.au=Feng%2C+Chenxi&rft.au=Kang%2C+Xinyu&rft.au=Wang%2C+Qi&rft.date=2025-04-01&rft.issn=0278-0070&rft.eissn=1937-4151&rft.volume=44&rft.issue=4&rft.spage=1567&rft.epage=1580&rft_id=info:doi/10.1109%2FTCAD.2024.3472293&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCAD_2024_3472293
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0278-0070&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0278-0070&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0278-0070&client=summon