RV-SCNN: A RISC-V Processor With Customized Instruction Set for SNN and CNN Inference Acceleration on Edge Platforms

The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we pro...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on computer-aided design of integrated circuits and systems Vol. 44; no. 4; pp. 1567 - 1580
Main Authors Wang, Xingbo, Feng, Chenxi, Kang, Xinyu, Wang, Qi, Huang, Yucong, Ye, Terry Tao
Format Journal Article
LanguageEnglish
Published New York IEEE 01.04.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0278-0070
1937-4151
DOI10.1109/TCAD.2024.3472293

Cover

More Information
Summary:The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we propose a novel processor architecture called RV-SCNN to address this challenge. The architecture is based on the RISC-V generic instruction set and incorporates various single instruction multiple data (SIMD) custom instruction extensions to accelerate the computation of spike neural networks (SNNs) and convolutional neural networks (CNNs), enabling efficient execution of complex neural network models. The core operators of the processor are shared by both SNN and CNN operations, thus supporting both computation modes. Other acceleration implementations include an internal hardware loop control unit that reduces the instruction overhead, an address calculation unit and an interlayer fusion unit that minimize the memory access overhead, as well as an image to column (IM2COL) unit that improves the computational efficiency of the <inline-formula> <tex-math notation="LaTeX">3 \times 3 </tex-math></inline-formula> convolutions in SNNs and CNNs. The custom instructions are called through inline assembly in the C program, providing higher flexibility compared to traditional ASICs and supporting custom complex SNN/CNN network structures. Compared to traditional instruction sets, the RV-SCNN processor reduces the execution time of CNNs and SNNs by over 90%. We validate the processor on FPGA platform and evaluate its performance under CMOS 55-nm process. The processor achieves an operational efficiency of 9.88 pJ/SOP in SNN network inference tasks, while the peak energy efficiency reaches 679 GOPS/W in CNN network inference.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0278-0070
1937-4151
DOI:10.1109/TCAD.2024.3472293