RV-SCNN: A RISC-V Processor With Customized Instruction Set for SNN and CNN Inference Acceleration on Edge Platforms

The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we pro...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on computer-aided design of integrated circuits and systems Vol. 44; no. 4; pp. 1567 - 1580
Main Authors	Wang, Xingbo, Feng, Chenxi, Kang, Xinyu, Wang, Qi, Huang, Yucong, Ye, Terry Tao
Format	Journal Article
Language	English
Published	New York IEEE 01.04.2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Acceleration Artificial intelligence Artificial neural networks Computer architecture Control systems Convolution convolutional neural network (CNN) Convolutional neural networks edge computing Efficiency Hardware Inference Instruction sets Interlayers Logic Memory management Microprocessors Neural networks Neurons Process control Registers RISC RISC-V Single instruction multiple data single instruction multiple data (SIMD) spike neural network (SNN)
Online Access	Get full text
ISSN	0278-0070 1937-4151
DOI	10.1109/TCAD.2024.3472293

Cover

More Information
Summary:	The rapid advancement of artificial intelligence (AI) applications has driven an increasing demand for conducting inference tasks on edge devices. However, implementing computation-intensive neural networks on resource-constrained edge systems remains a significant challenge. In this article, we propose a novel processor architecture called RV-SCNN to address this challenge. The architecture is based on the RISC-V generic instruction set and incorporates various single instruction multiple data (SIMD) custom instruction extensions to accelerate the computation of spike neural networks (SNNs) and convolutional neural networks (CNNs), enabling efficient execution of complex neural network models. The core operators of the processor are shared by both SNN and CNN operations, thus supporting both computation modes. Other acceleration implementations include an internal hardware loop control unit that reduces the instruction overhead, an address calculation unit and an interlayer fusion unit that minimize the memory access overhead, as well as an image to column (IM2COL) unit that improves the computational efficiency of the <inline-formula> <tex-math notation="LaTeX">3 \times 3 </tex-math></inline-formula> convolutions in SNNs and CNNs. The custom instructions are called through inline assembly in the C program, providing higher flexibility compared to traditional ASICs and supporting custom complex SNN/CNN network structures. Compared to traditional instruction sets, the RV-SCNN processor reduces the execution time of CNNs and SNNs by over 90%. We validate the processor on FPGA platform and evaluate its performance under CMOS 55-nm process. The processor achieves an operational efficiency of 9.88 pJ/SOP in SNN network inference tasks, while the peak energy efficiency reaches 679 GOPS/W in CNN network inference.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0278-0070 1937-4151
DOI:	10.1109/TCAD.2024.3472293