Efficient SNN multi-cores MAC array acceleration on SpiNNaker 2

The potential low-energy feature of the spiking neural network (SNN) engages the attention of the AI community. Only CPU-involved SNN processing inevitably results in an inherently long temporal span in the cases of large models and massive datasets. This study introduces the MAC array, a parallel a...

Full description

Saved in:

Bibliographic Details
Published in	Frontiers in neuroscience Vol. 17; p. 1223262
Main Authors	Huang, Jiaxin, Kelber, Florian, Vogginger, Bernhard, Liu, Chen, Kreutz, Felix, Gerhards, Pascal, Scholz, Daniel, Knobloch, Klaus, Mayr, Christian G.
Format	Journal Article
Language	English
Published	Lausanne Frontiers Research Foundation 07.08.2023 Frontiers Media S.A
Subjects	Algorithms Firing pattern MAC array multi-core load balancing deployment Nervous system Neural networks Neurons Neuroscience Optimization algorithms SNN Software Sparsity SpGEMM SpiNNaker 2 Temporal lobe
Online Access	Get full text
ISSN	1662-453X 1662-4548 1662-453X
DOI	10.3389/fnins.2023.1223262

Cover

More Information
Summary:	The potential low-energy feature of the spiking neural network (SNN) engages the attention of the AI community. Only CPU-involved SNN processing inevitably results in an inherently long temporal span in the cases of large models and massive datasets. This study introduces the MAC array, a parallel architecture on each processing element (PE) of SpiNNaker 2, into the computational process of SNN inference. Based on the work of single-core optimization algorithms, we investigate the parallel acceleration algorithms for collaborating with multi-core MAC arrays. The proposed Echelon Reorder model information densification algorithm, along with the adapted multi-core two-stage splitting and authorization deployment strategies, achieves efficient spatio-temporal load balancing and optimization performance. We evaluate the performance by benchmarking a wide range of constructed SNN models to research on the influence degree of different factors. We also benchmark with two actual SNN models (the gesture recognition model of the real-world application and balanced random cortex-like network from neuroscience) on the neuromorphic multi-core hardware SpiNNaker 2. The echelon optimization algorithm with mixed processors realizes 74.28% and 85.78% memory footprint of the original MAC calculation on these two models, respectively. The execution time of echelon algorithms using only MAC or mixed processors accounts for ≤ 24.56% of the serial ARM baseline. Accelerating SNN inference with algorithms in this study is essentially the general sparse matrix-matrix multiplication (SpGEMM) problem. This article explicitly expands the application field of the SpGEMM issue to SNN, developing novel SpGEMM optimization algorithms fitting the SNN feature and MAC array.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Reviewed by: Zhuo Zou, Fudan University, China; Steffen Albrecht, The University of Auckland, New Zealand; Arindam Basu, City University of Hong Kong, Hong Kong SAR, China Edited by: Lei Deng, Tsinghua University, China
ISSN:	1662-453X 1662-4548 1662-453X
DOI:	10.3389/fnins.2023.1223262