Design and Optimization of Efficient Digital Machine Learning Accelerators: An overview of architecture choices, efficient quantization, sparsity exploration, and system integration techniques

Digital machine learning (ML) accelerators are popular and widely used. We provide an overview of the SIMD and systolic array architectures that form the foundation of many accelerator designs. The demand for higher compute density, energy efficiency, and scalability has been increasing. To address...

Full description

Saved in:

Bibliographic Details
Published in	IEEE solid state circuits magazine Vol. 17; no. 2; pp. 30 - 38
Main Authors	Tang, Wei, Cho, Sung-Gun, Zhang, Jie-Fang, Zhang, Zhengya
Format	Journal Article
Language	English
Published	Piscataway IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Accelerators Computer architecture Design optimization Hardware Machine learning Optimization Quantization (signal) Reviews Scalability Single instruction multiple data System integration Systolic arrays
Online Access	Get full text
ISSN	1943-0582 1943-0590
DOI	10.1109/MSSC.2025.3549361

Cover

More Information
Summary:	Digital machine learning (ML) accelerators are popular and widely used. We provide an overview of the SIMD and systolic array architectures that form the foundation of many accelerator designs. The demand for higher compute density, energy efficiency, and scalability has been increasing. To address these needs, new ML accelerator designs have adopted a range of techniques, including advanced architectural design, more efficient quantization, exploiting data-level sparsity, and leveraging new integration technologies. For each of these techniques, we review the common approaches, identify the design tradeoffs, and discuss their implications.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1943-0582 1943-0590
DOI:	10.1109/MSSC.2025.3549361