Design and Optimization of Efficient Digital Machine Learning Accelerators: An overview of architecture choices, efficient quantization, sparsity exploration, and system integration techniques

Digital machine learning (ML) accelerators are popular and widely used. We provide an overview of the SIMD and systolic array architectures that form the foundation of many accelerator designs. The demand for higher compute density, energy efficiency, and scalability has been increasing. To address...

Full description

Saved in:
Bibliographic Details
Published inIEEE solid state circuits magazine Vol. 17; no. 2; pp. 30 - 38
Main Authors Tang, Wei, Cho, Sung-Gun, Zhang, Jie-Fang, Zhang, Zhengya
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1943-0582
1943-0590
DOI10.1109/MSSC.2025.3549361

Cover

More Information
Summary:Digital machine learning (ML) accelerators are popular and widely used. We provide an overview of the SIMD and systolic array architectures that form the foundation of many accelerator designs. The demand for higher compute density, energy efficiency, and scalability has been increasing. To address these needs, new ML accelerator designs have adopted a range of techniques, including advanced architectural design, more efficient quantization, exploiting data-level sparsity, and leveraging new integration technologies. For each of these techniques, we review the common approaches, identify the design tradeoffs, and discuss their implications.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1943-0582
1943-0590
DOI:10.1109/MSSC.2025.3549361