A Design Framework of Heterogeneous Approximate DCIM-Based Accelerator for Energy-Efficient NN Processing
Static random-access memory (SRAM) based digital compute-in-memory (DCIM) provides error-resilient computation at the expense of considerable power overhead of adder tree. In recent works, DCIM macro based on approximate computing mitigates the adder tree overheads, however, it faces a trade-off bet...
Saved in:
| Published in | IEEE transactions on circuits and systems. I, Regular papers Vol. 72; no. 8; pp. 3997 - 4006 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
New York
IEEE
01.08.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1549-8328 1558-0806 |
| DOI | 10.1109/TCSI.2025.3530637 |
Cover
| Summary: | Static random-access memory (SRAM) based digital compute-in-memory (DCIM) provides error-resilient computation at the expense of considerable power overhead of adder tree. In recent works, DCIM macro based on approximate computing mitigates the adder tree overheads, however, it faces a trade-off between power and neural network (NN) accuracy. The trade-off becomes more complicated in array-level CIM architecture since output channels of NN model have different sensitivities to approximation errors. In this paper, we propose a heterogeneous approximate DCIM-based accelerator design framework that achieves a good energy-accuracy trade-off for a specific NN model. The framework includes three key features: 1) Evolutionary algorithm-based search finds cost-efficient approximation points by pruning the design space. 2) Genetic algorithm-based channel-wise mapping creates heterogeneous approximation methods that effectively reduce DCIM energy consumption while maintaining high accuracy. 3) A hardware generation strategy decides the number of DCIM macros and their sizes, resulting in an energy-efficient DCIM-based accelerator tailored for the given NN model. Experimental results show that employing the proposed heterogeneous channel-wise mapping significantly enhances the energy efficiency compared to a homogeneous mapping. Moreover, the proposed framework can produce heterogeneous DCIM-based accelerators that consume less energy than state-of-the-art approximate DCIM approaches. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1549-8328 1558-0806 |
| DOI: | 10.1109/TCSI.2025.3530637 |