A Graph-Based Accelerator of Retinex Model With Bit-Serial Computing for Image Enhancements
This work proposes the Poisson equation formulation of the Retinex model for image enhancements using a low-power graph hardware accelerator performing finite difference updates on a lattice graph processing element (PE) array. By encapsulating the underlying algorithm in a graph hardware structure,...
Saved in:
| Published in | IEEE transactions on circuits and systems. I, Regular papers Vol. 72; no. 3; pp. 1346 - 1357 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
New York
IEEE
01.03.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1549-8328 1558-0806 |
| DOI | 10.1109/TCSI.2025.3525653 |
Cover
| Summary: | This work proposes the Poisson equation formulation of the Retinex model for image enhancements using a low-power graph hardware accelerator performing finite difference updates on a lattice graph processing element (PE) array. By encapsulating the underlying algorithm in a graph hardware structure, a highly localized dataflow that takes advantage of the physical placement of the PEs is enabled to minimize data movement and maximize data reuse. The on-chip dataflow that achieves data sharing, and reuse among neighboring PEs during massively parallel updates is generated in each PE driven by two external control signals. Using a custom accumulator design intended for bit-serial computing, this work enables precision on demand and extensive on-chip data reuse with minimal area overhead, accommodating a non-overlap image mapping scheme in which a <inline-formula> <tex-math notation="LaTeX">20\times 20 </tex-math></inline-formula> image tile can be processed without external memory access at a time. With increasing user-configurable update count, image noise and shadow can be progressively removed with the inevitable loss of image details. Fabricated using a 65nm technology, the test chip occupies 0.2955mm2 core area and consumes 2.191mW operating at 1V, 25.6MHz, and a reconfigurable 10- or 14-bit precision. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1549-8328 1558-0806 |
| DOI: | 10.1109/TCSI.2025.3525653 |