A Lightweight and Efficient GPU for NDP Utilizing Data Access Pattern of Image Processing

As the demand for image applications with high resolution increases, the importance of the system for image processing is growing. Graphics processing units (GPUs) can increase computational capacity with massive parallelism, but are still subject to limited memory bandwidth. Near-data-processing (N...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on computers Vol. 71; no. 1; pp. 13 - 26
Main Authors Choi, Jungwoo, Kim, Boyeal, Jeon, Ji-Ye, Lee, Hyuk-Jae, Lim, Euicheol, Rhee, Chae Eun
Format Journal Article
LanguageEnglish
Published New York IEEE 01.01.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0018-9340
1557-9956
DOI10.1109/TC.2020.3035826

Cover

More Information
Summary:As the demand for image applications with high resolution increases, the importance of the system for image processing is growing. Graphics processing units (GPUs) can increase computational capacity with massive parallelism, but are still subject to limited memory bandwidth. Near-data-processing (NDP) is expected to mitigate the performance and energy overhead caused as a result of data transfer by performing computations on the logic die of 3D-stacked memory. Although prior studies have demonstrated the advantages of NDP, a NDP solution focused on image processing has not yet been developed. This article proposes a GPU-based NDP architecture and well-matched optimization strategies considering both the characteristics of image applications and NDP constraints. First, data allocation to the processing unit is addressed to maintain the data locality and data access pattern. Second, a lightweight yet efficient NDP GPU architecture is proposed. By applying a prefetcher that leverages the pattern-aware data allocation, the number of active warps and the on-chip SRAM size of the NDP are significantly reduced. This enables the NDP constraints to be satisfied and a greater number of processing units to be integrated on a logic die. The evaluation results show that the proposed NDP GPU improves the performance by 1.85× and consumes 82.7 percent energy compared to the baseline NDP GPU.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0018-9340
1557-9956
DOI:10.1109/TC.2020.3035826