GCIM: Towards Efficient Processing of Graph Convolutional Networks in 3D-Stacked Memory

Graph convolutional networks (GCNs) have become a powerful deep learning approach for graph-structured data. Different from traditional neural networks such as convolutional neural networks, GCNs handle irregular input graph data, and GCNs are both computation-bound and memory-bound. How to efficien...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on computer-aided design of integrated circuits and systems p. 1
Main Authors Chen, Jiaxian, Lin, Yiquan, Sun, Kaoyi, Chen, Jiexin, Ma, Chenlin, Mao, Rui, Wang, Yi
Format Journal Article
LanguageEnglish
Published IEEE 2022
Subjects
Online AccessGet full text
ISSN0278-0070
DOI10.1109/TCAD.2022.3198320

Cover

More Information
Summary:Graph convolutional networks (GCNs) have become a powerful deep learning approach for graph-structured data. Different from traditional neural networks such as convolutional neural networks, GCNs handle irregular input graph data, and GCNs are both computation-bound and memory-bound. How to efficiently utilize the underlying computation and memory resource becomes a critical issue. The emerging 3D-stacked computation-in-memory (CIM) architecture can reduce the data movement between computing logic and memory, thereby presenting a promising solution for the processing of GCNs. An unsolved key challenge is how to allocate GCNs to take advantage of fast near-data processing of the 3D-stacked CIM architecture. This paper presents GCIM, a software-hardware co-design approach to exploit the efficient processing of Graph convolutional networks on Computation-In-Memory architecture. At the level of hardware design, GCIM integrates lightweight computing units near memory banks to fully exploit bank-level bandwidth and parallelism. At the level of software design, a locality-aware data mapping algorithm is proposed to partition the input graph and achieve workload balancing. GCIM is evaluated through a set of representative GCN models and standard graph datasets. The experimental results show that GCIM can significantly reduce the processing latency and data movement overhead compared with representative schemes.
ISSN:0278-0070
DOI:10.1109/TCAD.2022.3198320