GCIM: Towards Efficient Processing of Graph Convolutional Networks in 3D-Stacked Memory

Graph convolutional networks (GCNs) have become a powerful deep learning approach for graph-structured data. Different from traditional neural networks such as convolutional neural networks, GCNs handle irregular input graph data, and GCNs are both computation-bound and memory-bound. How to efficien...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on computer-aided design of integrated circuits and systems p. 1
Main Authors	Chen, Jiaxian, Lin, Yiquan, Sun, Kaoyi, Chen, Jiexin, Ma, Chenlin, Mao, Rui, Wang, Yi
Format	Journal Article
Language	English
Published	IEEE 2022
Subjects	3D-Stacked Memory Accelerator Aggregates Bandwidth Common Information Model (computing) Computation-in-Memory Computational efficiency Computer architecture Graph Convolutional Networks Memory management Random access memory
Online Access	Get full text
ISSN	0278-0070
DOI	10.1109/TCAD.2022.3198320

Cover

More Information
Summary:	Graph convolutional networks (GCNs) have become a powerful deep learning approach for graph-structured data. Different from traditional neural networks such as convolutional neural networks, GCNs handle irregular input graph data, and GCNs are both computation-bound and memory-bound. How to efficiently utilize the underlying computation and memory resource becomes a critical issue. The emerging 3D-stacked computation-in-memory (CIM) architecture can reduce the data movement between computing logic and memory, thereby presenting a promising solution for the processing of GCNs. An unsolved key challenge is how to allocate GCNs to take advantage of fast near-data processing of the 3D-stacked CIM architecture. This paper presents GCIM, a software-hardware co-design approach to exploit the efficient processing of Graph convolutional networks on Computation-In-Memory architecture. At the level of hardware design, GCIM integrates lightweight computing units near memory banks to fully exploit bank-level bandwidth and parallelism. At the level of software design, a locality-aware data mapping algorithm is proposed to partition the input graph and achieve workload balancing. GCIM is evaluated through a set of representative GCN models and standard graph datasets. The experimental results show that GCIM can significantly reduce the processing latency and data movement overhead compared with representative schemes.
ISSN:	0278-0070
DOI:	10.1109/TCAD.2022.3198320