A High-Performance and Energy-Efficient Photonic Architecture for Multi-DNN Acceleration

Large-scale deep neural network (DNN) accelerators are poised to facilitate the concurrent processing of diverse DNNs, imposing demanding challenges on the interconnection fabric. These challenges encompass overcoming performance degradation and energy increase associated with system scaling while a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on parallel and distributed systems Vol. 35; no. 1; pp. 1 - 13
Main Authors	Li, Yuan, Louri, Ahmed, Karanth, Avinash
Format	Journal Article
Language	English
Published	New York IEEE 01.01.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Accelerator Accelerators Algorithms Artificial neural networks Chips (memory devices) Co-design Computer architecture Computer networks Concurrent processing Dataflow Deep neural network Energy efficiency Flexibility Optical interconnects Performance degradation Photonics Receivers Resource allocation Resource management Silicon Silicon photonics System-on-chip Task analysis Transmitters
Online Access	Get full text
ISSN	1045-9219 1558-2183
DOI	10.1109/TPDS.2023.3327535

Cover

More Information
Summary:	Large-scale deep neural network (DNN) accelerators are poised to facilitate the concurrent processing of diverse DNNs, imposing demanding challenges on the interconnection fabric. These challenges encompass overcoming performance degradation and energy increase associated with system scaling while also necessitating flexibility to support dynamic partitioning and adaptable organization of compute resources. Nevertheless, conventional metallic-based interconnects frequently confront inherent limitations in scalability and flexibility. In this paper, we leverage silicon photonic interconnects and adopt an algorithm-architecture co-design approach to develop MDA, a DNN accelerator meticulously crafted to empower high-performance and energy-efficient concurrent processing of diverse DNNs. Specifically, MDA consists of three novel components: (1) a resource allocation algorithm that assigns compute resources to concurrent DNNs based on their computational demands and priorities; (2) a dataflow selection algorithm that determines off-chip and on-chip dataflows for each DNN, with the objectives of minimizing off-chip and on-chip memory accesses, respectively; (3) a flexible silicon photonic network that can be dynamically segmented into sub-networks, each interconnecting the assigned compute resources of a certain DNN while adapting to the communication patterns dictated by the selected on-chip dataflow. Simulation results show that the proposed MDA accelerator outperforms other state-of-the-art multi-DNN accelerators, including PREMA, AI-MT, Planaria, and HDA. MDA accelerator achieves a speedup of 3.6, accompanied by substantial improvements of 7.3×, 12.7×, and 9.2× in energy efficiency, service-level agreement (SLA) satisfaction rate, and fairness, respectively.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1045-9219 1558-2183
DOI:	10.1109/TPDS.2023.3327535