Optimized co-scheduling of mixed-precision neural network accelerator for real-time multitasking applications

Neural networks are increasingly applied into real-time and embedded Artificial Intelligent (AI) systems like autonomous driving system. Such resource-constrained systems cannot support the execution of neural network based tasks due to their high execution overheads on general processors. Hence, we...

Full description

Saved in:

Bibliographic Details
Published in	Journal of systems architecture Vol. 110; p. 101775
Main Authors	Jiang, Wei, Song, Ziwei, Zhan, Jinyu, He, Zhiyuan, Wen, Xiangyu, Jiang, Ke
Format	Journal Article
Language	English
Published	Elsevier B.V 01.11.2020
Subjects	Co-scheduling Design optimization Mixed-precision Neural network accelerator Real-time multitasking application Real-time multitasking application Mixed-precision Co-scheduling Neural network accelerator Design optimization
Online Access	Get full text
ISSN	1383-7621 1873-6165
DOI	10.1016/j.sysarc.2020.101775

Cover

More Information
Summary:	Neural networks are increasingly applied into real-time and embedded Artificial Intelligent (AI) systems like autonomous driving system. Such resource-constrained systems cannot support the execution of neural network based tasks due to their high execution overheads on general processors. Hence, we are approaching to design real-time AI applications on embedded systems with CPU and FPGA (Field Programmable Gate Array) coprocessors. We use dedicated FPGA to accelerate the neural network job and utilize CPU to process the rest jobs of real-time multitasking applications. We devise an Idle-Aware Earliest Deadline First policy to co-schedule the AI applications on hybrid CPU and FPGA coprocessors. Since the implementation of neural network job on FPGA accelerator with different precision configuration will result in different execution time and accuracy, we are also interested in the design optimization of real-time AI applications running on mixed-precision neural network accelerator, with the purpose of maximizing the accuracy related rewards of all applications subject to real-time related constraints. We address the problem as a multi-stage decision procedure, and propose an efficient dynamic programming approach with two pruning policies to reduce the intermediate searching states. Extensive experiments and real-life case evaluations demonstrate the efficiency of the proposed approaches.
ISSN:	1383-7621 1873-6165
DOI:	10.1016/j.sysarc.2020.101775