Optimized co-scheduling of mixed-precision neural network accelerator for real-time multitasking applications

Neural networks are increasingly applied into real-time and embedded Artificial Intelligent (AI) systems like autonomous driving system. Such resource-constrained systems cannot support the execution of neural network based tasks due to their high execution overheads on general processors. Hence, we...

Full description

Saved in:
Bibliographic Details
Published inJournal of systems architecture Vol. 110; p. 101775
Main Authors Jiang, Wei, Song, Ziwei, Zhan, Jinyu, He, Zhiyuan, Wen, Xiangyu, Jiang, Ke
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.11.2020
Subjects
Online AccessGet full text
ISSN1383-7621
1873-6165
DOI10.1016/j.sysarc.2020.101775

Cover

More Information
Summary:Neural networks are increasingly applied into real-time and embedded Artificial Intelligent (AI) systems like autonomous driving system. Such resource-constrained systems cannot support the execution of neural network based tasks due to their high execution overheads on general processors. Hence, we are approaching to design real-time AI applications on embedded systems with CPU and FPGA (Field Programmable Gate Array) coprocessors. We use dedicated FPGA to accelerate the neural network job and utilize CPU to process the rest jobs of real-time multitasking applications. We devise an Idle-Aware Earliest Deadline First policy to co-schedule the AI applications on hybrid CPU and FPGA coprocessors. Since the implementation of neural network job on FPGA accelerator with different precision configuration will result in different execution time and accuracy, we are also interested in the design optimization of real-time AI applications running on mixed-precision neural network accelerator, with the purpose of maximizing the accuracy related rewards of all applications subject to real-time related constraints. We address the problem as a multi-stage decision procedure, and propose an efficient dynamic programming approach with two pruning policies to reduce the intermediate searching states. Extensive experiments and real-life case evaluations demonstrate the efficiency of the proposed approaches.
ISSN:1383-7621
1873-6165
DOI:10.1016/j.sysarc.2020.101775