A lightweight neural network search algorithm based on in-place distillation and performance prediction for hardware-aware optimization

Due to the limited computing resources of edge devices, traditional object detection algorithms struggle to meet the efficiency and accuracy requirements of autonomous driving. Consequently, designing a neural network model that balances hardware resource requirements, operating speed, and accuracy...

Full description

Saved in:
Bibliographic Details
Published inEngineering applications of artificial intelligence Vol. 151; p. 110775
Main Authors Kang, Siyuan, Sun, Yinghao, Li, Shuguang, Xu, Yaozong, Li, Yuke, Chen, Guangjie, Xue, Fei
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.07.2025
Subjects
Online AccessGet full text
ISSN0952-1976
DOI10.1016/j.engappai.2025.110775

Cover

More Information
Summary:Due to the limited computing resources of edge devices, traditional object detection algorithms struggle to meet the efficiency and accuracy requirements of autonomous driving. Consequently, designing a neural network model that balances hardware resource requirements, operating speed, and accuracy is crucial. To address this, by integrating algorithm with hardware characteristics, we propose a lightweight neural network architecture search algorithm based on in-place distillation and performance predictor (LNIP). Initially, we focus on optimizing the operators of the you only look once version 8 nano (YOLOv8n) and dynamically adjust its network structure. Then, we trained a super-network using a progressive shrinking strategy, the sandwich rule, and in-place distillation. Subsequently, we employed a Gaussian process to model the relationship between network architecture and accuracy, utilizing encoding methods and custom kernel function to develop high-performance predictor. Finally, during the search process, we introduce a reward function based on Pareto optimality to balance the performance of the model with hardware constraints. Building upon this foundation, we design an efficient search algorithm based on the performance predictor to progressively explore the optimal network structure tailored to hardware characteristics. We compared our lightweight network with state-of-the-art methods on the BDD100K, COCO, and PASCAL VOC datasets and deployed it on the Black Sesame A1000 and NVIDIA Xavier for comprehensive evaluation. On the NVIDIA Xavier, the lightweight network achieves a latency of 11.81 ms and an edge precision of 46.1 %. These experimental results demonstrate that our method outperforms existing methods in balancing hardware constraints and model performance.
ISSN:0952-1976
DOI:10.1016/j.engappai.2025.110775