Optimal AI Model Splitting and Resource Allocation for Device-Edge Co-Inference in Multi-User Wireless Sensing Systems

With recent advancements in artificial intelligence (AI), wireless sensing has recently been accepted as an attractive solution to enable accurate detection of human activities by analyzing the radio signal variations of sensor devices (SDs) using a well-trained AI model. However, due to the limited...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on wireless communications Vol. 23; no. 9; pp. 11094 - 11108
Main Authors	Li, Xian, Bi, Suzhi
Format	Journal Article
Language	English
Published	New York IEEE 01.09.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Artificial intelligence Collaboration collaborative inference Combinatorial analysis Computational modeling Convexity Couplings Data models Deep learning Design optimization Edge computing Energy consumption Inference Linear programming Machine learning Mixed integer mobile edge computing model splitting Nonlinear programming Radio signals Resource allocation Sensors Splitting Task analysis Wireless communication Wireless sensing Wireless sensor networks
Online Access	Get full text
ISSN	1536-1276 1558-2248
DOI	10.1109/TWC.2024.3378418

Cover

More Information
Summary:	With recent advancements in artificial intelligence (AI), wireless sensing has recently been accepted as an attractive solution to enable accurate detection of human activities by analyzing the radio signal variations of sensor devices (SDs) using a well-trained AI model. However, due to the limited communication and computation resources at SDs, it is impractical to support energy- and delay-sensitive sensing services by solely processing the massive computation workload at local or offloading it to the edge server (ES) for edge inference. To address this problem, we consider in this paper device-edge co-inference in a wireless sensing system where multiple users collaboratively perform a common inference task. In particular, the AI model deployed at each SD can be split into two sequential parts. Each SD executes the former part of AI model at local, and leaves the remaining part computed at the ES. We aim to minimize the energy consumption of SDs subject to a prescribed inference latency requirement. To this end, we formulate a mixed integer non-linear programming (MINLP) to jointly optimize the model splitting point and system resource allocation, where the major difficulty lies in the tight couplings among splitting decisions of collaborative SDs. To solve the problem, we propose an integrated learning and optimization algorithm named LOP, which tackles the combinatorial model splitting by using a deep reinforcement learning (DRL)-based method, and deals with the remaining resource allocation problem via convex optimization. To gain some engineering insights, we study the optimal model splitting design in a practical wireless indoor crowd counting system, where the optimal splitting point exhibits a threshold-based structure related to the user channel gain. Simulation results demonstrate that the proposed LOP algorithm can achieve a near-optimal energy performance with on average 0.8% optimality gap while enjoying a hundredfold reduction in computation delay.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1536-1276 1558-2248
DOI:	10.1109/TWC.2024.3378418