Edge Video Analytics with Adaptive Information Gathering: A Deep Reinforcement Learning Approach

With growing popularity of enormous public safety and transportation infrastructure cameras, there are increasing demands for automatic mobile video analytics. The emerging multi-access edge computing (MEC) technology has been recently applied to improve the accuracy-latency tradeoff of mobile video...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on wireless communications Vol. 22; no. 9; p. 1
Main Authors	Wang, Shuoyao, Bi, Suzhi, Zhang, Ying-Jun Angela
Format	Journal Article
Language	English
Published	New York IEEE 01.09.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Accuracy Algorithms Bandwidths Channel allocation Decision analysis Decomposition Deep learning Deep Reinforcement Learning Degradation Edge computing Markov processes Markove Decision Process Mathematical analysis Mobile computing Multi-access Edge Computing Network latency Optimization Public safety Real-time systems Resource Allocation Resource management Servers Streaming media Tradeoffs Video Analytics Visual analytics
Online Access	Get full text
ISSN	1536-1276 1558-2248
DOI	10.1109/TWC.2023.3237202

Cover

More Information
Summary:	With growing popularity of enormous public safety and transportation infrastructure cameras, there are increasing demands for automatic mobile video analytics. The emerging multi-access edge computing (MEC) technology has been recently applied to improve the accuracy-latency tradeoff of mobile video analytics. In this paper, we study an MEC-enabled multi-device video analytics system and formulate the problem as a Markov decision process (MDP) to meet two practical challenges: i) the absence of ground truth in real-time and ii) the content-varying degradation-accuracy relation. In particular, we aim to design an online joint frame degradation and bandwidth allocation algorithm with the time-varying function and limited feedback from each device. Thanks to the MDP formulation and n -step return technique, the long-term goal offers adaptive information gathering and thus improves the average accuracy and latency. For sample efficiency, we decompose the MDP problem into discrete degradation adaptation subproblems and continuous bandwidth allocation subproblems. Based on the decomposition, we propose a deep reinforcement learning (DRL) based framework, referred to as DBAG, to solve the decomposed subproblems. DBAG integrates model-based optimization and model-free DRL to solve the MDP problem with a discrete-continuous hybrid action space. Under various network setups and public datasets, DBAG greatly improves the accuracy-latency tradeoff.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1536-1276 1558-2248
DOI:	10.1109/TWC.2023.3237202