Joint Trajectory and Handover Management for UAVs Co-existing with Terrestrial Users: A Multi-Agent DRL Approach

Despite increasing interest in cellular-connected unmanned aerial vehicles (UAVs), their integration into existing cellular networks poses substantial challenges, including intense interference from UAVs to terrestrial user equipments (UEs) and numerous redundant handovers. To jointly reduce the gen...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on cognitive communications and networking p. 1
Main Authors	Deng, Yuhang, Zhang, Shuai, Meer, Irshad A., Ozger, Mustafa, Cavdar, Cicek
Format	Journal Article
Language	English
Published	IEEE 2025
Subjects	Autonomous aerial vehicles Cellular networks Cellular-connected UAVs Delays Handover Handover management Interference Multi-agent deep reinforcement learning Multi-objective optimization Optimization Resource management Throughput Training Trajectory Trajectory design
Online Access	Get full text
ISSN	2332-7731 2372-2045 2332-7731
DOI	10.1109/TCCN.2025.3578506

Cover

More Information
Summary:	Despite increasing interest in cellular-connected unmanned aerial vehicles (UAVs), their integration into existing cellular networks poses substantial challenges, including intense interference from UAVs to terrestrial user equipments (UEs) and numerous redundant handovers. To jointly reduce the generated interference and redundant handovers of cellular-connected UAVs while keeping their low transmission delay, we define an optimization problem subject to constraints on total available bandwidth and quality of service (QoS). Then, we formulate the optimization problem as a decentralized partially observable Markov decision process (Dec-POMDP) in the context of a cooperative game. We further develope a collaborative trajectory and handover management scheme using a multi-agent deep reinforcement learning algorithm, specifically the Q-learning with a MIXer network (QMIX) algorithm, to jointly optimize the aforementioned three metrics. Simulation results demonstrate that QMIX significantly outperforms two benchmark schemes: the conventional handover management (CHM) scheme and the independent dueling double deep recurrent Q-network (ID3RQN) scheme. Compared with the CHM scheme, QMIX reduces the delay, interference, and number of handovers for UAVs by an average of 46.9%, 70.0% and 90.5%, respectively. Compared with the ID3RQN scheme, QMIX reduces the three metrics by an average of 90.0%, 43.0% and 41.7%, respectively.
ISSN:	2332-7731 2372-2045 2332-7731
DOI:	10.1109/TCCN.2025.3578506