Joint Trajectory and Handover Management for UAVs Co-existing with Terrestrial Users: A Multi-Agent DRL Approach
Despite increasing interest in cellular-connected unmanned aerial vehicles (UAVs), their integration into existing cellular networks poses substantial challenges, including intense interference from UAVs to terrestrial user equipments (UEs) and numerous redundant handovers. To jointly reduce the gen...
        Saved in:
      
    
          | Published in | IEEE transactions on cognitive communications and networking p. 1 | 
|---|---|
| Main Authors | , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
            IEEE
    
        2025
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 2332-7731 2372-2045 2332-7731  | 
| DOI | 10.1109/TCCN.2025.3578506 | 
Cover
| Summary: | Despite increasing interest in cellular-connected unmanned aerial vehicles (UAVs), their integration into existing cellular networks poses substantial challenges, including intense interference from UAVs to terrestrial user equipments (UEs) and numerous redundant handovers. To jointly reduce the generated interference and redundant handovers of cellular-connected UAVs while keeping their low transmission delay, we define an optimization problem subject to constraints on total available bandwidth and quality of service (QoS). Then, we formulate the optimization problem as a decentralized partially observable Markov decision process (Dec-POMDP) in the context of a cooperative game. We further develope a collaborative trajectory and handover management scheme using a multi-agent deep reinforcement learning algorithm, specifically the Q-learning with a MIXer network (QMIX) algorithm, to jointly optimize the aforementioned three metrics. Simulation results demonstrate that QMIX significantly outperforms two benchmark schemes: the conventional handover management (CHM) scheme and the independent dueling double deep recurrent Q-network (ID3RQN) scheme. Compared with the CHM scheme, QMIX reduces the delay, interference, and number of handovers for UAVs by an average of 46.9%, 70.0% and 90.5%, respectively. Compared with the ID3RQN scheme, QMIX reduces the three metrics by an average of 90.0%, 43.0% and 41.7%, respectively. | 
|---|---|
| ISSN: | 2332-7731 2372-2045 2332-7731  | 
| DOI: | 10.1109/TCCN.2025.3578506 |