Deep Reinforcement Learning for UAV-Assisted Spectrum Sharing Under Partial Observability
This paper proposes a dynamic spectrum sharing scheme in an unmanned aerial vehicle (UAV) assisted cognitive radio network. The UAV serves as a secondary base station to provide communication services to multiple secondary users (SUs) by adaptively utilizing the spatio-temporal spectrum opportunitie...
Saved in:
Published in | IEEE Vehicular Technology Conference pp. 1 - 6 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
10.10.2023
|
Subjects | |
Online Access | Get full text |
ISSN | 2577-2465 |
DOI | 10.1109/VTC2023-Fall60731.2023.10333853 |
Cover
Summary: | This paper proposes a dynamic spectrum sharing scheme in an unmanned aerial vehicle (UAV) assisted cognitive radio network. The UAV serves as a secondary base station to provide communication services to multiple secondary users (SUs) by adaptively utilizing the spatio-temporal spectrum opportunities of multiple device-to-device primary users (PUs), where each PU's spectrum occupancy follows a two-state Markov process. We jointly optimize the UAV's trajectory and user association to maximize the expectation of its cumulative energy efficiency subject to the interference constraint of the PUs. We formulate this problem as a partially observable Markov decision process (POMDP), where the UAV can only observe the spectrum occupancy status of the adjacent PUs. Due to the lack of the PUs' spectrum occupancy statistics, we propose a model-free reinforcement learning algorithm named partially observable double deep Q network (PO-DDQN) to obtain the near-optimal spectrum sharing policy. Simulation results show that our proposed algorithm outperforms the baseline policy gradient (PG) algorithm in terms of convergence speed and the UAV's energy efficiency. Additionally, the spectrum utilization efficiency can be further enhanced when the UAV has wider observation radius, or if the PUs' spectrum occupancy exhibits stronger temporal correlation. |
---|---|
ISSN: | 2577-2465 |
DOI: | 10.1109/VTC2023-Fall60731.2023.10333853 |