TD3 Agent-Based Nonlinear Dynamic Inverse Control for Fixed-Wing UAV Attitudes

To enhance the robustness of the nonlinear dynamic inverse (NDI) technique in the presence of model uncertainties, this study introduces a control scheme that integrates a reinforcement learning (RL) agent with the NDI approach. Initially, a fixed-wing unmanned aerial vehicle (UAV) is selected as th...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on intelligent transportation systems pp. 1 - 12
Main Authors Hu, Wenjun, Wang, Yujie, Chen, Qingyang, Wang, Peng, Wu, Erdong, Guo, Zheng, Hou, Zhongxi
Format Journal Article
LanguageEnglish
Published IEEE 01.05.2025
Subjects
Online AccessGet full text
ISSN1524-9050
1558-0016
DOI10.1109/TITS.2025.3561517

Cover

More Information
Summary:To enhance the robustness of the nonlinear dynamic inverse (NDI) technique in the presence of model uncertainties, this study introduces a control scheme that integrates a reinforcement learning (RL) agent with the NDI approach. Initially, a fixed-wing unmanned aerial vehicle (UAV) is selected as the primary subject of investigation, and its attitude dynamics model is established. A control system employing a nonlinear disturbance observer within a dynamic inverse framework has been developed based on this model. Subsequently, the stability of the resulting control system is verified through Lyapunov analysis. Following this, a twin delayed deep deterministic policy gradient (TD3) agent is introduced, with the closed-loop system serving as the training environment. Through continuous interaction with its surroundings, the agent learns to dynamically adjust control parameters in response to control errors. Ultimately, the trained RL agent is utilized to optimize the control parameters for the dynamic system, and a flight simulation of the fixed-wing UAV's attitude control is conducted. The simulation results demonstrate that the control parameters can be adaptively adjusted using the TD3-NDI method, which mitigates overshoot and suppresses oscillations during the control process. These findings confirm the effectiveness and robustness of the proposed control strategy.
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2025.3561517