Q-Learning-Based Robust Control for Nonlinear Systems With Mismatched Perturbations

This brief presents a novel optimal control (OC) approach based on <inline-formula> <tex-math notation="LaTeX">\mathcal {Q} </tex-math></inline-formula>-learning to address robust control challenges for uncertain nonlinear systems subject to mismatched perturbations...

Full description

Saved in:
Bibliographic Details
Published inIEEE transaction on neural networks and learning systems Vol. 36; no. 8; pp. 15547 - 15552
Main Authors Cui, Qian, Feng, Gang, Xu, Xuesong
Format Journal Article
LanguageEnglish
Published United States IEEE 01.08.2025
Subjects
Online AccessGet full text
ISSN2162-237X
2162-2388
2162-2388
DOI10.1109/TNNLS.2025.3543336

Cover

More Information
Summary:This brief presents a novel optimal control (OC) approach based on <inline-formula> <tex-math notation="LaTeX">\mathcal {Q} </tex-math></inline-formula>-learning to address robust control challenges for uncertain nonlinear systems subject to mismatched perturbations. Unlike conventional methodologies that solve the robust control problem directly, our approach reformulates the problem by minimizing a value function that integrates perturbation information. The <inline-formula> <tex-math notation="LaTeX">\mathcal {Q} </tex-math></inline-formula>-function is subsequently constructed by coupling the optimal value function with the Hamiltonian function. To estimate the parameters of the <inline-formula> <tex-math notation="LaTeX">\mathcal {Q} </tex-math></inline-formula>-function, an integral reinforcement learning (IRL) technique is employed to develop a critic neural network (NN). Leveraging this parameterized <inline-formula> <tex-math notation="LaTeX">\mathcal {Q} </tex-math></inline-formula>-function, we derive a model-free OC solution that generalizes the model-based formulation. Furthermore, using Lyapunov's direct method, the resulting closed-loop system is guaranteed to have uniform ultimate bounded stability. A case study is presented to showcase the effectiveness and applicability of the proposed approach.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2162-237X
2162-2388
2162-2388
DOI:10.1109/TNNLS.2025.3543336