A novel actor–critic–identifier architecture for nonlinear multiagent systems with gradient descent method

The infinite-horizon optimal consensus control problem for continuous-time unknown nonlinear multiagent systems (MASs) is solved via an online adaptive reinforcement learning approach. Three different neural network (NN) structures are proposed as part of a novel actor–critic–identifier (ACI) to app...

Full description

Saved in:
Bibliographic Details
Published inAutomatica (Oxford) Vol. 155; p. 111128
Main Authors Ming, Zhongyang, Zhang, Huaguang, Zhang, Juan, Xie, Xiangpeng
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.09.2023
Subjects
Online AccessGet full text
ISSN0005-1098
1873-2836
DOI10.1016/j.automatica.2023.111128

Cover

More Information
Summary:The infinite-horizon optimal consensus control problem for continuous-time unknown nonlinear multiagent systems (MASs) is solved via an online adaptive reinforcement learning approach. Three different neural network (NN) structures are proposed as part of a novel actor–critic–identifier (ACI) to approximate the Hamilton–Jacobi–Bellman (HJB) equations. The actor and critic NNs approximate the optimal controls and optimal value functions, respectively, and identifier NN approximates the unknown system model. The ACI architecture is simpler than most known architectures, which has the benefit of allowing actor, critic, and identifier learning to occur concurrently and continuously without the system model. It is important to note that the tuning laws for the three different NNs are designed using gradient descent method. Adaptive control techniques based on Lyapunov method are used to examine the algorithm’s convergence. To verify the viability of the proposed approach, we offer a simulated example.
ISSN:0005-1098
1873-2836
DOI:10.1016/j.automatica.2023.111128