A novel actor–critic–identifier architecture for nonlinear multiagent systems with gradient descent method

The infinite-horizon optimal consensus control problem for continuous-time unknown nonlinear multiagent systems (MASs) is solved via an online adaptive reinforcement learning approach. Three different neural network (NN) structures are proposed as part of a novel actor–critic–identifier (ACI) to app...

Full description

Saved in:

Bibliographic Details
Published in	Automatica (Oxford) Vol. 155; p. 111128
Main Authors	Ming, Zhongyang, Zhang, Huaguang, Zhang, Juan, Xie, Xiangpeng
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.09.2023
Subjects	Consensus problem Model free Nonlinear multiagent systems Reinforcement learning (RL) Reinforcement learning (RL) Nonlinear multiagent systems Consensus problem Model free
Online Access	Get full text
ISSN	0005-1098 1873-2836
DOI	10.1016/j.automatica.2023.111128

Cover

More Information
Summary:	The infinite-horizon optimal consensus control problem for continuous-time unknown nonlinear multiagent systems (MASs) is solved via an online adaptive reinforcement learning approach. Three different neural network (NN) structures are proposed as part of a novel actor–critic–identifier (ACI) to approximate the Hamilton–Jacobi–Bellman (HJB) equations. The actor and critic NNs approximate the optimal controls and optimal value functions, respectively, and identifier NN approximates the unknown system model. The ACI architecture is simpler than most known architectures, which has the benefit of allowing actor, critic, and identifier learning to occur concurrently and continuously without the system model. It is important to note that the tuning laws for the three different NNs are designed using gradient descent method. Adaptive control techniques based on Lyapunov method are used to examine the algorithm’s convergence. To verify the viability of the proposed approach, we offer a simulated example.
ISSN:	0005-1098 1873-2836
DOI:	10.1016/j.automatica.2023.111128