A novel actor–critic–identifier architecture for nonlinear multiagent systems with gradient descent method

The infinite-horizon optimal consensus control problem for continuous-time unknown nonlinear multiagent systems (MASs) is solved via an online adaptive reinforcement learning approach. Three different neural network (NN) structures are proposed as part of a novel actor–critic–identifier (ACI) to app...

Full description

Saved in:
Bibliographic Details
Published inAutomatica (Oxford) Vol. 155; p. 111128
Main Authors Ming, Zhongyang, Zhang, Huaguang, Zhang, Juan, Xie, Xiangpeng
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.09.2023
Subjects
Online AccessGet full text
ISSN0005-1098
1873-2836
DOI10.1016/j.automatica.2023.111128

Cover

Abstract The infinite-horizon optimal consensus control problem for continuous-time unknown nonlinear multiagent systems (MASs) is solved via an online adaptive reinforcement learning approach. Three different neural network (NN) structures are proposed as part of a novel actor–critic–identifier (ACI) to approximate the Hamilton–Jacobi–Bellman (HJB) equations. The actor and critic NNs approximate the optimal controls and optimal value functions, respectively, and identifier NN approximates the unknown system model. The ACI architecture is simpler than most known architectures, which has the benefit of allowing actor, critic, and identifier learning to occur concurrently and continuously without the system model. It is important to note that the tuning laws for the three different NNs are designed using gradient descent method. Adaptive control techniques based on Lyapunov method are used to examine the algorithm’s convergence. To verify the viability of the proposed approach, we offer a simulated example.
AbstractList The infinite-horizon optimal consensus control problem for continuous-time unknown nonlinear multiagent systems (MASs) is solved via an online adaptive reinforcement learning approach. Three different neural network (NN) structures are proposed as part of a novel actor–critic–identifier (ACI) to approximate the Hamilton–Jacobi–Bellman (HJB) equations. The actor and critic NNs approximate the optimal controls and optimal value functions, respectively, and identifier NN approximates the unknown system model. The ACI architecture is simpler than most known architectures, which has the benefit of allowing actor, critic, and identifier learning to occur concurrently and continuously without the system model. It is important to note that the tuning laws for the three different NNs are designed using gradient descent method. Adaptive control techniques based on Lyapunov method are used to examine the algorithm’s convergence. To verify the viability of the proposed approach, we offer a simulated example.
ArticleNumber 111128
Author Zhang, Juan
Ming, Zhongyang
Zhang, Huaguang
Xie, Xiangpeng
Author_xml – sequence: 1
  givenname: Zhongyang
  surname: Ming
  fullname: Ming, Zhongyang
  email: zhongyangming427@163.com
  organization: The School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, 110004, PR China
– sequence: 2
  givenname: Huaguang
  surname: Zhang
  fullname: Zhang, Huaguang
  email: hgzhang@ieee.org
  organization: State Key Laboratory of Synthetical Automation for Process Industries. The School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, 110004, PR China
– sequence: 3
  givenname: Juan
  surname: Zhang
  fullname: Zhang, Juan
  email: zjneu11@163.com
  organization: The School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, 110004, PR China
– sequence: 4
  givenname: Xiangpeng
  surname: Xie
  fullname: Xie, Xiangpeng
  email: xiexp@njupt.edu.cn
  organization: Institute of Advanced Technology, Nanjing University of Posts and Telecommunications, Nanjing, 210023, PR China
BookMark eNqNkMtKAzEUhoNUsK2-Q15gai4z08xGqMUbCG50HdLkTJsyk0iSVrrzHXxDn8QMFQQ3ms1Jwv9_cL4JGjnvACFMyYwSWl9uZ2qXfK-S1WrGCOMzmg8TJ2hMxZwXTPB6hMaEkKqgpBFnaBLjNj9LKtgYuQV2fg8dVjr58Pn-oYPNqHyxBlyyrYWAVdAbm0CnXQDc-pArrrMOVMD9rktWrXMUx0NM0Ef8ZtMGr4Mydvg1EPUwe0gbb87Raau6CBffc4pebm-el_fF49Pdw3LxWGhORSqM4SCahhpTV1BxUVc1I1ARzqGpGRBdsXbelIaUmpUrwhTRnFfNKgfquVCUT5E4cnXwMQZo5WuwvQoHSYkcvMmt_PEmB2_y6C1Xr35VtU055l0Kynb_AVwfAZAX3Gd_MuqsQoOxITuUxtu_IV_9RpbP
CitedBy_id crossref_primary_10_1109_TTE_2024_3402316
crossref_primary_10_1007_s11432_023_3843_7
crossref_primary_10_1002_oca_3174
crossref_primary_10_1109_TCYB_2024_3403680
crossref_primary_10_1109_TCYB_2024_3404010
crossref_primary_10_3390_math12243916
crossref_primary_10_1007_s11071_024_10799_1
crossref_primary_10_1016_j_ins_2025_122072
crossref_primary_10_1109_TCYB_2024_3443867
crossref_primary_10_1109_TCSII_2023_3335343
crossref_primary_10_1002_rnc_6983
crossref_primary_10_1038_s41598_024_65463_w
crossref_primary_10_1002_rnc_7256
crossref_primary_10_1049_cth2_12610
Cites_doi 10.1109/TNN.2005.863458
10.1016/j.automatica.2012.06.096
10.1016/j.automatica.2012.05.049
10.1109/TNN.2011.2168538
10.1016/j.automatica.2020.109451
10.1109/TCYB.2019.2903117
10.1109/TNN.2009.2027233
10.1016/j.automatica.2014.10.056
10.1016/j.automatica.2017.03.022
10.1109/TCYB.2020.3027344
10.1016/j.automatica.2012.09.019
10.1109/TSMC.2020.3003224
10.1016/j.automatica.2012.05.074
10.1109/TCYB.2018.2859801
10.1109/TCYB.2016.2523878
10.1109/JAS.2021.1004359
10.1109/TNNLS.2016.2642128
10.1109/TNNLS.2021.3051030
10.1109/TNNLS.2016.2586303
10.1109/TSMCB.2012.2203336
10.1016/j.automatica.2010.10.033
10.1109/TPEL.2021.3132028
10.1109/TFUZZ.2014.2310238
10.1016/j.neunet.2012.02.005
10.1109/TSMC.2020.3011184
10.1109/JAS.2021.1003838
10.1109/TFUZZ.2022.3145809
ContentType Journal Article
Copyright 2023
Copyright_xml – notice: 2023
DBID AAYXX
CITATION
DOI 10.1016/j.automatica.2023.111128
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1873-2836
ExternalDocumentID 10_1016_j_automatica_2023_111128
S0005109823002881
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61627809; 62022044
  funderid: http://dx.doi.org/10.13039/501100001809
– fundername: Liaoning Revitalization Talents Program
  grantid: XLYC1801005
– fundername: National Key R&D Program of China
  grantid: 2018YFA0702200
  funderid: http://dx.doi.org/10.13039/501100012166
– fundername: Natural Science Foundation of Liaoning Province of China
  grantid: 2022JH25/10100008
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1~.
1~5
23N
3R3
4.4
457
4G.
5GY
5VS
6TJ
7-5
71M
8P~
9JN
9JO
AAAKF
AAAKG
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AARIN
AAXUO
ABDEX
ABFNM
ABFRF
ABJNI
ABMAC
ABUCO
ABXDB
ABYKQ
ACBEA
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ADBBV
ADEZE
ADIYS
ADMUD
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFFNX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHPGS
AI.
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
APLSM
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
HAMUX
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
K-O
KOM
LG9
LY7
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
RXW
SBC
SDF
SDG
SDP
SES
SET
SEW
SPC
SPCBC
SSB
SSD
SST
SSZ
T5K
T9H
TAE
TN5
VH1
WH7
WUQ
X6Y
XFK
XPP
ZMT
~G-
77I
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c318t-dd3e8991dd65e53865620e5033e962e0c52f794d04c24b02a0c3359b033678a13
IEDL.DBID .~1
ISSN 0005-1098
IngestDate Wed Oct 01 05:25:40 EDT 2025
Thu Apr 24 22:59:40 EDT 2025
Fri Feb 23 02:35:57 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Reinforcement learning (RL)
Nonlinear multiagent systems
Consensus problem
Model free
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c318t-dd3e8991dd65e53865620e5033e962e0c52f794d04c24b02a0c3359b033678a13
ParticipantIDs crossref_primary_10_1016_j_automatica_2023_111128
crossref_citationtrail_10_1016_j_automatica_2023_111128
elsevier_sciencedirect_doi_10_1016_j_automatica_2023_111128
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate September 2023
2023-09-00
PublicationDateYYYYMMDD 2023-09-01
PublicationDate_xml – month: 09
  year: 2023
  text: September 2023
PublicationDecade 2020
PublicationTitle Automatica (Oxford)
PublicationYear 2023
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Zhang, Luo, Liu (b28) 2009; 20
Liu, Sun, Si, Guo, Mei (b11) 2012; 32
Fu, Hong, Fu, Chai (b6) 2022; 52
Jiang, Jiang (b7) 2012; 48
Na, Lv, Zhang, Zhao (b16) 2022; 52
Mukherjee, Bai, Chakrabortty (b15) 2021; 126
Wang, Mu, Liu, Ma (b22) 2018; 29
Zhang, Cui, Luo (b26) 2013; 43
Zhang, Cui, Zhang, Luo (b27) 2011; 22
Dong, Zhong, Sun, He (b5) 2017; 28
Shi, Yue, Xie (b17) 2022; 52
Wei, Wang, Zhong, Wu (b23) 2021; 8
Dong, Yan, Yuan, He, Sun (b4) 2019; 49
Vamvoudakis, Lewis (b18) 2010; 46
Zhang, Wei, Liu (b31) 2011; 47
Mu, Wang (b14) 2017; 81
Bhasin, Kamalapurkar, Johnson, Vamvoudakis, Lewis, Dixon (b3) 2013; 49
Wang, Liu, Wei, Zhao (b21) 2012; 48
Vamvoudakis, Lewis, Hudas (b19) 2012; 48
Li, Ding, Lewis, Chai (b8) 2021; 129
Xia, Li, Song, Modares (b24) 2022; 9
Luo, Wu, Huang, Liu (b12) 2014; 50
Ming, Zhang, Luo, Wang (b13) 2022
Zhang, Zhang, Yang, Luo (b32) 2015; 23
Zhong, He (b33) 2017; 47
Abdollahi, Talebi, Patel (b1) 2006; 17
Li, Li, Tong (b9) 2022
Zhang, Ming, Yan, Wang (b29) 2021
Zhang, Ren, Mu, Han (b30) 2021
Xie, Wei, Gu, Shi (b25) 2022
Abianeh, Wan, Ferdowsi, Mijatovic, Dragičević (b2) 2022; 37
Li, Liu, Tong (b10) 2022; 33
Wang (b20) 2020; 50
Zhang (10.1016/j.automatica.2023.111128_b27) 2011; 22
Zhang (10.1016/j.automatica.2023.111128_b29) 2021
Dong (10.1016/j.automatica.2023.111128_b5) 2017; 28
Liu (10.1016/j.automatica.2023.111128_b11) 2012; 32
Vamvoudakis (10.1016/j.automatica.2023.111128_b19) 2012; 48
Zhang (10.1016/j.automatica.2023.111128_b31) 2011; 47
Zhang (10.1016/j.automatica.2023.111128_b28) 2009; 20
Dong (10.1016/j.automatica.2023.111128_b4) 2019; 49
Wang (10.1016/j.automatica.2023.111128_b22) 2018; 29
Li (10.1016/j.automatica.2023.111128_b9) 2022
Vamvoudakis (10.1016/j.automatica.2023.111128_b18) 2010; 46
Ming (10.1016/j.automatica.2023.111128_b13) 2022
Na (10.1016/j.automatica.2023.111128_b16) 2022; 52
Xie (10.1016/j.automatica.2023.111128_b25) 2022
Wang (10.1016/j.automatica.2023.111128_b21) 2012; 48
Zhang (10.1016/j.automatica.2023.111128_b26) 2013; 43
Mukherjee (10.1016/j.automatica.2023.111128_b15) 2021; 126
Luo (10.1016/j.automatica.2023.111128_b12) 2014; 50
Li (10.1016/j.automatica.2023.111128_b8) 2021; 129
Bhasin (10.1016/j.automatica.2023.111128_b3) 2013; 49
Xia (10.1016/j.automatica.2023.111128_b24) 2022; 9
Zhong (10.1016/j.automatica.2023.111128_b33) 2017; 47
Fu (10.1016/j.automatica.2023.111128_b6) 2022; 52
Li (10.1016/j.automatica.2023.111128_b10) 2022; 33
Wang (10.1016/j.automatica.2023.111128_b20) 2020; 50
Jiang (10.1016/j.automatica.2023.111128_b7) 2012; 48
Zhang (10.1016/j.automatica.2023.111128_b30) 2021
Abdollahi (10.1016/j.automatica.2023.111128_b1) 2006; 17
Abianeh (10.1016/j.automatica.2023.111128_b2) 2022; 37
Zhang (10.1016/j.automatica.2023.111128_b32) 2015; 23
Shi (10.1016/j.automatica.2023.111128_b17) 2022; 52
Mu (10.1016/j.automatica.2023.111128_b14) 2017; 81
Wei (10.1016/j.automatica.2023.111128_b23) 2021; 8
References_xml – start-page: 1
  year: 2021
  end-page: 11
  ident: b30
  article-title: Optimal consensus control design for multiagent systems with multiple time delay using adaptive dynamic programming
  publication-title: IEEE Transactions on Cybernetics
– volume: 126
  year: 2021
  ident: b15
  article-title: Reduced-dimensional reinforcement learning control using singular perturbation approximations
  publication-title: Automatica
– volume: 46
  start-page: 878
  year: 2010
  end-page: 888
  ident: b18
  article-title: Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
– volume: 129
  year: 2021
  ident: b8
  article-title: A novel adaptive dynamic programming based on tracking error for nonlinear discrete-time systems
– volume: 33
  start-page: 3131
  year: 2022
  end-page: 3145
  ident: b10
  article-title: Observer-based neuro-adaptive optimized control of strict-feedback nonlinear systems with state constraints
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 43
  start-page: 206
  year: 2013
  end-page: 216
  ident: b26
  article-title: Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP
  publication-title: IEEE Transactions on Cybernetics
– volume: 37
  start-page: 6359
  year: 2022
  end-page: 6370
  ident: b2
  article-title: Vulnerability identification and remediation of FDI attacks in islanded DC microgrids using multiagent reinforcement learning
  publication-title: IEEE Transactions on Power Electronics
– volume: 52
  start-page: 4441
  year: 2022
  end-page: 4450
  ident: b6
  article-title: Approximate optimal tracking control of nondifferentiable signals for a class of continuous-time nonlinear systems
  publication-title: IEEE Transactions on Cybernetics
– volume: 50
  start-page: 3281
  year: 2014
  end-page: 3290
  ident: b12
  article-title: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design
  publication-title: Automatica
– volume: 9
  start-page: 520
  year: 2022
  end-page: 532
  ident: b24
  article-title: Optimal synchronization control of heterogeneous asymmetric input-constrained unknown nonlinear mass via reinforcement learning
  publication-title: IEEE/CAA Journal of Automatica Sinica
– volume: 47
  start-page: 683
  year: 2017
  end-page: 694
  ident: b33
  article-title: An event-triggered ADP control approach for continuous-time system with unknown internal states
  publication-title: IEEE Transactions on Cybernetics
– volume: 49
  start-page: 82
  year: 2013
  end-page: 92
  ident: b3
  article-title: A novelactor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems
  publication-title: Automatica
– volume: 52
  start-page: 1182
  year: 2022
  end-page: 1191
  ident: b17
  article-title: Optimal leader-follower consensus for constrained-input multiagent systems with completely unknown dynamics
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics: Systems
– volume: 48
  start-page: 1598
  year: 2012
  end-page: 1611
  ident: b19
  article-title: Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality
  publication-title: Automatica
– volume: 22
  start-page: 2226
  year: 2011
  end-page: 2236
  ident: b27
  article-title: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method
  publication-title: IEEE Transactions on Neural Networks
– volume: 49
  start-page: 4206
  year: 2019
  end-page: 4218
  ident: b4
  article-title: Functional nonlinear model predictive control based on adaptive dynamic programming
  publication-title: IEEE Transactions on Cybernetics
– start-page: 1
  year: 2022
  end-page: 8
  ident: b9
  article-title: Event-based finite-time control for nonlinear multi-agent systems with asymptotic tracking
  publication-title: IEEE Transactions on Automatic Control
– year: 2022
  ident: b25
  article-title: Relaxed resilient fuzzy stabilization of discrete-time Takagi-Sugeno systems via a higher order time-variant balanced matrix method
  publication-title: IEEE Transactions on Fuzzy Systems
– volume: 52
  start-page: 459
  year: 2022
  end-page: 472
  ident: b16
  article-title: Adaptive identifier-critic-based optimal tracking control for nonlinear systems with experimental validation
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics: Systems
– start-page: 1
  year: 2021
  end-page: 15
  ident: b29
  article-title: Data-driven finite-horizon
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 20
  start-page: 1490
  year: 2009
  end-page: 1503
  ident: b28
  article-title: Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints
  publication-title: IEEE Transactions on Neural Networks
– volume: 48
  start-page: 2699
  year: 2012
  end-page: 2704
  ident: b7
  article-title: Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
  publication-title: Automatica
– volume: 28
  start-page: 1941
  year: 2017
  end-page: 1952
  ident: b5
  article-title: Event-triggered adaptive dynamic programming for continuous-time systems with control constraints
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 32
  start-page: 229
  year: 2012
  end-page: 235
  ident: b11
  article-title: A boundedness result for the direct heuristic dynamic programming
  publication-title: Neural Networks
– volume: 8
  start-page: 423
  year: 2021
  end-page: 431
  ident: b23
  article-title: Consensus control of leader-following multi-agent systems in directed topology with heterogeneous disturbances
  publication-title: IEEE/CAA Journal of Automatica Sinica
– volume: 17
  start-page: 118
  year: 2006
  end-page: 129
  ident: b1
  article-title: A stable neural network-based observer with application to flexible-joint manipulators
  publication-title: IEEE Transactions on Neural Networks
– volume: 48
  start-page: 1825
  year: 2012
  end-page: 1832
  ident: b21
  article-title: Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming
  publication-title: Automatica
– volume: 81
  start-page: 240
  year: 2017
  end-page: 252
  ident: b14
  article-title: Novel iterative neural dynamic programming for data-based approximate optimal control design
  publication-title: Automatica
– start-page: 1
  year: 2022
  end-page: 10
  ident: b13
  article-title: Dynamic event-based control for stochastic optimal regulation of nonlinear networked control systems
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 29
  start-page: 993
  year: 2018
  end-page: 1005
  ident: b22
  article-title: On mixed data and event driven design for adaptive-critic-based nonlinear
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 50
  start-page: 2740
  year: 2020
  end-page: 2748
  ident: b20
  article-title: Intelligent critic control with robustness guarantee of disturbed nonlinear plants
  publication-title: IEEE Transactions on Cybernetics
– volume: 23
  start-page: 152
  year: 2015
  end-page: 163
  ident: b32
  article-title: Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming
  publication-title: IEEE Transactions on Fuzzy Systems
– volume: 47
  start-page: 207
  year: 2011
  end-page: 214
  ident: b31
  article-title: An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games
  publication-title: Automatica
– volume: 17
  start-page: 118
  issue: 1
  year: 2006
  ident: 10.1016/j.automatica.2023.111128_b1
  article-title: A stable neural network-based observer with application to flexible-joint manipulators
  publication-title: IEEE Transactions on Neural Networks
  doi: 10.1109/TNN.2005.863458
– volume: 48
  start-page: 2699
  issue: 10
  year: 2012
  ident: 10.1016/j.automatica.2023.111128_b7
  article-title: Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
  publication-title: Automatica
  doi: 10.1016/j.automatica.2012.06.096
– volume: 48
  start-page: 1825
  issue: 8
  year: 2012
  ident: 10.1016/j.automatica.2023.111128_b21
  article-title: Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming
  publication-title: Automatica
  doi: 10.1016/j.automatica.2012.05.049
– volume: 22
  start-page: 2226
  issue: 12
  year: 2011
  ident: 10.1016/j.automatica.2023.111128_b27
  article-title: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method
  publication-title: IEEE Transactions on Neural Networks
  doi: 10.1109/TNN.2011.2168538
– volume: 126
  year: 2021
  ident: 10.1016/j.automatica.2023.111128_b15
  article-title: Reduced-dimensional reinforcement learning control using singular perturbation approximations
  publication-title: Automatica
  doi: 10.1016/j.automatica.2020.109451
– volume: 50
  start-page: 2740
  issue: 6
  year: 2020
  ident: 10.1016/j.automatica.2023.111128_b20
  article-title: Intelligent critic control with robustness guarantee of disturbed nonlinear plants
  publication-title: IEEE Transactions on Cybernetics
  doi: 10.1109/TCYB.2019.2903117
– volume: 20
  start-page: 1490
  issue: 9
  year: 2009
  ident: 10.1016/j.automatica.2023.111128_b28
  article-title: Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints
  publication-title: IEEE Transactions on Neural Networks
  doi: 10.1109/TNN.2009.2027233
– volume: 129
  year: 2021
  ident: 10.1016/j.automatica.2023.111128_b8
  article-title: A novel adaptive dynamic programming based on tracking error for nonlinear discrete-time systems
– volume: 50
  start-page: 3281
  issue: 12
  year: 2014
  ident: 10.1016/j.automatica.2023.111128_b12
  article-title: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design
  publication-title: Automatica
  doi: 10.1016/j.automatica.2014.10.056
– volume: 81
  start-page: 240
  year: 2017
  ident: 10.1016/j.automatica.2023.111128_b14
  article-title: Novel iterative neural dynamic programming for data-based approximate optimal control design
  publication-title: Automatica
  doi: 10.1016/j.automatica.2017.03.022
– volume: 52
  start-page: 4441
  issue: 6
  year: 2022
  ident: 10.1016/j.automatica.2023.111128_b6
  article-title: Approximate optimal tracking control of nondifferentiable signals for a class of continuous-time nonlinear systems
  publication-title: IEEE Transactions on Cybernetics
  doi: 10.1109/TCYB.2020.3027344
– volume: 49
  start-page: 82
  issue: 1
  year: 2013
  ident: 10.1016/j.automatica.2023.111128_b3
  article-title: A novelactor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems
  publication-title: Automatica
  doi: 10.1016/j.automatica.2012.09.019
– start-page: 1
  year: 2022
  ident: 10.1016/j.automatica.2023.111128_b9
  article-title: Event-based finite-time control for nonlinear multi-agent systems with asymptotic tracking
  publication-title: IEEE Transactions on Automatic Control
– volume: 52
  start-page: 459
  issue: 1
  year: 2022
  ident: 10.1016/j.automatica.2023.111128_b16
  article-title: Adaptive identifier-critic-based optimal tracking control for nonlinear systems with experimental validation
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics: Systems
  doi: 10.1109/TSMC.2020.3003224
– start-page: 1
  year: 2021
  ident: 10.1016/j.automatica.2023.111128_b30
  article-title: Optimal consensus control design for multiagent systems with multiple time delay using adaptive dynamic programming
  publication-title: IEEE Transactions on Cybernetics
– start-page: 1
  year: 2021
  ident: 10.1016/j.automatica.2023.111128_b29
  article-title: Data-driven finite-horizon H∞ tracking control with event-triggered mechanism for the continuous-time nonlinear systems
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 48
  start-page: 1598
  issue: 8
  year: 2012
  ident: 10.1016/j.automatica.2023.111128_b19
  article-title: Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality
  publication-title: Automatica
  doi: 10.1016/j.automatica.2012.05.074
– volume: 49
  start-page: 4206
  issue: 12
  year: 2019
  ident: 10.1016/j.automatica.2023.111128_b4
  article-title: Functional nonlinear model predictive control based on adaptive dynamic programming
  publication-title: IEEE Transactions on Cybernetics
  doi: 10.1109/TCYB.2018.2859801
– start-page: 1
  year: 2022
  ident: 10.1016/j.automatica.2023.111128_b13
  article-title: Dynamic event-based control for stochastic optimal regulation of nonlinear networked control systems
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 47
  start-page: 683
  issue: 3
  year: 2017
  ident: 10.1016/j.automatica.2023.111128_b33
  article-title: An event-triggered ADP control approach for continuous-time system with unknown internal states
  publication-title: IEEE Transactions on Cybernetics
  doi: 10.1109/TCYB.2016.2523878
– volume: 9
  start-page: 520
  issue: 3
  year: 2022
  ident: 10.1016/j.automatica.2023.111128_b24
  article-title: Optimal synchronization control of heterogeneous asymmetric input-constrained unknown nonlinear mass via reinforcement learning
  publication-title: IEEE/CAA Journal of Automatica Sinica
  doi: 10.1109/JAS.2021.1004359
– volume: 29
  start-page: 993
  issue: 4
  year: 2018
  ident: 10.1016/j.automatica.2023.111128_b22
  article-title: On mixed data and event driven design for adaptive-critic-based nonlinear H∞ control
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
  doi: 10.1109/TNNLS.2016.2642128
– volume: 33
  start-page: 3131
  issue: 7
  year: 2022
  ident: 10.1016/j.automatica.2023.111128_b10
  article-title: Observer-based neuro-adaptive optimized control of strict-feedback nonlinear systems with state constraints
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
  doi: 10.1109/TNNLS.2021.3051030
– volume: 28
  start-page: 1941
  issue: 8
  year: 2017
  ident: 10.1016/j.automatica.2023.111128_b5
  article-title: Event-triggered adaptive dynamic programming for continuous-time systems with control constraints
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
  doi: 10.1109/TNNLS.2016.2586303
– volume: 43
  start-page: 206
  issue: 1
  year: 2013
  ident: 10.1016/j.automatica.2023.111128_b26
  article-title: Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP
  publication-title: IEEE Transactions on Cybernetics
  doi: 10.1109/TSMCB.2012.2203336
– volume: 47
  start-page: 207
  issue: 1
  year: 2011
  ident: 10.1016/j.automatica.2023.111128_b31
  article-title: An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games
  publication-title: Automatica
  doi: 10.1016/j.automatica.2010.10.033
– volume: 37
  start-page: 6359
  issue: 6
  year: 2022
  ident: 10.1016/j.automatica.2023.111128_b2
  article-title: Vulnerability identification and remediation of FDI attacks in islanded DC microgrids using multiagent reinforcement learning
  publication-title: IEEE Transactions on Power Electronics
  doi: 10.1109/TPEL.2021.3132028
– volume: 23
  start-page: 152
  issue: 1
  year: 2015
  ident: 10.1016/j.automatica.2023.111128_b32
  article-title: Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming
  publication-title: IEEE Transactions on Fuzzy Systems
  doi: 10.1109/TFUZZ.2014.2310238
– volume: 32
  start-page: 229
  year: 2012
  ident: 10.1016/j.automatica.2023.111128_b11
  article-title: A boundedness result for the direct heuristic dynamic programming
  publication-title: Neural Networks
  doi: 10.1016/j.neunet.2012.02.005
– volume: 52
  start-page: 1182
  issue: 2
  year: 2022
  ident: 10.1016/j.automatica.2023.111128_b17
  article-title: Optimal leader-follower consensus for constrained-input multiagent systems with completely unknown dynamics
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics: Systems
  doi: 10.1109/TSMC.2020.3011184
– volume: 8
  start-page: 423
  issue: 2
  year: 2021
  ident: 10.1016/j.automatica.2023.111128_b23
  article-title: Consensus control of leader-following multi-agent systems in directed topology with heterogeneous disturbances
  publication-title: IEEE/CAA Journal of Automatica Sinica
  doi: 10.1109/JAS.2021.1003838
– volume: 46
  start-page: 878
  year: 2010
  ident: 10.1016/j.automatica.2023.111128_b18
  article-title: Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
– year: 2022
  ident: 10.1016/j.automatica.2023.111128_b25
  article-title: Relaxed resilient fuzzy stabilization of discrete-time Takagi-Sugeno systems via a higher order time-variant balanced matrix method
  publication-title: IEEE Transactions on Fuzzy Systems
  doi: 10.1109/TFUZZ.2022.3145809
SSID ssj0004182
Score 2.5539823
Snippet The infinite-horizon optimal consensus control problem for continuous-time unknown nonlinear multiagent systems (MASs) is solved via an online adaptive...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 111128
SubjectTerms Consensus problem
Model free
Nonlinear multiagent systems
Reinforcement learning (RL)
Title A novel actor–critic–identifier architecture for nonlinear multiagent systems with gradient descent method
URI https://dx.doi.org/10.1016/j.automatica.2023.111128
Volume 155
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 1873-2836
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0004182
  issn: 0005-1098
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Complete Freedom Collection [SCCMFC]
  customDbUrl:
  eissn: 1873-2836
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0004182
  issn: 0005-1098
  databaseCode: ACRLP
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection
  customDbUrl:
  eissn: 1873-2836
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0004182
  issn: 0005-1098
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect Freedom Collection Journals
  customDbUrl:
  eissn: 1873-2836
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0004182
  issn: 0005-1098
  databaseCode: AIKHN
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  customDbUrl:
  mediaType: online
  eissn: 1873-2836
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0004182
  issn: 0005-1098
  databaseCode: AKRWK
  dateStart: 19630101
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3LSgMxFA2lbnQhPvFZsnA7NpNkHsFVKZaq2JWF7oZMkpFKaUuduhT_wT_0S7xJZmwFQcHdvC6Em8e9mZx7DkIXQkguFC2CImGwQUliEuQsTwKiFbxiNE9d3dr9IO4P-e0oGjVQt66FsbDKau33a7pbrasn7cqb7fl4bGt87YAS9qQIgqQrv-Y8sSoGl68rmAcPU88Y7hg3RVqheTzGSy7LmWNGtQxElLn1w-qy_xSi1sJObwdtV_ki7vgm7aKGme6hrTUWwX007eDp7MVMsJPO-Xh7V06-AC7G2mOBzAKvHxhgSFTBxHFkyAV2mEJpS6yw53V-xvbvLH5cODhYibWnfMJebfoADXvXD91-UMkoBAombBlozQzsqkKt48hEVuMzpsTY40sjYmqIimgBs1ITrijPCZVEMRaJHD6ASCZDdoia0CZzhLDR1ED-w4QxIfRrKpPCFJCBKS5TySNyjJLac5mqOMat1MUkq8FkT9nK55n1eeZ9fozCL8u559n4g81V3TnZtzGTQTj41frkX9anaNPeebTZGWqWi6U5h_SkzFtu_LXQRufmrj_4BM-z6Pc
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LSwMxEA6lHtSD-MT6zMHr2t1kX8FTKZaqbU8t9LZkk6xUSlvWrUfxP_gP_SVOkl1bQVDwtmx2IEySmcnON98gdMUY95kgmZNFFC4oUeg6KU0jx5UChihJY1O31h-E3ZF_Pw7GNdSuamE0rLK0_damG2tdvmmW2mwuJhNd46s3FNOZInCSuvx6ww9IpG9g168rnIfvxZYy3FBusriE81iQF18Wc0ONqimICDUGRDdm_8lHrfmdzi7aKQNG3LJz2kM1NdtH22s0ggdo1sKz-YuaYtM75-PtXZj-BfAwkRYMpHK8njHAEKmCiCHJ4Dk2oEKua6ywJXZ-xvr3LH7MDR6swNJyPmHbbvoQjTq3w3bXKfsoOAJObOFISRVcqzwpw0AFuslnSFyl85eKhUS5IiAZHEvp-oL4qUu4KygNWAofgCvjHj1CdZiTOkZYSaIgAKJMKQ8WNuZRpjIIwYTPY-4HbgNFleYSUZKM614X06RCkz0lK50nWueJ1XkDeV-SC0u08QeZm2pxkm-bJgF_8Kv0yb-kL9Fmd9jvJb27wcMp2tIjFnp2hupFvlTnEKsU6YXZi58_eeqM
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+novel+actor%E2%80%93critic%E2%80%93identifier+architecture+for+nonlinear+multiagent+systems+with+gradient+descent+method&rft.jtitle=Automatica+%28Oxford%29&rft.au=Ming%2C+Zhongyang&rft.au=Zhang%2C+Huaguang&rft.au=Zhang%2C+Juan&rft.au=Xie%2C+Xiangpeng&rft.date=2023-09-01&rft.issn=0005-1098&rft.volume=155&rft.spage=111128&rft_id=info:doi/10.1016%2Fj.automatica.2023.111128&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_automatica_2023_111128
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0005-1098&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0005-1098&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0005-1098&client=summon