Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming

Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 344; pp. 13 - 19
Main Authors Jiang, He, Zhang, Huaguang, Xie, Xiangpeng, Han, Ji
Format Journal Article
LanguageEnglish
Published Elsevier B.V 07.06.2019
Subjects
Online AccessGet full text
ISSN0925-2312
1872-8286
DOI10.1016/j.neucom.2018.02.107

Cover

Abstract Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order to address this issue, a novel policy iteration (PI) algorithm is proposed based on ADP technique, and its associated convergence analysis is also studied in this brief paper. For the proposed PI algorithm, an online neural network (NN) implementation scheme with multiple-network structure is presented. In the online NN-based learning algorithm, critic network, constrained actor networks and unconstrained actor networks are employed to approximate the value function, constrained and unconstrained control policies, respectively, and the NN weight updating laws are designed based on the gradient descent method. Finally, a numerical simulation example is illustrated to show the effectiveness.
AbstractList Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order to address this issue, a novel policy iteration (PI) algorithm is proposed based on ADP technique, and its associated convergence analysis is also studied in this brief paper. For the proposed PI algorithm, an online neural network (NN) implementation scheme with multiple-network structure is presented. In the online NN-based learning algorithm, critic network, constrained actor networks and unconstrained actor networks are employed to approximate the value function, constrained and unconstrained control policies, respectively, and the NN weight updating laws are designed based on the gradient descent method. Finally, a numerical simulation example is illustrated to show the effectiveness.
Author Jiang, He
Han, Ji
Zhang, Huaguang
Xie, Xiangpeng
Author_xml – sequence: 1
  givenname: He
  orcidid: 0000-0001-9841-3580
  surname: Jiang
  fullname: Jiang, He
  email: jianghescholar@163.com
  organization: College of Information Science and Engineering, Northeastern University, Box 134, Shenyang, 110819, PR China
– sequence: 2
  givenname: Huaguang
  orcidid: 0000-0002-8022-907X
  surname: Zhang
  fullname: Zhang, Huaguang
  email: hgzhang@ieee.org
  organization: College of Information Science and Engineering, Northeastern University, Box 134, Shenyang, 110819, PR China
– sequence: 3
  givenname: Xiangpeng
  surname: Xie
  fullname: Xie, Xiangpeng
  email: xiexiangpeng1953@163.com
  organization: Institute of Advanced Technology, Nanjing University of Posts and Telecommunications, Nanjing, 210003, PR China
– sequence: 4
  givenname: Ji
  surname: Han
  fullname: Han, Ji
  email: hanji0912@163.com
  organization: College of Information Science and Engineering, Northeastern University, Box 134, Shenyang, 110819, PR China
BookMark eNqFkM2OEzEQhC20K5H9eQMOfoEJtrOTGXNAQiv-pNVygbPVsdvBYWyP2k5WeRmedR3CaQ9wKqnUX6mrrthFygkZeyPFUgq5frtbJtzbHJdKyHEpVHOHV2whx0F1oxrXF2whtOo7tZLqNbsqZSeEHKTSC_b7EfcEU5ewPmX61W2goOMTAqWQthymbaZQf8bCfSZuc56RoIYD8i1ELDx77kKxhBW7GiLyuJ9q6OYJjki8HEvFxj61iAanSnk6aakEIdXCDwE4OJj_JLpjghgsnylvCWJsD9ywSw9Twdu_es1-fPr4_f5L9_Dt89f7Dw-dXYl17Tyi1tJKr51TdjN6Dbbve6fFMKBf9WOvtQXQw6YdirtRbaQfrdY9KC_V2q-u2d0511IuhdCbmUIEOhopzGljszPnjc1pYyNUc4eGvXuB2VDbPKemEKb_we_PMLZih4Bkig2YLLpAaKtxOfw74Bk00KN8
CitedBy_id crossref_primary_10_1016_j_sysconle_2021_104894
crossref_primary_10_1109_TCYB_2019_2957406
crossref_primary_10_1109_ACCESS_2021_3061255
crossref_primary_10_1016_j_jfranklin_2021_04_014
crossref_primary_10_1016_j_eswa_2022_118882
crossref_primary_10_1016_j_neucom_2020_02_025
crossref_primary_10_1002_oca_2916
crossref_primary_10_1016_j_neucom_2022_11_006
crossref_primary_10_1016_j_ins_2021_12_125
crossref_primary_10_1016_j_neucom_2020_04_119
crossref_primary_10_1109_ACCESS_2022_3168032
crossref_primary_10_1177_01423312221114687
crossref_primary_10_1155_2023_9973580
crossref_primary_10_1002_oca_2771
crossref_primary_10_1016_j_neucom_2022_08_011
crossref_primary_10_1109_TCYB_2023_3277558
crossref_primary_10_1016_j_neucom_2020_06_083
crossref_primary_10_1016_j_ast_2022_107527
crossref_primary_10_1016_j_neucom_2022_10_058
crossref_primary_10_1016_j_neucom_2023_126276
crossref_primary_10_1145_3519303
crossref_primary_10_1002_oca_2907
crossref_primary_10_1007_s12555_022_1133_1
crossref_primary_10_1109_ACCESS_2019_2960064
crossref_primary_10_1002_asjc_3226
crossref_primary_10_3390_app132111888
Cites_doi 10.1109/TNNLS.2014.2371046
10.1109/TCYB.2015.2488680
10.1109/MCAS.2009.933854
10.1109/TNNLS.2017.2705113
10.1109/TCYB.2014.2354377
10.1109/MCI.2009.932261
10.1016/j.neucom.2016.06.020
10.1109/JAS.2014.7004681
10.1016/j.neucom.2014.12.066
10.1109/TASE.2012.2198057
10.1109/TSMC.2013.2295351
10.1109/TSMC.2015.2492941
10.1109/TSMCB.2012.2203336
10.1109/TCYB.2016.2611613
10.1109/TNNLS.2016.2582849
10.1016/j.neucom.2017.01.076
10.1016/j.automatica.2014.10.056
10.1109/TCYB.2014.2319577
10.1109/TNN.2011.2172628
10.1016/j.automatica.2011.03.005
10.1109/TCYB.2015.2492242
10.1016/j.neucom.2011.03.058
10.1109/TASE.2013.2284545
10.1016/j.neunet.2015.08.007
10.1007/s00521-012-1188-7
10.1109/TCYB.2015.2417170
10.1109/TNNLS.2015.2399020
10.1109/TCYB.2016.2542923
10.1109/TNN.2009.2027233
10.1016/j.neucom.2017.03.047
10.1109/TNNLS.2014.2358227
10.1109/TNNLS.2013.2281663
10.1049/iet-cta.2015.0769
10.1016/j.neucom.2011.05.031
ContentType Journal Article
Copyright 2019 Elsevier B.V.
Copyright_xml – notice: 2019 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.neucom.2018.02.107
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-8286
EndPage 19
ExternalDocumentID 10_1016_j_neucom_2018_02_107
S0925231219301705
GroupedDBID ---
--K
--M
.DC
.~1
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JM
9JN
AABNK
AACTN
AADPK
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXLA
AAXUO
AAYFN
ABBOA
ABCQJ
ABFNM
ABJNI
ABMAC
ABYKQ
ACDAQ
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFXIZ
AGHFR
AGUBO
AGWIK
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
AXJTR
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
IHE
J1W
KOM
LG9
M41
MO0
MOBAO
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
RIG
ROL
RPZ
SDF
SDG
SDP
SES
SPC
SPCBC
SSN
SSV
SSZ
T5K
ZMT
~G-
29N
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ABXDB
ACLOT
ACNNM
ACRPL
ACVFH
ADCNI
ADJOM
ADMUD
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
CITATION
EFKBS
FEDTE
FGOYB
HLZ
HVGLF
HZ~
R2-
SBC
SEW
WUQ
XPP
~HD
ID FETCH-LOGICAL-c306t-fee991c1f9dd2cb8f9ac555d9077ef358599caa97bee90482b1f8c995a2f126f3
IEDL.DBID .~1
ISSN 0925-2312
IngestDate Thu Oct 16 04:40:49 EDT 2025
Thu Apr 24 23:03:13 EDT 2025
Fri Feb 23 02:27:02 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Approximate dynamic programming
Adaptive dynamic programming
Neural networks
Reinforcement learning
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c306t-fee991c1f9dd2cb8f9ac555d9077ef358599caa97bee90482b1f8c995a2f126f3
ORCID 0000-0002-8022-907X
0000-0001-9841-3580
PageCount 7
ParticipantIDs crossref_primary_10_1016_j_neucom_2018_02_107
crossref_citationtrail_10_1016_j_neucom_2018_02_107
elsevier_sciencedirect_doi_10_1016_j_neucom_2018_02_107
PublicationCentury 2000
PublicationDate 2019-06-07
PublicationDateYYYYMMDD 2019-06-07
PublicationDate_xml – month: 06
  year: 2019
  text: 2019-06-07
  day: 07
PublicationDecade 2010
PublicationTitle Neurocomputing (Amsterdam)
PublicationYear 2019
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Zhang, Song, Wei, Zhang (bib0009) 2011; 22
Zhu, Zhao, Li (bib0014) 2016; 10
Liu, Li, Wang (bib0027) 2014; 44
Song, Lewis, Wei, Zhang, Jiang, Levine (bib0023) 2015; 26
Wang, Liu, Wei (bib0015) 2012; 78
Liu, Wei (bib0034) 2014; 25
Zhang, Cui, Luo, Jiang (bib0016) 2017; PP
Wang, Zhang, Liu (bib0003) 2009; 4
Wei, Wang, Zhang (bib0010) 2013; 23
Zhang, Zhao, Zhu (bib0031) 2017; 238
Liu, Wang, Zhao, Wei, Jin (bib0022) 2012; 9
Wei, Wang, Liu, Yang (bib0004) 2014; 44
Liu, Yang, Wang, Wei (bib0019) 2015; 45
Mu, Wang (bib0011) 2017; 245
Luo, Wu, Huang, Liu (bib0018) 2015; 71
Wei, Liu, Lin (bib0002) 2016; 46
He, Ni, Fu (bib0001) 2012; 78
Zhao, Zhang, Wang, Zhu (bib0028) 2016; 46
Wang, Liu, Zhang, Zhao (bib0013) 2016; 46
Wei, Liu (bib0017) 2014; 11
Zhao, Zhu (bib0006) 2015; 26
Wei, Lewis, Sun, Yan, Song (bib0021) 2017; 47
Song, Wei, Sun (bib0008) 2015; 156
Luo, Wu, Huang, Liu (bib0020) 2014; 50
Luo, Wu, Huang (bib0007) 2015; 45
Zhang, Jiang, Luo, Xiao (bib0030) 2017; 47
Zhang, Luo, Liu (bib0032) 2009; 20
Kiumarsi, Lewis (bib0033) 2015; 26
Kamalapurkar, Klotz, Dixon (bib0026) 2014; 1
Wang, Liu, Mu, Ma (bib0012) 2016; 214
Vamvoudakis, Lewis (bib0024) 2011; 47
Zhang, Cui, Luo (bib0025) 2013; 43
Song, Lewis, Wei (bib0029) 2017; 28
Lewis, Vrabie (bib0005) 2009; 9
Mu (10.1016/j.neucom.2018.02.107_bib0011) 2017; 245
Wang (10.1016/j.neucom.2018.02.107_bib0012) 2016; 214
Wang (10.1016/j.neucom.2018.02.107_bib0013) 2016; 46
Wei (10.1016/j.neucom.2018.02.107_bib0002) 2016; 46
He (10.1016/j.neucom.2018.02.107_bib0001) 2012; 78
Liu (10.1016/j.neucom.2018.02.107_bib0022) 2012; 9
Wei (10.1016/j.neucom.2018.02.107_bib0017) 2014; 11
Zhao (10.1016/j.neucom.2018.02.107_bib0028) 2016; 46
Zhang (10.1016/j.neucom.2018.02.107_bib0032) 2009; 20
Liu (10.1016/j.neucom.2018.02.107_bib0034) 2014; 25
Wang (10.1016/j.neucom.2018.02.107_bib0015) 2012; 78
Wei (10.1016/j.neucom.2018.02.107_bib0021) 2017; 47
Zhao (10.1016/j.neucom.2018.02.107_bib0006) 2015; 26
Lewis (10.1016/j.neucom.2018.02.107_bib0005) 2009; 9
Luo (10.1016/j.neucom.2018.02.107_bib0018) 2015; 71
Wang (10.1016/j.neucom.2018.02.107_bib0003) 2009; 4
Vamvoudakis (10.1016/j.neucom.2018.02.107_bib0024) 2011; 47
Kiumarsi (10.1016/j.neucom.2018.02.107_bib0033) 2015; 26
Zhang (10.1016/j.neucom.2018.02.107_bib0031) 2017; 238
Zhang (10.1016/j.neucom.2018.02.107_bib0009) 2011; 22
Liu (10.1016/j.neucom.2018.02.107_bib0027) 2014; 44
Zhang (10.1016/j.neucom.2018.02.107_bib0030) 2017; 47
Luo (10.1016/j.neucom.2018.02.107_bib0007) 2015; 45
Wei (10.1016/j.neucom.2018.02.107_bib0010) 2013; 23
Zhang (10.1016/j.neucom.2018.02.107_bib0016) 2017; PP
Zhu (10.1016/j.neucom.2018.02.107_bib0014) 2016; 10
Song (10.1016/j.neucom.2018.02.107_bib0008) 2015; 156
Luo (10.1016/j.neucom.2018.02.107_bib0020) 2014; 50
Kamalapurkar (10.1016/j.neucom.2018.02.107_bib0026) 2014; 1
Zhang (10.1016/j.neucom.2018.02.107_bib0025) 2013; 43
Song (10.1016/j.neucom.2018.02.107_bib0023) 2015; 26
Wei (10.1016/j.neucom.2018.02.107_bib0004) 2014; 44
Song (10.1016/j.neucom.2018.02.107_bib0029) 2017; 28
Liu (10.1016/j.neucom.2018.02.107_bib0019) 2015; 45
References_xml – volume: 4
  start-page: 39
  year: 2009
  end-page: 47
  ident: bib0003
  article-title: Adaptive dynamic programming: an introduction
  publication-title: IEEE Comput. Intell. Mag.
– volume: PP
  start-page: 1
  year: 2017
  end-page: 13
  ident: bib0016
  article-title: Finite-horizon
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
– volume: 22
  start-page: 1851
  year: 2011
  end-page: 1862
  ident: bib0009
  article-title: Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming
  publication-title: IEEE Trans. Neural Netw.
– volume: 45
  start-page: 1372
  year: 2015
  end-page: 1385
  ident: bib0019
  article-title: Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints
  publication-title: IEEE Trans. Cybern.
– volume: 28
  start-page: 704
  year: 2017
  end-page: 713
  ident: bib0029
  article-title: Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
– volume: 10
  start-page: 1339
  year: 2016
  end-page: 1347
  ident: bib0014
  article-title: Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics
  publication-title: IET Control Theory Appl.
– volume: 44
  start-page: 2820
  year: 2014
  end-page: 2833
  ident: bib0004
  article-title: Finite-approximation-error-based discrete-time iterative adaptive dynamic programming
  publication-title: IEEE Trans. Cybern.
– volume: 1
  start-page: 239
  year: 2014
  end-page: 247
  ident: bib0026
  article-title: Concurrent learning-based approximate feedback-nash equilibrium solution of N-player nonzero-sum differential games
  publication-title: IEEE/CAA J. Autom. Sin.
– volume: 214
  start-page: 297
  year: 2016
  end-page: 306
  ident: bib0012
  article-title: Decentralized guaranteed cost control of interconnected systems with uncertainties: a learning-based optimal control strategy
  publication-title: Neurocomputing
– volume: 78
  start-page: 3
  year: 2012
  end-page: 13
  ident: bib0001
  article-title: A three-network architecture for on-line learning and optimization based on adaptive dynamic programming
  publication-title: Neurocomputing
– volume: 25
  start-page: 621
  year: 2014
  end-page: 634
  ident: bib0034
  article-title: Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems
  publication-title: IEEE Trans. Neural Netw. Learning Syst.
– volume: 23
  start-page: 1851
  year: 2013
  end-page: 1863
  ident: bib0010
  article-title: Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays
  publication-title: Neural Comput. Appl.
– volume: 44
  start-page: 1015
  year: 2014
  end-page: 1027
  ident: bib0027
  article-title: Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamics
  publication-title: IEEE Trans. Syst. Man Cybern.
– volume: 46
  start-page: 840
  year: 2016
  end-page: 853
  ident: bib0002
  article-title: Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems
  publication-title: IEEE Trans. Cybern.
– volume: 71
  start-page: 150
  year: 2015
  end-page: 158
  ident: bib0018
  article-title: Reinforcement learning solution for HJB equation arising in constrained optimal control problem
  publication-title: Neural Netw.
– volume: 50
  start-page: 3281
  year: 2014
  end-page: 3290
  ident: bib0020
  article-title: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design
  publication-title: Automatica
– volume: 45
  start-page: 65
  year: 2015
  end-page: 76
  ident: bib0007
  article-title: Off-policy reinforcement learning for
  publication-title: IEEE Trans. Cybern.
– volume: 156
  start-page: 166
  year: 2015
  end-page: 175
  ident: bib0008
  article-title: Nearly finite-horizon optimal control for a class of nonaffine time-delay nonlinear systems based on adaptive dynamic programming
  publication-title: Neurocomputing
– volume: 9
  start-page: 32
  year: 2009
  end-page: 50
  ident: bib0005
  article-title: Reinforcement learning and adaptive dynamic programming for feedback control
  publication-title: IEEE Circuits Syst. Mag.
– volume: 47
  start-page: 1556
  year: 2011
  end-page: 1569
  ident: bib0024
  article-title: Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations
  publication-title: Automatica
– volume: 26
  start-page: 851
  year: 2015
  end-page: 865
  ident: bib0023
  article-title: Multiple actor-critic structures for continuous-time optimal control using input-output data
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
– volume: 26
  start-page: 346
  year: 2015
  end-page: 356
  ident: bib0006
  article-title: MEC–a near-optimal online reinforcement learning algorithm for continuous deterministic systems
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
– volume: 9
  start-page: 628
  year: 2012
  end-page: 634
  ident: bib0022
  article-title: Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming
  publication-title: IEEE Trans. Autom. Sci. Eng.
– volume: 26
  start-page: 140
  year: 2015
  end-page: 151
  ident: bib0033
  article-title: Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
– volume: 43
  start-page: 206
  year: 2013
  end-page: 216
  ident: bib0025
  article-title: Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP
  publication-title: IEEE Trans. Cybern.
– volume: 245
  start-page: 46
  year: 2017
  end-page: 54
  ident: bib0011
  article-title: Neural-network-based adaptive guaranteed cost control of nonlinear dynamical systems with matched uncertainties
  publication-title: Neurocomputing
– volume: 47
  start-page: 3331
  year: 2017
  end-page: 3340
  ident: bib0030
  article-title: Discrete-time nonzero-sum games for multiplayer using policy-iteration-based adaptive dynamic programming algorithms
  publication-title: IEEE Trans. Cybern.
– volume: 238
  start-page: 377
  year: 2017
  end-page: 386
  ident: bib0031
  article-title: Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs
  publication-title: Neurocomputing
– volume: 78
  start-page: 14
  year: 2012
  end-page: 22
  ident: bib0015
  article-title: Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach
  publication-title: Neurocomputing
– volume: 46
  start-page: 1544
  year: 2016
  end-page: 1555
  ident: bib0013
  article-title: Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics
  publication-title: IEEE Trans. Syst. Man Cybern.
– volume: 11
  start-page: 1020
  year: 2014
  end-page: 1036
  ident: bib0017
  article-title: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification
  publication-title: IEEE Trans. Autom. Sci. Eng.
– volume: 20
  start-page: 1490
  year: 2009
  end-page: 1503
  ident: bib0032
  article-title: Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints
  publication-title: IEEE Trans. Neural Netw.
– volume: 46
  start-page: 854
  year: 2016
  end-page: 865
  ident: bib0028
  article-title: Experience replay for optimal control of nonzero-sum game systems with unknown dynamics
  publication-title: IEEE Trans. Cybern.
– volume: 47
  start-page: 1224
  year: 2017
  end-page: 1237
  ident: bib0021
  article-title: Discrete-time deterministic
  publication-title: IEEE Trans. Cybern.
– volume: 26
  start-page: 346
  issue: 2
  year: 2015
  ident: 10.1016/j.neucom.2018.02.107_bib0006
  article-title: MEC–a near-optimal online reinforcement learning algorithm for continuous deterministic systems
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
  doi: 10.1109/TNNLS.2014.2371046
– volume: 46
  start-page: 854
  issue: 3
  year: 2016
  ident: 10.1016/j.neucom.2018.02.107_bib0028
  article-title: Experience replay for optimal control of nonzero-sum game systems with unknown dynamics
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TCYB.2015.2488680
– volume: 9
  start-page: 32
  issue: 3
  year: 2009
  ident: 10.1016/j.neucom.2018.02.107_bib0005
  article-title: Reinforcement learning and adaptive dynamic programming for feedback control
  publication-title: IEEE Circuits Syst. Mag.
  doi: 10.1109/MCAS.2009.933854
– volume: PP
  start-page: 1
  issue: 99
  year: 2017
  ident: 10.1016/j.neucom.2018.02.107_bib0016
  article-title: Finite-horizon H∞ tracking control for unknown nonlinear systems with saturating actuators
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
  doi: 10.1109/TNNLS.2017.2705113
– volume: 44
  start-page: 2820
  issue: 12
  year: 2014
  ident: 10.1016/j.neucom.2018.02.107_bib0004
  article-title: Finite-approximation-error-based discrete-time iterative adaptive dynamic programming
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TCYB.2014.2354377
– volume: 4
  start-page: 39
  issue: 2
  year: 2009
  ident: 10.1016/j.neucom.2018.02.107_bib0003
  article-title: Adaptive dynamic programming: an introduction
  publication-title: IEEE Comput. Intell. Mag.
  doi: 10.1109/MCI.2009.932261
– volume: 214
  start-page: 297
  year: 2016
  ident: 10.1016/j.neucom.2018.02.107_bib0012
  article-title: Decentralized guaranteed cost control of interconnected systems with uncertainties: a learning-based optimal control strategy
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2016.06.020
– volume: 1
  start-page: 239
  issue: 3
  year: 2014
  ident: 10.1016/j.neucom.2018.02.107_bib0026
  article-title: Concurrent learning-based approximate feedback-nash equilibrium solution of N-player nonzero-sum differential games
  publication-title: IEEE/CAA J. Autom. Sin.
  doi: 10.1109/JAS.2014.7004681
– volume: 156
  start-page: 166
  year: 2015
  ident: 10.1016/j.neucom.2018.02.107_bib0008
  article-title: Nearly finite-horizon optimal control for a class of nonaffine time-delay nonlinear systems based on adaptive dynamic programming
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2014.12.066
– volume: 9
  start-page: 628
  issue: 3
  year: 2012
  ident: 10.1016/j.neucom.2018.02.107_bib0022
  article-title: Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming
  publication-title: IEEE Trans. Autom. Sci. Eng.
  doi: 10.1109/TASE.2012.2198057
– volume: 44
  start-page: 1015
  issue: 8
  year: 2014
  ident: 10.1016/j.neucom.2018.02.107_bib0027
  article-title: Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamics
  publication-title: IEEE Trans. Syst. Man Cybern.
  doi: 10.1109/TSMC.2013.2295351
– volume: 46
  start-page: 1544
  issue: 11
  year: 2016
  ident: 10.1016/j.neucom.2018.02.107_bib0013
  article-title: Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics
  publication-title: IEEE Trans. Syst. Man Cybern.
  doi: 10.1109/TSMC.2015.2492941
– volume: 43
  start-page: 206
  issue: 1
  year: 2013
  ident: 10.1016/j.neucom.2018.02.107_bib0025
  article-title: Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TSMCB.2012.2203336
– volume: 47
  start-page: 3331
  issue: 10
  year: 2017
  ident: 10.1016/j.neucom.2018.02.107_bib0030
  article-title: Discrete-time nonzero-sum games for multiplayer using policy-iteration-based adaptive dynamic programming algorithms
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TCYB.2016.2611613
– volume: 28
  start-page: 704
  issue: 3
  year: 2017
  ident: 10.1016/j.neucom.2018.02.107_bib0029
  article-title: Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
  doi: 10.1109/TNNLS.2016.2582849
– volume: 238
  start-page: 377
  year: 2017
  ident: 10.1016/j.neucom.2018.02.107_bib0031
  article-title: Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2017.01.076
– volume: 50
  start-page: 3281
  issue: 12
  year: 2014
  ident: 10.1016/j.neucom.2018.02.107_bib0020
  article-title: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design
  publication-title: Automatica
  doi: 10.1016/j.automatica.2014.10.056
– volume: 45
  start-page: 65
  issue: 1
  year: 2015
  ident: 10.1016/j.neucom.2018.02.107_bib0007
  article-title: Off-policy reinforcement learning for H∞ control design
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TCYB.2014.2319577
– volume: 22
  start-page: 1851
  issue: 12
  year: 2011
  ident: 10.1016/j.neucom.2018.02.107_bib0009
  article-title: Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming
  publication-title: IEEE Trans. Neural Netw.
  doi: 10.1109/TNN.2011.2172628
– volume: 47
  start-page: 1556
  issue: 8
  year: 2011
  ident: 10.1016/j.neucom.2018.02.107_bib0024
  article-title: Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations
  publication-title: Automatica
  doi: 10.1016/j.automatica.2011.03.005
– volume: 46
  start-page: 840
  issue: 3
  year: 2016
  ident: 10.1016/j.neucom.2018.02.107_bib0002
  article-title: Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TCYB.2015.2492242
– volume: 78
  start-page: 14
  issue: 1
  year: 2012
  ident: 10.1016/j.neucom.2018.02.107_bib0015
  article-title: Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2011.03.058
– volume: 11
  start-page: 1020
  issue: 4
  year: 2014
  ident: 10.1016/j.neucom.2018.02.107_bib0017
  article-title: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification
  publication-title: IEEE Trans. Autom. Sci. Eng.
  doi: 10.1109/TASE.2013.2284545
– volume: 71
  start-page: 150
  year: 2015
  ident: 10.1016/j.neucom.2018.02.107_bib0018
  article-title: Reinforcement learning solution for HJB equation arising in constrained optimal control problem
  publication-title: Neural Netw.
  doi: 10.1016/j.neunet.2015.08.007
– volume: 23
  start-page: 1851
  issue: 7–8
  year: 2013
  ident: 10.1016/j.neucom.2018.02.107_bib0010
  article-title: Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays
  publication-title: Neural Comput. Appl.
  doi: 10.1007/s00521-012-1188-7
– volume: 45
  start-page: 1372
  issue: 7
  year: 2015
  ident: 10.1016/j.neucom.2018.02.107_bib0019
  article-title: Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TCYB.2015.2417170
– volume: 26
  start-page: 851
  issue: 4
  year: 2015
  ident: 10.1016/j.neucom.2018.02.107_bib0023
  article-title: Multiple actor-critic structures for continuous-time optimal control using input-output data
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
  doi: 10.1109/TNNLS.2015.2399020
– volume: 47
  start-page: 1224
  issue: 5
  year: 2017
  ident: 10.1016/j.neucom.2018.02.107_bib0021
  article-title: Discrete-time deterministic Q-learning: a novel convergence analysis
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TCYB.2016.2542923
– volume: 20
  start-page: 1490
  issue: 9
  year: 2009
  ident: 10.1016/j.neucom.2018.02.107_bib0032
  article-title: Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints
  publication-title: IEEE Trans. Neural Netw.
  doi: 10.1109/TNN.2009.2027233
– volume: 245
  start-page: 46
  year: 2017
  ident: 10.1016/j.neucom.2018.02.107_bib0011
  article-title: Neural-network-based adaptive guaranteed cost control of nonlinear dynamical systems with matched uncertainties
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2017.03.047
– volume: 26
  start-page: 140
  issue: 1
  year: 2015
  ident: 10.1016/j.neucom.2018.02.107_bib0033
  article-title: Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
  doi: 10.1109/TNNLS.2014.2358227
– volume: 25
  start-page: 621
  issue: 3
  year: 2014
  ident: 10.1016/j.neucom.2018.02.107_bib0034
  article-title: Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems
  publication-title: IEEE Trans. Neural Netw. Learning Syst.
  doi: 10.1109/TNNLS.2013.2281663
– volume: 10
  start-page: 1339
  issue: 12
  year: 2016
  ident: 10.1016/j.neucom.2018.02.107_bib0014
  article-title: Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics
  publication-title: IET Control Theory Appl.
  doi: 10.1049/iet-cta.2015.0769
– volume: 78
  start-page: 3
  issue: 1
  year: 2012
  ident: 10.1016/j.neucom.2018.02.107_bib0001
  article-title: A three-network architecture for on-line learning and optimization based on adaptive dynamic programming
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2011.05.031
SSID ssj0017129
Score 2.4046528
Snippet Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 13
SubjectTerms Adaptive dynamic programming
Approximate dynamic programming
Neural networks
Reinforcement learning
Title Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming
URI https://dx.doi.org/10.1016/j.neucom.2018.02.107
Volume 344
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Complete Freedom Collection [SCCMFC]
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: ACRLP
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals [SCFCJ]
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: AIKHN
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Science Direct
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  customDbUrl:
  mediaType: online
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: AKRWK
  dateStart: 19930201
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9swDBaC9LJLH1uL9QkedtVSK7FlHYuiRbZhuWwBejMUPbIUiW0kbo_9Kf2tJS05aIFiA3YyLIiGIEoiKX_8yNiXPM-8Qr-CC6cyPso9bin0MrgZYQjkhZXa0oX-z0k2no6-36V3PXbd5cIQrDKe_eFMb0_r2DKIszmoF4vBr0slMIpKcMsNW1IYymAfSapi8PVpC_NIZCIC355IOfXu0udajFfpHggzgkYwJ-bOhIrKvmeeXpmc2322G31FuArDOWA9V35ke10dBojb8hN7JoYNveRlgHRzskwWYj2IOejlvFovmj-rDaCHCqaqahf4vmFOGFmoPFBy7hr9Z0615qFFGfJ6qdEfh8D1vAG6sYWIbKfnpq0u0WzgcaFBW123X7Shwj1E3NcKB3DIprc3v6_HPNZd4AYDiIZ759BrNIlX1gozy73SJk1Ti3G0dH6IAYZSRmslZ9gRTwAxS3xulEq18InI_PCI9cuqdJ8Z0H8c7TVKYxiZ5Vpb1JfOrJVEPZb4YzbsprswkZScRr8sOvTZfRGUVJCSikuBrfKY8a1UHUg5_tFfdpos3iyuAu3GXyVP_lvylH3AN9WiyuQZ6zfrB3eO_kszu2gX6AXbufr2Yzx5Af6h9Uw
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwELYoHOBCoYCAFphDr2aJdxPHxwoVLc8LIHGLvH4si5Yk2g099qfwWzsTOwgkBFJPkRxPZHnseTifv2HsZ55nXmFcwYVTGR_kHrcURhncDDAF8sJKbelA__IqG94Ozu7SuwV23N2FIVhltP3BprfWOrb04mz26smkd32kBGZRCW65fksK84UtDVIhKQM7_PuC80hkIgLhnkg5de_uz7Ugr9I9EWgEvWBO1J0JVZV9zz-98jkna2w1BovwK4xnnS248hv72hVigLgvN9gzUWzoKS8DppuTa7IQC0KMQU_H1WzS3D_OAUNUMFVVu0D4DWMCyULlgW7nzjCA5lRsHlqYIa-nGgNyCGTPc6AjW4jQdnrO2_ISzRz-TDRoq-v2izaUuIcI_HrEAWyy25PfN8dDHgsvcIMZRMO9cxg2msQra4UZ5V5pk6apxURaOt_HDEMpo7WSI-yIJkCMEp8bpVItfCIy399ii2VVum0G9CNHe43SmEdmudYWFaYzayVxjyV-h_W76S5MZCWn0U-LDn72UAQlFaSk4khgq9xh_EWqDqwcn_SXnSaLN6urQMfxoeTuf0sesOXhzeVFcXF6df6dreAb1ULM5A-22Mye3B4GM81ov12s_wCmh_bh
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Neural-network-based+learning+algorithms+for+cooperative+games+of+discrete-time+multi-player+systems+with+control+constraints+via+adaptive+dynamic+programming&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Jiang%2C+He&rft.au=Zhang%2C+Huaguang&rft.au=Xie%2C+Xiangpeng&rft.au=Han%2C+Ji&rft.date=2019-06-07&rft.issn=0925-2312&rft.volume=344&rft.spage=13&rft.epage=19&rft_id=info:doi/10.1016%2Fj.neucom.2018.02.107&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_neucom_2018_02_107
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon