Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming
Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order...
Saved in:
| Published in | Neurocomputing (Amsterdam) Vol. 344; pp. 13 - 19 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
Elsevier B.V
07.06.2019
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0925-2312 1872-8286 |
| DOI | 10.1016/j.neucom.2018.02.107 |
Cover
| Abstract | Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order to address this issue, a novel policy iteration (PI) algorithm is proposed based on ADP technique, and its associated convergence analysis is also studied in this brief paper. For the proposed PI algorithm, an online neural network (NN) implementation scheme with multiple-network structure is presented. In the online NN-based learning algorithm, critic network, constrained actor networks and unconstrained actor networks are employed to approximate the value function, constrained and unconstrained control policies, respectively, and the NN weight updating laws are designed based on the gradient descent method. Finally, a numerical simulation example is illustrated to show the effectiveness. |
|---|---|
| AbstractList | Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order to address this issue, a novel policy iteration (PI) algorithm is proposed based on ADP technique, and its associated convergence analysis is also studied in this brief paper. For the proposed PI algorithm, an online neural network (NN) implementation scheme with multiple-network structure is presented. In the online NN-based learning algorithm, critic network, constrained actor networks and unconstrained actor networks are employed to approximate the value function, constrained and unconstrained control policies, respectively, and the NN weight updating laws are designed based on the gradient descent method. Finally, a numerical simulation example is illustrated to show the effectiveness. |
| Author | Jiang, He Han, Ji Zhang, Huaguang Xie, Xiangpeng |
| Author_xml | – sequence: 1 givenname: He orcidid: 0000-0001-9841-3580 surname: Jiang fullname: Jiang, He email: jianghescholar@163.com organization: College of Information Science and Engineering, Northeastern University, Box 134, Shenyang, 110819, PR China – sequence: 2 givenname: Huaguang orcidid: 0000-0002-8022-907X surname: Zhang fullname: Zhang, Huaguang email: hgzhang@ieee.org organization: College of Information Science and Engineering, Northeastern University, Box 134, Shenyang, 110819, PR China – sequence: 3 givenname: Xiangpeng surname: Xie fullname: Xie, Xiangpeng email: xiexiangpeng1953@163.com organization: Institute of Advanced Technology, Nanjing University of Posts and Telecommunications, Nanjing, 210003, PR China – sequence: 4 givenname: Ji surname: Han fullname: Han, Ji email: hanji0912@163.com organization: College of Information Science and Engineering, Northeastern University, Box 134, Shenyang, 110819, PR China |
| BookMark | eNqFkM2OEzEQhC20K5H9eQMOfoEJtrOTGXNAQiv-pNVygbPVsdvBYWyP2k5WeRmedR3CaQ9wKqnUX6mrrthFygkZeyPFUgq5frtbJtzbHJdKyHEpVHOHV2whx0F1oxrXF2whtOo7tZLqNbsqZSeEHKTSC_b7EfcEU5ewPmX61W2goOMTAqWQthymbaZQf8bCfSZuc56RoIYD8i1ELDx77kKxhBW7GiLyuJ9q6OYJjki8HEvFxj61iAanSnk6aakEIdXCDwE4OJj_JLpjghgsnylvCWJsD9ywSw9Twdu_es1-fPr4_f5L9_Dt89f7Dw-dXYl17Tyi1tJKr51TdjN6Dbbve6fFMKBf9WOvtQXQw6YdirtRbaQfrdY9KC_V2q-u2d0511IuhdCbmUIEOhopzGljszPnjc1pYyNUc4eGvXuB2VDbPKemEKb_we_PMLZih4Bkig2YLLpAaKtxOfw74Bk00KN8 |
| CitedBy_id | crossref_primary_10_1016_j_sysconle_2021_104894 crossref_primary_10_1109_TCYB_2019_2957406 crossref_primary_10_1109_ACCESS_2021_3061255 crossref_primary_10_1016_j_jfranklin_2021_04_014 crossref_primary_10_1016_j_eswa_2022_118882 crossref_primary_10_1016_j_neucom_2020_02_025 crossref_primary_10_1002_oca_2916 crossref_primary_10_1016_j_neucom_2022_11_006 crossref_primary_10_1016_j_ins_2021_12_125 crossref_primary_10_1016_j_neucom_2020_04_119 crossref_primary_10_1109_ACCESS_2022_3168032 crossref_primary_10_1177_01423312221114687 crossref_primary_10_1155_2023_9973580 crossref_primary_10_1002_oca_2771 crossref_primary_10_1016_j_neucom_2022_08_011 crossref_primary_10_1109_TCYB_2023_3277558 crossref_primary_10_1016_j_neucom_2020_06_083 crossref_primary_10_1016_j_ast_2022_107527 crossref_primary_10_1016_j_neucom_2022_10_058 crossref_primary_10_1016_j_neucom_2023_126276 crossref_primary_10_1145_3519303 crossref_primary_10_1002_oca_2907 crossref_primary_10_1007_s12555_022_1133_1 crossref_primary_10_1109_ACCESS_2019_2960064 crossref_primary_10_1002_asjc_3226 crossref_primary_10_3390_app132111888 |
| Cites_doi | 10.1109/TNNLS.2014.2371046 10.1109/TCYB.2015.2488680 10.1109/MCAS.2009.933854 10.1109/TNNLS.2017.2705113 10.1109/TCYB.2014.2354377 10.1109/MCI.2009.932261 10.1016/j.neucom.2016.06.020 10.1109/JAS.2014.7004681 10.1016/j.neucom.2014.12.066 10.1109/TASE.2012.2198057 10.1109/TSMC.2013.2295351 10.1109/TSMC.2015.2492941 10.1109/TSMCB.2012.2203336 10.1109/TCYB.2016.2611613 10.1109/TNNLS.2016.2582849 10.1016/j.neucom.2017.01.076 10.1016/j.automatica.2014.10.056 10.1109/TCYB.2014.2319577 10.1109/TNN.2011.2172628 10.1016/j.automatica.2011.03.005 10.1109/TCYB.2015.2492242 10.1016/j.neucom.2011.03.058 10.1109/TASE.2013.2284545 10.1016/j.neunet.2015.08.007 10.1007/s00521-012-1188-7 10.1109/TCYB.2015.2417170 10.1109/TNNLS.2015.2399020 10.1109/TCYB.2016.2542923 10.1109/TNN.2009.2027233 10.1016/j.neucom.2017.03.047 10.1109/TNNLS.2014.2358227 10.1109/TNNLS.2013.2281663 10.1049/iet-cta.2015.0769 10.1016/j.neucom.2011.05.031 |
| ContentType | Journal Article |
| Copyright | 2019 Elsevier B.V. |
| Copyright_xml | – notice: 2019 Elsevier B.V. |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.neucom.2018.02.107 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1872-8286 |
| EndPage | 19 |
| ExternalDocumentID | 10_1016_j_neucom_2018_02_107 S0925231219301705 |
| GroupedDBID | --- --K --M .DC .~1 0R~ 123 1B1 1~. 1~5 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JM 9JN AABNK AACTN AADPK AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXLA AAXUO AAYFN ABBOA ABCQJ ABFNM ABJNI ABMAC ABYKQ ACDAQ ACGFS ACRLP ACZNC ADBBV ADEZE AEBSH AEKER AENEX AFKWA AFTJW AFXIZ AGHFR AGUBO AGWIK AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD AXJTR BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ IHE J1W KOM LG9 M41 MO0 MOBAO N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 RIG ROL RPZ SDF SDG SDP SES SPC SPCBC SSN SSV SSZ T5K ZMT ~G- 29N AAQXK AATTM AAXKI AAYWO AAYXX ABWVN ABXDB ACLOT ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN CITATION EFKBS FEDTE FGOYB HLZ HVGLF HZ~ R2- SBC SEW WUQ XPP ~HD |
| ID | FETCH-LOGICAL-c306t-fee991c1f9dd2cb8f9ac555d9077ef358599caa97bee90482b1f8c995a2f126f3 |
| IEDL.DBID | .~1 |
| ISSN | 0925-2312 |
| IngestDate | Thu Oct 16 04:40:49 EDT 2025 Thu Apr 24 23:03:13 EDT 2025 Fri Feb 23 02:27:02 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Approximate dynamic programming Adaptive dynamic programming Neural networks Reinforcement learning |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c306t-fee991c1f9dd2cb8f9ac555d9077ef358599caa97bee90482b1f8c995a2f126f3 |
| ORCID | 0000-0002-8022-907X 0000-0001-9841-3580 |
| PageCount | 7 |
| ParticipantIDs | crossref_primary_10_1016_j_neucom_2018_02_107 crossref_citationtrail_10_1016_j_neucom_2018_02_107 elsevier_sciencedirect_doi_10_1016_j_neucom_2018_02_107 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-06-07 |
| PublicationDateYYYYMMDD | 2019-06-07 |
| PublicationDate_xml | – month: 06 year: 2019 text: 2019-06-07 day: 07 |
| PublicationDecade | 2010 |
| PublicationTitle | Neurocomputing (Amsterdam) |
| PublicationYear | 2019 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | Zhang, Song, Wei, Zhang (bib0009) 2011; 22 Zhu, Zhao, Li (bib0014) 2016; 10 Liu, Li, Wang (bib0027) 2014; 44 Song, Lewis, Wei, Zhang, Jiang, Levine (bib0023) 2015; 26 Wang, Liu, Wei (bib0015) 2012; 78 Liu, Wei (bib0034) 2014; 25 Zhang, Cui, Luo, Jiang (bib0016) 2017; PP Wang, Zhang, Liu (bib0003) 2009; 4 Wei, Wang, Zhang (bib0010) 2013; 23 Zhang, Zhao, Zhu (bib0031) 2017; 238 Liu, Wang, Zhao, Wei, Jin (bib0022) 2012; 9 Wei, Wang, Liu, Yang (bib0004) 2014; 44 Liu, Yang, Wang, Wei (bib0019) 2015; 45 Mu, Wang (bib0011) 2017; 245 Luo, Wu, Huang, Liu (bib0018) 2015; 71 Wei, Liu, Lin (bib0002) 2016; 46 He, Ni, Fu (bib0001) 2012; 78 Zhao, Zhang, Wang, Zhu (bib0028) 2016; 46 Wang, Liu, Zhang, Zhao (bib0013) 2016; 46 Wei, Liu (bib0017) 2014; 11 Zhao, Zhu (bib0006) 2015; 26 Wei, Lewis, Sun, Yan, Song (bib0021) 2017; 47 Song, Wei, Sun (bib0008) 2015; 156 Luo, Wu, Huang, Liu (bib0020) 2014; 50 Luo, Wu, Huang (bib0007) 2015; 45 Zhang, Jiang, Luo, Xiao (bib0030) 2017; 47 Zhang, Luo, Liu (bib0032) 2009; 20 Kiumarsi, Lewis (bib0033) 2015; 26 Kamalapurkar, Klotz, Dixon (bib0026) 2014; 1 Wang, Liu, Mu, Ma (bib0012) 2016; 214 Vamvoudakis, Lewis (bib0024) 2011; 47 Zhang, Cui, Luo (bib0025) 2013; 43 Song, Lewis, Wei (bib0029) 2017; 28 Lewis, Vrabie (bib0005) 2009; 9 Mu (10.1016/j.neucom.2018.02.107_bib0011) 2017; 245 Wang (10.1016/j.neucom.2018.02.107_bib0012) 2016; 214 Wang (10.1016/j.neucom.2018.02.107_bib0013) 2016; 46 Wei (10.1016/j.neucom.2018.02.107_bib0002) 2016; 46 He (10.1016/j.neucom.2018.02.107_bib0001) 2012; 78 Liu (10.1016/j.neucom.2018.02.107_bib0022) 2012; 9 Wei (10.1016/j.neucom.2018.02.107_bib0017) 2014; 11 Zhao (10.1016/j.neucom.2018.02.107_bib0028) 2016; 46 Zhang (10.1016/j.neucom.2018.02.107_bib0032) 2009; 20 Liu (10.1016/j.neucom.2018.02.107_bib0034) 2014; 25 Wang (10.1016/j.neucom.2018.02.107_bib0015) 2012; 78 Wei (10.1016/j.neucom.2018.02.107_bib0021) 2017; 47 Zhao (10.1016/j.neucom.2018.02.107_bib0006) 2015; 26 Lewis (10.1016/j.neucom.2018.02.107_bib0005) 2009; 9 Luo (10.1016/j.neucom.2018.02.107_bib0018) 2015; 71 Wang (10.1016/j.neucom.2018.02.107_bib0003) 2009; 4 Vamvoudakis (10.1016/j.neucom.2018.02.107_bib0024) 2011; 47 Kiumarsi (10.1016/j.neucom.2018.02.107_bib0033) 2015; 26 Zhang (10.1016/j.neucom.2018.02.107_bib0031) 2017; 238 Zhang (10.1016/j.neucom.2018.02.107_bib0009) 2011; 22 Liu (10.1016/j.neucom.2018.02.107_bib0027) 2014; 44 Zhang (10.1016/j.neucom.2018.02.107_bib0030) 2017; 47 Luo (10.1016/j.neucom.2018.02.107_bib0007) 2015; 45 Wei (10.1016/j.neucom.2018.02.107_bib0010) 2013; 23 Zhang (10.1016/j.neucom.2018.02.107_bib0016) 2017; PP Zhu (10.1016/j.neucom.2018.02.107_bib0014) 2016; 10 Song (10.1016/j.neucom.2018.02.107_bib0008) 2015; 156 Luo (10.1016/j.neucom.2018.02.107_bib0020) 2014; 50 Kamalapurkar (10.1016/j.neucom.2018.02.107_bib0026) 2014; 1 Zhang (10.1016/j.neucom.2018.02.107_bib0025) 2013; 43 Song (10.1016/j.neucom.2018.02.107_bib0023) 2015; 26 Wei (10.1016/j.neucom.2018.02.107_bib0004) 2014; 44 Song (10.1016/j.neucom.2018.02.107_bib0029) 2017; 28 Liu (10.1016/j.neucom.2018.02.107_bib0019) 2015; 45 |
| References_xml | – volume: 4 start-page: 39 year: 2009 end-page: 47 ident: bib0003 article-title: Adaptive dynamic programming: an introduction publication-title: IEEE Comput. Intell. Mag. – volume: PP start-page: 1 year: 2017 end-page: 13 ident: bib0016 article-title: Finite-horizon publication-title: IEEE Trans. Neural Netw. Learn. Syst. – volume: 22 start-page: 1851 year: 2011 end-page: 1862 ident: bib0009 article-title: Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming publication-title: IEEE Trans. Neural Netw. – volume: 45 start-page: 1372 year: 2015 end-page: 1385 ident: bib0019 article-title: Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints publication-title: IEEE Trans. Cybern. – volume: 28 start-page: 704 year: 2017 end-page: 713 ident: bib0029 article-title: Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games publication-title: IEEE Trans. Neural Netw. Learn. Syst. – volume: 10 start-page: 1339 year: 2016 end-page: 1347 ident: bib0014 article-title: Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics publication-title: IET Control Theory Appl. – volume: 44 start-page: 2820 year: 2014 end-page: 2833 ident: bib0004 article-title: Finite-approximation-error-based discrete-time iterative adaptive dynamic programming publication-title: IEEE Trans. Cybern. – volume: 1 start-page: 239 year: 2014 end-page: 247 ident: bib0026 article-title: Concurrent learning-based approximate feedback-nash equilibrium solution of N-player nonzero-sum differential games publication-title: IEEE/CAA J. Autom. Sin. – volume: 214 start-page: 297 year: 2016 end-page: 306 ident: bib0012 article-title: Decentralized guaranteed cost control of interconnected systems with uncertainties: a learning-based optimal control strategy publication-title: Neurocomputing – volume: 78 start-page: 3 year: 2012 end-page: 13 ident: bib0001 article-title: A three-network architecture for on-line learning and optimization based on adaptive dynamic programming publication-title: Neurocomputing – volume: 25 start-page: 621 year: 2014 end-page: 634 ident: bib0034 article-title: Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems publication-title: IEEE Trans. Neural Netw. Learning Syst. – volume: 23 start-page: 1851 year: 2013 end-page: 1863 ident: bib0010 article-title: Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays publication-title: Neural Comput. Appl. – volume: 44 start-page: 1015 year: 2014 end-page: 1027 ident: bib0027 article-title: Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamics publication-title: IEEE Trans. Syst. Man Cybern. – volume: 46 start-page: 840 year: 2016 end-page: 853 ident: bib0002 article-title: Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems publication-title: IEEE Trans. Cybern. – volume: 71 start-page: 150 year: 2015 end-page: 158 ident: bib0018 article-title: Reinforcement learning solution for HJB equation arising in constrained optimal control problem publication-title: Neural Netw. – volume: 50 start-page: 3281 year: 2014 end-page: 3290 ident: bib0020 article-title: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design publication-title: Automatica – volume: 45 start-page: 65 year: 2015 end-page: 76 ident: bib0007 article-title: Off-policy reinforcement learning for publication-title: IEEE Trans. Cybern. – volume: 156 start-page: 166 year: 2015 end-page: 175 ident: bib0008 article-title: Nearly finite-horizon optimal control for a class of nonaffine time-delay nonlinear systems based on adaptive dynamic programming publication-title: Neurocomputing – volume: 9 start-page: 32 year: 2009 end-page: 50 ident: bib0005 article-title: Reinforcement learning and adaptive dynamic programming for feedback control publication-title: IEEE Circuits Syst. Mag. – volume: 47 start-page: 1556 year: 2011 end-page: 1569 ident: bib0024 article-title: Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations publication-title: Automatica – volume: 26 start-page: 851 year: 2015 end-page: 865 ident: bib0023 article-title: Multiple actor-critic structures for continuous-time optimal control using input-output data publication-title: IEEE Trans. Neural Netw. Learn. Syst. – volume: 26 start-page: 346 year: 2015 end-page: 356 ident: bib0006 article-title: MEC–a near-optimal online reinforcement learning algorithm for continuous deterministic systems publication-title: IEEE Trans. Neural Netw. Learn. Syst. – volume: 9 start-page: 628 year: 2012 end-page: 634 ident: bib0022 article-title: Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming publication-title: IEEE Trans. Autom. Sci. Eng. – volume: 26 start-page: 140 year: 2015 end-page: 151 ident: bib0033 article-title: Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems publication-title: IEEE Trans. Neural Netw. Learn. Syst. – volume: 43 start-page: 206 year: 2013 end-page: 216 ident: bib0025 article-title: Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP publication-title: IEEE Trans. Cybern. – volume: 245 start-page: 46 year: 2017 end-page: 54 ident: bib0011 article-title: Neural-network-based adaptive guaranteed cost control of nonlinear dynamical systems with matched uncertainties publication-title: Neurocomputing – volume: 47 start-page: 3331 year: 2017 end-page: 3340 ident: bib0030 article-title: Discrete-time nonzero-sum games for multiplayer using policy-iteration-based adaptive dynamic programming algorithms publication-title: IEEE Trans. Cybern. – volume: 238 start-page: 377 year: 2017 end-page: 386 ident: bib0031 article-title: Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs publication-title: Neurocomputing – volume: 78 start-page: 14 year: 2012 end-page: 22 ident: bib0015 article-title: Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach publication-title: Neurocomputing – volume: 46 start-page: 1544 year: 2016 end-page: 1555 ident: bib0013 article-title: Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics publication-title: IEEE Trans. Syst. Man Cybern. – volume: 11 start-page: 1020 year: 2014 end-page: 1036 ident: bib0017 article-title: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification publication-title: IEEE Trans. Autom. Sci. Eng. – volume: 20 start-page: 1490 year: 2009 end-page: 1503 ident: bib0032 article-title: Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints publication-title: IEEE Trans. Neural Netw. – volume: 46 start-page: 854 year: 2016 end-page: 865 ident: bib0028 article-title: Experience replay for optimal control of nonzero-sum game systems with unknown dynamics publication-title: IEEE Trans. Cybern. – volume: 47 start-page: 1224 year: 2017 end-page: 1237 ident: bib0021 article-title: Discrete-time deterministic publication-title: IEEE Trans. Cybern. – volume: 26 start-page: 346 issue: 2 year: 2015 ident: 10.1016/j.neucom.2018.02.107_bib0006 article-title: MEC–a near-optimal online reinforcement learning algorithm for continuous deterministic systems publication-title: IEEE Trans. Neural Netw. Learn. Syst. doi: 10.1109/TNNLS.2014.2371046 – volume: 46 start-page: 854 issue: 3 year: 2016 ident: 10.1016/j.neucom.2018.02.107_bib0028 article-title: Experience replay for optimal control of nonzero-sum game systems with unknown dynamics publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2015.2488680 – volume: 9 start-page: 32 issue: 3 year: 2009 ident: 10.1016/j.neucom.2018.02.107_bib0005 article-title: Reinforcement learning and adaptive dynamic programming for feedback control publication-title: IEEE Circuits Syst. Mag. doi: 10.1109/MCAS.2009.933854 – volume: PP start-page: 1 issue: 99 year: 2017 ident: 10.1016/j.neucom.2018.02.107_bib0016 article-title: Finite-horizon H∞ tracking control for unknown nonlinear systems with saturating actuators publication-title: IEEE Trans. Neural Netw. Learn. Syst. doi: 10.1109/TNNLS.2017.2705113 – volume: 44 start-page: 2820 issue: 12 year: 2014 ident: 10.1016/j.neucom.2018.02.107_bib0004 article-title: Finite-approximation-error-based discrete-time iterative adaptive dynamic programming publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2014.2354377 – volume: 4 start-page: 39 issue: 2 year: 2009 ident: 10.1016/j.neucom.2018.02.107_bib0003 article-title: Adaptive dynamic programming: an introduction publication-title: IEEE Comput. Intell. Mag. doi: 10.1109/MCI.2009.932261 – volume: 214 start-page: 297 year: 2016 ident: 10.1016/j.neucom.2018.02.107_bib0012 article-title: Decentralized guaranteed cost control of interconnected systems with uncertainties: a learning-based optimal control strategy publication-title: Neurocomputing doi: 10.1016/j.neucom.2016.06.020 – volume: 1 start-page: 239 issue: 3 year: 2014 ident: 10.1016/j.neucom.2018.02.107_bib0026 article-title: Concurrent learning-based approximate feedback-nash equilibrium solution of N-player nonzero-sum differential games publication-title: IEEE/CAA J. Autom. Sin. doi: 10.1109/JAS.2014.7004681 – volume: 156 start-page: 166 year: 2015 ident: 10.1016/j.neucom.2018.02.107_bib0008 article-title: Nearly finite-horizon optimal control for a class of nonaffine time-delay nonlinear systems based on adaptive dynamic programming publication-title: Neurocomputing doi: 10.1016/j.neucom.2014.12.066 – volume: 9 start-page: 628 issue: 3 year: 2012 ident: 10.1016/j.neucom.2018.02.107_bib0022 article-title: Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming publication-title: IEEE Trans. Autom. Sci. Eng. doi: 10.1109/TASE.2012.2198057 – volume: 44 start-page: 1015 issue: 8 year: 2014 ident: 10.1016/j.neucom.2018.02.107_bib0027 article-title: Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamics publication-title: IEEE Trans. Syst. Man Cybern. doi: 10.1109/TSMC.2013.2295351 – volume: 46 start-page: 1544 issue: 11 year: 2016 ident: 10.1016/j.neucom.2018.02.107_bib0013 article-title: Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics publication-title: IEEE Trans. Syst. Man Cybern. doi: 10.1109/TSMC.2015.2492941 – volume: 43 start-page: 206 issue: 1 year: 2013 ident: 10.1016/j.neucom.2018.02.107_bib0025 article-title: Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP publication-title: IEEE Trans. Cybern. doi: 10.1109/TSMCB.2012.2203336 – volume: 47 start-page: 3331 issue: 10 year: 2017 ident: 10.1016/j.neucom.2018.02.107_bib0030 article-title: Discrete-time nonzero-sum games for multiplayer using policy-iteration-based adaptive dynamic programming algorithms publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2016.2611613 – volume: 28 start-page: 704 issue: 3 year: 2017 ident: 10.1016/j.neucom.2018.02.107_bib0029 article-title: Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games publication-title: IEEE Trans. Neural Netw. Learn. Syst. doi: 10.1109/TNNLS.2016.2582849 – volume: 238 start-page: 377 year: 2017 ident: 10.1016/j.neucom.2018.02.107_bib0031 article-title: Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs publication-title: Neurocomputing doi: 10.1016/j.neucom.2017.01.076 – volume: 50 start-page: 3281 issue: 12 year: 2014 ident: 10.1016/j.neucom.2018.02.107_bib0020 article-title: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design publication-title: Automatica doi: 10.1016/j.automatica.2014.10.056 – volume: 45 start-page: 65 issue: 1 year: 2015 ident: 10.1016/j.neucom.2018.02.107_bib0007 article-title: Off-policy reinforcement learning for H∞ control design publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2014.2319577 – volume: 22 start-page: 1851 issue: 12 year: 2011 ident: 10.1016/j.neucom.2018.02.107_bib0009 article-title: Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming publication-title: IEEE Trans. Neural Netw. doi: 10.1109/TNN.2011.2172628 – volume: 47 start-page: 1556 issue: 8 year: 2011 ident: 10.1016/j.neucom.2018.02.107_bib0024 article-title: Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations publication-title: Automatica doi: 10.1016/j.automatica.2011.03.005 – volume: 46 start-page: 840 issue: 3 year: 2016 ident: 10.1016/j.neucom.2018.02.107_bib0002 article-title: Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2015.2492242 – volume: 78 start-page: 14 issue: 1 year: 2012 ident: 10.1016/j.neucom.2018.02.107_bib0015 article-title: Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach publication-title: Neurocomputing doi: 10.1016/j.neucom.2011.03.058 – volume: 11 start-page: 1020 issue: 4 year: 2014 ident: 10.1016/j.neucom.2018.02.107_bib0017 article-title: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification publication-title: IEEE Trans. Autom. Sci. Eng. doi: 10.1109/TASE.2013.2284545 – volume: 71 start-page: 150 year: 2015 ident: 10.1016/j.neucom.2018.02.107_bib0018 article-title: Reinforcement learning solution for HJB equation arising in constrained optimal control problem publication-title: Neural Netw. doi: 10.1016/j.neunet.2015.08.007 – volume: 23 start-page: 1851 issue: 7–8 year: 2013 ident: 10.1016/j.neucom.2018.02.107_bib0010 article-title: Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays publication-title: Neural Comput. Appl. doi: 10.1007/s00521-012-1188-7 – volume: 45 start-page: 1372 issue: 7 year: 2015 ident: 10.1016/j.neucom.2018.02.107_bib0019 article-title: Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2015.2417170 – volume: 26 start-page: 851 issue: 4 year: 2015 ident: 10.1016/j.neucom.2018.02.107_bib0023 article-title: Multiple actor-critic structures for continuous-time optimal control using input-output data publication-title: IEEE Trans. Neural Netw. Learn. Syst. doi: 10.1109/TNNLS.2015.2399020 – volume: 47 start-page: 1224 issue: 5 year: 2017 ident: 10.1016/j.neucom.2018.02.107_bib0021 article-title: Discrete-time deterministic Q-learning: a novel convergence analysis publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2016.2542923 – volume: 20 start-page: 1490 issue: 9 year: 2009 ident: 10.1016/j.neucom.2018.02.107_bib0032 article-title: Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints publication-title: IEEE Trans. Neural Netw. doi: 10.1109/TNN.2009.2027233 – volume: 245 start-page: 46 year: 2017 ident: 10.1016/j.neucom.2018.02.107_bib0011 article-title: Neural-network-based adaptive guaranteed cost control of nonlinear dynamical systems with matched uncertainties publication-title: Neurocomputing doi: 10.1016/j.neucom.2017.03.047 – volume: 26 start-page: 140 issue: 1 year: 2015 ident: 10.1016/j.neucom.2018.02.107_bib0033 article-title: Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems publication-title: IEEE Trans. Neural Netw. Learn. Syst. doi: 10.1109/TNNLS.2014.2358227 – volume: 25 start-page: 621 issue: 3 year: 2014 ident: 10.1016/j.neucom.2018.02.107_bib0034 article-title: Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems publication-title: IEEE Trans. Neural Netw. Learning Syst. doi: 10.1109/TNNLS.2013.2281663 – volume: 10 start-page: 1339 issue: 12 year: 2016 ident: 10.1016/j.neucom.2018.02.107_bib0014 article-title: Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics publication-title: IET Control Theory Appl. doi: 10.1049/iet-cta.2015.0769 – volume: 78 start-page: 3 issue: 1 year: 2012 ident: 10.1016/j.neucom.2018.02.107_bib0001 article-title: A three-network architecture for on-line learning and optimization based on adaptive dynamic programming publication-title: Neurocomputing doi: 10.1016/j.neucom.2011.05.031 |
| SSID | ssj0017129 |
| Score | 2.4046528 |
| Snippet | Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 13 |
| SubjectTerms | Adaptive dynamic programming Approximate dynamic programming Neural networks Reinforcement learning |
| Title | Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming |
| URI | https://dx.doi.org/10.1016/j.neucom.2018.02.107 |
| Volume | 344 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) customDbUrl: eissn: 1872-8286 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017129 issn: 0925-2312 databaseCode: GBLVA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Complete Freedom Collection [SCCMFC] customDbUrl: eissn: 1872-8286 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017129 issn: 0925-2312 databaseCode: ACRLP dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals [SCFCJ] customDbUrl: eissn: 1872-8286 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017129 issn: 0925-2312 databaseCode: AIKHN dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Science Direct customDbUrl: eissn: 1872-8286 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017129 issn: 0925-2312 databaseCode: .~1 dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVLSH databaseName: Elsevier Journals customDbUrl: mediaType: online eissn: 1872-8286 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017129 issn: 0925-2312 databaseCode: AKRWK dateStart: 19930201 isFulltext: true providerName: Library Specific Holdings |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9swDBaC9LJLH1uL9QkedtVSK7FlHYuiRbZhuWwBejMUPbIUiW0kbo_9Kf2tJS05aIFiA3YyLIiGIEoiKX_8yNiXPM-8Qr-CC6cyPso9bin0MrgZYQjkhZXa0oX-z0k2no6-36V3PXbd5cIQrDKe_eFMb0_r2DKIszmoF4vBr0slMIpKcMsNW1IYymAfSapi8PVpC_NIZCIC355IOfXu0udajFfpHggzgkYwJ-bOhIrKvmeeXpmc2322G31FuArDOWA9V35ke10dBojb8hN7JoYNveRlgHRzskwWYj2IOejlvFovmj-rDaCHCqaqahf4vmFOGFmoPFBy7hr9Z0615qFFGfJ6qdEfh8D1vAG6sYWIbKfnpq0u0WzgcaFBW123X7Shwj1E3NcKB3DIprc3v6_HPNZd4AYDiIZ759BrNIlX1gozy73SJk1Ti3G0dH6IAYZSRmslZ9gRTwAxS3xulEq18InI_PCI9cuqdJ8Z0H8c7TVKYxiZ5Vpb1JfOrJVEPZb4YzbsprswkZScRr8sOvTZfRGUVJCSikuBrfKY8a1UHUg5_tFfdpos3iyuAu3GXyVP_lvylH3AN9WiyuQZ6zfrB3eO_kszu2gX6AXbufr2Yzx5Af6h9Uw |
| linkProvider | Elsevier |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwELYoHOBCoYCAFphDr2aJdxPHxwoVLc8LIHGLvH4si5Yk2g099qfwWzsTOwgkBFJPkRxPZHnseTifv2HsZ55nXmFcwYVTGR_kHrcURhncDDAF8sJKbelA__IqG94Ozu7SuwV23N2FIVhltP3BprfWOrb04mz26smkd32kBGZRCW65fksK84UtDVIhKQM7_PuC80hkIgLhnkg5de_uz7Ugr9I9EWgEvWBO1J0JVZV9zz-98jkna2w1BovwK4xnnS248hv72hVigLgvN9gzUWzoKS8DppuTa7IQC0KMQU_H1WzS3D_OAUNUMFVVu0D4DWMCyULlgW7nzjCA5lRsHlqYIa-nGgNyCGTPc6AjW4jQdnrO2_ISzRz-TDRoq-v2izaUuIcI_HrEAWyy25PfN8dDHgsvcIMZRMO9cxg2msQra4UZ5V5pk6apxURaOt_HDEMpo7WSI-yIJkCMEp8bpVItfCIy399ii2VVum0G9CNHe43SmEdmudYWFaYzayVxjyV-h_W76S5MZCWn0U-LDn72UAQlFaSk4khgq9xh_EWqDqwcn_SXnSaLN6urQMfxoeTuf0sesOXhzeVFcXF6df6dreAb1ULM5A-22Mye3B4GM81ov12s_wCmh_bh |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Neural-network-based+learning+algorithms+for+cooperative+games+of+discrete-time+multi-player+systems+with+control+constraints+via+adaptive+dynamic+programming&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Jiang%2C+He&rft.au=Zhang%2C+Huaguang&rft.au=Xie%2C+Xiangpeng&rft.au=Han%2C+Ji&rft.date=2019-06-07&rft.issn=0925-2312&rft.volume=344&rft.spage=13&rft.epage=19&rft_id=info:doi/10.1016%2Fj.neucom.2018.02.107&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_neucom_2018_02_107 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon |