Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming

Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 344; pp. 13 - 19
Main Authors Jiang, He, Zhang, Huaguang, Xie, Xiangpeng, Han, Ji
Format Journal Article
LanguageEnglish
Published Elsevier B.V 07.06.2019
Subjects
Online AccessGet full text
ISSN0925-2312
1872-8286
DOI10.1016/j.neucom.2018.02.107

Cover

More Information
Summary:Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order to address this issue, a novel policy iteration (PI) algorithm is proposed based on ADP technique, and its associated convergence analysis is also studied in this brief paper. For the proposed PI algorithm, an online neural network (NN) implementation scheme with multiple-network structure is presented. In the online NN-based learning algorithm, critic network, constrained actor networks and unconstrained actor networks are employed to approximate the value function, constrained and unconstrained control policies, respectively, and the NN weight updating laws are designed based on the gradient descent method. Finally, a numerical simulation example is illustrated to show the effectiveness.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2018.02.107