Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming

Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order...

Full description

Saved in:

Bibliographic Details
Published in	Neurocomputing (Amsterdam) Vol. 344; pp. 13 - 19
Main Authors	Jiang, He, Zhang, Huaguang, Xie, Xiangpeng, Han, Ji
Format	Journal Article
Language	English
Published	Elsevier B.V 07.06.2019
Subjects	Adaptive dynamic programming Approximate dynamic programming Neural networks Reinforcement learning Approximate dynamic programming Adaptive dynamic programming Neural networks Reinforcement learning
Online Access	Get full text
ISSN	0925-2312 1872-8286
DOI	10.1016/j.neucom.2018.02.107

Cover

More Information
Summary:	Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order to address this issue, a novel policy iteration (PI) algorithm is proposed based on ADP technique, and its associated convergence analysis is also studied in this brief paper. For the proposed PI algorithm, an online neural network (NN) implementation scheme with multiple-network structure is presented. In the online NN-based learning algorithm, critic network, constrained actor networks and unconstrained actor networks are employed to approximate the value function, constrained and unconstrained control policies, respectively, and the NN weight updating laws are designed based on the gradient descent method. Finally, a numerical simulation example is illustrated to show the effectiveness.
ISSN:	0925-2312 1872-8286
DOI:	10.1016/j.neucom.2018.02.107