Robust ADP-based control for uncertain nonlinear Stackelberg games

Stackelberg games allow players to access system information differently and take actions asynchronously. This paper introduces a robust adaptive dynamic programming-based method to solve the nonlinear two-player Stackelberg game subject to external disturbances. Combined with a neural network ident...

Full description

Saved in:

Bibliographic Details
Published in	Neurocomputing (Amsterdam) Vol. 561; p. 126834
Main Authors	Yu, Lin, Lai, Jing, Xiong, Junlin, Xie, Min
Format	Journal Article
Language	English
Published	Elsevier B.V 07.12.2023
Subjects	Actor–critic structure Adaptive dynamic programming Identifier Neural network Stackelberg game Identifier Stackelberg game Neural network Actor–critic structure Adaptive dynamic programming
Online Access	Get full text
ISSN	0925-2312
DOI	10.1016/j.neucom.2023.126834

Cover

More Information
Summary:	Stackelberg games allow players to access system information differently and take actions asynchronously. This paper introduces a robust adaptive dynamic programming-based method to solve the nonlinear two-player Stackelberg game subject to external disturbances. Combined with a neural network identifier, our method is implemented on the actor–critic-disturbance structure to approximate the optimal value function, i.e., the corresponding Stackelberg equilibrium. With the aid of costate, we transform this leader–follower optimization problem into solving two parametric equations and a costate equation. The coefficients of critic approximators and the costate are updated simultaneously to reach the Stackelberg equilibrium. The proposed control method finds real-time approximations of the Stackelberg-Saddle equilibrium while ensuring the closed-loop system’s stability. Finally, the simulation example shows the effectiveness and advantage of our method. •In our approach, a class of Stackelberg differential games with nonlinear dynamics is solved using the ADP method without knowing system drift dynamics. Notably, we take external disturbances into account, which contributes to the main difference between Mu et al. (2020) and this paper. The algorithm exhibits good robustness, expanding the scope of practical applications.•Our work extends the work (Kebriaei and Iannelli, 2018) to continuous-time nonlinear systems and avoids solving recursive difference Riccati equations. Two actor–critic-disturbance architectures, which have been frequently used in single-input system optimal control problems, are modified to approximate the Stackelberg-Saddle solution under this novel and more complicated task setting. In addition, this paper further investigates the impact of hierarchical input on the control performance in the context of multi-player games.•Furthermore, in contrast to Kebriaei and Razminia (2019), where the action of the follower was regarded as the function of the leader’s action, we transform the differential game under consideration into solving two parametric Hamilton–Jacobi–Isaacs (HJI) equations and a costate equation. Then the approximator weights and the costate are updated simultaneously and continuously according to these equations. A comparison study between Kebriaei and Razminia (2019) and our approach is conducted to validate the superiority and effectiveness of our method in the simulation part.
ISSN:	0925-2312
DOI:	10.1016/j.neucom.2023.126834