Output-Feedback Global Consensus of Discrete-Time Multiagent Systems Subject to Input Saturation via Q-Learning Method

This article proposes a <inline-formula> <tex-math notation="LaTeX">Q </tex-math></inline-formula>-learning (QL)-based algorithm for global consensus of saturated discrete-time multiagent systems (DTMASs) via output feedback. According to the low-gain feedback (LGF)...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on cybernetics Vol. 52; no. 3; pp. 1661 - 1670
Main Authors	Long, Mingkang, Su, Housheng, Zeng, Zhigang
Format	Journal Article
Language	English
Published	United States IEEE 01.03.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	<italic xmlns:ali="http://www.niso.org/schemas/ali/1.0/" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">Q -learning (QL) algorithm Algorithms Directed graphs Discrete time Discrete time systems Eigenvalues Eigenvalues and eigenfunctions Formulas (mathematics) global consensus Heuristic algorithms input saturation Lower bounds Machine learning Multi-agent systems Multiagent systems Network topologies Network topology Output feedback Protocols Riccati equation Saturation System dynamics
Online Access	Get full text
ISSN	2168-2267 2168-2275 2168-2275
DOI	10.1109/TCYB.2020.2987385

Cover

More Information
Summary:	This article proposes a <inline-formula> <tex-math notation="LaTeX">Q </tex-math></inline-formula>-learning (QL)-based algorithm for global consensus of saturated discrete-time multiagent systems (DTMASs) via output feedback. According to the low-gain feedback (LGF) theory, control inputs of the saturated DTMASs can avoid the saturation by utilizing the control policies with LGF matrices, which were computed from the modified algebraic Riccati equation (MARE) by requiring the information of system dynamics in most previous works. However, in this article, we first find the lower bound on the real part of Laplacian matrices' nonzero eigenvalues of directed network topologies. Then, we define a test control input and propose a <inline-formula> <tex-math notation="LaTeX">Q </tex-math></inline-formula>-function to derive a QL Bellman equation, which plays an essential part of the QL algorithm. Subsequently, different from the previous works, the output-feedback gain (OFG) matrix of this article can be obtained by limited iterations of the QL algorithm without requiring the information of agent dynamics and network topologies of the saturated DTMASs. Furthermore, the saturated DTMASs can achieve global consensus rather than the semiglobal consensus of the previous results. Finally, the effectiveness of the QL algorithm is confirmed via two simulations.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2168-2267 2168-2275 2168-2275
DOI:	10.1109/TCYB.2020.2987385