Model-free algorithm for consensus of discrete-time multi-agent systems using reinforcement learning method

In this work, we investigate consensus issues of discrete-time (DT) multi-agent systems (MASs) with completely unknown dynamic by using reinforcement learning (RL) technique. Different from policy iteration (PI) based algorithms that require admissible initial control policies, this work proposes a...

Full description

Saved in:
Bibliographic Details
Published inJournal of the Franklin Institute Vol. 360; no. 14; pp. 10564 - 10581
Main Authors Long, Mingkang, An, Qing, Su, Housheng, Luo, Hui, Zhao, Jin
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.09.2023
Online AccessGet full text
ISSN0016-0032
1879-2693
DOI10.1016/j.jfranklin.2023.08.010

Cover

More Information
Summary:In this work, we investigate consensus issues of discrete-time (DT) multi-agent systems (MASs) with completely unknown dynamic by using reinforcement learning (RL) technique. Different from policy iteration (PI) based algorithms that require admissible initial control policies, this work proposes a value iteration (VI) based model-free algorithm for consensus of DTMASs with optimal performance and no requirement of admissible initial control policy. Firstly, in order to utilize RL method, the consensus problem is modeled as an optimal control problem of tracking error system for each agent. Then, we introduce a VI algorithm for consensus of DTMASs and give a novel convergence analysis for this algorithm, which does not require admissible initial control input. To implement the proposed VI algorithm to achieve consensus of DTMASs without information of dynamics, we construct actor-critic networks to online estimate the value functions and optimal control inputs in real time. At last, we give some simulation results to show the validity of the proposed algorithm.
ISSN:0016-0032
1879-2693
DOI:10.1016/j.jfranklin.2023.08.010