GAMA: Graph Attention Multi-agent reinforcement learning algorithm for cooperation

Multi-agent reinforcement learning (MARL) is an important way to realize multi-agent cooperation. But there are still many challenges, including the scalability and the uncertainty of the environment that limit its application. In this paper, we explored to solve those problems through the graph net...

Full description

Saved in:
Bibliographic Details
Published inApplied intelligence (Dordrecht, Netherlands) Vol. 50; no. 12; pp. 4195 - 4205
Main Authors Chen, Haoqiang, Liu, Yadong, Zhou, Zongtan, Hu, Dewen, Zhang, Ming
Format Journal Article
LanguageEnglish
Published New York Springer US 01.12.2020
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0924-669X
1573-7497
DOI10.1007/s10489-020-01755-8

Cover

More Information
Summary:Multi-agent reinforcement learning (MARL) is an important way to realize multi-agent cooperation. But there are still many challenges, including the scalability and the uncertainty of the environment that limit its application. In this paper, we explored to solve those problems through the graph network and the attention mechanism. Finally we succeeded in extending the existing algorithm and obtaining a new algorithm called GAMA. Specifically through the graph network, we made the environment information shared among agents. Meanwhile, the unimportant information was filtered out with the help of the attention mechanism, which helped to improve the communication efficiency. As a result, GAMA obtained the highest mean episode rewards compared to the baselines as well as excellent scalability. The reason why we choose the graph network is that understanding the relationship among agents plays a key role in solving multi-agent problems. And the graph network is very suitable for relational induction bias. Through the integration with the attention mechanism, it was shown that agents could figure out their relationship and focus on the influential environment factors in our experiment.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-020-01755-8