GAMA: Graph Attention Multi-agent reinforcement learning algorithm for cooperation
Multi-agent reinforcement learning (MARL) is an important way to realize multi-agent cooperation. But there are still many challenges, including the scalability and the uncertainty of the environment that limit its application. In this paper, we explored to solve those problems through the graph net...
        Saved in:
      
    
          | Published in | Applied intelligence (Dordrecht, Netherlands) Vol. 50; no. 12; pp. 4195 - 4205 | 
|---|---|
| Main Authors | , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        New York
          Springer US
    
        01.12.2020
     Springer Nature B.V  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0924-669X 1573-7497  | 
| DOI | 10.1007/s10489-020-01755-8 | 
Cover
| Summary: | Multi-agent reinforcement learning (MARL) is an important way to realize multi-agent cooperation. But there are still many challenges, including the scalability and the uncertainty of the environment that limit its application. In this paper, we explored to solve those problems through the graph network and the attention mechanism. Finally we succeeded in extending the existing algorithm and obtaining a new algorithm called GAMA. Specifically through the graph network, we made the environment information shared among agents. Meanwhile, the unimportant information was filtered out with the help of the attention mechanism, which helped to improve the communication efficiency. As a result, GAMA obtained the highest mean episode rewards compared to the baselines as well as excellent scalability. The reason why we choose the graph network is that understanding the relationship among agents plays a key role in solving multi-agent problems. And the graph network is very suitable for relational induction bias. Through the integration with the attention mechanism, it was shown that agents could figure out their relationship and focus on the influential environment factors in our experiment. | 
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14  | 
| ISSN: | 0924-669X 1573-7497  | 
| DOI: | 10.1007/s10489-020-01755-8 |