An Off-COMA Algorithm for Multi-UCAV Intelligent Combat Decision-Making
Unmanned Combat Aerial Vehicle (UCAV) has played an important role in modern military warfare, whose level of intelligent decision-making needs to be improved urgently. In this paper, a simplified multi-UCAV combat environment is established, which is modeled as a multi-agent Markov games. There are...
Saved in:
| Published in | 2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS) pp. 1 - 6 |
|---|---|
| Main Authors | , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
28.10.2022
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/DOCS55193.2022.9967776 |
Cover
| Summary: | Unmanned Combat Aerial Vehicle (UCAV) has played an important role in modern military warfare, whose level of intelligent decision-making needs to be improved urgently. In this paper, a simplified multi-UCAV combat environment is established, which is modeled as a multi-agent Markov games. There are many difficulties in multi-UCAV combat problem, including strong randomness and complexity, sparse rewards, and no strong opponents for training. In order to solve the above problems, an algorithm called Off Conterfactual Multi-Agent (Off-COMA) is proposed. This algorithm extends the COMA algorithm to the off-policy version, and can reuse historical data for training, which improves data utilization. In addition, the proposed Off-COMA algorithm exploits an improved prioritized experience replay method to deal with the sparse reward. This paper presents an asymmetric policy replay self-play method, which provides a guarantee for the algorithm to generate a powerful policy. Finally, compared with several classical multi-agent reinforcement learning algorithms, the superiority of Off-COMA algorithm in solving the multi-UCAV combat problem is verified. |
|---|---|
| DOI: | 10.1109/DOCS55193.2022.9967776 |