An Off-COMA Algorithm for Multi-UCAV Intelligent Combat Decision-Making
Unmanned Combat Aerial Vehicle (UCAV) has played an important role in modern military warfare, whose level of intelligent decision-making needs to be improved urgently. In this paper, a simplified multi-UCAV combat environment is established, which is modeled as a multi-agent Markov games. There are...
        Saved in:
      
    
          | Published in | 2022 4th International Conference on Data-driven Optimization of Complex Systems (DOCS) pp. 1 - 6 | 
|---|---|
| Main Authors | , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        28.10.2022
     | 
| Subjects | |
| Online Access | Get full text | 
| DOI | 10.1109/DOCS55193.2022.9967776 | 
Cover
| Summary: | Unmanned Combat Aerial Vehicle (UCAV) has played an important role in modern military warfare, whose level of intelligent decision-making needs to be improved urgently. In this paper, a simplified multi-UCAV combat environment is established, which is modeled as a multi-agent Markov games. There are many difficulties in multi-UCAV combat problem, including strong randomness and complexity, sparse rewards, and no strong opponents for training. In order to solve the above problems, an algorithm called Off Conterfactual Multi-Agent (Off-COMA) is proposed. This algorithm extends the COMA algorithm to the off-policy version, and can reuse historical data for training, which improves data utilization. In addition, the proposed Off-COMA algorithm exploits an improved prioritized experience replay method to deal with the sparse reward. This paper presents an asymmetric policy replay self-play method, which provides a guarantee for the algorithm to generate a powerful policy. Finally, compared with several classical multi-agent reinforcement learning algorithms, the superiority of Off-COMA algorithm in solving the multi-UCAV combat problem is verified. | 
|---|---|
| DOI: | 10.1109/DOCS55193.2022.9967776 |