Enhanced LSTM‐DQN algorithm for a two‐player zero‐sum game in three‐dimensional space

To tackle the challenges presented by the two‐player zero sum game (TZSG) in three‐dimensional space, this study introduces an enhanced deep Q‐learning (DQN) algorithm that utilizes long short term memory (LSTM) network. The primary objective of this algorithm is to enhance the temporal correlation...

Full description

Saved in:

Bibliographic Details
Published in	IET control theory & applications Vol. 18; no. 18; pp. 2798 - 2812
Main Authors	Lu, Bo, Ru, Le, Lv, Maolong, Hu, Shiguang, Zhang, Hongguo, Zhao, Zilong
Format	Journal Article
Language	English
Published	01.12.2024
Subjects	deep reinforcement learning hindsight experience replay long short term memory‐deep Q‐learning manoeuvre decision‐making three‐dimensional space two‐player zero‐sum game
Online Access	Get full text
ISSN	1751-8644 1751-8652 1751-8652
DOI	10.1049/cth2.12677

Cover

More Information
Summary:	To tackle the challenges presented by the two‐player zero sum game (TZSG) in three‐dimensional space, this study introduces an enhanced deep Q‐learning (DQN) algorithm that utilizes long short term memory (LSTM) network. The primary objective of this algorithm is to enhance the temporal correlation of the TZSG in three‐dimensional space. Additionally, it incorporates the hindsight experience replay (HER) mechanism to improve the learning efficiency of the network and mitigate the issue of the “sparse reward” that arises from prolonged training of intelligence in solving the TZSG in the three‐dimensional. Furthermore, this method enhances the convergence and stability of the overall solution.An intelligent training environment centred around an airborne agent and its mutual pursuit interaction scenario was designed to proposed approach's effectiveness. The algorithm training and comparison results show that the LSTM‐DQN‐HER algorithm outperforms similar algorithm in solving the TZSG in three‐dimensional space. In conclusion, this paper presents an improved DQN algorithm based on LSTM and incorporates the HER mechanism to address the challenges posed by the TZSG in three‐dimensional space. The proposed algorithm enhances the solution's temporal correlation, learning efficiency, convergence, and stability. The simulation results confirm its superior performance in solving the TZSG in three‐dimensional space. The LSTM‐DQN‐HER algorithm is proposed by modelling the MDP and POMDP of a two‐player zero‐sum game problem in three‐dimensional space, and the effectiveness of the proposed algorithm in solving the three‐dimensional two‐player zero‐sum game problem is verified by training and adversarial simulation of the Agent.
ISSN:	1751-8644 1751-8652 1751-8652
DOI:	10.1049/cth2.12677