State Super Sampling Soft Actor–Critic Algorithm for Multi-AUV Hunting in 3D Underwater Environment

Reinforcement learning (RL) is known for its efficiency and practicality in single-agent planning, but it faces numerous challenges when applied to multi-agent scenarios. In this paper, a Super Sampling Info-GAN (SSIG) algorithm based on Generative Adversarial Networks (GANs) is proposed to address...

Full description

Saved in:

Bibliographic Details
Published in	Journal of marine science and engineering Vol. 11; no. 7; p. 1257
Main Authors	Wang, Zhuo, Sui, Yancheng, Qin, Hongde, Lu, Hao
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.07.2023
Subjects	Acoustics Adaptability Algorithms Analysis Autonomous underwater vehicles Communication Dynamical systems Efficiency Generative adversarial networks Hunting Learning Machine learning multi-agent reinforcement learning Multiagent systems multiple autonomous underwater vehicle hunting Reinforcement Remote submersibles Sampling Soft Actor–Critic Underwater vehicles
Online Access	Get full text
ISSN	2077-1312 2077-1312
DOI	10.3390/jmse11071257

Cover

More Information
Summary:	Reinforcement learning (RL) is known for its efficiency and practicality in single-agent planning, but it faces numerous challenges when applied to multi-agent scenarios. In this paper, a Super Sampling Info-GAN (SSIG) algorithm based on Generative Adversarial Networks (GANs) is proposed to address the problem of state instability in Multi-Agent Reinforcement Learning (MARL). The SSIG model allows a pair of GAN networks to analyze the previous state of dynamic system and predict the future state of consecutive state pairs. A multi-agent system (MAS) can deduce the complete state of all collaborating agents through SSIG. The proposed model has the potential to be employed in multi-autonomous underwater vehicle (multi-AUV) planning scenarios by combining it with the Soft Actor–Critic (SAC) algorithm. Hence, this paper presents State Super Sampling Soft Actor–Critic (S4AC), which is a new algorithm that combines the advantages of SSIG and SAC and can be applied to Multi-AUV hunting tasks. The simulation results demonstrate that the proposed algorithm has strong learning ability and adaptability and has a considerable success rate in hunting the evading target in multiple testing scenarios.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2077-1312 2077-1312
DOI:	10.3390/jmse11071257