Adaptive Distributed Control for Leader–Follower Formation Based on a Recurrent SAC Algorithm
This study proposes a novel adaptive distributed recurrent SAC (Soft Actor–Critic) control method to address the leader–follower formation control problem of omnidirectional mobile robots. Our method successfully eliminates the reliance on the complete state of the leader and achieves the task of fo...
Saved in:
| Published in | Electronics (Basel) Vol. 13; no. 17; p. 3513 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
Basel
MDPI AG
01.09.2024
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2079-9292 2079-9292 |
| DOI | 10.3390/electronics13173513 |
Cover
| Summary: | This study proposes a novel adaptive distributed recurrent SAC (Soft Actor–Critic) control method to address the leader–follower formation control problem of omnidirectional mobile robots. Our method successfully eliminates the reliance on the complete state of the leader and achieves the task of formation solely using the pose between robots. Moreover, we develop a novel recurrent SAC reinforcement learning framework that ensures that the controller exhibits good transient and steady-state characteristics to achieve outstanding control performance. We also present an episode-based memory replay buffer and sampling approaches, along with a unique normalized reward function, which expedites the recurrent SAC reinforcement learning formation framework to converge rapidly and receive consistent incentives across various leader–follower tasks. This facilitates better learning and adaptation to the formation task requirements in different scenarios. Furthermore, to bolster the generalization capability of our method, we normalized the state space, effectively eliminating differences between formation tasks of different shapes. Different shapes of leader–follower formation experiments in the Gazebo simulator achieve excellent results, validating the efficacy of our method. Comparative experiments with traditional PID and common network controllers demonstrate that our method achieves faster convergence and greater robustness. These simulation results provide strong support for our study and demonstrate the potential and reliability of our method in solving real-world problems. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2079-9292 2079-9292 |
| DOI: | 10.3390/electronics13173513 |