Hybrid of representation learning and reinforcement learning for dynamic and complex robotic motion planning

•The implementation of RG-DSAC and AW-DSAC. These two algorithms are important baselines for LSA-DSAC because LSA-DSAC is optimized from AW-DSAC where attention network is optimized to alleviate the problems of unstable training and slow convergence.•Proposing LSA-DSAC for robotic motion planning in...

Full description

Saved in:
Bibliographic Details
Published inRobotics and autonomous systems Vol. 194; p. 105167
Main Authors Zhou, Chengmin, Lu, Xin, Dai, Jiapeng, Liu, Xiaoxu, Huang, Bingding, Fränti, Pasi
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.12.2025
Subjects
Online AccessGet full text
ISSN0921-8890
1872-793X
DOI10.1016/j.robot.2025.105167

Cover

More Information
Summary:•The implementation of RG-DSAC and AW-DSAC. These two algorithms are important baselines for LSA-DSAC because LSA-DSAC is optimized from AW-DSAC where attention network is optimized to alleviate the problems of unstable training and slow convergence.•Proposing LSA-DSAC for robotic motion planning in dense and dynamic environment, with better interpretability, stability, and convergence. LSA-DSAC is the optimized version of AW-DSAC by integrating the skip connection method and LSTM into the architecture of the attention network of AW-DSAC.•Extensive evaluations of LSA-DSAC against the state-of-the-art by simulations.•Physical implementation, testing of the robot in the real world, and analytical discussions about the problem of vanishing gradient in deep networks, computation challenges with increasing obstacle numbers, and inaccurate attention of attention network. Motion planning is the soul of robot decision making. Classical planning algorithms like graph search and reaction-based algorithms face challenges in cases of dense and dynamic obstacles. Deep learning algorithms generate suboptimal one-step predictions that cause many collisions. Reinforcement learning algorithms generate optimal or near-optimal time-sequential predictions. However, they suffer from slow convergence, suboptimal converged results, and unstable training. This paper introduces a hybrid algorithm for robotic motion planning: long short-term memory (LSTM) and skip connection for attention-based discrete soft actor critic (LSA-DSAC). First, graph network (relational graph) and attention network (attention weight) interpret the environmental state for the learning of the discrete soft actor critic algorithm. The expressive power of attention network outperforms that of graph in our task by difference analysis of these two representation methods. However, attention based DSAC faces the problem of unstable training (vanishing gradient). Second, the skip connection method is integrated to attention based DSAC to mitigate unstable training and improve convergence speed. Third, LSTM is taken to replace the sum operator of attention weigh and eliminate unstable training by slightly sacrificing convergence speed at early-stage training. Experiments show that LSA-DSAC outperforms the state-of-the-art in training and most evaluations. Physical robots are also implemented and tested in the real world.
ISSN:0921-8890
1872-793X
DOI:10.1016/j.robot.2025.105167