Hybrid of representation learning and reinforcement learning for dynamic and complex robotic motion planning

•The implementation of RG-DSAC and AW-DSAC. These two algorithms are important baselines for LSA-DSAC because LSA-DSAC is optimized from AW-DSAC where attention network is optimized to alleviate the problems of unstable training and slow convergence.•Proposing LSA-DSAC for robotic motion planning in...

Full description

Saved in:

Bibliographic Details
Published in	Robotics and autonomous systems Vol. 194; p. 105167
Main Authors	Zhou, Chengmin, Lu, Xin, Dai, Jiapeng, Liu, Xiaoxu, Huang, Bingding, Fränti, Pasi
Format	Journal Article
Language	English
Published	Elsevier B.V 01.12.2025
Subjects	Intelligent robot Motion planning Navigation Reinforcement learning Representation learning deep deterministic policy gradient skip connection for attention-based DSAC advantage actor critic Long short-term memory relational graph based DSAC Representation learning twin delayed deep deterministic policy gradient Monte-Carlo tree search optimal reciprocal collision avoidance proximal policy optimization dynamic window approach relational graph Navigation probabilistic roadmap method multi-layer perceptron convolutional neural network pulse-width modulation Reinforcement learning Intelligent robot Motion planning LSTM and skip connection for attention-based discrete soft actor critic rapidly exploring random tree Deep learning algorithms attention weight based DSAC soft actor critic local area network Markov decision process asynchronous advantage actor critic robot operation system deep Q network
Online Access	Get full text
ISSN	0921-8890 1872-793X
DOI	10.1016/j.robot.2025.105167

Cover

More Information
Summary:	•The implementation of RG-DSAC and AW-DSAC. These two algorithms are important baselines for LSA-DSAC because LSA-DSAC is optimized from AW-DSAC where attention network is optimized to alleviate the problems of unstable training and slow convergence.•Proposing LSA-DSAC for robotic motion planning in dense and dynamic environment, with better interpretability, stability, and convergence. LSA-DSAC is the optimized version of AW-DSAC by integrating the skip connection method and LSTM into the architecture of the attention network of AW-DSAC.•Extensive evaluations of LSA-DSAC against the state-of-the-art by simulations.•Physical implementation, testing of the robot in the real world, and analytical discussions about the problem of vanishing gradient in deep networks, computation challenges with increasing obstacle numbers, and inaccurate attention of attention network. Motion planning is the soul of robot decision making. Classical planning algorithms like graph search and reaction-based algorithms face challenges in cases of dense and dynamic obstacles. Deep learning algorithms generate suboptimal one-step predictions that cause many collisions. Reinforcement learning algorithms generate optimal or near-optimal time-sequential predictions. However, they suffer from slow convergence, suboptimal converged results, and unstable training. This paper introduces a hybrid algorithm for robotic motion planning: long short-term memory (LSTM) and skip connection for attention-based discrete soft actor critic (LSA-DSAC). First, graph network (relational graph) and attention network (attention weight) interpret the environmental state for the learning of the discrete soft actor critic algorithm. The expressive power of attention network outperforms that of graph in our task by difference analysis of these two representation methods. However, attention based DSAC faces the problem of unstable training (vanishing gradient). Second, the skip connection method is integrated to attention based DSAC to mitigate unstable training and improve convergence speed. Third, LSTM is taken to replace the sum operator of attention weigh and eliminate unstable training by slightly sacrificing convergence speed at early-stage training. Experiments show that LSA-DSAC outperforms the state-of-the-art in training and most evaluations. Physical robots are also implemented and tested in the real world.
ISSN:	0921-8890 1872-793X
DOI:	10.1016/j.robot.2025.105167