Hybrid of representation learning and reinforcement learning for dynamic and complex robotic motion planning

•The implementation of RG-DSAC and AW-DSAC. These two algorithms are important baselines for LSA-DSAC because LSA-DSAC is optimized from AW-DSAC where attention network is optimized to alleviate the problems of unstable training and slow convergence.•Proposing LSA-DSAC for robotic motion planning in...

Full description

Saved in:

Bibliographic Details
Published in	Robotics and autonomous systems Vol. 194; p. 105167
Main Authors	Zhou, Chengmin, Lu, Xin, Dai, Jiapeng, Liu, Xiaoxu, Huang, Bingding, Fränti, Pasi
Format	Journal Article
Language	English
Published	Elsevier B.V 01.12.2025
Subjects	Intelligent robot Motion planning Navigation Reinforcement learning Representation learning deep deterministic policy gradient skip connection for attention-based DSAC advantage actor critic Long short-term memory relational graph based DSAC Representation learning twin delayed deep deterministic policy gradient Monte-Carlo tree search optimal reciprocal collision avoidance proximal policy optimization dynamic window approach relational graph Navigation probabilistic roadmap method multi-layer perceptron convolutional neural network pulse-width modulation Reinforcement learning Intelligent robot Motion planning LSTM and skip connection for attention-based discrete soft actor critic rapidly exploring random tree Deep learning algorithms attention weight based DSAC soft actor critic local area network Markov decision process asynchronous advantage actor critic robot operation system deep Q network
Online Access	Get full text
ISSN	0921-8890 1872-793X
DOI	10.1016/j.robot.2025.105167

Cover

Abstract	•The implementation of RG-DSAC and AW-DSAC. These two algorithms are important baselines for LSA-DSAC because LSA-DSAC is optimized from AW-DSAC where attention network is optimized to alleviate the problems of unstable training and slow convergence.•Proposing LSA-DSAC for robotic motion planning in dense and dynamic environment, with better interpretability, stability, and convergence. LSA-DSAC is the optimized version of AW-DSAC by integrating the skip connection method and LSTM into the architecture of the attention network of AW-DSAC.•Extensive evaluations of LSA-DSAC against the state-of-the-art by simulations.•Physical implementation, testing of the robot in the real world, and analytical discussions about the problem of vanishing gradient in deep networks, computation challenges with increasing obstacle numbers, and inaccurate attention of attention network. Motion planning is the soul of robot decision making. Classical planning algorithms like graph search and reaction-based algorithms face challenges in cases of dense and dynamic obstacles. Deep learning algorithms generate suboptimal one-step predictions that cause many collisions. Reinforcement learning algorithms generate optimal or near-optimal time-sequential predictions. However, they suffer from slow convergence, suboptimal converged results, and unstable training. This paper introduces a hybrid algorithm for robotic motion planning: long short-term memory (LSTM) and skip connection for attention-based discrete soft actor critic (LSA-DSAC). First, graph network (relational graph) and attention network (attention weight) interpret the environmental state for the learning of the discrete soft actor critic algorithm. The expressive power of attention network outperforms that of graph in our task by difference analysis of these two representation methods. However, attention based DSAC faces the problem of unstable training (vanishing gradient). Second, the skip connection method is integrated to attention based DSAC to mitigate unstable training and improve convergence speed. Third, LSTM is taken to replace the sum operator of attention weigh and eliminate unstable training by slightly sacrificing convergence speed at early-stage training. Experiments show that LSA-DSAC outperforms the state-of-the-art in training and most evaluations. Physical robots are also implemented and tested in the real world.
AbstractList	•The implementation of RG-DSAC and AW-DSAC. These two algorithms are important baselines for LSA-DSAC because LSA-DSAC is optimized from AW-DSAC where attention network is optimized to alleviate the problems of unstable training and slow convergence.•Proposing LSA-DSAC for robotic motion planning in dense and dynamic environment, with better interpretability, stability, and convergence. LSA-DSAC is the optimized version of AW-DSAC by integrating the skip connection method and LSTM into the architecture of the attention network of AW-DSAC.•Extensive evaluations of LSA-DSAC against the state-of-the-art by simulations.•Physical implementation, testing of the robot in the real world, and analytical discussions about the problem of vanishing gradient in deep networks, computation challenges with increasing obstacle numbers, and inaccurate attention of attention network. Motion planning is the soul of robot decision making. Classical planning algorithms like graph search and reaction-based algorithms face challenges in cases of dense and dynamic obstacles. Deep learning algorithms generate suboptimal one-step predictions that cause many collisions. Reinforcement learning algorithms generate optimal or near-optimal time-sequential predictions. However, they suffer from slow convergence, suboptimal converged results, and unstable training. This paper introduces a hybrid algorithm for robotic motion planning: long short-term memory (LSTM) and skip connection for attention-based discrete soft actor critic (LSA-DSAC). First, graph network (relational graph) and attention network (attention weight) interpret the environmental state for the learning of the discrete soft actor critic algorithm. The expressive power of attention network outperforms that of graph in our task by difference analysis of these two representation methods. However, attention based DSAC faces the problem of unstable training (vanishing gradient). Second, the skip connection method is integrated to attention based DSAC to mitigate unstable training and improve convergence speed. Third, LSTM is taken to replace the sum operator of attention weigh and eliminate unstable training by slightly sacrificing convergence speed at early-stage training. Experiments show that LSA-DSAC outperforms the state-of-the-art in training and most evaluations. Physical robots are also implemented and tested in the real world.
ArticleNumber	105167
Author	Fränti, Pasi Lu, Xin Dai, Jiapeng Huang, Bingding Zhou, Chengmin Liu, Xiaoxu
Author_xml	– sequence: 1 givenname: Chengmin orcidid: 0000-0002-8297-5949 surname: Zhou fullname: Zhou, Chengmin email: zhou@cs.uef.fi organization: University of Eastern Finland FI-80100 Joensuu, Finland – sequence: 2 givenname: Xin surname: Lu fullname: Lu, Xin email: luwenkai67109@outlook.com organization: Sino-German College of Intelligent Manufacturing, Shenzhen Technology University, 518118 Shenzhen, China – sequence: 3 givenname: Jiapeng surname: Dai fullname: Dai, Jiapeng email: jalaxy1996@outlook.com organization: College of Big Data and Internet, Shenzhen Technology University, 518118 Shenzhen, China – sequence: 4 givenname: Xiaoxu surname: Liu fullname: Liu, Xiaoxu email: liuxiaoxu@sztu.edu.cn organization: Sino-German College of Intelligent Manufacturing, Shenzhen Technology University, 518118 Shenzhen, China – sequence: 5 givenname: Bingding orcidid: 0000-0002-4748-2882 surname: Huang fullname: Huang, Bingding email: huangbingding@sztu.edu.cn organization: College of Big Data and Internet, Shenzhen Technology University, 518118 Shenzhen, China – sequence: 6 givenname: Pasi orcidid: 0000-0002-9554-2827 surname: Fränti fullname: Fränti, Pasi email: franti@cs.uef.fi organization: Machine Learning Group, School of Computing, University of Eastern Finland, FI-80100 Joensuu, Finland
BookMark	eNqNkM1OwzAQhH0oEm3hCbjkBVK8cZw4Bw6oAlqpEhc4W669Qa4SO3LCT94eN0HihjitNLPfamdWZOG8Q0JugG6AQnF72gR_9MMmoxmPCoeiXJAlrTJIhajoJVn1_YlSynjJlqTZjcdgTeLrJGAXsEc3qMF6lzSogrPuLVHORM-62geNbfR_rSglZnSqtXpa077tGvxKpg-i1vrpVNcod96_Ihe1anq8_plr8vr48LLdpYfnp_32_pBqRumQwpGpXFHKUcUMABUTRSnyLKtyoaHOdaU41IabUpUFiJwbyAXUJYOCAwrN1iSf7767To2fqmlkF2yrwiiBynNL8iSnH-W5JTm3FDE2Yzr4vg9Y_5O6mymMiT4sBtlri06jsQH1II23f_LfaI-H0A
Cites_doi	10.1109/LRA.2022.3199674 10.1016/j.neucom.2020.04.020 10.1088/0031-9155/58/24/8769 10.1109/70.508439 10.1007/s10845-022-01988-z 10.1109/100.580977 10.1007/978-3-030-32323-3_12 10.1038/nature14236 10.1016/j.neucom.2016.11.023 10.1016/j.trpro.2021.07.052 10.1016/j.tre.2022.102834 10.1007/BF01386390 10.1109/TSSC.1968.300136 10.1007/s12008-020-00714-4
ContentType	Journal Article
Copyright	2025 The Author(s)
Copyright_xml	– notice: 2025 The Author(s)
DBID	6I. AAFTH AAYXX CITATION ADTOC UNPAY
DOI	10.1016/j.robot.2025.105167
DatabaseName	ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef Unpaywall for CDI: Periodical Content Unpaywall
DatabaseTitle	CrossRef
DatabaseTitleList
Database_xml	– sequence: 1 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
ExternalDocumentID	10.1016/j.robot.2025.105167 10_1016_j_robot_2025_105167 S0921889025002647
GroupedDBID	--K --M -~X .~1 0R~ 123 1B1 1~. 1~5 29P 4.4 457 4G. 5VS 6I. 7-5 71M 8P~ 9JN AABNK AAEDT AAEDW AAFTH AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AATTM AAXKI AAXUO AAYFN AAYWO ABBOA ABFNM ABFRF ABIVO ABJNI ABMAC ABWVN ABXDB ACDAQ ACGFO ACGFS ACLOT ACNNM ACRLP ACRPL ACVFH ACZNC ADBBV ADCNI ADEZE ADJOM ADMUD ADNMO ADTZH AEBSH AECPX AEFWE AEIPS AEKER AENEX AEUPX AFFNX AFJKZ AFPUW AFTJW AGHFR AGQPQ AGUBO AGYEJ AHHHB AHJVU AHPGS AHZHX AIALX AIEXJ AIGII AIIUN AIKHN AITUG AKBMS AKRWK AKYEP ALMA_UNASSIGNED_HOLDINGS AMRAJ ANKPU AOUOD APXCP ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFKBS EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HVGLF HZ~ H~9 IHE J1W JJJVA KOM LG9 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- ROL RPZ RXW SBC SCC SDF SDG SDP SES SET SEW SPC SPCBC SST SSV SSZ T5K TAE UNMZH WUQ XPP ~G- ~HD AAYXX CITATION ADTOC AGCQF UNPAY
ID	FETCH-LOGICAL-c300t-1b3a4a005ea02511938678422948c1f4c9a51fd5d7a761845d1481f731651e8c3
IEDL.DBID	UNPAY
ISSN	0921-8890 1872-793X
IngestDate	Tue Aug 26 13:24:22 EDT 2025 Thu Oct 16 04:41:07 EDT 2025 Sat Oct 25 17:18:05 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Keywords	deep deterministic policy gradient skip connection for attention-based DSAC advantage actor critic Long short-term memory relational graph based DSAC Representation learning twin delayed deep deterministic policy gradient Monte-Carlo tree search optimal reciprocal collision avoidance proximal policy optimization dynamic window approach relational graph Navigation probabilistic roadmap method multi-layer perceptron convolutional neural network pulse-width modulation Reinforcement learning Intelligent robot Motion planning LSTM and skip connection for attention-based discrete soft actor critic rapidly exploring random tree Deep learning algorithms attention weight based DSAC soft actor critic local area network Markov decision process asynchronous advantage actor critic robot operation system deep Q network
Language	English
License	This is an open access article under the CC BY license. cc-by
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c300t-1b3a4a005ea02511938678422948c1f4c9a51fd5d7a761845d1481f731651e8c3
ORCID	0000-0002-9554-2827 0000-0002-4748-2882 0000-0002-8297-5949
OpenAccessLink	https://proxy.k.utb.cz/login?url=https://doi.org/10.1016/j.robot.2025.105167
ParticipantIDs	unpaywall_primary_10_1016_j_robot_2025_105167 crossref_primary_10_1016_j_robot_2025_105167 elsevier_sciencedirect_doi_10_1016_j_robot_2025_105167
PublicationCentury	2000
PublicationDate	December 2025 2025-12-00
PublicationDateYYYYMMDD	2025-12-01
PublicationDate_xml	– month: 12 year: 2025 text: December 2025
PublicationDecade	2020
PublicationTitle	Robotics and autonomous systems
PublicationYear	2025
Publisher	Elsevier B.V
Publisher_xml	– name: Elsevier B.V
References	García-Vázquez, Marinetto, Santos-Miranda, Calvo, Desco, Pascau (bib0049) 2013; 58 Long, Fanl, Liao, Liu, Zhang, Pan (bib0032) 2018 Quigley (bib0046) 2009; 3 Vemula, Muelling, Oh (bib0019) 2018 Barbosa (bib0004) 2020; 14 T. Dam, G. Chalvatzaki, J. Peters, and J. Pajarinen, “Monte-Carlo robot path planning,” Baird (bib0037) 1995; 1995 Duan, Guan, Li, Ren, Sun, Cheng (bib0030) 2021 Huang, Liu, van der Maaten, Weinberger (bib0043) 2017 Van Hasselt, Guez, Silver (bib0021) 2016 Bas (bib0036) 2019 Dai, Mao, Huang, Qin, Huang, Li (bib0012) 2020; 402 Gerrits, Schuur (bib0002) 2021 Everett, Chen, How (bib0035) 2018 Y. Xu, D. Hu, L. Liang, S. McAleer, P. Abbeel, and R. Fox, “Target entropy annealing for discrete soft actor-critic,” Wang, Schaul, Hessel, Van Hasselt, Lanctot, De Frcitas (bib0022) 2016; 4 Fox, Burgard, Thrun (bib0010) 1997; 4 Bartoš, Bulej, Bohušík, Stancek, Ivanov, Macek (bib0005) 2021; 55 Mnih (bib0038) 2015; 518 Hamilton (bib0017) 2020; 14 Boyd, Vandenberghe (bib0040) 2004 Mnih (bib0013) 2013 Fujimoto, Van Hoof, Meger (bib0031) 2018; 4 Furtado, Liu, Lai, Lacheray, Desouza-Coelho (bib0047) 2018; 2018 J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” Haarnoja, Zhou, Abbeel, Levine (bib0028) 2018; 5 hwan Jeon (bib0008) 2013 C. Zhou, C. Wang, H. Hassan, H. Shah, B. Huang, and P. Fränti, “Bayesian inference for data-efficient, explainable, and safe robotic motion planning : a review,” arXiv:2307.08024, pp. 1–33. Xu, Jegelka, Hu, Leskovec (bib0044) 2019 Tang, Wang, Kwong (bib0042) 2017; 225 Den Van Berg, Lin, Manocha (bib0011) 2008; 2 Konda, Tsitsiklis (bib0039) 2000 C. Chen, S. Hu, P. Nikdel, G. Mori, and M. Savva, “Relational graph learning for crowd navigation,” in pp. 1–13, 2021. Alahi, Goel, Ramanathan, Robicquet, Fei-Fei, Savarese (bib0015) 2016 pp. 1–12, 2017. pp. 1–8, 2022. Kavraki, Švestka, Latombe, Overmars (bib0009) 1996; 12 Chen, Liu, Everett, How (bib0033) 2017 Dijkstra (bib0007) 1959; 1 Mnih (bib0014) 2016; 48 Fuentes-Moraleda, Díaz-Pérez, Orea-Giner, Muñoz- Mazón, Villacé-Molinero (bib0001) 2020; 36 Chen, Liu, Kreiss, Alahi (bib0018) 2019; 2019-May 2019, pp. 1–7. T. Haarnoja et al., “Soft actor-critic algorithms and applications,” Zhou, Huang, Hassan, Fränti (bib0034) 2023; 34 Ameler (bib0048) 2019 pp. 1–7, 2019. He, Zhang, Ren, Sun (bib0041) 2016 Srinivas, Ramachandiran, Rajendran (bib0003) 2022; 165 Haarnoja, Tang, Abbeel, Levine (bib0023) 2017; 3 Silver, Lever, Heess, Degris, Wierstra, Riedmiller (bib0024) 2014; 1 Munos, Stepleton, Harutyunyan, Bellemare (bib0025) 2016 Hart, Nilsson, Raphael (bib0006) 1968; 4 pp. 1–17, 2018. P. Christodoulou, “Soft actor-critic for discrete action settings,” Ameler (10.1016/j.robot.2025.105167_bib0048) 2019 Srinivas (10.1016/j.robot.2025.105167_bib0003) 2022; 165 Kavraki (10.1016/j.robot.2025.105167_bib0009) 1996; 12 Alahi (10.1016/j.robot.2025.105167_bib0015) 2016 Dai (10.1016/j.robot.2025.105167_bib0012) 2020; 402 Wang (10.1016/j.robot.2025.105167_bib0022) 2016; 4 Munos (10.1016/j.robot.2025.105167_bib0025) 2016 Tang (10.1016/j.robot.2025.105167_bib0042) 2017; 225 Gerrits (10.1016/j.robot.2025.105167_bib0002) 2021 Bartoš (10.1016/j.robot.2025.105167_bib0005) 2021; 55 Dijkstra (10.1016/j.robot.2025.105167_bib0007) 1959; 1 Chen (10.1016/j.robot.2025.105167_bib0018) 2019; 2019-May Van Hasselt (10.1016/j.robot.2025.105167_bib0021) 2016 Fujimoto (10.1016/j.robot.2025.105167_bib0031) 2018; 4 Chen (10.1016/j.robot.2025.105167_bib0033) 2017 Boyd (10.1016/j.robot.2025.105167_bib0040) 2004 Zhou (10.1016/j.robot.2025.105167_bib0034) 2023; 34 Silver (10.1016/j.robot.2025.105167_bib0024) 2014; 1 Baird (10.1016/j.robot.2025.105167_bib0037) 1995; 1995 Haarnoja (10.1016/j.robot.2025.105167_bib0023) 2017; 3 Long (10.1016/j.robot.2025.105167_bib0032) 2018 10.1016/j.robot.2025.105167_bib0050 Bas (10.1016/j.robot.2025.105167_bib0036) 2019 Mnih (10.1016/j.robot.2025.105167_bib0013) 2013 10.1016/j.robot.2025.105167_bib0045 Barbosa (10.1016/j.robot.2025.105167_bib0004) 2020; 14 Haarnoja (10.1016/j.robot.2025.105167_bib0028) 2018; 5 Xu (10.1016/j.robot.2025.105167_bib0044) 2019 Duan (10.1016/j.robot.2025.105167_bib0030) 2021 Quigley (10.1016/j.robot.2025.105167_bib0046) 2009; 3 Fuentes-Moraleda (10.1016/j.robot.2025.105167_bib0001) 2020; 36 Den Van Berg (10.1016/j.robot.2025.105167_bib0011) 2008; 2 Everett (10.1016/j.robot.2025.105167_bib0035) 2018 Konda (10.1016/j.robot.2025.105167_bib0039) 2000 10.1016/j.robot.2025.105167_bib0020 Mnih (10.1016/j.robot.2025.105167_bib0014) 2016; 48 He (10.1016/j.robot.2025.105167_bib0041) 2016 10.1016/j.robot.2025.105167_bib0016 García-Vázquez (10.1016/j.robot.2025.105167_bib0049) 2013; 58 Huang (10.1016/j.robot.2025.105167_bib0043) 2017 Furtado (10.1016/j.robot.2025.105167_bib0047) 2018; 2018 Mnih (10.1016/j.robot.2025.105167_bib0038) 2015; 518 hwan Jeon (10.1016/j.robot.2025.105167_bib0008) 2013 Fox (10.1016/j.robot.2025.105167_bib0010) 1997; 4 10.1016/j.robot.2025.105167_bib0026 10.1016/j.robot.2025.105167_bib0027 Hart (10.1016/j.robot.2025.105167_bib0006) 1968; 4 Vemula (10.1016/j.robot.2025.105167_bib0019) 2018 Hamilton (10.1016/j.robot.2025.105167_bib0017) 2020; 14 10.1016/j.robot.2025.105167_bib0029
References_xml	– reference: , pp. 1–7, 2019. – volume: 12 start-page: 566 year: 1996 end-page: 580 ident: bib0009 article-title: Probabilistic roadmaps for path planning in high-dimensional configuration spaces publication-title: IEEE Trans. Robot. Autom. – volume: 4 start-page: 23 year: 1997 end-page: 33 ident: bib0010 article-title: The dynamic window approach to collision avoidance publication-title: IEEE Robot. Autom. Mag. – volume: 5 start-page: 2976 year: 2018 end-page: 2989 ident: bib0028 article-title: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor publication-title: 35th Int. Conf. Mach. Learn. ICML 2018 – year: 2004 ident: bib0040 article-title: Convex Optimization – volume: 1 start-page: 605 year: 2014 end-page: 619 ident: bib0024 article-title: Deterministic policy gradient algorithms publication-title: 31st Int. Conf. Mach. Learn. ICML 2014, – start-page: 188 year: 2013 end-page: 193 ident: bib0008 article-title: Optimal motion planning with the half-car dynamical model for autonomous high-speed driving publication-title: 2013 American Control Conference – volume: 165 year: 2022 ident: bib0003 article-title: Autonomous robot-driven deliveries : a review of recent developments and future directions publication-title: Transp. Res. Part E – start-page: 1465 year: 2019 end-page: 1470 ident: bib0048 article-title: A comparative evaluation of SteamVR tracking and the OptiTrack system for medical device tracking publication-title: Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS – start-page: 961 year: 2016 end-page: 971 ident: bib0015 article-title: Social LSTM: human trajectory prediction in crowded spaces publication-title: n (CVPR) – start-page: 285 year: 2017 end-page: 292 ident: bib0033 article-title: Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning publication-title: Proc. - IEEE Int. Conf. RobotAutom – volume: 518 start-page: 529 year: 2015 end-page: 533 ident: bib0038 article-title: Human-level control through deep reinforcement learning publication-title: Nature – start-page: 4700 year: 2017 end-page: 4708 ident: bib0043 article-title: Densely connected convolutional networks publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit – volume: 3 start-page: 1 year: 2009 end-page: 6 ident: bib0046 article-title: ROS: an open-source robot operating system publication-title: ICRA Work. open source Softw – volume: 1 start-page: 269 year: 1959 end-page: 271 ident: bib0007 article-title: A note on two problems in connexion with graphs publication-title: Numer. Math. – volume: 3 start-page: 2171 year: 2017 end-page: 2186 ident: bib0023 article-title: Reinforcement learning with deep energy-based policies publication-title: 34th Int. Conf. Mach. Learn. ICML 2017 – volume: 4 start-page: 2587 year: 2018 end-page: 2601 ident: bib0031 article-title: Addressing function approximation error in actor-critic methods publication-title: 35th Int. Conf. Mach. Learn. ICML 2018 – start-page: 1 year: 2013 end-page: 9 ident: bib0013 article-title: Playing Atari with Deep Reinforcement Learning publication-title: arXiv – reference: , pp. 1–12, 2017. – volume: 14 start-page: 1569 year: 2020 end-page: 1575 ident: bib0004 article-title: Industry 4.0: examples of the use of the robotic arm for digital manufacturing processes publication-title: Int. J. Interact. Des. Manuf. – volume: 2019-May start-page: 6015 year: 2019 end-page: 6022 ident: bib0018 article-title: Crowd-robot interaction: crowd-aware robot navigation with attention-based deep reinforcement learning publication-title: Proc. - IEEE Int. Conf. Robot. Autom – reference: T. Dam, G. Chalvatzaki, J. Peters, and J. Pajarinen, “Monte-Carlo robot path planning,” – volume: 402 start-page: 346 year: 2020 end-page: 358 ident: bib0012 article-title: Automatic obstacle avoidance of quadrotor UAV via CNN-based learning publication-title: Neurocomputing – reference: J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” – volume: 4 start-page: 100 year: 1968 end-page: 107 ident: bib0006 article-title: A formal basis for the heuristic determination of minimum cost paths publication-title: IEEE Trans. Syst. Sci. Cybern. – start-page: 1 year: 2019 end-page: 17 ident: bib0044 article-title: How powerful are graph neural networks? publication-title: 7th Int. Conf. Learn. Represent. ICLR 2019 – reference: , 2019, pp. 1–7. – volume: 36 year: 2020 ident: bib0001 article-title: Interaction between hotel service robots and humans: a hotel-specific service robot acceptance model (sRAM) publication-title: Tour. Manag. Perspect – start-page: 1 year: 2021 end-page: 12 ident: bib0002 article-title: Parcel delivery for smart cities: a synchronization approach for combined truck-drone-street robot deliveries publication-title: 2021 Winter Simul – reference: Y. Xu, D. Hu, L. Liang, S. McAleer, P. Abbeel, and R. Fox, “Target entropy annealing for discrete soft actor-critic,” – volume: 1995 start-page: 30 year: 1995 end-page: 37 ident: bib0037 article-title: Residual algorithms: reinforcement learning with function approximation publication-title: Mach. Learn. Proc. – volume: 14 start-page: 1 year: 2020 end-page: 159 ident: bib0017 article-title: Graph representation learning publication-title: Synth. Lect. Artif. Intell. Mach. Learn. – start-page: 770 year: 2016 end-page: 778 ident: bib0041 article-title: Deep residual learning for image recognition publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit – start-page: 6252 year: 2018 end-page: 6259 ident: bib0032 article-title: Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning publication-title: Proc. - IEEE Int. Conf. Robot. Autom – volume: 34 start-page: 151 year: 2023 end-page: 180 ident: bib0034 article-title: Attention-based advantage actor-critic algorithm with prioritized experience replay for complex 2-D robotic motion planning publication-title: J. Intell. Manuf. – volume: 55 start-page: 837 year: 2021 end-page: 844 ident: bib0005 article-title: An overview of robot applications in automotive industry publication-title: Transp. Res. Procedia – volume: 58 start-page: 8769 year: 2013 end-page: 8782 ident: bib0049 article-title: Feasibility of integrating a multi-camera optical tracking system in intra-operative electron radiation therapy scenarios publication-title: Phys. Med. Biol. – reference: , pp. 1–17, 2018. – start-page: 1008 year: 2000 end-page: 1014 ident: bib0039 article-title: Actor-critic algorithms publication-title: Adv. Neural Inf. Process. Syst – start-page: 179 year: 2019 end-page: 198 ident: bib0036 article-title: An introduction to Markov chains publication-title: Basics Probab. Stoch. Process. – volume: 48 start-page: 1928 year: 2016 end-page: 1937 ident: bib0014 article-title: Asynchronous methods for deep reinforcement learning publication-title: Proceedings of Machine Learning Research – reference: , pp. 1–8, 2022. – reference: T. Haarnoja et al., “Soft actor-critic algorithms and applications,” – reference: P. Christodoulou, “Soft actor-critic for discrete action settings,” – start-page: 1 year: 2021 end-page: 15 ident: bib0030 article-title: Distributional soft actor-critic: off-policy reinforcement learning for addressing value estimation errors publication-title: IEEE Trans. Neural Networks Learn. Syst. – volume: 2018 year: 2018 ident: bib0047 article-title: Comparative analysis of OptiTrack motion capture systems publication-title: Proc. Can. Soc. Mech. Eng. Int. Congr – reference: , pp. 1–13, 2021. – volume: 225 start-page: 188 year: 2017 end-page: 197 ident: bib0042 article-title: G-MS2F: googLeNet based multi-stage feature fusion of deep CNN for scene recognition publication-title: Neurocomputing – volume: 2 start-page: 100 year: 2008 end-page: 107 ident: bib0011 article-title: Reciprocal velocity obstacles for real-time multi-agent navigation publication-title: Proc. - IEEE Int. Conf. Robot. Autom – reference: C. Chen, S. Hu, P. Nikdel, G. Mori, and M. Savva, “Relational graph learning for crowd navigation,” in – start-page: 3052 year: 2018 end-page: 3059 ident: bib0035 article-title: Motion planning among dynamic, decision-making agents with deep reinforcement learning publication-title: 2018 IEEE/RSJ Int. Conf. Intell. Robot. Syst – start-page: 2094 year: 2016 end-page: 2100 ident: bib0021 article-title: Deep reinforcement learning with double Q-learning publication-title: 30th AAAI Conf. Artif. Intell. AAAI 2016 – start-page: 4601 year: 2018 end-page: 4607 ident: bib0019 article-title: Social attention: modeling attention in Human crowds publication-title: Proc. - IEEE Int. Conf. Robot. Autom – start-page: 1054 year: 2016 end-page: 1062 ident: bib0025 article-title: Safe and efficient off-policy reinforcement learning publication-title: 30th Conf. Neural Inf. Process. Syst. (NIPS2016) – reference: C. Zhou, C. Wang, H. Hassan, H. Shah, B. Huang, and P. Fränti, “Bayesian inference for data-efficient, explainable, and safe robotic motion planning : a review,” arXiv:2307.08024, pp. 1–33. – volume: 4 start-page: 2939 year: 2016 end-page: 2947 ident: bib0022 article-title: Dueling network architectures for deep reinforcement learning publication-title: 33rd Int. Conf. Mach. Learn. ICML 2016 – start-page: 1465 year: 2019 ident: 10.1016/j.robot.2025.105167_bib0048 article-title: A comparative evaluation of SteamVR tracking and the OptiTrack system for medical device tracking – year: 2004 ident: 10.1016/j.robot.2025.105167_bib0040 – volume: 4 start-page: 2587 year: 2018 ident: 10.1016/j.robot.2025.105167_bib0031 article-title: Addressing function approximation error in actor-critic methods – start-page: 1 year: 2019 ident: 10.1016/j.robot.2025.105167_bib0044 article-title: How powerful are graph neural networks? – start-page: 6252 year: 2018 ident: 10.1016/j.robot.2025.105167_bib0032 article-title: Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning – start-page: 1 year: 2021 ident: 10.1016/j.robot.2025.105167_bib0030 article-title: Distributional soft actor-critic: off-policy reinforcement learning for addressing value estimation errors publication-title: IEEE Trans. Neural Networks Learn. Syst. – volume: 3 start-page: 2171 year: 2017 ident: 10.1016/j.robot.2025.105167_bib0023 article-title: Reinforcement learning with deep energy-based policies – volume: 36 issue: October 2020 year: 2020 ident: 10.1016/j.robot.2025.105167_bib0001 article-title: Interaction between hotel service robots and humans: a hotel-specific service robot acceptance model (sRAM) publication-title: Tour. Manag. Perspect – start-page: 188 year: 2013 ident: 10.1016/j.robot.2025.105167_bib0008 article-title: Optimal motion planning with the half-car dynamical model for autonomous high-speed driving – start-page: 1 year: 2013 ident: 10.1016/j.robot.2025.105167_bib0013 article-title: Playing Atari with Deep Reinforcement Learning publication-title: arXiv – volume: 2019-May start-page: 6015 year: 2019 ident: 10.1016/j.robot.2025.105167_bib0018 article-title: Crowd-robot interaction: crowd-aware robot navigation with attention-based deep reinforcement learning – ident: 10.1016/j.robot.2025.105167_bib0029 – ident: 10.1016/j.robot.2025.105167_bib0020 doi: 10.1109/LRA.2022.3199674 – volume: 402 start-page: 346 year: 2020 ident: 10.1016/j.robot.2025.105167_bib0012 article-title: Automatic obstacle avoidance of quadrotor UAV via CNN-based learning publication-title: Neurocomputing doi: 10.1016/j.neucom.2020.04.020 – volume: 14 start-page: 1 issue: 3 year: 2020 ident: 10.1016/j.robot.2025.105167_bib0017 article-title: Graph representation learning publication-title: Synth. Lect. Artif. Intell. Mach. Learn. – start-page: 1054 year: 2016 ident: 10.1016/j.robot.2025.105167_bib0025 article-title: Safe and efficient off-policy reinforcement learning – volume: 4 start-page: 2939 year: 2016 ident: 10.1016/j.robot.2025.105167_bib0022 article-title: Dueling network architectures for deep reinforcement learning – volume: 58 start-page: 8769 issue: 24 year: 2013 ident: 10.1016/j.robot.2025.105167_bib0049 article-title: Feasibility of integrating a multi-camera optical tracking system in intra-operative electron radiation therapy scenarios publication-title: Phys. Med. Biol. doi: 10.1088/0031-9155/58/24/8769 – volume: 1995 start-page: 30 year: 1995 ident: 10.1016/j.robot.2025.105167_bib0037 article-title: Residual algorithms: reinforcement learning with function approximation publication-title: Mach. Learn. Proc. – volume: 2018 year: 2018 ident: 10.1016/j.robot.2025.105167_bib0047 article-title: Comparative analysis of OptiTrack motion capture systems – start-page: 961 year: 2016 ident: 10.1016/j.robot.2025.105167_bib0015 article-title: Social LSTM: human trajectory prediction in crowded spaces – volume: 1 start-page: 605 year: 2014 ident: 10.1016/j.robot.2025.105167_bib0024 article-title: Deterministic policy gradient algorithms – volume: 12 start-page: 566 issue: 4 year: 1996 ident: 10.1016/j.robot.2025.105167_bib0009 article-title: Probabilistic roadmaps for path planning in high-dimensional configuration spaces publication-title: IEEE Trans. Robot. Autom. doi: 10.1109/70.508439 – start-page: 770 year: 2016 ident: 10.1016/j.robot.2025.105167_bib0041 article-title: Deep residual learning for image recognition – volume: 34 start-page: 151 year: 2023 ident: 10.1016/j.robot.2025.105167_bib0034 article-title: Attention-based advantage actor-critic algorithm with prioritized experience replay for complex 2-D robotic motion planning publication-title: J. Intell. Manuf. doi: 10.1007/s10845-022-01988-z – volume: 4 start-page: 23 issue: 1 year: 1997 ident: 10.1016/j.robot.2025.105167_bib0010 article-title: The dynamic window approach to collision avoidance publication-title: IEEE Robot. Autom. Mag. doi: 10.1109/100.580977 – start-page: 4601 year: 2018 ident: 10.1016/j.robot.2025.105167_bib0019 article-title: Social attention: modeling attention in Human crowds – start-page: 179 year: 2019 ident: 10.1016/j.robot.2025.105167_bib0036 article-title: An introduction to Markov chains publication-title: Basics Probab. Stoch. Process. doi: 10.1007/978-3-030-32323-3_12 – volume: 518 start-page: 529 issue: 7540 year: 2015 ident: 10.1016/j.robot.2025.105167_bib0038 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – volume: 225 start-page: 188 year: 2017 ident: 10.1016/j.robot.2025.105167_bib0042 article-title: G-MS2F: googLeNet based multi-stage feature fusion of deep CNN for scene recognition publication-title: Neurocomputing doi: 10.1016/j.neucom.2016.11.023 – volume: 55 start-page: 837 year: 2021 ident: 10.1016/j.robot.2025.105167_bib0005 article-title: An overview of robot applications in automotive industry publication-title: Transp. Res. Procedia doi: 10.1016/j.trpro.2021.07.052 – start-page: 1008 year: 2000 ident: 10.1016/j.robot.2025.105167_bib0039 article-title: Actor-critic algorithms publication-title: Adv. Neural Inf. Process. Syst – ident: 10.1016/j.robot.2025.105167_bib0045 – ident: 10.1016/j.robot.2025.105167_bib0026 – volume: 3 start-page: 1 year: 2009 ident: 10.1016/j.robot.2025.105167_bib0046 article-title: ROS: an open-source robot operating system publication-title: ICRA Work. open source Softw – volume: 5 start-page: 2976 year: 2018 ident: 10.1016/j.robot.2025.105167_bib0028 article-title: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor – start-page: 4700 year: 2017 ident: 10.1016/j.robot.2025.105167_bib0043 article-title: Densely connected convolutional networks – ident: 10.1016/j.robot.2025.105167_bib0027 – volume: 165 year: 2022 ident: 10.1016/j.robot.2025.105167_bib0003 article-title: Autonomous robot-driven deliveries : a review of recent developments and future directions publication-title: Transp. Res. Part E doi: 10.1016/j.tre.2022.102834 – ident: 10.1016/j.robot.2025.105167_bib0050 – volume: 48 start-page: 1928 year: 2016 ident: 10.1016/j.robot.2025.105167_bib0014 article-title: Asynchronous methods for deep reinforcement learning – start-page: 285 year: 2017 ident: 10.1016/j.robot.2025.105167_bib0033 article-title: Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning – ident: 10.1016/j.robot.2025.105167_bib0016 – start-page: 3052 year: 2018 ident: 10.1016/j.robot.2025.105167_bib0035 article-title: Motion planning among dynamic, decision-making agents with deep reinforcement learning – volume: 1 start-page: 269 issue: 1 year: 1959 ident: 10.1016/j.robot.2025.105167_bib0007 article-title: A note on two problems in connexion with graphs publication-title: Numer. Math. doi: 10.1007/BF01386390 – start-page: 1 year: 2021 ident: 10.1016/j.robot.2025.105167_bib0002 article-title: Parcel delivery for smart cities: a synchronization approach for combined truck-drone-street robot deliveries – volume: 4 start-page: 100 issue: 2 year: 1968 ident: 10.1016/j.robot.2025.105167_bib0006 article-title: A formal basis for the heuristic determination of minimum cost paths publication-title: IEEE Trans. Syst. Sci. Cybern. doi: 10.1109/TSSC.1968.300136 – volume: 14 start-page: 1569 issue: 4 year: 2020 ident: 10.1016/j.robot.2025.105167_bib0004 article-title: Industry 4.0: examples of the use of the robotic arm for digital manufacturing processes publication-title: Int. J. Interact. Des. Manuf. doi: 10.1007/s12008-020-00714-4 – start-page: 2094 year: 2016 ident: 10.1016/j.robot.2025.105167_bib0021 article-title: Deep reinforcement learning with double Q-learning – volume: 2 start-page: 100 year: 2008 ident: 10.1016/j.robot.2025.105167_bib0011 article-title: Reciprocal velocity obstacles for real-time multi-agent navigation
SSID	ssj0003573
Score	2.4554954
Snippet	•The implementation of RG-DSAC and AW-DSAC. These two algorithms are important baselines for LSA-DSAC because LSA-DSAC is optimized from AW-DSAC where...
SourceID	unpaywall crossref elsevier
SourceType	Open Access Repository Index Database Publisher
StartPage	105167
SubjectTerms	Intelligent robot Motion planning Navigation Reinforcement learning Representation learning
SummonAdditionalLinks	– databaseName: Elsevier ScienceDirect dbid: .~1 link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NS8MwFA9jF_UgfuL8IgePxjXNR9ujDMcQ9KKD3UqaD5mMtowN3cW_3Xy0boKIeGySNuElfS8v-b33A-BKYs0x5xGiCTWIKhesLGONmJLWaSbGZMKddzw88tGY3k_YpAMGbSyMg1U2uj_odK-tm5J-I81-PZ32n6LMmid_TeYcCeoiyilNHIvBzcca5kFYuGW2jZFr3WYe8hiveVVUDlAZM8d3iz3Z_I_WaWtZ1mL1JmazDesz3AO7zbYR3oaR7YOOLg_AzkYywUMwG61c9BWsDPSZKtuoohI2zBAvUJTK1vlcqdIfC66rbBFUgZ3eN_NQc_0O_dhtWWD7gXXDcXQExsO758EINVwKSJIoWiBcEEGF_eW08F5FRlJrpmgcZzSV2FCZCYaNYioRieOAYcr6Sdg4XiuGdSrJMeiWValPACSF4tx-RhSGUJMw68EJnhURM4nKOI164LqVYV6HlBl5iyV7zf2wcyfyPIi8B3gr5_zbzOdWqf_-Ivqalb90dPrfjs7AtnsKIJZz0F3Ml_rCbkUWxaVfa58i39x4 priority: 102 providerName: Elsevier
Title	Hybrid of representation learning and reinforcement learning for dynamic and complex robotic motion planning
URI	https://dx.doi.org/10.1016/j.robot.2025.105167 https://doi.org/10.1016/j.robot.2025.105167
UnpaywallVersion	publishedVersion
Volume	194
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) issn: 0921-8890 databaseCode: GBLVA dateStart: 20110101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.sciencedirect.com omitProxy: true ssIdentifier: ssj0003573 providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier ScienceDirect issn: 0921-8890 databaseCode: .~1 dateStart: 19950101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.sciencedirect.com omitProxy: true ssIdentifier: ssj0003573 providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Complete Freedom Collection [SCCMFC] issn: 0921-8890 databaseCode: ACRLP dateStart: 19950201 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.sciencedirect.com omitProxy: true ssIdentifier: ssj0003573 providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals issn: 0921-8890 databaseCode: AIKHN dateStart: 19950201 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.sciencedirect.com omitProxy: true ssIdentifier: ssj0003573 providerName: Elsevier – providerCode: PRVLSH databaseName: Elsevier Journals issn: 0921-8890 databaseCode: AKRWK dateStart: 19880301 customDbUrl: isFulltext: true mediaType: online dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0003573 providerName: Library Specific Holdings
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA9uO4gHv8WJjhw8mtG0SdoehzimwyHicJ5KmjSCjnbMDp0H_3aTtNUpKvNUSNI0zUt5H33v9wPgWOCEYcYcRHyiEJGmWFm4CaJSaKfZUyrkJt5xOWC9IbkY0VGJs21qYb78v7d5WNMszkzSo0sNJy1mfg00GNWGdx00hoOrzp1F03MxCgIbUcGBr23G0BtVGEM_z_KbHlqdpRM-f-bj8YKe6W4UBdxPFp7QpJc8tmd53Bav38Abl3yFTbBe2puwUxyQLbCSpNtgbQGFcAeMe3NTtgUzBS3EZVWOlMKSUuIe8lTqPguyKmw88bNLN0FZ0NrbYTZHPXmBdjW6raAJgpOSHGkXDLtnN6c9VJIwIOE5To5w7HHC9beacOuOhF6g9Rtx3ZAEAisiQk6xklT63DfkMVRqBwsrQ4hFcRIIbw_U0yxN9gH0YsmYnobHyiPKp9r14yyMHap8GTLiNMFJJZJoUmBtRFUS2kNklx2ZTYyKTWwCVoktKs2FwgyItAT-vhF9CHmZBx38c_whqOfTWXKkrZU8boFa-w23QKNz3u8NzLV_fdtvlWf3HcsR6rM
linkProvider	Unpaywall
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELZKGQoD4inK0wMjpkn8SDOiiqpA24VW6hY5sY2KqiSqWkEXfjt-JFAkhBDr2Ymtc3Lns7-7D4Cr1JfMZ8xDJCQKEWGSldNAIipSHTRjpSJuzjsGQ9Ybk4cJndRAp8qFMbDK0vY7m26tdSlpldpsFdNp68mLtHuy12QmkCDhBtgkNAhNBHbz_oXzwNRdM-veyHSvSg9ZkNc8T3KDqAyoIbz1Ldv8j-6pscwKvnrls9ma--nugp1y3whv3dT2QE1m-2B7rZrgAZj1Vib9CuYK2lKVVVpRBktqiGfIM6HbbLHU1J4LfjVpERSOnt52s1hz-Qbt3LXM0f3AoiQ5OgTj7t2o00MlmQJKsectkJ9gTrj-5yS3YUWE29pPkSCISDv1FUkjTn0lqAh5aEhgqNCBkq8MsRX1ZTvFR6Ce5Zk8BhAngjH9Gp4oTFRIdQjHWZR4VIUiYsRrgutKh3HhambEFZjsJbbTjo3KY6fyJmCVnuNvSx9rq_77g-hzVf4y0Ml_B7oEjd5o0I_798PHU7BlWhyi5QzUF_OlPNf7kkVyYb-7D0w235s
linkToUnpaywall	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LS8MwGA-6HcSDb3GikoNHM5rm0fY4xDEEhwcH81TSPAQd7ZgdOv96k7TVKSrzmqRpmi_le-T7fj8AziXWHHMeIBpRg6hyxcoy1IgpaZ1mYkwiXLzjZsgHI3o9ZuMaZ9vVwny5v_d5WLMiK1zSY8gcJy3m0Tpoc2YN7xZoj4a3vXuPphdiFMc-ooLjyNqMCRk3GEM_z_KbHtqY51OxeBGTyZKe6W9XBdzPHp7QpZc8dedl1pVv38AbV_yEHbBV25uwVx2QXbCm8z2wuYRCuA8mg4Ur24KFgR7isilHymFNKfEARa5snwdZlT6e-Nllm6CqaO39MJ-jrl-hX41tq2iC4LQmRzoAo_7V3eUA1SQMSJIgKBHOiKDC_qtaeHckIbHVbzQMExpLbKhMBMNGMRWJyJHHMGUdLGwcIRbDOpbkELTyItdHAJJMcW6nEZkh1ETMun6CJ1nATKQSToMOuGhEkk4rrI20SUJ7TP2yU7eJabWJHcAbsaW1uVCZAamVwN8Pog8hr_Ki43-OPwGtcjbXp9ZaKbOz-pS-A9TK5oQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Hybrid+of+representation+learning+and+reinforcement+learning+for+dynamic+and+complex+robotic+motion+planning&rft.jtitle=Robotics+and+autonomous+systems&rft.au=Zhou%2C+Chengmin&rft.au=Lu%2C+Xin&rft.au=Dai%2C+Jiapeng&rft.au=Liu%2C+Xiaoxu&rft.date=2025-12-01&rft.pub=Elsevier+B.V&rft.issn=0921-8890&rft.volume=194&rft_id=info:doi/10.1016%2Fj.robot.2025.105167&rft.externalDocID=S0921889025002647
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0921-8890&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0921-8890&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0921-8890&client=summon