Exploring PPO in G2RL: A Reinforcement Learning-Based Path Planning Approach to Dynamic Environments

Autonomous navigation in dynamic environments presents significant challenges for reinforcement learning (RL)-based robot navigation, including adapting to real-time obstacle dynamics and ensuring reproducibility of results across frameworks. The Globally Guided Reinforcement Learning (G2RL) framewo...

Full description

Saved in:

Bibliographic Details
Published in	2025 3rd International Conference on Control and Robot Technology (ICCRT) pp. 58 - 64
Main Authors	Yalley, Abraham Kojo, Chen, Yang, Fu, Hao
Format	Conference Proceeding
Language	English
Published	IEEE 16.04.2025
Subjects	Adaptation models Autonomous robots Decision making dynamic environments hierarchical reinforcement learning Optimization Path planning proximal policy optimization Reinforcement learning Reproducibility of results Stability analysis Training Tuning
Online Access	Get full text
DOI	10.1109/ICCRT63554.2025.11072787

Cover

Abstract	Autonomous navigation in dynamic environments presents significant challenges for reinforcement learning (RL)-based robot navigation, including adapting to real-time obstacle dynamics and ensuring reproducibility of results across frameworks. The Globally Guided Reinforcement Learning (G2RL) framework offers a promising hierarchical approach, combining global path planning with \mathrm{A}^{*} -based algorithms and local decision-making using Double Deep Q-Learning (DDQN). However, value-based methods like DDQN can suffer from instability and suboptimal performance in highly dynamic environments. This paper investigates the feasibility of replacing DDQN with Proximal Policy Optimization (PPO), a policy-gradient method known for its stability and adaptability, within the G2RL framework. Using the original G2RL's environment configuration, and reward structure, this study compares the performance of PPO and DDQN in identical conditions. Both models were trained on a single random map with 10 dynamic obstacles and tested on the same map with 60 obstacles. The results reveal that while the DDQN implementation failed to replicate the original paper's reported performance, PPO demonstrated robustness under dynamic conditions and showed potential as a viable alternative for hierarchical frameworks. This study highlights the importance of reproducibility in RL research and showcases PPO's adaptability, even though its overall performance requires further optimization for real-world applications.
AbstractList	Autonomous navigation in dynamic environments presents significant challenges for reinforcement learning (RL)-based robot navigation, including adapting to real-time obstacle dynamics and ensuring reproducibility of results across frameworks. The Globally Guided Reinforcement Learning (G2RL) framework offers a promising hierarchical approach, combining global path planning with \mathrm{A}^{*} -based algorithms and local decision-making using Double Deep Q-Learning (DDQN). However, value-based methods like DDQN can suffer from instability and suboptimal performance in highly dynamic environments. This paper investigates the feasibility of replacing DDQN with Proximal Policy Optimization (PPO), a policy-gradient method known for its stability and adaptability, within the G2RL framework. Using the original G2RL's environment configuration, and reward structure, this study compares the performance of PPO and DDQN in identical conditions. Both models were trained on a single random map with 10 dynamic obstacles and tested on the same map with 60 obstacles. The results reveal that while the DDQN implementation failed to replicate the original paper's reported performance, PPO demonstrated robustness under dynamic conditions and showed potential as a viable alternative for hierarchical frameworks. This study highlights the importance of reproducibility in RL research and showcases PPO's adaptability, even though its overall performance requires further optimization for real-world applications.
Author	Chen, Yang Yalley, Abraham Kojo Fu, Hao
Author_xml	– sequence: 1 givenname: Abraham Kojo surname: Yalley fullname: Yalley, Abraham Kojo email: yalleyabraham2@gmail.com organization: School of Artificial Intelligence and Automation, Wuhan University of Science and Technology,Wuhan,China – sequence: 2 givenname: Yang surname: Chen fullname: Chen, Yang email: chenyag@wust.edu.cn organization: School of Artificial Intelligence and Automation, Wuhan University of Science and Technology,Wuhan,China – sequence: 3 givenname: Hao surname: Fu fullname: Fu, Hao email: fuhao@wust.edu.cn organization: School of Computer Science and Technology, Wuhan University of Science and Technology,Wuhan,China
BookMark	eNo1j9FOgzAYhWuiFzr3Bl78L8Bs-1NovUPEuYRkhHC_FCiuCRRSiHFvr8R5dZIv5zvJeSC3bnSGEGB0xxhVz4c0LasIhQh3nHKxwpjHMr4hWxUricgEYizEPWmz76kfvXWfUBRHsA72vMxfIIHSWNeNvjGDcQvkRnv32wpe9WxaKPRyhqLXbmWQTJMfdXOGZYS3i9ODbSBzX9aPbpXnR3LX6X4222tuSPWeVelHkB_3hzTJA6twCbAOOVUNdpJzhpKhxi4UTDMe07qJhOC0qxUNW9oqxaQOI8mYFKFEGhlqEDfk6W_WGmNOk7eD9pfT_3f8AcmZUkI
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ICCRT63554.2025.11072787
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9798331533755
EndPage	64
ExternalDocumentID	11072787
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China grantid: 62173262,62303357 funderid: 10.13039/501100001809
GroupedDBID	6IE 6IL CBEJK RIE RIL
ID	FETCH-LOGICAL-i93t-3b4209c3f82213813a3f451a1270bc65520fb904d0d9918a468118548306e0e33
IEDL.DBID	RIE
IngestDate	Wed Jul 16 07:53:49 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i93t-3b4209c3f82213813a3f451a1270bc65520fb904d0d9918a468118548306e0e33
PageCount	7
ParticipantIDs	ieee_primary_11072787
PublicationCentury	2000
PublicationDate	2025-April-16
PublicationDateYYYYMMDD	2025-04-16
PublicationDate_xml	– month: 04 year: 2025 text: 2025-April-16 day: 16
PublicationDecade	2020
PublicationTitle	2025 3rd International Conference on Control and Robot Technology (ICCRT)
PublicationTitleAbbrev	ICCRT
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	1.9088126
Snippet	Autonomous navigation in dynamic environments presents significant challenges for reinforcement learning (RL)-based robot navigation, including adapting to...
SourceID	ieee
SourceType	Publisher
StartPage	58
SubjectTerms	Adaptation models Autonomous robots Decision making dynamic environments hierarchical reinforcement learning Optimization Path planning proximal policy optimization Reinforcement learning Reproducibility of results Stability analysis Training Tuning
Title	Exploring PPO in G2RL: A Reinforcement Learning-Based Path Planning Approach to Dynamic Environments
URI	https://ieeexplore.ieee.org/document/11072787
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA66kycVJ_4mB6_p0iZNG29zbk7RWcaE3UaSpjKETqS7-Nf7krbzBwjeSmhpSALf-_Le-z6ELhOTmlBTSVTCGOEyUkSHLuGaQ_CsdCq0t_N5nIjxM7-fx_OmWd33wlhrffGZDdyjz-XnK7N2V2U9x1UiOGHbaDtJRd2s1VbnUNm7GwymMw-gwPuiOGhf_2Gc4nFjtIsm7R_rcpHXYF3pwHz8EmP895T2UPerRQ9nG_DZR1u2PED5pqIOZ9kTXpb4Npo-XOE-nlqvkGr8ZSBuRFVfyDVgWI4ziAJx616E-43KOK5W-Kb2q8fDb-1wXTQbDWeDMWlsFMhSsoowzSMqDSsgFAgBn5liBY9D5VLO2og4jmihJeU5zSFWTBUXKZAOIDJAJiy1jB2iTrkq7RHC3BaSpTqBrwUQq0QVLpELDDAJteFxcYy6boUWb7VQxqJdnJM_xk_Rjtsol5wJxRnqVO9rew4YX-kLv7efk3qk3Q
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1dS8MwFA06H_RJxYnf5sHXdGmT9MO3OaebbrOMCnsbTZrKEDqR7sVf703azg8QfAuB0JIEzj2599yD0FWgQuVKGpE0YIzwyEuJdE3CNYPgOZWhL62dz3jiD575w0zMarG61cJorW3xmXbM0Obys6VamaeyjuEqHtywTbQlOOeikms19Tk06gx7vWliIRSYnyecZsEP6xSLHHe7aNJ8syoYeXVWpXTUx692jP_-qT3U_hLp4XgNP_toQxcHKFvX1OE4fsKLAt9709E17uKptj1SlX0OxHVb1RdyAyiW4RjiQNz4F-Fu3Wccl0t8WznW4_43QVwbJXf9pDcgtZECWUSsJExyj0aK5RAMuIDQLGU5F25qks5S-UJ4NJcR5RnNIFoMU-6HQDuAygCd0FQzdohaxbLQRwhznUcslAGs9oFaBWluUrnAAQNXKi7yY9Q2OzR_q1plzJvNOflj_hJtD5LxaD4aTh5P0Y45NJOqcf0z1CrfV_ocEL-UF_acPwFnSqgq
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2025+3rd+International+Conference+on+Control+and+Robot+Technology+%28ICCRT%29&rft.atitle=Exploring+PPO+in+G2RL%3A+A+Reinforcement+Learning-Based+Path+Planning+Approach+to+Dynamic+Environments&rft.au=Yalley%2C+Abraham+Kojo&rft.au=Chen%2C+Yang&rft.au=Fu%2C+Hao&rft.date=2025-04-16&rft.pub=IEEE&rft.spage=58&rft.epage=64&rft_id=info:doi/10.1109%2FICCRT63554.2025.11072787&rft.externalDocID=11072787