A novel reinforcement learning-based multi-operator differential evolution with cubic spline for the path planning problem
Path planning in autonomous driving systems remains a critical challenge, requiring algorithms capable of generating safe, efficient, and reliable routes. Existing state-of-the-art methods, including graph-based and sampling-based approaches, often produce sharp, suboptimal paths and struggle in com...
Saved in:
| Published in | The Artificial intelligence review Vol. 58; no. 5; p. 142 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
Dordrecht
Springer Netherlands
24.02.2025
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1573-7462 0269-2821 1573-7462 |
| DOI | 10.1007/s10462-025-11129-6 |
Cover
| Summary: | Path planning in autonomous driving systems remains a critical challenge, requiring algorithms capable of generating safe, efficient, and reliable routes. Existing state-of-the-art methods, including graph-based and sampling-based approaches, often produce sharp, suboptimal paths and struggle in complex search spaces, while trajectory-based algorithms suffer from high computational costs. Recently, meta-heuristic optimization algorithms have shown effective performance but often lack learning ability due to their inherent randomness. This paper introduces a unified benchmarking framework, named Reda’s Path Planning Benchmark 2024 (RP2B-24), alongside two novel reinforcement learning (RL)-based path-planning algorithms: Q-Spline Multi-Operator Differential Evolution (QSMODE), utilizing Q-learning (Q-tables), and Deep Q-Spline Multi-Operator Differential Evolution (DQSMODE), based on Deep Q-networks (DQN). Both algorithms are integrated under a single framework and enhanced with cubic spline interpolation to improve path smoothness and adaptability. The proposed RP2B-24 library comprises 50 distinct benchmark problems, offering a comprehensive and generalizable testing ground for diverse path-planning algorithms. Unlike traditional approaches, RL in QSMODE/DQSMODE is not merely a parameter adjustment method but is fully utilized to generate paths based on the accumulated search experience to enhance path quality. QSMODE/DQSMODE introduces a unique self-training update mechanism for the Q-table and DQN based on candidate paths within the algorithm’s population, complemented by a secondary update method that increases population diversity through random action selection. An adaptive RL switching probability dynamically alternates between these Q-table update modes. DQSMODE and QSMODE demonstrated superior performance, outperforming 22 state-of-the-art algorithms, including the IMODEII. The algorithms ranked first and second in the Friedman test and SNE-SR ranking test, achieving scores of 99.2877 (DQSMODE) and 93.0463 (QSMODE), with statistically significant results in the Wilcoxon test. The practical applicability of the algorithm was validated on a ROS-based system using a four-wheel differential drive robot, which successfully followed the planned paths in two driving scenarios, demonstrating the algorithm’s feasibility and effectiveness for real-world scenarios. The source code for the proposed benchmark and algorithm is publicly available for further research and experimentation at:
https://github.com/MohamedRedaMu/RP2B24-Benchmark
and
https://github.com/MohamedRedaMu/QSMODEAlgorithm
. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1573-7462 0269-2821 1573-7462 |
| DOI: | 10.1007/s10462-025-11129-6 |