Transfer learning-based sparse-reward meta-Q-learning algorithm for active SLAM

Path planning is a widely researched topic, especially in complex unknown environments where dynamic changes and sensor drift introduce significant uncertainties. This paper proposes a transfer-learning-based prioritized experience replay meta-Q-learning algorithm. To solve the sparse reward problem...

Full description

Saved in:

Bibliographic Details
Published in	Expert systems with applications Vol. 299; p. 129799
Main Authors	Liu, Xin, Wen, Shuhuan, Guo, Zhengzheng, Liu, Huaping
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.03.2026
Subjects	Meta-reinforcement learning Path planning Prioritized experience replay Sparse reward Transfer learning Path planning Sparse reward Transfer learning Meta-reinforcement learning Prioritized experience replay
Online Access	Get full text
ISSN	0957-4174
DOI	10.1016/j.eswa.2025.129799

Cover

More Information
Summary:	Path planning is a widely researched topic, especially in complex unknown environments where dynamic changes and sensor drift introduce significant uncertainties. This paper proposes a transfer-learning-based prioritized experience replay meta-Q-learning algorithm. To solve the sparse reward problem caused by dynamic changes and sensor drift in complex unknown environments, a path planning algorithm based on sparse-reward meta-Q-learning is proposed. In the adaptive stage, the learned advantage function and transfer learning is introduced to improve the decision-making capability and generalisation of the algorithm, ensuring stable performance under uncertain conditions. To address the issues of low sample efficiency in the reinforcement learning experience replay buffer, sampling weights are designed, and bias estimation is used to maximise performance on new tasks in meta-learning. This approximated the policy to the optimal action decisions and further shortened the training time. Experimental results demonstrate that, compared to state-of-the-art meta-reinforcement learning algorithms, the proposed algorithm exhibits stronger robustness against uncertainties. It effectively avoids obstacles and successfully accomplishes localization and mapping tasks in dynamic and noisy environments, validating its reliability under uncertain conditions. We provide code at: https://github.com/XinLiu98/Transfer-Learning-based-sparse-reward-meta-Q-learning-Algorithm-for-Active-SLAM.git
ISSN:	0957-4174
DOI:	10.1016/j.eswa.2025.129799