Path Planning of Maritime Autonomous Surface Ships in Unknown Environment with Reinforcement Learning

Recently, artificial intelligence algorithms represented by reinforcement learning and deep learning have promoted the development of autonomous driving technology. For the shipping industry, research and development of maritime autonomous surface ships (MASS) has academic value and practical signif...

Full description

Saved in:

Bibliographic Details
Published in	Cognitive Systems and Signal Processing Vol. 1006; pp. 127 - 137
Main Authors	Wang, Chengbo, Zhang, Xinyu, Li, Ruijie, Dong, Peifang
Format	Book Chapter
Language	English
Published	Singapore Springer 2019 Springer Singapore
Series	Communications in Computer and Information Science
Subjects	Collision avoidance Maritime autonomous surface ships Path planning Reinforcement learning
Online Access	Get full text
ISBN	9789811379857 9811379858
ISSN	1865-0929 1865-0937
DOI	10.1007/978-981-13-7986-4_12

Cover

More Information
Summary:	Recently, artificial intelligence algorithms represented by reinforcement learning and deep learning have promoted the development of autonomous driving technology. For the shipping industry, research and development of maritime autonomous surface ships (MASS) has academic value and practical significance. In an unknown environment, MASS interacts with the environment to conduct behavioral decisions-making, intelligent collision avoidance, and path planning. Reinforcement learning balances exploration and exploitation to improve its own behavior by interacting with the environment to obtain rewarded data. Thus, to achieve intelligent collision avoidance and path planning for MASS in unknown environments, a path planning algorithm of MASS based on reinforcement learning is established. Firstly, the research status of unmanned ships and reinforcement learning is reviewed. The four basic elements of reinforcement learning are analyzed: environment model, incentive function, value function and strategy. Secondly, the port environment model, sensor model, MASS behavioral space, reward function, and action selection strategy were designed separately. Besides, the reward function consists of avoiding obstacles and approaching the target point. Finally, based on the python and pygame platform, a simulation experiment was carried out with Rizhao Harbor District as a case study to verify that this method has better self-adaptability. The model successfully avoids obstacles through online trial and error self-learning and plans adaptive paths in unknown environments.
ISBN:	9789811379857 9811379858
ISSN:	1865-0929 1865-0937
DOI:	10.1007/978-981-13-7986-4_12