Path Planning of Maritime Autonomous Surface Ships in Unknown Environment with Reinforcement Learning
Recently, artificial intelligence algorithms represented by reinforcement learning and deep learning have promoted the development of autonomous driving technology. For the shipping industry, research and development of maritime autonomous surface ships (MASS) has academic value and practical signif...
Saved in:
| Published in | Cognitive Systems and Signal Processing Vol. 1006; pp. 127 - 137 |
|---|---|
| Main Authors | , , , |
| Format | Book Chapter |
| Language | English |
| Published |
Singapore
Springer
2019
Springer Singapore |
| Series | Communications in Computer and Information Science |
| Subjects | |
| Online Access | Get full text |
| ISBN | 9789811379857 9811379858 |
| ISSN | 1865-0929 1865-0937 |
| DOI | 10.1007/978-981-13-7986-4_12 |
Cover
| Summary: | Recently, artificial intelligence algorithms represented by reinforcement learning and deep learning have promoted the development of autonomous driving technology. For the shipping industry, research and development of maritime autonomous surface ships (MASS) has academic value and practical significance. In an unknown environment, MASS interacts with the environment to conduct behavioral decisions-making, intelligent collision avoidance, and path planning. Reinforcement learning balances exploration and exploitation to improve its own behavior by interacting with the environment to obtain rewarded data. Thus, to achieve intelligent collision avoidance and path planning for MASS in unknown environments, a path planning algorithm of MASS based on reinforcement learning is established. Firstly, the research status of unmanned ships and reinforcement learning is reviewed. The four basic elements of reinforcement learning are analyzed: environment model, incentive function, value function and strategy. Secondly, the port environment model, sensor model, MASS behavioral space, reward function, and action selection strategy were designed separately. Besides, the reward function consists of avoiding obstacles and approaching the target point. Finally, based on the python and pygame platform, a simulation experiment was carried out with Rizhao Harbor District as a case study to verify that this method has better self-adaptability. The model successfully avoids obstacles through online trial and error self-learning and plans adaptive paths in unknown environments. |
|---|---|
| ISBN: | 9789811379857 9811379858 |
| ISSN: | 1865-0929 1865-0937 |
| DOI: | 10.1007/978-981-13-7986-4_12 |