The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place

This paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one...

Full description

Saved in:

Bibliographic Details
Published in	Biomimetics (Basel, Switzerland) Vol. 8; no. 2; p. 240
Main Authors	Kim, Byeongjun, Kwon, Gunam, Park, Chaneun, Kwon, Nam Kyu
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 01.06.2023 MDPI
Subjects	Algorithms Artificial intelligence Assembly lines Collaboration Control algorithms Decomposition deep reinforcement learning Grasping Pick-and-Place Reinforcement robot manipulator Robots Soft Actor-Critic Success task decomposition United States robot manipulator Soft Actor-Critic deep reinforcement learning Pick-and-Place task decomposition
Online Access	Get full text
ISSN	2313-7673 2313-7673
DOI	10.3390/biomimetics8020240

Cover

More Information
Summary:	This paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one grasping task. One of the two reaching tasks is approaching the object, and the other is reaching the place position. These two reaching tasks are carried out using each optimal policy of the agents which are trained using Soft Actor-Critic (SAC). Different from the two reaching tasks, the grasping is implemented via simple logic which is easily designable but may result in improper gripping. To assist the grasping task properly, a dedicated reward system for approaching the object is designed through using individual axis-based weights. To verify the validity of the proposed method, wecarry out various experiments in the MuJoCo physics engine with the Robosuite framework. According to the simulation results of four trials, the robot manipulator picked up and released the object in the goal position with an average success rate of 93.2%.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2313-7673 2313-7673
DOI:	10.3390/biomimetics8020240