Opportunities and challenges in applying reinforcement learning to robotic manipulation: An industrial case study

As manufacturing moves towards a more agile paradigm, industrial robots are expected to perform more complex tasks in less structured environments, complicating the use of traditional automation techniques and leading to growing interest in data-driven methods, like reinforcement learning (RL). In t...

Full description

Saved in:
Bibliographic Details
Published inManufacturing letters Vol. 35; pp. 1019 - 1030
Main Authors Toner, Tyler, Saez, Miguel, Tilbury, Dawn M., Barton, Kira
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.08.2023
Subjects
Online AccessGet full text
ISSN2213-8463
2213-8463
DOI10.1016/j.mfglet.2023.08.055

Cover

Abstract As manufacturing moves towards a more agile paradigm, industrial robots are expected to perform more complex tasks in less structured environments, complicating the use of traditional automation techniques and leading to growing interest in data-driven methods, like reinforcement learning (RL). In this work, we explore the process of applying RL to enable automation of a challenging industrial manipulation task. We focus on wire harness installation as a motivating example, which presents challenges for traditional automation due to the nonlinear dynamics of the deformable harness. A physical system was developed involving a three-terminal harness manipulated by a 6-DOF UR5 robot, with control enabled through a ROS interface. Modifications were made to the harness to enable simplified grasping and marker-based visual tracking. We detail the development of an RL formulation of the problem, subject to practical constraints on control and sensing motivated by the physical system. We develop a simulator and a basic scripted policy with which to safely generate a data-set of high-quality behaviors, then apply a state-of-the-art model-free offline RL algorithm, TD3 + BC, to learn a policy to serve as a safe starting point on the physical system. Despite extensive tuning, we find that the algorithm fails to achieve acceptable performance. We propose three failure modalities to explain the learning performance, related to control frequency, task symmetry arising from problem simplifications, and unexpected policy complexity, and discuss opportunities for future applications.
AbstractList As manufacturing moves towards a more agile paradigm, industrial robots are expected to perform more complex tasks in less structured environments, complicating the use of traditional automation techniques and leading to growing interest in data-driven methods, like reinforcement learning (RL). In this work, we explore the process of applying RL to enable automation of a challenging industrial manipulation task. We focus on wire harness installation as a motivating example, which presents challenges for traditional automation due to the nonlinear dynamics of the deformable harness. A physical system was developed involving a three-terminal harness manipulated by a 6-DOF UR5 robot, with control enabled through a ROS interface. Modifications were made to the harness to enable simplified grasping and marker-based visual tracking. We detail the development of an RL formulation of the problem, subject to practical constraints on control and sensing motivated by the physical system. We develop a simulator and a basic scripted policy with which to safely generate a data-set of high-quality behaviors, then apply a state-of-the-art model-free offline RL algorithm, TD3 + BC, to learn a policy to serve as a safe starting point on the physical system. Despite extensive tuning, we find that the algorithm fails to achieve acceptable performance. We propose three failure modalities to explain the learning performance, related to control frequency, task symmetry arising from problem simplifications, and unexpected policy complexity, and discuss opportunities for future applications.
Author Tilbury, Dawn M.
Toner, Tyler
Saez, Miguel
Barton, Kira
Author_xml – sequence: 1
  givenname: Tyler
  surname: Toner
  fullname: Toner, Tyler
  email: twtoner@umich.edu
  organization: University of Michigan, 2505 Hayward St., Ann Arbor, MI 48109, USA
– sequence: 2
  givenname: Miguel
  surname: Saez
  fullname: Saez, Miguel
  organization: General Motors, GM Tech Center Rd., Warren, MI 48092, USA
– sequence: 3
  givenname: Dawn M.
  surname: Tilbury
  fullname: Tilbury, Dawn M.
  organization: University of Michigan, 2505 Hayward St., Ann Arbor, MI 48109, USA
– sequence: 4
  givenname: Kira
  surname: Barton
  fullname: Barton, Kira
  organization: University of Michigan, 2505 Hayward St., Ann Arbor, MI 48109, USA
BookMark eNqFkM1KAzEYRYMoWGvfwEVeYMYkM5mmXQil-AeFbroPafJNTckkY5IKfXun1oW40NX3A-fCPTfo0gcPCN1RUlJCm_t92bU7B7lkhFUlESXh_AKNGKNVIeqmuvyxX6NJSntCBo6QWkxH6H3d9yHmg7fZQsLKG6zflHPgd8NpPVZ9747W73AE69sQNXTgM3agoj-9c8AxbEO2GnfK2_7gVLbBz_HCD7g5pBytclirBDjlgzneoqtWuQST7zlGm6fHzfKlWK2fX5eLVaEr0uRipmZAW1ZpACUYYy0zZKprENxoZoBviRCipVNFVMOpZoTDthG05rzhzPBqjOpzrI4hpQit7KPtVDxKSuRJnNzLszh5EieJkIO4AZv_wrTNX41yVNb9Bz-cYRh6fViIMmkLXoOxEXSWJti_Az4BQ3WRHQ
CitedBy_id crossref_primary_10_3390_robotics13080118
crossref_primary_10_1016_j_mfglet_2024_05_007
Cites_doi 10.1109/TASE.2018.2847222
10.1038/nature16961
10.1002/rob.21898
10.1109/IROS45743.2020.9341771
10.1016/j.rcim.2015.04.002
10.1016/j.mfglet.2022.07.111
10.1162/NECO_a_00393
10.3390/e24020279
10.1109/AIM.2008.4601774
10.1109/MRA.2006.250573
10.1146/annurev-control-053018-023825
10.1016/j.engappai.2018.11.006
10.1109/TIV.2020.3002505
10.1109/MRA.2007.339609
10.1016/j.jmsy.2021.10.003
10.1609/aaai.v30i1.10295
ContentType Journal Article
Copyright 2023 The Author(s)
Copyright_xml – notice: 2023 The Author(s)
DBID 6I.
AAFTH
AAYXX
CITATION
DOI 10.1016/j.mfglet.2023.08.055
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
EISSN 2213-8463
EndPage 1030
ExternalDocumentID 10_1016_j_mfglet_2023_08_055
S2213846323001128
GroupedDBID --M
.~1
1~.
4.4
457
4G.
6I.
7-5
8P~
AACTN
AAEDT
AAEDW
AAFTH
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAXUO
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFS
ACRLP
ADBBV
ADEZE
AEBSH
AECPX
AEKER
AFKWA
AFTJW
AGHFR
AGUBO
AHJVU
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AXJTR
BJAXD
BKOJK
BLXMC
EBS
EFJIC
EFLBG
EJD
FDB
FEDTE
FIRID
FNPLU
FYGXN
GBLVA
HVGLF
JJJVA
KOM
M41
MO0
OAUVE
P-8
P-9
PC.
RIG
ROL
SPC
SPCBC
SST
SSZ
T5K
~G-
0R~
AAQFI
AATTM
AAXKI
AAYWO
AAYXX
ABJNI
ACLOT
ACVFH
ADCNI
ADVLN
AEIPS
AEUPX
AFJKZ
AFPUW
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
ID FETCH-LOGICAL-c306t-9a9e1f23ceea8222f2d07c4e85dc2de5b0888f17a0a651c205eb681455652d53
IEDL.DBID .~1
ISSN 2213-8463
IngestDate Thu Apr 24 22:58:52 EDT 2025
Wed Oct 01 02:24:34 EDT 2025
Sat Mar 02 16:00:40 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Wire harness installation
Industrial robotics
Reinforcement learning
Smart manufacturing
Language English
License This is an open access article under the CC BY-NC-ND license.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c306t-9a9e1f23ceea8222f2d07c4e85dc2de5b0888f17a0a651c205eb681455652d53
OpenAccessLink https://www.sciencedirect.com/science/article/pii/S2213846323001128
PageCount 12
ParticipantIDs crossref_primary_10_1016_j_mfglet_2023_08_055
crossref_citationtrail_10_1016_j_mfglet_2023_08_055
elsevier_sciencedirect_doi_10_1016_j_mfglet_2023_08_055
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate August 2023
2023-08-00
PublicationDateYYYYMMDD 2023-08-01
PublicationDate_xml – month: 08
  year: 2023
  text: August 2023
PublicationDecade 2020
PublicationTitle Manufacturing letters
PublicationYear 2023
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Stenmark, Topp (b0205) 2016
Blanco-Claraco, J.L., 2020. A tutorial on SE(3) transformation parameterizations and on-manifold optimization. Technical Report. University of Almería. Almería. URL: http://arxiv.org/abs/2103.15980, arXiv:2103.15980.
Toner, Tilbury, Barton (b0215) 2022
Pedersen, Nalpantidis, Andersen, Schou, Bøgh, Krüger, Madsen (b0175) 2016; 37
Pane, Nageshrao, Kober, Babuška (b0165) 2019; 78
LaValle (b0100) 2006
LaValle, S.M., et al., 1998. Rapidly-exploring random trees: A new tool for path planning.
Mandlekar, A., Xu, D., Wong, J., Nasiriany, S., Wang, C., Kulkarni, R., Fei-Fei, L., Savarese, S., Zhu, Y., Martín-Martín, R., 2021. What Matters in Learning from Offline Human Demonstrations for Robot Manipulation. URL: http://arxiv.org/abs/2108.03298. arXiv:2108.03298 [cs].
Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, et al. Hindsight experience replay. In: Advances in neural information processing systems 2017;p. 5049–59. arXiv:1707.01495.
Fujimoto, S., Gu, S.S., 2021. A Minimalist Approach to Offline Reinforcement Learning, in: Conference on Neural Information Processing Systems, pp. 20132–20145. arXiv:2106.06860.
Ly, A.O., Akhloufi, M., 2021. Learning to Drive by Imitation: An Overview of Deep Behavior Cloning Methods. IEEE Transactions on Intelligent Vehicles 6, 195–209. URL: https://ieeexplore.ieee.org/document/9117169/, doi:10.1109/TIV.2020.3002505.
Zhao, Luo, Sushkov, Pevceviciute, Heess, Scholz, Schaal, Levine (b0235) 2022
Karigiannis, J.N., Laurin, P., Liu, S., Holovashchenko, V., Lizotte, A., Roux, V., Boulet, P., 2022. Reinforcement Learning Enabled Self-Homing of Industrial Robotic Manipulators in Manufacturing. Manufacturing Letters 33, 909–918. URL: https://linkinghub.elsevier.com/retrieve/pii/S2213846322001420, doi:10.1016/j.mfglet.2022.07.111.
Lobos-Tsunekawa, K., Harada, T., 2020. Point cloud based reinforcement learning for sim-to-real and partial observability in visual navigation, in: IEEE International Conference on Intelligent Robots and Systems, pp. 5871–5878. doi:10.1109/IROS45743.2020.9341771, arXiv:2007.13715.
Akiba, Sano, Yanase, Ohta, Koyama (b0005) 2019
Ratliff, Zucker, Andrew Bagnell, Srinivasa (b0180) 2009; 489–494
Recht (b0190) 2019; 2
Wang, L., Xiang, Y., Yang, W., Mousavian, A., Fox, D., 2020. Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds, 1–16URL: http://arxiv.org/abs/2010.00824, arXiv:2010.00824.
Van Hasselt, H., Guez, A., Silver, D., 2016. Deep reinforcement learning with double Q-Learning, in: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, pp. 2094–2100. doi:10.1609/aaai.v30i1.10295, arXiv:1509.06461.
Levy, A., Konidaris, G., Platt, R., Saenko, K., 2019. Learning Multi-Level Hierarchies with Hindsight, in: International Conference on Learning Representations. URL: http://arxiv.org/abs/1712.00948, arXiv:1712.00948.
Ng, Harada, Russell (b0150) 1999
Chaumette, Hutchinson (b0025) 2006; 13
Ijspeert, Nakanishi, Hoffmann, Pastor, Schaal (b0060) 2013; 25
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D., 2016. Continuous control with deep reinforcement learning, in: 4th International Conference on Learning Representations, ICLR 2016. arXiv:1509.02971.
Mnih, Kavukcuoglu, Silver, Graves, Antonoglou, Wiestra, Riedmiller (b0145) 2013
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S., 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor, in: 35th International Conference on Machine Learning, ICML 2018, pp. 2976–2989. arXiv:1801.01290.
Ratliff, N.D., Issac, J., Kappler, D., Birchfield, S., Fox, D., 2018. Riemannian motion policies. arXiv arXiv:1801.02854.
Feldman, Ziesche, Vien, Castro (b0040) 2022
Koo, K.M., Jiang, X., Kikuchi, K., Konno, A., Uchiyama, M., 2008. Development of a robot car wiring system, in: IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM, pp. 862–867. doi:10.1109/AIM.2008.4601774.
Pateria, Subagdja, Tan, Quek (b0170) 2021; 54
Kumar, A., Hong, J., Singh, A., Levine, S., 2022. When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? URL: http://arxiv.org/abs/2204.05618. arXiv:2204.05618 [cs].
Chaumette, Hutchinson (b0030) 2007; 14
Silver, Huang, Maddison, Guez, Sifre, Van Den Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot (b0200) 2016; 529
Kumar, A., Singh, A., Tian, S., Finn, C., Levine, S., A Workflow for Offline Model-Free Robotic Reinforcement Learning.
Sutton, Barto (b0210) 2018
Fujimoto, S., Van Hoof, H., Meger, D., 2018. Addressing Function Approximation Error in Actor-Critic Methods, in: 35th International Conference on Machine Learning, ICML 2018, pp. 2587–2601. arXiv:1802.09477.
Ijspeert, A.J., Nakanishi, J., Schaal, S., 2003. Learning attractor landscapes for learning motor primitives, in: Advances in Neural Information Processing Systems.
Jiang, X., Koo, K.m., Kikuchi, K., Konmo, A., Uchiyama, M., 2010. Robotized Assembly of a Wire Harness in Car Production Line, in: International Conference on Intelligent Robots and Systems.
Kovalenko, Barton, Moyne, Tilbury (b0085) 2023
Shen, Jia, Huang, Wang, Fei, Chen (b0195) 2022; 24
Malyuta, D., Brommer, C., Hentzen, D., Stastny, T., Siegwart, R., Brockers, R., 2019. Long-duration fully autonomous operation of rotorcraft unmanned aerial systems for remote-sensing data acquisition. Journal of Field Robotics, arXiv:1908.06381URL: doi: 10.1002/rob.21898, doi:10.1002/rob.21898.
Barto, Mahadevan (b0015) 2002
Zhou, Barnes, Lu, Yang, Li (b0240) 2019
Li, Zhan, Xiao, Zhou (b0115) 2021
Watkins (b0230) 1989
De Gregorio, Zanella, Palli, Pirozzi, Melchiorri (b0035) 2019; 16
Nguyen, Yoon (b0155) 2021; 61
Pane, Aertbelien, Schutter, Decre (b0160) 2020
Feldman (10.1016/j.mfglet.2023.08.055_b0040) 2022
Mnih (10.1016/j.mfglet.2023.08.055_b0145) 2013
10.1016/j.mfglet.2023.08.055_b0080
Kovalenko (10.1016/j.mfglet.2023.08.055_b0085) 2023
Ijspeert (10.1016/j.mfglet.2023.08.055_b0060) 2013; 25
10.1016/j.mfglet.2023.08.055_b0020
10.1016/j.mfglet.2023.08.055_b0185
10.1016/j.mfglet.2023.08.055_b0140
10.1016/j.mfglet.2023.08.055_b0220
10.1016/j.mfglet.2023.08.055_b0065
10.1016/j.mfglet.2023.08.055_b0120
10.1016/j.mfglet.2023.08.055_b0045
Zhao (10.1016/j.mfglet.2023.08.055_b0235) 2022
10.1016/j.mfglet.2023.08.055_b0125
Toner (10.1016/j.mfglet.2023.08.055_b0215) 2022
Ng (10.1016/j.mfglet.2023.08.055_b0150) 1999
Sutton (10.1016/j.mfglet.2023.08.055_b0210) 2018
Chaumette (10.1016/j.mfglet.2023.08.055_b0030) 2007; 14
Akiba (10.1016/j.mfglet.2023.08.055_b0005) 2019
Recht (10.1016/j.mfglet.2023.08.055_b0190) 2019; 2
De Gregorio (10.1016/j.mfglet.2023.08.055_b0035) 2019; 16
10.1016/j.mfglet.2023.08.055_b0090
10.1016/j.mfglet.2023.08.055_b0070
10.1016/j.mfglet.2023.08.055_b0095
10.1016/j.mfglet.2023.08.055_b0050
10.1016/j.mfglet.2023.08.055_b0075
10.1016/j.mfglet.2023.08.055_b0130
Nguyen (10.1016/j.mfglet.2023.08.055_b0155) 2021; 61
10.1016/j.mfglet.2023.08.055_b0055
10.1016/j.mfglet.2023.08.055_b0110
Pateria (10.1016/j.mfglet.2023.08.055_b0170) 2021; 54
10.1016/j.mfglet.2023.08.055_b0010
Barto (10.1016/j.mfglet.2023.08.055_b0015) 2002
Pane (10.1016/j.mfglet.2023.08.055_b0160) 2020
10.1016/j.mfglet.2023.08.055_b0135
10.1016/j.mfglet.2023.08.055_b0105
10.1016/j.mfglet.2023.08.055_b0225
Li (10.1016/j.mfglet.2023.08.055_b0115) 2021
Shen (10.1016/j.mfglet.2023.08.055_b0195) 2022; 24
Pane (10.1016/j.mfglet.2023.08.055_b0165) 2019; 78
Pedersen (10.1016/j.mfglet.2023.08.055_b0175) 2016; 37
Stenmark (10.1016/j.mfglet.2023.08.055_b0205) 2016
Ratliff (10.1016/j.mfglet.2023.08.055_b0180) 2009; 489–494
Silver (10.1016/j.mfglet.2023.08.055_b0200) 2016; 529
Watkins (10.1016/j.mfglet.2023.08.055_b0230) 1989
LaValle (10.1016/j.mfglet.2023.08.055_b0100) 2006
Chaumette (10.1016/j.mfglet.2023.08.055_b0025) 2006; 13
Zhou (10.1016/j.mfglet.2023.08.055_b0240) 2019
References_xml – reference: Fujimoto, S., Gu, S.S., 2021. A Minimalist Approach to Offline Reinforcement Learning, in: Conference on Neural Information Processing Systems, pp. 20132–20145. arXiv:2106.06860.
– volume: 54
  year: 2021
  ident: b0170
  article-title: Hierarchical Reinforcement Learning: A Comprehensive Survey
  publication-title: ACM Comput. Surv.
– start-page: 2623
  year: 2019
  end-page: 2631
  ident: b0005
  article-title: Optuna: a next-generation hyperparameter optimization framework
  publication-title: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining
– reference: Levy, A., Konidaris, G., Platt, R., Saenko, K., 2019. Learning Multi-Level Hierarchies with Hindsight, in: International Conference on Learning Representations. URL: http://arxiv.org/abs/1712.00948, arXiv:1712.00948.
– reference: Mandlekar, A., Xu, D., Wong, J., Nasiriany, S., Wang, C., Kulkarni, R., Fei-Fei, L., Savarese, S., Zhu, Y., Martín-Martín, R., 2021. What Matters in Learning from Offline Human Demonstrations for Robot Manipulation. URL: http://arxiv.org/abs/2108.03298. arXiv:2108.03298 [cs].
– year: 2018
  ident: b0210
  article-title: Reinforcement learning: an introduction
– reference: Ijspeert, A.J., Nakanishi, J., Schaal, S., 2003. Learning attractor landscapes for learning motor primitives, in: Advances in Neural Information Processing Systems.
– year: 2016
  ident: b0205
  article-title: From demonstrations to skills for high-level programming of industrial robots
  publication-title: Technical Report
– reference: Kumar, A., Singh, A., Tian, S., Finn, C., Levine, S., A Workflow for Offline Model-Free Robotic Reinforcement Learning.
– year: 2002
  ident: b0015
  article-title: Recent advances in hierarchical reinforcement learning
  publication-title: Discrete event dynamic systems: theory and applications
– volume: 14
  start-page: 109
  year: 2007
  end-page: 118
  ident: b0030
  article-title: Visual servo control part II: advanced approaches
  publication-title: IEEE Robot Autom Mag
– volume: 25
  start-page: 328
  year: 2013
  end-page: 373
  ident: b0060
  article-title: Dynamical movement primitives: Learning attractor models for motor behaviors
  publication-title: Neural Comput.
– reference: Ly, A.O., Akhloufi, M., 2021. Learning to Drive by Imitation: An Overview of Deep Behavior Cloning Methods. IEEE Transactions on Intelligent Vehicles 6, 195–209. URL: https://ieeexplore.ieee.org/document/9117169/, doi:10.1109/TIV.2020.3002505.
– volume: 78
  start-page: 236
  year: 2019
  end-page: 247
  ident: b0165
  article-title: Reinforcement learning based compensation methods for robot manipulators
  publication-title: Eng. Appl. Artif. Intell.
– volume: 489–494
  year: 2009
  ident: b0180
  article-title: CHOMP: Gradient optimization techniques for efficient motion planning
  publication-title: Proceedings - IEEE International Conference on Robotics and Automation
– volume: 529
  start-page: 484
  year: 2016
  end-page: 489
  ident: b0200
  article-title: Mastering the game of go with deep neural networks and tree search
  publication-title: Nature
– reference: Wang, L., Xiang, Y., Yang, W., Mousavian, A., Fox, D., 2020. Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds, 1–16URL: http://arxiv.org/abs/2010.00824, arXiv:2010.00824.
– reference: Fujimoto, S., Van Hoof, H., Meger, D., 2018. Addressing Function Approximation Error in Actor-Critic Methods, in: 35th International Conference on Machine Learning, ICML 2018, pp. 2587–2601. arXiv:1802.09477.
– reference: Ratliff, N.D., Issac, J., Kappler, D., Birchfield, S., Fox, D., 2018. Riemannian motion policies. arXiv arXiv:1801.02854.
– reference: Karigiannis, J.N., Laurin, P., Liu, S., Holovashchenko, V., Lizotte, A., Roux, V., Boulet, P., 2022. Reinforcement Learning Enabled Self-Homing of Industrial Robotic Manipulators in Manufacturing. Manufacturing Letters 33, 909–918. URL: https://linkinghub.elsevier.com/retrieve/pii/S2213846322001420, doi:10.1016/j.mfglet.2022.07.111.
– reference: LaValle, S.M., et al., 1998. Rapidly-exploring random trees: A new tool for path planning.
– start-page: 5738
  year: 2019
  end-page: 5746
  ident: b0240
  article-title: On the continuity of rotation representations in neural networks
  publication-title: Proceedings of the IEEE computer society conference on computer vision and pattern recognition
– volume: 37
  start-page: 282
  year: 2016
  end-page: 291
  ident: b0175
  article-title: Robot skills for manufacturing: from concept to industrial deployment
  publication-title: Robot Comput-Integr Manuf
– year: 2022
  ident: b0040
  article-title: A hybrid approach for learning to shift and grasp with elaborate motion primitives
  publication-title: International Conference on Robotics and Automation (ICRA)
– reference: Jiang, X., Koo, K.m., Kikuchi, K., Konmo, A., Uchiyama, M., 2010. Robotized Assembly of a Wire Harness in Car Production Line, in: International Conference on Intelligent Robots and Systems.
– volume: 61
  start-page: 365
  year: 2021
  end-page: 374
  ident: b0155
  article-title: A novel vision-based method for 3D profile extraction of wire harness in robotized assembly process
  publication-title: Journal of Manufacturing Systems
– volume: 24
  year: 2022
  ident: b0195
  article-title: Reinforcement Learning-Based Reactive Obstacle Avoidance Method for Redundant Manipulators
  publication-title: Entropy
– start-page: 278
  year: 1999
  end-page: 287
  ident: b0150
  article-title: Policy invariance under reward transformations: Theory and application to reward shaping
  publication-title: International Conference on Machine Learning
– start-page: 7087
  year: 2020
  end-page: 7094
  ident: b0160
  article-title: Skill-based programming framework for composable reactive robot behaviors
  publication-title: IEEE International Conference on Intelligent Robots and Systems
– volume: 13
  start-page: 82
  year: 2006
  end-page: 90
  ident: b0025
  article-title: Visual servo control Part I: basic approaches
  publication-title: IEEE Robot Autom Mag
– reference: Lobos-Tsunekawa, K., Harada, T., 2020. Point cloud based reinforcement learning for sim-to-real and partial observability in visual navigation, in: IEEE International Conference on Intelligent Robots and Systems, pp. 5871–5878. doi:10.1109/IROS45743.2020.9341771, arXiv:2007.13715.
– reference: Kumar, A., Hong, J., Singh, A., Levine, S., 2022. When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? URL: http://arxiv.org/abs/2204.05618. arXiv:2204.05618 [cs].
– reference: Van Hasselt, H., Guez, A., Silver, D., 2016. Deep reinforcement learning with double Q-Learning, in: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, pp. 2094–2100. doi:10.1609/aaai.v30i1.10295, arXiv:1509.06461.
– reference: Blanco-Claraco, J.L., 2020. A tutorial on SE(3) transformation parameterizations and on-manifold optimization. Technical Report. University of Almería. Almería. URL: http://arxiv.org/abs/2103.15980, arXiv:2103.15980.
– year: 2006
  ident: b0100
  article-title: Planning algorithms
– reference: Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D., 2016. Continuous control with deep reinforcement learning, in: 4th International Conference on Learning Representations, ICLR 2016. arXiv:1509.02971.
– reference: Haarnoja, T., Zhou, A., Abbeel, P., Levine, S., 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor, in: 35th International Conference on Machine Learning, ICML 2018, pp. 2976–2989. arXiv:1801.01290.
– reference: Koo, K.M., Jiang, X., Kikuchi, K., Konno, A., Uchiyama, M., 2008. Development of a robot car wiring system, in: IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM, pp. 862–867. doi:10.1109/AIM.2008.4601774.
– reference: Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, et al. Hindsight experience replay. In: Advances in neural information processing systems 2017;p. 5049–59. arXiv:1707.01495.
– reference: Malyuta, D., Brommer, C., Hentzen, D., Stastny, T., Siegwart, R., Brockers, R., 2019. Long-duration fully autonomous operation of rotorcraft unmanned aerial systems for remote-sensing data acquisition. Journal of Field Robotics, arXiv:1908.06381URL: doi: 10.1002/rob.21898, doi:10.1002/rob.21898.
– year: 2021
  ident: b0115
  article-title: Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information
– year: 2022
  ident: b0235
  article-title: Offline meta-reinforcement learning for industrial insertion
  publication-title: International Conference on Robotics and Automation (ICRA)
– volume: 2
  start-page: 253
  year: 2019
  end-page: 279
  ident: b0190
  article-title: A tour of reinforcement learning: the view from continuous control
  publication-title: Annu Rev Control, Robot, Auton Syst
– start-page: 1214
  year: 2022
  end-page: 1221
  ident: b0215
  article-title: Probabilistically safe mobile manipulation in an unmodeled environment with automated feedback tuning
  publication-title: 2022 American Control Conference (ACC)
– year: 2023
  ident: b0085
  publication-title: Opportunities and challenges to integrate artificial intelligence into manufacturing systems: Thoughts from a panel discussion.
– volume: 16
  start-page: 585
  year: 2019
  end-page: 598
  ident: b0035
  article-title: Integration of robotic vision and tactile sensing for wire-terminal insertion tasks
  publication-title: IEEE Trans Autom Sci Eng
– year: 2013
  ident: b0145
  article-title: Playing Atari with Deep Reinforcement Learning arXiv:arXiv:1312.5602v1
– year: 1989
  ident: b0230
  article-title: Learning from delayed rewards
– volume: 16
  start-page: 585
  year: 2019
  ident: 10.1016/j.mfglet.2023.08.055_b0035
  article-title: Integration of robotic vision and tactile sensing for wire-terminal insertion tasks
  publication-title: IEEE Trans Autom Sci Eng
  doi: 10.1109/TASE.2018.2847222
– ident: 10.1016/j.mfglet.2023.08.055_b0105
– year: 2006
  ident: 10.1016/j.mfglet.2023.08.055_b0100
– year: 2023
  ident: 10.1016/j.mfglet.2023.08.055_b0085
  publication-title: Opportunities and challenges to integrate artificial intelligence into manufacturing systems: Thoughts from a panel discussion.
– start-page: 1214
  year: 2022
  ident: 10.1016/j.mfglet.2023.08.055_b0215
  article-title: Probabilistically safe mobile manipulation in an unmodeled environment with automated feedback tuning
– ident: 10.1016/j.mfglet.2023.08.055_b0050
– volume: 529
  start-page: 484
  year: 2016
  ident: 10.1016/j.mfglet.2023.08.055_b0200
  article-title: Mastering the game of go with deep neural networks and tree search
  publication-title: Nature
  doi: 10.1038/nature16961
– year: 2002
  ident: 10.1016/j.mfglet.2023.08.055_b0015
  article-title: Recent advances in hierarchical reinforcement learning
– ident: 10.1016/j.mfglet.2023.08.055_b0135
  doi: 10.1002/rob.21898
– volume: 54
  year: 2021
  ident: 10.1016/j.mfglet.2023.08.055_b0170
  article-title: Hierarchical Reinforcement Learning: A Comprehensive Survey
  publication-title: ACM Comput. Surv.
– ident: 10.1016/j.mfglet.2023.08.055_b0090
– year: 1989
  ident: 10.1016/j.mfglet.2023.08.055_b0230
– ident: 10.1016/j.mfglet.2023.08.055_b0225
– ident: 10.1016/j.mfglet.2023.08.055_b0125
  doi: 10.1109/IROS45743.2020.9341771
– ident: 10.1016/j.mfglet.2023.08.055_b0055
– volume: 37
  start-page: 282
  year: 2016
  ident: 10.1016/j.mfglet.2023.08.055_b0175
  article-title: Robot skills for manufacturing: from concept to industrial deployment
  publication-title: Robot Comput-Integr Manuf
  doi: 10.1016/j.rcim.2015.04.002
– ident: 10.1016/j.mfglet.2023.08.055_b0185
– year: 2022
  ident: 10.1016/j.mfglet.2023.08.055_b0040
  article-title: A hybrid approach for learning to shift and grasp with elaborate motion primitives
– ident: 10.1016/j.mfglet.2023.08.055_b0075
  doi: 10.1016/j.mfglet.2022.07.111
– volume: 25
  start-page: 328
  year: 2013
  ident: 10.1016/j.mfglet.2023.08.055_b0060
  article-title: Dynamical movement primitives: Learning attractor models for motor behaviors
  publication-title: Neural Comput.
  doi: 10.1162/NECO_a_00393
– ident: 10.1016/j.mfglet.2023.08.055_b0120
– start-page: 278
  year: 1999
  ident: 10.1016/j.mfglet.2023.08.055_b0150
  article-title: Policy invariance under reward transformations: Theory and application to reward shaping
  publication-title: International Conference on Machine Learning
– ident: 10.1016/j.mfglet.2023.08.055_b0065
– volume: 24
  year: 2022
  ident: 10.1016/j.mfglet.2023.08.055_b0195
  article-title: Reinforcement Learning-Based Reactive Obstacle Avoidance Method for Redundant Manipulators
  publication-title: Entropy
  doi: 10.3390/e24020279
– ident: 10.1016/j.mfglet.2023.08.055_b0020
– ident: 10.1016/j.mfglet.2023.08.055_b0080
  doi: 10.1109/AIM.2008.4601774
– ident: 10.1016/j.mfglet.2023.08.055_b0045
– ident: 10.1016/j.mfglet.2023.08.055_b0070
– year: 2021
  ident: 10.1016/j.mfglet.2023.08.055_b0115
– volume: 13
  start-page: 82
  year: 2006
  ident: 10.1016/j.mfglet.2023.08.055_b0025
  article-title: Visual servo control Part I: basic approaches
  publication-title: IEEE Robot Autom Mag
  doi: 10.1109/MRA.2006.250573
– volume: 2
  start-page: 253
  year: 2019
  ident: 10.1016/j.mfglet.2023.08.055_b0190
  article-title: A tour of reinforcement learning: the view from continuous control
  publication-title: Annu Rev Control, Robot, Auton Syst
  doi: 10.1146/annurev-control-053018-023825
– volume: 78
  start-page: 236
  year: 2019
  ident: 10.1016/j.mfglet.2023.08.055_b0165
  article-title: Reinforcement learning based compensation methods for robot manipulators
  publication-title: Eng. Appl. Artif. Intell.
  doi: 10.1016/j.engappai.2018.11.006
– ident: 10.1016/j.mfglet.2023.08.055_b0130
  doi: 10.1109/TIV.2020.3002505
– ident: 10.1016/j.mfglet.2023.08.055_b0110
– ident: 10.1016/j.mfglet.2023.08.055_b0095
– volume: 14
  start-page: 109
  year: 2007
  ident: 10.1016/j.mfglet.2023.08.055_b0030
  article-title: Visual servo control part II: advanced approaches
  publication-title: IEEE Robot Autom Mag
  doi: 10.1109/MRA.2007.339609
– ident: 10.1016/j.mfglet.2023.08.055_b0140
– start-page: 2623
  year: 2019
  ident: 10.1016/j.mfglet.2023.08.055_b0005
  article-title: Optuna: a next-generation hyperparameter optimization framework
– ident: 10.1016/j.mfglet.2023.08.055_b0010
– volume: 489–494
  year: 2009
  ident: 10.1016/j.mfglet.2023.08.055_b0180
  article-title: CHOMP: Gradient optimization techniques for efficient motion planning
  publication-title: Proceedings - IEEE International Conference on Robotics and Automation
– start-page: 7087
  year: 2020
  ident: 10.1016/j.mfglet.2023.08.055_b0160
  article-title: Skill-based programming framework for composable reactive robot behaviors
  publication-title: IEEE International Conference on Intelligent Robots and Systems
– year: 2016
  ident: 10.1016/j.mfglet.2023.08.055_b0205
  article-title: From demonstrations to skills for high-level programming of industrial robots
  publication-title: Technical Report
– start-page: 5738
  year: 2019
  ident: 10.1016/j.mfglet.2023.08.055_b0240
  article-title: On the continuity of rotation representations in neural networks
– year: 2022
  ident: 10.1016/j.mfglet.2023.08.055_b0235
  article-title: Offline meta-reinforcement learning for industrial insertion
– year: 2018
  ident: 10.1016/j.mfglet.2023.08.055_b0210
– volume: 61
  start-page: 365
  year: 2021
  ident: 10.1016/j.mfglet.2023.08.055_b0155
  article-title: A novel vision-based method for 3D profile extraction of wire harness in robotized assembly process
  publication-title: Journal of Manufacturing Systems
  doi: 10.1016/j.jmsy.2021.10.003
– year: 2013
  ident: 10.1016/j.mfglet.2023.08.055_b0145
– ident: 10.1016/j.mfglet.2023.08.055_b0220
  doi: 10.1609/aaai.v30i1.10295
SSID ssj0001600487
Score 2.2854073
Snippet As manufacturing moves towards a more agile paradigm, industrial robots are expected to perform more complex tasks in less structured environments,...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 1019
SubjectTerms Industrial robotics
Reinforcement learning
Smart manufacturing
Wire harness installation
Title Opportunities and challenges in applying reinforcement learning to robotic manipulation: An industrial case study
URI https://dx.doi.org/10.1016/j.mfglet.2023.08.055
Volume 35
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 2213-8463
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001600487
  issn: 2213-8463
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier E-journals (Freedom Collection)
  customDbUrl:
  eissn: 2213-8463
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001600487
  issn: 2213-8463
  databaseCode: ACRLP
  dateStart: 20230701
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier E-journals (Freedom Collection)
  customDbUrl:
  eissn: 2213-8463
  dateEnd: 20230831
  omitProxy: true
  ssIdentifier: ssj0001600487
  issn: 2213-8463
  databaseCode: ACRLP
  dateStart: 20230801
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier ScienceDirect
  customDbUrl:
  eissn: 2213-8463
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001600487
  issn: 2213-8463
  databaseCode: .~1
  dateStart: 20131001
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect Freedom Collection Journals
  customDbUrl:
  eissn: 2213-8463
  dateEnd: 20230831
  omitProxy: true
  ssIdentifier: ssj0001600487
  issn: 2213-8463
  databaseCode: AIKHN
  dateStart: 20230801
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect Freedom Collection Journals
  customDbUrl:
  eissn: 2213-8463
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001600487
  issn: 2213-8463
  databaseCode: AIKHN
  dateStart: 20230701
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  customDbUrl:
  mediaType: online
  eissn: 2213-8463
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001600487
  issn: 2213-8463
  databaseCode: AKRWK
  dateStart: 20131001
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8MwDI6mceGCQIAYjykHrmVtHn1wmyam8RoIhrRblabJNATtGN2V306cNjwkBBKnqlUcVY5lO4n9fQgdkywPA6Hhyp9Kj5kY4Zl9FvOSyNc00VEUK2gUvh6Howd2MeXTFhq4Xhgoq2x8f-3TrbduvvQabfYW83nvnpCAmuhJTRJtjJRAwy9jEbAYnLwFn-csIRgpdE3DeA8EXAedLfN61jOjoBNgEbdYntDz91OE-hJ1hptoo0kXcb_-oy3UUsU2erlZQNa8KiwaKhZFjqXjRHnF8wLDrTS0L-GlssCo0p4B4oYhYoarEi_LrDRzYoC_cBRep7hfGHHH5YGliXDY4s_uoMnwbDIYeQ11gifNHqDyEpGoQBNqQqCAFECT3I8kUzHPJckVz4xziXUQCV-EPJDE5yoLYwAtDznJOd1F7aIs1B7CKheUCyW1yjOWMCWENnsM30xBQ19HfgdRp61UNrDiwG7xlLr6sce01nEKOk6B9JLzDvI-pBY1rMYf4yO3EOk380iN5_9Vcv_fkgdoHd7qar9D1K6WK3VkMpAq61oT66K1_uDu6hae55ej8TvqPN-X
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELZKGWBBIEC88cCaNrHjPNiqiqpAWwaK1M1y_KiKIC0lXfnt-JKYh4RAYk1yVnQ-3cO--z6ELkimokAYuPKn0gttjPBsnRV6aewbmpo4TjQMCg9HUf8hvJmwSQN13SwMtFXWvr_y6aW3rp-0a222F7NZ-56QgNroSW0SbY2UJGtoPWQkhgqs9RZ8HrREYKUwNg0CHki4Ebqyz-vZTK2GWkAjXoJ5wtDfTyHqS9jpbaOtOl_EneqXdlBD57vo5W4BafMqL-FQscgVlo4U5RXPcgzX0jC_hJe6REaV5SEgrikipriY4-U8m9s1MeBfOA6vS9zJrbgj88DShjhcAtDuoXHvatztezV3gidtEVB4qUh1YAi1MVBADmCI8mMZ6oQpSZRmmfUuiQli4YuIBZL4TGdRAqjlESOK0X3UzOe5PkBYK0GZ0NJolYVpqIUwtsjw7RI08k3sHyLqtMVljSsO9BZP3DWQPfJKxxx0zIH1krFD5H1ILSpcjT--j91G8G_2wa3r_1Xy6N-S52ijPx4O-OB6dHuMNuFN1fp3gprFcqVPbTpSZGelub0DMa_flw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Opportunities+and+challenges+in+applying+reinforcement+learning+to+robotic+manipulation%3A+An+industrial+case+study&rft.jtitle=Manufacturing+letters&rft.au=Toner%2C+Tyler&rft.au=Saez%2C+Miguel&rft.au=Tilbury%2C+Dawn+M.&rft.au=Barton%2C+Kira&rft.date=2023-08-01&rft.issn=2213-8463&rft.eissn=2213-8463&rft.volume=35&rft.spage=1019&rft.epage=1030&rft_id=info:doi/10.1016%2Fj.mfglet.2023.08.055&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_mfglet_2023_08_055
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2213-8463&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2213-8463&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2213-8463&client=summon