Opportunities and challenges in applying reinforcement learning to robotic manipulation: An industrial case study

As manufacturing moves towards a more agile paradigm, industrial robots are expected to perform more complex tasks in less structured environments, complicating the use of traditional automation techniques and leading to growing interest in data-driven methods, like reinforcement learning (RL). In t...

Full description

Saved in:

Bibliographic Details
Published in	Manufacturing letters Vol. 35; pp. 1019 - 1030
Main Authors	Toner, Tyler, Saez, Miguel, Tilbury, Dawn M., Barton, Kira
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.08.2023
Subjects	Industrial robotics Reinforcement learning Smart manufacturing Wire harness installation Wire harness installation Industrial robotics Reinforcement learning Smart manufacturing
Online Access	Get full text
ISSN	2213-8463 2213-8463
DOI	10.1016/j.mfglet.2023.08.055

Cover

Abstract	As manufacturing moves towards a more agile paradigm, industrial robots are expected to perform more complex tasks in less structured environments, complicating the use of traditional automation techniques and leading to growing interest in data-driven methods, like reinforcement learning (RL). In this work, we explore the process of applying RL to enable automation of a challenging industrial manipulation task. We focus on wire harness installation as a motivating example, which presents challenges for traditional automation due to the nonlinear dynamics of the deformable harness. A physical system was developed involving a three-terminal harness manipulated by a 6-DOF UR5 robot, with control enabled through a ROS interface. Modifications were made to the harness to enable simplified grasping and marker-based visual tracking. We detail the development of an RL formulation of the problem, subject to practical constraints on control and sensing motivated by the physical system. We develop a simulator and a basic scripted policy with which to safely generate a data-set of high-quality behaviors, then apply a state-of-the-art model-free offline RL algorithm, TD3 + BC, to learn a policy to serve as a safe starting point on the physical system. Despite extensive tuning, we find that the algorithm fails to achieve acceptable performance. We propose three failure modalities to explain the learning performance, related to control frequency, task symmetry arising from problem simplifications, and unexpected policy complexity, and discuss opportunities for future applications.
AbstractList	As manufacturing moves towards a more agile paradigm, industrial robots are expected to perform more complex tasks in less structured environments, complicating the use of traditional automation techniques and leading to growing interest in data-driven methods, like reinforcement learning (RL). In this work, we explore the process of applying RL to enable automation of a challenging industrial manipulation task. We focus on wire harness installation as a motivating example, which presents challenges for traditional automation due to the nonlinear dynamics of the deformable harness. A physical system was developed involving a three-terminal harness manipulated by a 6-DOF UR5 robot, with control enabled through a ROS interface. Modifications were made to the harness to enable simplified grasping and marker-based visual tracking. We detail the development of an RL formulation of the problem, subject to practical constraints on control and sensing motivated by the physical system. We develop a simulator and a basic scripted policy with which to safely generate a data-set of high-quality behaviors, then apply a state-of-the-art model-free offline RL algorithm, TD3 + BC, to learn a policy to serve as a safe starting point on the physical system. Despite extensive tuning, we find that the algorithm fails to achieve acceptable performance. We propose three failure modalities to explain the learning performance, related to control frequency, task symmetry arising from problem simplifications, and unexpected policy complexity, and discuss opportunities for future applications.
Author	Tilbury, Dawn M. Toner, Tyler Saez, Miguel Barton, Kira
Author_xml	– sequence: 1 givenname: Tyler surname: Toner fullname: Toner, Tyler email: twtoner@umich.edu organization: University of Michigan, 2505 Hayward St., Ann Arbor, MI 48109, USA – sequence: 2 givenname: Miguel surname: Saez fullname: Saez, Miguel organization: General Motors, GM Tech Center Rd., Warren, MI 48092, USA – sequence: 3 givenname: Dawn M. surname: Tilbury fullname: Tilbury, Dawn M. organization: University of Michigan, 2505 Hayward St., Ann Arbor, MI 48109, USA – sequence: 4 givenname: Kira surname: Barton fullname: Barton, Kira organization: University of Michigan, 2505 Hayward St., Ann Arbor, MI 48109, USA
BookMark	eNqFkM1KAzEYRYMoWGvfwEVeYMYkM5mmXQil-AeFbroPafJNTckkY5IKfXun1oW40NX3A-fCPTfo0gcPCN1RUlJCm_t92bU7B7lkhFUlESXh_AKNGKNVIeqmuvyxX6NJSntCBo6QWkxH6H3d9yHmg7fZQsLKG6zflHPgd8NpPVZ9747W73AE69sQNXTgM3agoj-9c8AxbEO2GnfK2_7gVLbBz_HCD7g5pBytclirBDjlgzneoqtWuQST7zlGm6fHzfKlWK2fX5eLVaEr0uRipmZAW1ZpACUYYy0zZKprENxoZoBviRCipVNFVMOpZoTDthG05rzhzPBqjOpzrI4hpQit7KPtVDxKSuRJnNzLszh5EieJkIO4AZv_wrTNX41yVNb9Bz-cYRh6fViIMmkLXoOxEXSWJti_Az4BQ3WRHQ
CitedBy_id	crossref_primary_10_3390_robotics13080118 crossref_primary_10_1016_j_mfglet_2024_05_007
Cites_doi	10.1109/TASE.2018.2847222 10.1038/nature16961 10.1002/rob.21898 10.1109/IROS45743.2020.9341771 10.1016/j.rcim.2015.04.002 10.1016/j.mfglet.2022.07.111 10.1162/NECO_a_00393 10.3390/e24020279 10.1109/AIM.2008.4601774 10.1109/MRA.2006.250573 10.1146/annurev-control-053018-023825 10.1016/j.engappai.2018.11.006 10.1109/TIV.2020.3002505 10.1109/MRA.2007.339609 10.1016/j.jmsy.2021.10.003 10.1609/aaai.v30i1.10295
ContentType	Journal Article
Copyright	2023 The Author(s)
Copyright_xml	– notice: 2023 The Author(s)
DBID	6I. AAFTH AAYXX CITATION
DOI	10.1016/j.mfglet.2023.08.055
DatabaseName	ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
EISSN	2213-8463
EndPage	1030
ExternalDocumentID	10_1016_j_mfglet_2023_08_055 S2213846323001128
GroupedDBID	--M .~1 1~. 4.4 457 4G. 6I. 7-5 8P~ AACTN AAEDT AAEDW AAFTH AAIAV AAIKJ AAKOC AALRI AAOAW AAXUO ABMAC ABXDB ABYKQ ACDAQ ACGFS ACRLP ADBBV ADEZE AEBSH AECPX AEKER AFKWA AFTJW AGHFR AGUBO AHJVU AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AXJTR BJAXD BKOJK BLXMC EBS EFJIC EFLBG EJD FDB FEDTE FIRID FNPLU FYGXN GBLVA HVGLF JJJVA KOM M41 MO0 OAUVE P-8 P-9 PC. RIG ROL SPC SPCBC SST SSZ T5K ~G- 0R~ AAQFI AATTM AAXKI AAYWO AAYXX ABJNI ACLOT ACVFH ADCNI ADVLN AEIPS AEUPX AFJKZ AFPUW AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS
ID	FETCH-LOGICAL-c306t-9a9e1f23ceea8222f2d07c4e85dc2de5b0888f17a0a651c205eb681455652d53
IEDL.DBID	.~1
ISSN	2213-8463
IngestDate	Thu Apr 24 22:58:52 EDT 2025 Wed Oct 01 02:24:34 EDT 2025 Sat Mar 02 16:00:40 EST 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Keywords	Wire harness installation Industrial robotics Reinforcement learning Smart manufacturing
Language	English
License	This is an open access article under the CC BY-NC-ND license.
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c306t-9a9e1f23ceea8222f2d07c4e85dc2de5b0888f17a0a651c205eb681455652d53
OpenAccessLink	https://www.sciencedirect.com/science/article/pii/S2213846323001128
PageCount	12
ParticipantIDs	crossref_primary_10_1016_j_mfglet_2023_08_055 crossref_citationtrail_10_1016_j_mfglet_2023_08_055 elsevier_sciencedirect_doi_10_1016_j_mfglet_2023_08_055
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	August 2023 2023-08-00
PublicationDateYYYYMMDD	2023-08-01
PublicationDate_xml	– month: 08 year: 2023 text: August 2023
PublicationDecade	2020
PublicationTitle	Manufacturing letters
PublicationYear	2023
Publisher	Elsevier Ltd
Publisher_xml	– name: Elsevier Ltd
References	Stenmark, Topp (b0205) 2016 Blanco-Claraco, J.L., 2020. A tutorial on SE(3) transformation parameterizations and on-manifold optimization. Technical Report. University of Almería. Almería. URL: http://arxiv.org/abs/2103.15980, arXiv:2103.15980. Toner, Tilbury, Barton (b0215) 2022 Pedersen, Nalpantidis, Andersen, Schou, Bøgh, Krüger, Madsen (b0175) 2016; 37 Pane, Nageshrao, Kober, Babuška (b0165) 2019; 78 LaValle (b0100) 2006 LaValle, S.M., et al., 1998. Rapidly-exploring random trees: A new tool for path planning. Mandlekar, A., Xu, D., Wong, J., Nasiriany, S., Wang, C., Kulkarni, R., Fei-Fei, L., Savarese, S., Zhu, Y., Martín-Martín, R., 2021. What Matters in Learning from Offline Human Demonstrations for Robot Manipulation. URL: http://arxiv.org/abs/2108.03298. arXiv:2108.03298 [cs]. Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, et al. Hindsight experience replay. In: Advances in neural information processing systems 2017;p. 5049–59. arXiv:1707.01495. Fujimoto, S., Gu, S.S., 2021. A Minimalist Approach to Offline Reinforcement Learning, in: Conference on Neural Information Processing Systems, pp. 20132–20145. arXiv:2106.06860. Ly, A.O., Akhloufi, M., 2021. Learning to Drive by Imitation: An Overview of Deep Behavior Cloning Methods. IEEE Transactions on Intelligent Vehicles 6, 195–209. URL: https://ieeexplore.ieee.org/document/9117169/, doi:10.1109/TIV.2020.3002505. Zhao, Luo, Sushkov, Pevceviciute, Heess, Scholz, Schaal, Levine (b0235) 2022 Karigiannis, J.N., Laurin, P., Liu, S., Holovashchenko, V., Lizotte, A., Roux, V., Boulet, P., 2022. Reinforcement Learning Enabled Self-Homing of Industrial Robotic Manipulators in Manufacturing. Manufacturing Letters 33, 909–918. URL: https://linkinghub.elsevier.com/retrieve/pii/S2213846322001420, doi:10.1016/j.mfglet.2022.07.111. Lobos-Tsunekawa, K., Harada, T., 2020. Point cloud based reinforcement learning for sim-to-real and partial observability in visual navigation, in: IEEE International Conference on Intelligent Robots and Systems, pp. 5871–5878. doi:10.1109/IROS45743.2020.9341771, arXiv:2007.13715. Akiba, Sano, Yanase, Ohta, Koyama (b0005) 2019 Ratliff, Zucker, Andrew Bagnell, Srinivasa (b0180) 2009; 489–494 Recht (b0190) 2019; 2 Wang, L., Xiang, Y., Yang, W., Mousavian, A., Fox, D., 2020. Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds, 1–16URL: http://arxiv.org/abs/2010.00824, arXiv:2010.00824. Van Hasselt, H., Guez, A., Silver, D., 2016. Deep reinforcement learning with double Q-Learning, in: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, pp. 2094–2100. doi:10.1609/aaai.v30i1.10295, arXiv:1509.06461. Levy, A., Konidaris, G., Platt, R., Saenko, K., 2019. Learning Multi-Level Hierarchies with Hindsight, in: International Conference on Learning Representations. URL: http://arxiv.org/abs/1712.00948, arXiv:1712.00948. Ng, Harada, Russell (b0150) 1999 Chaumette, Hutchinson (b0025) 2006; 13 Ijspeert, Nakanishi, Hoffmann, Pastor, Schaal (b0060) 2013; 25 Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D., 2016. Continuous control with deep reinforcement learning, in: 4th International Conference on Learning Representations, ICLR 2016. arXiv:1509.02971. Mnih, Kavukcuoglu, Silver, Graves, Antonoglou, Wiestra, Riedmiller (b0145) 2013 Haarnoja, T., Zhou, A., Abbeel, P., Levine, S., 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor, in: 35th International Conference on Machine Learning, ICML 2018, pp. 2976–2989. arXiv:1801.01290. Ratliff, N.D., Issac, J., Kappler, D., Birchfield, S., Fox, D., 2018. Riemannian motion policies. arXiv arXiv:1801.02854. Feldman, Ziesche, Vien, Castro (b0040) 2022 Koo, K.M., Jiang, X., Kikuchi, K., Konno, A., Uchiyama, M., 2008. Development of a robot car wiring system, in: IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM, pp. 862–867. doi:10.1109/AIM.2008.4601774. Pateria, Subagdja, Tan, Quek (b0170) 2021; 54 Kumar, A., Hong, J., Singh, A., Levine, S., 2022. When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? URL: http://arxiv.org/abs/2204.05618. arXiv:2204.05618 [cs]. Chaumette, Hutchinson (b0030) 2007; 14 Silver, Huang, Maddison, Guez, Sifre, Van Den Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot (b0200) 2016; 529 Kumar, A., Singh, A., Tian, S., Finn, C., Levine, S., A Workflow for Offline Model-Free Robotic Reinforcement Learning. Sutton, Barto (b0210) 2018 Fujimoto, S., Van Hoof, H., Meger, D., 2018. Addressing Function Approximation Error in Actor-Critic Methods, in: 35th International Conference on Machine Learning, ICML 2018, pp. 2587–2601. arXiv:1802.09477. Ijspeert, A.J., Nakanishi, J., Schaal, S., 2003. Learning attractor landscapes for learning motor primitives, in: Advances in Neural Information Processing Systems. Jiang, X., Koo, K.m., Kikuchi, K., Konmo, A., Uchiyama, M., 2010. Robotized Assembly of a Wire Harness in Car Production Line, in: International Conference on Intelligent Robots and Systems. Kovalenko, Barton, Moyne, Tilbury (b0085) 2023 Shen, Jia, Huang, Wang, Fei, Chen (b0195) 2022; 24 Malyuta, D., Brommer, C., Hentzen, D., Stastny, T., Siegwart, R., Brockers, R., 2019. Long-duration fully autonomous operation of rotorcraft unmanned aerial systems for remote-sensing data acquisition. Journal of Field Robotics, arXiv:1908.06381URL: doi: 10.1002/rob.21898, doi:10.1002/rob.21898. Barto, Mahadevan (b0015) 2002 Zhou, Barnes, Lu, Yang, Li (b0240) 2019 Li, Zhan, Xiao, Zhou (b0115) 2021 Watkins (b0230) 1989 De Gregorio, Zanella, Palli, Pirozzi, Melchiorri (b0035) 2019; 16 Nguyen, Yoon (b0155) 2021; 61 Pane, Aertbelien, Schutter, Decre (b0160) 2020 Feldman (10.1016/j.mfglet.2023.08.055_b0040) 2022 Mnih (10.1016/j.mfglet.2023.08.055_b0145) 2013 10.1016/j.mfglet.2023.08.055_b0080 Kovalenko (10.1016/j.mfglet.2023.08.055_b0085) 2023 Ijspeert (10.1016/j.mfglet.2023.08.055_b0060) 2013; 25 10.1016/j.mfglet.2023.08.055_b0020 10.1016/j.mfglet.2023.08.055_b0185 10.1016/j.mfglet.2023.08.055_b0140 10.1016/j.mfglet.2023.08.055_b0220 10.1016/j.mfglet.2023.08.055_b0065 10.1016/j.mfglet.2023.08.055_b0120 10.1016/j.mfglet.2023.08.055_b0045 Zhao (10.1016/j.mfglet.2023.08.055_b0235) 2022 10.1016/j.mfglet.2023.08.055_b0125 Toner (10.1016/j.mfglet.2023.08.055_b0215) 2022 Ng (10.1016/j.mfglet.2023.08.055_b0150) 1999 Sutton (10.1016/j.mfglet.2023.08.055_b0210) 2018 Chaumette (10.1016/j.mfglet.2023.08.055_b0030) 2007; 14 Akiba (10.1016/j.mfglet.2023.08.055_b0005) 2019 Recht (10.1016/j.mfglet.2023.08.055_b0190) 2019; 2 De Gregorio (10.1016/j.mfglet.2023.08.055_b0035) 2019; 16 10.1016/j.mfglet.2023.08.055_b0090 10.1016/j.mfglet.2023.08.055_b0070 10.1016/j.mfglet.2023.08.055_b0095 10.1016/j.mfglet.2023.08.055_b0050 10.1016/j.mfglet.2023.08.055_b0075 10.1016/j.mfglet.2023.08.055_b0130 Nguyen (10.1016/j.mfglet.2023.08.055_b0155) 2021; 61 10.1016/j.mfglet.2023.08.055_b0055 10.1016/j.mfglet.2023.08.055_b0110 Pateria (10.1016/j.mfglet.2023.08.055_b0170) 2021; 54 10.1016/j.mfglet.2023.08.055_b0010 Barto (10.1016/j.mfglet.2023.08.055_b0015) 2002 Pane (10.1016/j.mfglet.2023.08.055_b0160) 2020 10.1016/j.mfglet.2023.08.055_b0135 10.1016/j.mfglet.2023.08.055_b0105 10.1016/j.mfglet.2023.08.055_b0225 Li (10.1016/j.mfglet.2023.08.055_b0115) 2021 Shen (10.1016/j.mfglet.2023.08.055_b0195) 2022; 24 Pane (10.1016/j.mfglet.2023.08.055_b0165) 2019; 78 Pedersen (10.1016/j.mfglet.2023.08.055_b0175) 2016; 37 Stenmark (10.1016/j.mfglet.2023.08.055_b0205) 2016 Ratliff (10.1016/j.mfglet.2023.08.055_b0180) 2009; 489–494 Silver (10.1016/j.mfglet.2023.08.055_b0200) 2016; 529 Watkins (10.1016/j.mfglet.2023.08.055_b0230) 1989 LaValle (10.1016/j.mfglet.2023.08.055_b0100) 2006 Chaumette (10.1016/j.mfglet.2023.08.055_b0025) 2006; 13 Zhou (10.1016/j.mfglet.2023.08.055_b0240) 2019
References_xml	– reference: Fujimoto, S., Gu, S.S., 2021. A Minimalist Approach to Offline Reinforcement Learning, in: Conference on Neural Information Processing Systems, pp. 20132–20145. arXiv:2106.06860. – volume: 54 year: 2021 ident: b0170 article-title: Hierarchical Reinforcement Learning: A Comprehensive Survey publication-title: ACM Comput. Surv. – start-page: 2623 year: 2019 end-page: 2631 ident: b0005 article-title: Optuna: a next-generation hyperparameter optimization framework publication-title: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining – reference: Levy, A., Konidaris, G., Platt, R., Saenko, K., 2019. Learning Multi-Level Hierarchies with Hindsight, in: International Conference on Learning Representations. URL: http://arxiv.org/abs/1712.00948, arXiv:1712.00948. – reference: Mandlekar, A., Xu, D., Wong, J., Nasiriany, S., Wang, C., Kulkarni, R., Fei-Fei, L., Savarese, S., Zhu, Y., Martín-Martín, R., 2021. What Matters in Learning from Offline Human Demonstrations for Robot Manipulation. URL: http://arxiv.org/abs/2108.03298. arXiv:2108.03298 [cs]. – year: 2018 ident: b0210 article-title: Reinforcement learning: an introduction – reference: Ijspeert, A.J., Nakanishi, J., Schaal, S., 2003. Learning attractor landscapes for learning motor primitives, in: Advances in Neural Information Processing Systems. – year: 2016 ident: b0205 article-title: From demonstrations to skills for high-level programming of industrial robots publication-title: Technical Report – reference: Kumar, A., Singh, A., Tian, S., Finn, C., Levine, S., A Workflow for Offline Model-Free Robotic Reinforcement Learning. – year: 2002 ident: b0015 article-title: Recent advances in hierarchical reinforcement learning publication-title: Discrete event dynamic systems: theory and applications – volume: 14 start-page: 109 year: 2007 end-page: 118 ident: b0030 article-title: Visual servo control part II: advanced approaches publication-title: IEEE Robot Autom Mag – volume: 25 start-page: 328 year: 2013 end-page: 373 ident: b0060 article-title: Dynamical movement primitives: Learning attractor models for motor behaviors publication-title: Neural Comput. – reference: Ly, A.O., Akhloufi, M., 2021. Learning to Drive by Imitation: An Overview of Deep Behavior Cloning Methods. IEEE Transactions on Intelligent Vehicles 6, 195–209. URL: https://ieeexplore.ieee.org/document/9117169/, doi:10.1109/TIV.2020.3002505. – volume: 78 start-page: 236 year: 2019 end-page: 247 ident: b0165 article-title: Reinforcement learning based compensation methods for robot manipulators publication-title: Eng. Appl. Artif. Intell. – volume: 489–494 year: 2009 ident: b0180 article-title: CHOMP: Gradient optimization techniques for efficient motion planning publication-title: Proceedings - IEEE International Conference on Robotics and Automation – volume: 529 start-page: 484 year: 2016 end-page: 489 ident: b0200 article-title: Mastering the game of go with deep neural networks and tree search publication-title: Nature – reference: Wang, L., Xiang, Y., Yang, W., Mousavian, A., Fox, D., 2020. Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds, 1–16URL: http://arxiv.org/abs/2010.00824, arXiv:2010.00824. – reference: Fujimoto, S., Van Hoof, H., Meger, D., 2018. Addressing Function Approximation Error in Actor-Critic Methods, in: 35th International Conference on Machine Learning, ICML 2018, pp. 2587–2601. arXiv:1802.09477. – reference: Ratliff, N.D., Issac, J., Kappler, D., Birchfield, S., Fox, D., 2018. Riemannian motion policies. arXiv arXiv:1801.02854. – reference: Karigiannis, J.N., Laurin, P., Liu, S., Holovashchenko, V., Lizotte, A., Roux, V., Boulet, P., 2022. Reinforcement Learning Enabled Self-Homing of Industrial Robotic Manipulators in Manufacturing. Manufacturing Letters 33, 909–918. URL: https://linkinghub.elsevier.com/retrieve/pii/S2213846322001420, doi:10.1016/j.mfglet.2022.07.111. – reference: LaValle, S.M., et al., 1998. Rapidly-exploring random trees: A new tool for path planning. – start-page: 5738 year: 2019 end-page: 5746 ident: b0240 article-title: On the continuity of rotation representations in neural networks publication-title: Proceedings of the IEEE computer society conference on computer vision and pattern recognition – volume: 37 start-page: 282 year: 2016 end-page: 291 ident: b0175 article-title: Robot skills for manufacturing: from concept to industrial deployment publication-title: Robot Comput-Integr Manuf – year: 2022 ident: b0040 article-title: A hybrid approach for learning to shift and grasp with elaborate motion primitives publication-title: International Conference on Robotics and Automation (ICRA) – reference: Jiang, X., Koo, K.m., Kikuchi, K., Konmo, A., Uchiyama, M., 2010. Robotized Assembly of a Wire Harness in Car Production Line, in: International Conference on Intelligent Robots and Systems. – volume: 61 start-page: 365 year: 2021 end-page: 374 ident: b0155 article-title: A novel vision-based method for 3D profile extraction of wire harness in robotized assembly process publication-title: Journal of Manufacturing Systems – volume: 24 year: 2022 ident: b0195 article-title: Reinforcement Learning-Based Reactive Obstacle Avoidance Method for Redundant Manipulators publication-title: Entropy – start-page: 278 year: 1999 end-page: 287 ident: b0150 article-title: Policy invariance under reward transformations: Theory and application to reward shaping publication-title: International Conference on Machine Learning – start-page: 7087 year: 2020 end-page: 7094 ident: b0160 article-title: Skill-based programming framework for composable reactive robot behaviors publication-title: IEEE International Conference on Intelligent Robots and Systems – volume: 13 start-page: 82 year: 2006 end-page: 90 ident: b0025 article-title: Visual servo control Part I: basic approaches publication-title: IEEE Robot Autom Mag – reference: Lobos-Tsunekawa, K., Harada, T., 2020. Point cloud based reinforcement learning for sim-to-real and partial observability in visual navigation, in: IEEE International Conference on Intelligent Robots and Systems, pp. 5871–5878. doi:10.1109/IROS45743.2020.9341771, arXiv:2007.13715. – reference: Kumar, A., Hong, J., Singh, A., Levine, S., 2022. When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? URL: http://arxiv.org/abs/2204.05618. arXiv:2204.05618 [cs]. – reference: Van Hasselt, H., Guez, A., Silver, D., 2016. Deep reinforcement learning with double Q-Learning, in: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, pp. 2094–2100. doi:10.1609/aaai.v30i1.10295, arXiv:1509.06461. – reference: Blanco-Claraco, J.L., 2020. A tutorial on SE(3) transformation parameterizations and on-manifold optimization. Technical Report. University of Almería. Almería. URL: http://arxiv.org/abs/2103.15980, arXiv:2103.15980. – year: 2006 ident: b0100 article-title: Planning algorithms – reference: Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D., 2016. Continuous control with deep reinforcement learning, in: 4th International Conference on Learning Representations, ICLR 2016. arXiv:1509.02971. – reference: Haarnoja, T., Zhou, A., Abbeel, P., Levine, S., 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor, in: 35th International Conference on Machine Learning, ICML 2018, pp. 2976–2989. arXiv:1801.01290. – reference: Koo, K.M., Jiang, X., Kikuchi, K., Konno, A., Uchiyama, M., 2008. Development of a robot car wiring system, in: IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM, pp. 862–867. doi:10.1109/AIM.2008.4601774. – reference: Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, et al. Hindsight experience replay. In: Advances in neural information processing systems 2017;p. 5049–59. arXiv:1707.01495. – reference: Malyuta, D., Brommer, C., Hentzen, D., Stastny, T., Siegwart, R., Brockers, R., 2019. Long-duration fully autonomous operation of rotorcraft unmanned aerial systems for remote-sensing data acquisition. Journal of Field Robotics, arXiv:1908.06381URL: doi: 10.1002/rob.21898, doi:10.1002/rob.21898. – year: 2021 ident: b0115 article-title: Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information – year: 2022 ident: b0235 article-title: Offline meta-reinforcement learning for industrial insertion publication-title: International Conference on Robotics and Automation (ICRA) – volume: 2 start-page: 253 year: 2019 end-page: 279 ident: b0190 article-title: A tour of reinforcement learning: the view from continuous control publication-title: Annu Rev Control, Robot, Auton Syst – start-page: 1214 year: 2022 end-page: 1221 ident: b0215 article-title: Probabilistically safe mobile manipulation in an unmodeled environment with automated feedback tuning publication-title: 2022 American Control Conference (ACC) – year: 2023 ident: b0085 publication-title: Opportunities and challenges to integrate artificial intelligence into manufacturing systems: Thoughts from a panel discussion. – volume: 16 start-page: 585 year: 2019 end-page: 598 ident: b0035 article-title: Integration of robotic vision and tactile sensing for wire-terminal insertion tasks publication-title: IEEE Trans Autom Sci Eng – year: 2013 ident: b0145 article-title: Playing Atari with Deep Reinforcement Learning arXiv:arXiv:1312.5602v1 – year: 1989 ident: b0230 article-title: Learning from delayed rewards – volume: 16 start-page: 585 year: 2019 ident: 10.1016/j.mfglet.2023.08.055_b0035 article-title: Integration of robotic vision and tactile sensing for wire-terminal insertion tasks publication-title: IEEE Trans Autom Sci Eng doi: 10.1109/TASE.2018.2847222 – ident: 10.1016/j.mfglet.2023.08.055_b0105 – year: 2006 ident: 10.1016/j.mfglet.2023.08.055_b0100 – year: 2023 ident: 10.1016/j.mfglet.2023.08.055_b0085 publication-title: Opportunities and challenges to integrate artificial intelligence into manufacturing systems: Thoughts from a panel discussion. – start-page: 1214 year: 2022 ident: 10.1016/j.mfglet.2023.08.055_b0215 article-title: Probabilistically safe mobile manipulation in an unmodeled environment with automated feedback tuning – ident: 10.1016/j.mfglet.2023.08.055_b0050 – volume: 529 start-page: 484 year: 2016 ident: 10.1016/j.mfglet.2023.08.055_b0200 article-title: Mastering the game of go with deep neural networks and tree search publication-title: Nature doi: 10.1038/nature16961 – year: 2002 ident: 10.1016/j.mfglet.2023.08.055_b0015 article-title: Recent advances in hierarchical reinforcement learning – ident: 10.1016/j.mfglet.2023.08.055_b0135 doi: 10.1002/rob.21898 – volume: 54 year: 2021 ident: 10.1016/j.mfglet.2023.08.055_b0170 article-title: Hierarchical Reinforcement Learning: A Comprehensive Survey publication-title: ACM Comput. Surv. – ident: 10.1016/j.mfglet.2023.08.055_b0090 – year: 1989 ident: 10.1016/j.mfglet.2023.08.055_b0230 – ident: 10.1016/j.mfglet.2023.08.055_b0225 – ident: 10.1016/j.mfglet.2023.08.055_b0125 doi: 10.1109/IROS45743.2020.9341771 – ident: 10.1016/j.mfglet.2023.08.055_b0055 – volume: 37 start-page: 282 year: 2016 ident: 10.1016/j.mfglet.2023.08.055_b0175 article-title: Robot skills for manufacturing: from concept to industrial deployment publication-title: Robot Comput-Integr Manuf doi: 10.1016/j.rcim.2015.04.002 – ident: 10.1016/j.mfglet.2023.08.055_b0185 – year: 2022 ident: 10.1016/j.mfglet.2023.08.055_b0040 article-title: A hybrid approach for learning to shift and grasp with elaborate motion primitives – ident: 10.1016/j.mfglet.2023.08.055_b0075 doi: 10.1016/j.mfglet.2022.07.111 – volume: 25 start-page: 328 year: 2013 ident: 10.1016/j.mfglet.2023.08.055_b0060 article-title: Dynamical movement primitives: Learning attractor models for motor behaviors publication-title: Neural Comput. doi: 10.1162/NECO_a_00393 – ident: 10.1016/j.mfglet.2023.08.055_b0120 – start-page: 278 year: 1999 ident: 10.1016/j.mfglet.2023.08.055_b0150 article-title: Policy invariance under reward transformations: Theory and application to reward shaping publication-title: International Conference on Machine Learning – ident: 10.1016/j.mfglet.2023.08.055_b0065 – volume: 24 year: 2022 ident: 10.1016/j.mfglet.2023.08.055_b0195 article-title: Reinforcement Learning-Based Reactive Obstacle Avoidance Method for Redundant Manipulators publication-title: Entropy doi: 10.3390/e24020279 – ident: 10.1016/j.mfglet.2023.08.055_b0020 – ident: 10.1016/j.mfglet.2023.08.055_b0080 doi: 10.1109/AIM.2008.4601774 – ident: 10.1016/j.mfglet.2023.08.055_b0045 – ident: 10.1016/j.mfglet.2023.08.055_b0070 – year: 2021 ident: 10.1016/j.mfglet.2023.08.055_b0115 – volume: 13 start-page: 82 year: 2006 ident: 10.1016/j.mfglet.2023.08.055_b0025 article-title: Visual servo control Part I: basic approaches publication-title: IEEE Robot Autom Mag doi: 10.1109/MRA.2006.250573 – volume: 2 start-page: 253 year: 2019 ident: 10.1016/j.mfglet.2023.08.055_b0190 article-title: A tour of reinforcement learning: the view from continuous control publication-title: Annu Rev Control, Robot, Auton Syst doi: 10.1146/annurev-control-053018-023825 – volume: 78 start-page: 236 year: 2019 ident: 10.1016/j.mfglet.2023.08.055_b0165 article-title: Reinforcement learning based compensation methods for robot manipulators publication-title: Eng. Appl. Artif. Intell. doi: 10.1016/j.engappai.2018.11.006 – ident: 10.1016/j.mfglet.2023.08.055_b0130 doi: 10.1109/TIV.2020.3002505 – ident: 10.1016/j.mfglet.2023.08.055_b0110 – ident: 10.1016/j.mfglet.2023.08.055_b0095 – volume: 14 start-page: 109 year: 2007 ident: 10.1016/j.mfglet.2023.08.055_b0030 article-title: Visual servo control part II: advanced approaches publication-title: IEEE Robot Autom Mag doi: 10.1109/MRA.2007.339609 – ident: 10.1016/j.mfglet.2023.08.055_b0140 – start-page: 2623 year: 2019 ident: 10.1016/j.mfglet.2023.08.055_b0005 article-title: Optuna: a next-generation hyperparameter optimization framework – ident: 10.1016/j.mfglet.2023.08.055_b0010 – volume: 489–494 year: 2009 ident: 10.1016/j.mfglet.2023.08.055_b0180 article-title: CHOMP: Gradient optimization techniques for efficient motion planning publication-title: Proceedings - IEEE International Conference on Robotics and Automation – start-page: 7087 year: 2020 ident: 10.1016/j.mfglet.2023.08.055_b0160 article-title: Skill-based programming framework for composable reactive robot behaviors publication-title: IEEE International Conference on Intelligent Robots and Systems – year: 2016 ident: 10.1016/j.mfglet.2023.08.055_b0205 article-title: From demonstrations to skills for high-level programming of industrial robots publication-title: Technical Report – start-page: 5738 year: 2019 ident: 10.1016/j.mfglet.2023.08.055_b0240 article-title: On the continuity of rotation representations in neural networks – year: 2022 ident: 10.1016/j.mfglet.2023.08.055_b0235 article-title: Offline meta-reinforcement learning for industrial insertion – year: 2018 ident: 10.1016/j.mfglet.2023.08.055_b0210 – volume: 61 start-page: 365 year: 2021 ident: 10.1016/j.mfglet.2023.08.055_b0155 article-title: A novel vision-based method for 3D profile extraction of wire harness in robotized assembly process publication-title: Journal of Manufacturing Systems doi: 10.1016/j.jmsy.2021.10.003 – year: 2013 ident: 10.1016/j.mfglet.2023.08.055_b0145 – ident: 10.1016/j.mfglet.2023.08.055_b0220 doi: 10.1609/aaai.v30i1.10295
SSID	ssj0001600487
Score	2.2854073
Snippet	As manufacturing moves towards a more agile paradigm, industrial robots are expected to perform more complex tasks in less structured environments,...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	1019
SubjectTerms	Industrial robotics Reinforcement learning Smart manufacturing Wire harness installation
Title	Opportunities and challenges in applying reinforcement learning to robotic manipulation: An industrial case study
URI	https://dx.doi.org/10.1016/j.mfglet.2023.08.055
Volume	35
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) customDbUrl: eissn: 2213-8463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001600487 issn: 2213-8463 databaseCode: GBLVA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier E-journals (Freedom Collection) customDbUrl: eissn: 2213-8463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001600487 issn: 2213-8463 databaseCode: ACRLP dateStart: 20230701 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier E-journals (Freedom Collection) customDbUrl: eissn: 2213-8463 dateEnd: 20230831 omitProxy: true ssIdentifier: ssj0001600487 issn: 2213-8463 databaseCode: ACRLP dateStart: 20230801 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier ScienceDirect customDbUrl: eissn: 2213-8463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001600487 issn: 2213-8463 databaseCode: .~1 dateStart: 20131001 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect Freedom Collection Journals customDbUrl: eissn: 2213-8463 dateEnd: 20230831 omitProxy: true ssIdentifier: ssj0001600487 issn: 2213-8463 databaseCode: AIKHN dateStart: 20230801 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect Freedom Collection Journals customDbUrl: eissn: 2213-8463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001600487 issn: 2213-8463 databaseCode: AIKHN dateStart: 20230701 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVLSH databaseName: Elsevier Journals customDbUrl: mediaType: online eissn: 2213-8463 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001600487 issn: 2213-8463 databaseCode: AKRWK dateStart: 20131001 isFulltext: true providerName: Library Specific Holdings
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8MwDI6mceGCQIAYjykHrmVtHn1wmyam8RoIhrRblabJNATtGN2V306cNjwkBBKnqlUcVY5lO4n9fQgdkywPA6Hhyp9Kj5kY4Zl9FvOSyNc00VEUK2gUvh6Howd2MeXTFhq4Xhgoq2x8f-3TrbduvvQabfYW83nvnpCAmuhJTRJtjJRAwy9jEbAYnLwFn-csIRgpdE3DeA8EXAedLfN61jOjoBNgEbdYntDz91OE-hJ1hptoo0kXcb_-oy3UUsU2erlZQNa8KiwaKhZFjqXjRHnF8wLDrTS0L-GlssCo0p4B4oYhYoarEi_LrDRzYoC_cBRep7hfGHHH5YGliXDY4s_uoMnwbDIYeQ11gifNHqDyEpGoQBNqQqCAFECT3I8kUzHPJckVz4xziXUQCV-EPJDE5yoLYwAtDznJOd1F7aIs1B7CKheUCyW1yjOWMCWENnsM30xBQ19HfgdRp61UNrDiwG7xlLr6sce01nEKOk6B9JLzDvI-pBY1rMYf4yO3EOk380iN5_9Vcv_fkgdoHd7qar9D1K6WK3VkMpAq61oT66K1_uDu6hae55ej8TvqPN-X
linkProvider	Elsevier
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELZKGWBBIEC88cCaNrHjPNiqiqpAWwaK1M1y_KiKIC0lXfnt-JKYh4RAYk1yVnQ-3cO--z6ELkimokAYuPKn0gttjPBsnRV6aewbmpo4TjQMCg9HUf8hvJmwSQN13SwMtFXWvr_y6aW3rp-0a222F7NZ-56QgNroSW0SbY2UJGtoPWQkhgqs9RZ8HrREYKUwNg0CHki4Ebqyz-vZTK2GWkAjXoJ5wtDfTyHqS9jpbaOtOl_EneqXdlBD57vo5W4BafMqL-FQscgVlo4U5RXPcgzX0jC_hJe6REaV5SEgrikipriY4-U8m9s1MeBfOA6vS9zJrbgj88DShjhcAtDuoXHvatztezV3gidtEVB4qUh1YAi1MVBADmCI8mMZ6oQpSZRmmfUuiQli4YuIBZL4TGdRAqjlESOK0X3UzOe5PkBYK0GZ0NJolYVpqIUwtsjw7RI08k3sHyLqtMVljSsO9BZP3DWQPfJKxxx0zIH1krFD5H1ILSpcjT--j91G8G_2wa3r_1Xy6N-S52ijPx4O-OB6dHuMNuFN1fp3gprFcqVPbTpSZGelub0DMa_flw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Opportunities+and+challenges+in+applying+reinforcement+learning+to+robotic+manipulation%3A+An+industrial+case+study&rft.jtitle=Manufacturing+letters&rft.au=Toner%2C+Tyler&rft.au=Saez%2C+Miguel&rft.au=Tilbury%2C+Dawn+M.&rft.au=Barton%2C+Kira&rft.date=2023-08-01&rft.issn=2213-8463&rft.eissn=2213-8463&rft.volume=35&rft.spage=1019&rft.epage=1030&rft_id=info:doi/10.1016%2Fj.mfglet.2023.08.055&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_mfglet_2023_08_055
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2213-8463&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2213-8463&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2213-8463&client=summon