Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture
The identification of learning mechanisms for locomotion has been the subject of much research for some time but many challenges remain. Dynamic systems theory (DST) offers a novel approach to humanoid learning through environmental interaction. Reinforcement learning (RL) has offered a promising me...
Saved in:
| Published in | Frontiers in neurorobotics Vol. 7; no. 5; p. 5 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
Switzerland
Frontiers Research Foundation
01.01.2013
Frontiers Media S.A |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1662-5218 1662-5218 |
| DOI | 10.3389/fnbot.2013.00005 |
Cover
| Summary: | The identification of learning mechanisms for locomotion has been the subject of much research for some time but many challenges remain. Dynamic systems theory (DST) offers a novel approach to humanoid learning through environmental interaction. Reinforcement learning (RL) has offered a promising method to adaptively link the dynamic system to the environment it interacts with via a reward-based value system. In this paper, we propose a model that integrates the above perspectives and applies it to the case of a humanoid (NAO) robot learning to walk the ability of which emerges from its value-based interaction with the environment. In the model, a simplified central pattern generator (CPG) architecture inspired by neuroscientific research and DST is integrated with an actor-critic approach to RL (cpg-actor-critic). In the cpg-actor-critic architecture, least-square-temporal-difference based learning converges to the optimal solution quickly by using natural gradient learning and balancing exploration and exploitation. Futhermore, rather than using a traditional (designer-specified) reward it uses a dynamic value function as a stability indicator that adapts to the environment. The results obtained are analyzed using a novel DST-based embodied cognition approach. Learning to walk, from this perspective, is a process of integrating levels of sensorimotor activity and value. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Edited by: Jeffrey L. Krichmar, University of California Irvine, USA Reviewed by: Mehdi Khamassi, CNRS, France; Poramate Manoonpong, Georg-August-Universität Göttingen, Germany; Calogero M. Oddo, Scuola Superiore Sant’Anna, Italy |
| ISSN: | 1662-5218 1662-5218 |
| DOI: | 10.3389/fnbot.2013.00005 |