Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation

We present a coarse-to-fine discretisation method that enables the use of discrete reinforcement learning approaches in place of unstable and data-inefficient actorcritic methods in continuous robotics domains. This approach builds on the recently released ARM algorithm, which replaces the continuou...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 13729 - 13738
Main Authors	James, Stephen, Wada, Kentaro, Laidlow, Tristan, Davison, Andrew J.
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2022
Subjects	Computer vision Machine vision Pattern recognition Q-learning Robot vision systems Task analysis Vision applications and systems; Others; Robot vision Visualization
Online Access	Get full text
ISSN	1063-6919
DOI	10.1109/CVPR52688.2022.01337

Cover

More Information
Summary:	We present a coarse-to-fine discretisation method that enables the use of discrete reinforcement learning approaches in place of unstable and data-inefficient actorcritic methods in continuous robotics domains. This approach builds on the recently released ARM algorithm, which replaces the continuous next-best pose agent with a discrete one, with coarse-to-fine Q-attention. Given a voxelised scene, coarse-to-fine Q-attention learns what part of the scene to 'zoom' into. When this 'zooming' behaviour is applied iteratively, it results in a near-lossless discretisation of the translation space, and allows the use of a discrete action, deep Q-learning method. We show that our new coarse-to-fine algorithm achieves state-of-the-art performance on several difficult sparsely rewarded RLBench vision-based robotics tasks, and can train real-world policies, tabula rasa, in a matter of minutes, with as little as 3 demonstrations.
ISSN:	1063-6919
DOI:	10.1109/CVPR52688.2022.01337