DeepGRU: Deep Gesture Recognition Utility

We propose DeepGRU, a novel end-to-end deep network model informed by recent developments in deep learning for gesture and action recognition, that is streamlined and device-agnostic. DeepGRU, which uses only raw skeleton, pose or vector data is quickly understood, implemented, and trained, and yet...

Full description

Saved in:

Bibliographic Details
Published in	Advances in Visual Computing Vol. 11844; pp. 16 - 31
Main Authors	Maghoumi, Mehran, LaViola, Joseph J.
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2019 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Action recognition Deep learning Gesture recognition
Online Access	Get full text
ISBN	3030337197 9783030337193
ISSN	0302-9743 1611-3349
DOI	10.1007/978-3-030-33720-9_2

Cover

More Information
Summary:	We propose DeepGRU, a novel end-to-end deep network model informed by recent developments in deep learning for gesture and action recognition, that is streamlined and device-agnostic. DeepGRU, which uses only raw skeleton, pose or vector data is quickly understood, implemented, and trained, and yet achieves state-of-the-art results on challenging datasets. At the heart of our method lies a set of stacked gated recurrent units (GRU), two fully-connected layers and a novel global attention model. We evaluate our method on seven publicly available datasets, containing various number of samples and spanning over a broad range of interactions (full-body, multi-actor, hand gestures, etc.). In all but one case we outperform the state-of-the-art pose-based methods. For instance, we achieve a recognition accuracy of 84.9% and 92.3% on cross-subject and cross-view tests of the NTU RGB+D dataset respectively, and also 100% recognition accuracy on the UT-Kinect dataset. We show that even in the absence of powerful hardware, or a large amount of training data, and with as little as four samples per class, DeepGRU can be trained in under 10 min while beating traditional methods specifically designed for small training sets, making it an enticing choice for rapid application prototyping and development.
Bibliography:	The original version of this chapter was revised: the name of an author was tagged incorrectly. The correction to this chapter is available at https://doi.org/10.1007/978-3-030-33720-9_54
ISBN:	3030337197 9783030337193
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-030-33720-9_2