CNN-based and DTW features for human activity recognition on depth maps

In this work, we present a new algorithm for human action recognition on raw depth maps. At the beginning, for each class we train a separate one-against-all convolutional neural network (CNN) to extract class-specific features representing person shape. Each class-specific, multivariate time-series...

Full description

Saved in:
Bibliographic Details
Published inNeural computing & applications Vol. 33; no. 21; pp. 14551 - 14563
Main Authors Trelinski, Jacek, Kwolek, Bogdan
Format Journal Article
LanguageEnglish
Published London Springer London 01.11.2021
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0941-0643
1433-3058
1433-3058
DOI10.1007/s00521-021-06097-1

Cover

More Information
Summary:In this work, we present a new algorithm for human action recognition on raw depth maps. At the beginning, for each class we train a separate one-against-all convolutional neural network (CNN) to extract class-specific features representing person shape. Each class-specific, multivariate time-series is processed by a Siamese multichannel 1D CNN or a multichannel 1D CNN to determine features representing actions. Afterwards, for the nonzero pixels representing the person shape in each depth map we calculate statistical features. On multivariate time-series of such features we determine Dynamic Time Warping (DTW) features. They are determined on the basis of DTW distances between all training time-series. Finally, each class-specific feature vector is concatenated with the DTW feature vector. For each action category we train a multiclass classifier, which predicts probability distribution of class labels. From pool of such classifiers we select a number of classifiers such that an ensemble built on them achieves the best classification accuracy. Action recognition is performed by a soft voting ensemble that averages distributions calculated by such classifiers with the largest discriminative power. We demonstrate experimentally that on MSR-Action3D and UTD-MHAD datasets the proposed algorithm attains promising results and outperforms several state-of-the-art depth-based algorithms.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0941-0643
1433-3058
1433-3058
DOI:10.1007/s00521-021-06097-1