Two-Stream Auto-Encoder Network for Unsupervised Skeleton-Based Action Recognition

Representation learning from unlabeled skeleton data is a challenging task. Prior unsupervised learning algorithms mainly rely on the modeling ability of recurrent neural networks to extract the action representations. However, the structural information of the skeleton data, which also plays a crit...

Full description

Saved in:

Bibliographic Details
Published in	Shanghai jiao tong da xue xue bao. Yi xue ban Vol. 30; no. 2; p. 330
Main Authors	Wang, Gang, Guan, Yaonan, Li, Dewei
Format	Journal Article
Language	English
Published	Shanghai Shanghai Jiaotong University Press 01.04.2025
Subjects	Activity recognition Algorithms Artificial neural networks Machine learning Neural networks Recurrent neural networks Representations Unsupervised learning
Online Access	Get full text
ISSN	1674-8115

Cover

More Information
Summary:	Representation learning from unlabeled skeleton data is a challenging task. Prior unsupervised learning algorithms mainly rely on the modeling ability of recurrent neural networks to extract the action representations. However, the structural information of the skeleton data, which also plays a critical role in action recognition, is rarely explored in existing unsupervised methods. To deal with this limitation, we propose a novel two-stream autoencoder network to combine the topological information with temporal information of skeleton data. Specifically, we encode the graph structure by graph convolutional network (GCN) and integrate the extracted GCN-based representations into the gate recurrent unit stream. Then we design a transfer module to merge the representations of the two streams adaptively. According to the characteristics of the two-stream autoencoder, a unified loss function composed of multiple tasks is proposed to update the learnable parameters of our model. Comprehensive experiments on NW-UCLA, UWA3D, and NTU-RGBD 60 datasets demonstrate that our proposed method can achieve an excellent performance among the unsupervised skeleton-based methods and even perform a similar or superior performance over numerous supervised skeleton-based methods.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1674-8115