A motion-aware and temporal-enhanced Spatial–Temporal Graph Convolutional Network for skeleton-based human action segmentation

Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize only a single action from a given input video, thus we need to input a pre-trimmed video containing only one type of action. In contrast, te...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 580; p. 127482
Main Authors Chai, Shurong, Jain, Rahul Kumar, Liu, Jiaqing, Teng, Shiyu, Tateyama, Tomoko, Li, Yinhao, Chen, Yen-Wei
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.05.2024
Subjects
Online AccessGet full text
ISSN0925-2312
1872-8286
DOI10.1016/j.neucom.2024.127482

Cover

Abstract Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize only a single action from a given input video, thus we need to input a pre-trimmed video containing only one type of action. In contrast, temporal action segmentation (TAS) aims to segment a temporally untrimmed video sequence by time. Consequently, it has wider application prospects in various fields. Previously proposed TAS-based methods use only RGB color video as input to segment the actions, but RGB video is not robust against diverse backgrounds. Whereas skeleton-based features are more resilient as they do not incorporate any background information but there has been limited research exploring this feature modality. To this end, we propose a motion-aware and temporal-enhanced spatial–temporal graph convolutional network for the skeleton-based human action segmentation. Our framework contains a motion-aware module, multi-scale temporal convolutional network, temporal-enhanced graph convolutional network module and a refinement module. Our method can efficiently capture the motion information and long-range dependencies using skeleton features while improving temporal modeling. We have conducted experiments using four publicly available datasets to demonstrate the effectiveness of our introduced method. The code is available at https://github.com/11yxk/openpack.
AbstractList Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize only a single action from a given input video, thus we need to input a pre-trimmed video containing only one type of action. In contrast, temporal action segmentation (TAS) aims to segment a temporally untrimmed video sequence by time. Consequently, it has wider application prospects in various fields. Previously proposed TAS-based methods use only RGB color video as input to segment the actions, but RGB video is not robust against diverse backgrounds. Whereas skeleton-based features are more resilient as they do not incorporate any background information but there has been limited research exploring this feature modality. To this end, we propose a motion-aware and temporal-enhanced spatial–temporal graph convolutional network for the skeleton-based human action segmentation. Our framework contains a motion-aware module, multi-scale temporal convolutional network, temporal-enhanced graph convolutional network module and a refinement module. Our method can efficiently capture the motion information and long-range dependencies using skeleton features while improving temporal modeling. We have conducted experiments using four publicly available datasets to demonstrate the effectiveness of our introduced method. The code is available at https://github.com/11yxk/openpack.
ArticleNumber 127482
Author Chen, Yen-Wei
Teng, Shiyu
Jain, Rahul Kumar
Tateyama, Tomoko
Chai, Shurong
Li, Yinhao
Liu, Jiaqing
Author_xml – sequence: 1
  givenname: Shurong
  surname: Chai
  fullname: Chai, Shurong
  organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan
– sequence: 2
  givenname: Rahul Kumar
  orcidid: 0000-0002-0768-2193
  surname: Jain
  fullname: Jain, Rahul Kumar
  organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan
– sequence: 3
  givenname: Jiaqing
  surname: Liu
  fullname: Liu, Jiaqing
  organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan
– sequence: 4
  givenname: Shiyu
  surname: Teng
  fullname: Teng, Shiyu
  organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan
– sequence: 5
  givenname: Tomoko
  surname: Tateyama
  fullname: Tateyama, Tomoko
  organization: Department of Intelligent Information Engineering, Fujita Health University, Fujita, Japan
– sequence: 6
  givenname: Yinhao
  surname: Li
  fullname: Li, Yinhao
  organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan
– sequence: 7
  givenname: Yen-Wei
  orcidid: 0000-0002-5952-0188
  surname: Chen
  fullname: Chen, Yen-Wei
  email: chen@is.ritsumei.ac.jp
  organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan
BookMark eNqFkE1OwzAQRi1UJNrCDVj4Aim281sWSKiCgoRgAaytiTOhLold2W4rdr0DN-QkJAorFrCy7PH3NN-bkJGxBgk552zGGc8u1jODW2XbmWAimXGRJ4U4ImNe5CIqRJGNyJjNRRqJmIsTMvF-zRjPuZiPyeGatjZoayLYg0MKpqIB24110ERoVmAUVvR5A0FD83X4fPmZ0aWDzYourNnZZtsDurdHDHvr3mltHfXv2GDouCX4jrDatmAoqP4n9fjWognQX07JcQ2Nx7Ofc0peb29eFnfRw9PyfnH9EKk4FyGqsrxUWQVFUlaYxnGhkpJB17Rk2RwyBqpWJcMMMUkLRIW8Kjibl0nNeZrzJJ6SZOAqZ713WMuN0y24D8mZ7C3KtRwsyt6iHCx2sctfMaWHxYMD3fwXvhrC2BXbaXTSK429Ue1QBVlZ_TfgG9mol3g
CitedBy_id crossref_primary_10_1007_s00371_024_03688_6
Cites_doi 10.1109/CVPR46437.2021.01653
10.1109/CVPR.2016.216
10.1109/WACV48630.2021.00089
10.1109/WACV45572.2020.9093535
10.1609/aaai.v32i1.12328
10.1109/CVPR.2017.75
10.1109/ICCV51070.2023.01258
10.1109/CVPR42600.2020.00022
10.3390/s16010115
10.3390/s20154083
10.1109/CVPR.2019.00369
10.1007/978-3-030-69541-5_3
10.1016/j.neunet.2005.06.042
10.1016/j.neucom.2022.09.071
10.1109/CVPR42600.2020.01404
10.1109/TPAMI.2022.3183112
10.1109/WACV48630.2021.00237
10.1109/CVPR.2016.341
10.1109/CVPR.2017.143
10.1109/CVPR.2015.7298714
10.1109/TAI.2021.3076974
10.1145/3132734.3132739
10.1145/3441628
10.1109/CVPR.2017.113
10.1109/CVPR.2017.633
10.1109/CVPR.2018.00705
10.1109/ICCV48922.2021.01311
10.1109/CVPR.2019.01230
ContentType Journal Article
Copyright 2024 Elsevier B.V.
Copyright_xml – notice: 2024 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.neucom.2024.127482
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-8286
ExternalDocumentID 10_1016_j_neucom_2024_127482
S0925231224002534
GroupedDBID ---
--K
--M
.DC
.~1
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JM
9JN
AABNK
AACTN
AADPK
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXLA
AAXUO
AAYFN
ABBOA
ABCQJ
ABFNM
ABJNI
ABMAC
ABYKQ
ACDAQ
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFXIZ
AGHFR
AGUBO
AGWIK
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
AXJTR
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
IHE
J1W
KOM
MO0
MOBAO
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
RIG
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSN
SSV
SSZ
T5K
ZMT
~G-
29N
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ABXDB
ACLOT
ACNNM
ACRPL
ACVFH
ADCNI
ADJOM
ADMUD
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
CITATION
EFKBS
EJD
FEDTE
FGOYB
HLZ
HVGLF
HZ~
LG9
M41
R2-
SBC
WUQ
XPP
~HD
ID FETCH-LOGICAL-c372t-d67bc6da84bde5338c4b0a274b069a60acfcb0e6ee458eece1d8109b4f1157143
IEDL.DBID .~1
ISSN 0925-2312
IngestDate Thu Oct 16 04:44:24 EDT 2025
Thu Apr 24 23:10:35 EDT 2025
Sat Mar 30 16:20:01 EDT 2024
IsPeerReviewed true
IsScholarly true
Keywords Skeleton-based action recognition
Human action segmentation
Graph convolutional network
Video understanding
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c372t-d67bc6da84bde5338c4b0a274b069a60acfcb0e6ee458eece1d8109b4f1157143
ORCID 0000-0002-5952-0188
0000-0002-0768-2193
ParticipantIDs crossref_primary_10_1016_j_neucom_2024_127482
crossref_citationtrail_10_1016_j_neucom_2024_127482
elsevier_sciencedirect_doi_10_1016_j_neucom_2024_127482
PublicationCentury 2000
PublicationDate 2024-05-01
2024-05-00
PublicationDateYYYYMMDD 2024-05-01
PublicationDate_xml – month: 05
  year: 2024
  text: 2024-05-01
  day: 01
PublicationDecade 2020
PublicationTitle Neurocomputing (Amsterdam)
PublicationYear 2024
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Ding, Sener, Yao (b1) 2022
Wang, Gao, Wang, Li, Wu (b7) 2020
Y. Du, W. Wang, L. Wang, Hierarchical recurrent neural network for skeleton based action recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1110–1118.
Yue, Tian, Du (b15) 2022
P. Lei, S. Todorovic, Temporal deformable residual networks for action segmentation in videos, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6742–6751.
Yi, Wen, Jiang (b8) 2021
Graves, Schmidhuber (b56) 2005; 18
Ding, Xu (b39) 2017
Rohrbach, Amin, Andriluka, Schiele (b38) 2012
Uchiyama (b41) 2023
Dhiman, Saxena, Vishwakarma (b32) 2019
Y. Ioannou, D. Robertson, R. Cipolla, A. Criminisi, Deep roots: Improving cnn efficiency with hierarchical filter groups, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1231–1240.
D. Yang, Y. Wang, A. Dantcheva, Q. Kong, L. Garattoni, G. Francesca, F. Bremond, LAC-Latent Action Composition for Skeleton-based Action Segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 13679–13690.
Y.A. Farha, J. Gall, Ms-tcn: Multi-stage temporal convolutional network for action segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3575–3584.
S. Yan, Y. Xiong, D. Lin, Spatial temporal graph convolutional networks for skeleton-based action recognition, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, No. 1, 2018.
Duan, Wang, Chen, Lin (b52) 2022
Chai, Liu, Jain, Li, Tateyama, Chen (b25) 2023
Hamilton, Ying, Leskovec (b29) 2017; 30
F. Yu, V. Koltun, T. Funkhouser, Dilated residual networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 472–480.
Zhou, Li, Cheng, Geng, Xie, Keuper (b35) 2022
B. Singh, T.K. Marks, M. Jones, O. Tuzel, M. Shao, A multi-stream bi-directional recurrent neural network for fine-grained action detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1961–1970.
S.-H. Gao, Q. Han, Z.-Y. Li, P. Peng, L. Wang, M.-M. Cheng, Global2local: Efficient structure search for video action segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 16805–16814.
A. Richard, J. Gall, Temporal action detection using a statistical language model, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3131–3140.
Xu, Hu, Leskovec, Jegelka (b28) 2018
Yoshimura, Morales, Maekawa, Hara (b20) 2022
S. Karaman, L. Seidenari, A. Del Bimbo, Fast saliency based pooling of fisher encoded dense trajectories, in: ECCV THUMOS Workshop, Vol. 1, No. 2, 2014, p. 5.
Z. Liu, H. Zhang, Z. Chen, Z. Wang, W. Ouyang, Disentangling and unifying graph convolutions for skeleton-based action recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 143–152.
M.-H. Chen, B. Li, Y. Bao, G. AlRegib, Action segmentation with mixed temporal domain adaptation, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 605–614.
Matsubayashi (b44) 2023
Kipf, Welling (b18) 2016
Z. Cao, T. Simon, S.-E. Wei, Y. Sheikh, Realtime multi-person 2d pose estimation using part affinity fields, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 7291–7299.
J. Liu, N. Akhtar, A. Mian, Skepxels: Spatio-temporal image representation of human skeleton joints for action recognition, in: CVPR Workshops, 2019, pp. 10–19.
Lee, Lee, Lee, Lee (b50) 2022
L. Shi, Y. Zhang, J. Cheng, H. Lu, Decoupled spatial-temporal attention network for skeleton-based action-gesture recognition, in: Proceedings of the Asian Conference on Computer Vision, 2020.
Ahmad, Jin, Zhang, Lai, Tang, Lin (b17) 2021; 2
Y. Ben-Shabat, X. Yu, F. Saleh, D. Campbell, C. Rodriguez-Opazo, H. Li, S. Gould, The ikea asm dataset: Understanding people assembling furniture through actions, objects and pose, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 847–859.
Inoshita, Namba, Nakatani, Ishihara, Iwasaki, Moriwaki, Han (b43) 2023
C. Lea, M.D. Flynn, R. Vidal, A. Reiter, G.D. Hager, Temporal convolutional networks for action segmentation and detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 156–165.
Ordóñez, Roggen (b42) 2016; 16
Behrmann, Golestaneh, Kolter, Gall, Noroozi (b9) 2022
L. Shi, Y. Zhang, J. Cheng, H. Lu, Two-stream adaptive graph convolutional networks for skeleton-based action recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 12026–12035.
Wang, Zhang, Wei, Wang, Zhao, Jiang (b51) 2022
Dhiman, Vishwakarma, Agarwal (b31) 2021; 17
Y. Chen, Z. Zhang, C. Yuan, B. Li, Y. Deng, W. Hu, Channel-wise topology refinement graph convolution for skeleton-based action recognition, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 13359–13368.
Sun, Ke, Rahmani, Bennamoun, Wang, Liu (b14) 2022
Singhania, Rahaman, Yao (b5) 2021
Filtjens, Vanrumste, Slaets (b22) 2022
Lea, Reiter, Vidal, Hager (b2) 2016
Y. Ishikawa, S. Kasai, Y. Aoki, H. Kataoka, Alleviating over-segmentation errors by detecting action boundaries, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 2322–2331.
Plizzari, Cannici, Matteucci (b36) 2021; 208
Hu, Liu, Feng (b46) 2022
Y. Huang, Y. Sugano, Y. Sato, Improving action segmentation via graph-based temporal reasoning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14024–14034.
Niemann, Reining, Moya Rueda, Nair, Steffens, Fink, Ten Hompel (b49) 2020; 20
Veličković, Cucurull, Casanova, Romero, Lio, Bengio (b27) 2017
C. Liu, Y. Hu, Y. Li, S. Song, J. Liu, PKU-MMD: A large scale benchmark for skeleton-based human action understanding, in: Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities, 2017, pp. 1–8.
Wagh (b45) 2023
10.1016/j.neucom.2024.127482_b47
10.1016/j.neucom.2024.127482_b48
Behrmann (10.1016/j.neucom.2024.127482_b9) 2022
Niemann (10.1016/j.neucom.2024.127482_b49) 2020; 20
10.1016/j.neucom.2024.127482_b53
10.1016/j.neucom.2024.127482_b10
10.1016/j.neucom.2024.127482_b54
10.1016/j.neucom.2024.127482_b11
10.1016/j.neucom.2024.127482_b55
10.1016/j.neucom.2024.127482_b12
Filtjens (10.1016/j.neucom.2024.127482_b22) 2022
Chai (10.1016/j.neucom.2024.127482_b25) 2023
10.1016/j.neucom.2024.127482_b37
Plizzari (10.1016/j.neucom.2024.127482_b36) 2021; 208
Ding (10.1016/j.neucom.2024.127482_b39) 2017
Yi (10.1016/j.neucom.2024.127482_b8) 2021
Hu (10.1016/j.neucom.2024.127482_b46) 2022
Lee (10.1016/j.neucom.2024.127482_b50) 2022
Wang (10.1016/j.neucom.2024.127482_b7) 2020
Xu (10.1016/j.neucom.2024.127482_b28) 2018
10.1016/j.neucom.2024.127482_b40
Sun (10.1016/j.neucom.2024.127482_b14) 2022
10.1016/j.neucom.2024.127482_b24
Matsubayashi (10.1016/j.neucom.2024.127482_b44) 2023
Ding (10.1016/j.neucom.2024.127482_b1) 2022
10.1016/j.neucom.2024.127482_b6
Ahmad (10.1016/j.neucom.2024.127482_b17) 2021; 2
10.1016/j.neucom.2024.127482_b26
10.1016/j.neucom.2024.127482_b4
10.1016/j.neucom.2024.127482_b3
Uchiyama (10.1016/j.neucom.2024.127482_b41) 2023
Yue (10.1016/j.neucom.2024.127482_b15) 2022
Inoshita (10.1016/j.neucom.2024.127482_b43) 2023
Singhania (10.1016/j.neucom.2024.127482_b5) 2021
Veličković (10.1016/j.neucom.2024.127482_b27) 2017
10.1016/j.neucom.2024.127482_b33
10.1016/j.neucom.2024.127482_b34
Wang (10.1016/j.neucom.2024.127482_b51) 2022
10.1016/j.neucom.2024.127482_b30
10.1016/j.neucom.2024.127482_b19
Lea (10.1016/j.neucom.2024.127482_b2) 2016
10.1016/j.neucom.2024.127482_b13
Zhou (10.1016/j.neucom.2024.127482_b35) 2022
10.1016/j.neucom.2024.127482_b16
Hamilton (10.1016/j.neucom.2024.127482_b29) 2017; 30
Rohrbach (10.1016/j.neucom.2024.127482_b38) 2012
Duan (10.1016/j.neucom.2024.127482_b52) 2022
Dhiman (10.1016/j.neucom.2024.127482_b31) 2021; 17
Graves (10.1016/j.neucom.2024.127482_b56) 2005; 18
Wagh (10.1016/j.neucom.2024.127482_b45) 2023
Kipf (10.1016/j.neucom.2024.127482_b18) 2016
10.1016/j.neucom.2024.127482_b21
10.1016/j.neucom.2024.127482_b23
Ordóñez (10.1016/j.neucom.2024.127482_b42) 2016; 16
Dhiman (10.1016/j.neucom.2024.127482_b32) 2019
Yoshimura (10.1016/j.neucom.2024.127482_b20) 2022
References_xml – reference: F. Yu, V. Koltun, T. Funkhouser, Dilated residual networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 472–480.
– reference: Y. Ioannou, D. Robertson, R. Cipolla, A. Criminisi, Deep roots: Improving cnn efficiency with hierarchical filter groups, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1231–1240.
– volume: 208
  year: 2021
  ident: b36
  article-title: Skeleton-based action recognition via spatial and temporal transformer networks
  publication-title: Comput. Vis. Image Underst.
– reference: J. Liu, N. Akhtar, A. Mian, Skepxels: Spatio-temporal image representation of human skeleton joints for action recognition, in: CVPR Workshops, 2019, pp. 10–19.
– year: 2018
  ident: b28
  article-title: How powerful are graph neural networks?
– reference: Y.A. Farha, J. Gall, Ms-tcn: Multi-stage temporal convolutional network for action segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3575–3584.
– reference: B. Singh, T.K. Marks, M. Jones, O. Tuzel, M. Shao, A multi-stream bi-directional recurrent neural network for fine-grained action detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1961–1970.
– reference: S. Karaman, L. Seidenari, A. Del Bimbo, Fast saliency based pooling of fisher encoded dense trajectories, in: ECCV THUMOS Workshop, Vol. 1, No. 2, 2014, p. 5.
– reference: P. Lei, S. Todorovic, Temporal deformable residual networks for action segmentation in videos, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6742–6751.
– reference: D. Yang, Y. Wang, A. Dantcheva, Q. Kong, L. Garattoni, G. Francesca, F. Bremond, LAC-Latent Action Composition for Skeleton-based Action Segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 13679–13690.
– year: 2017
  ident: b27
  article-title: Graph attention networks
– start-page: 262
  year: 2023
  end-page: 263
  ident: b43
  article-title: Exploring cross modality feature fusion for activity recognition at OpenPack challenge 2022
  publication-title: 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events
– year: 2022
  ident: b52
  article-title: DG-STGCN: Dynamic spatial-temporal modeling for skeleton-based action recognition
– year: 2021
  ident: b5
  article-title: Coarse to fine multi-resolution temporal convolutional network
– reference: Z. Cao, T. Simon, S.-E. Wei, Y. Sheikh, Realtime multi-person 2d pose estimation using part affinity fields, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 7291–7299.
– year: 2022
  ident: b35
  article-title: Hypergraph transformer for skeleton-based action recognition
– start-page: 36
  year: 2016
  end-page: 52
  ident: b2
  article-title: Segmental spatiotemporal cnns for fine-grained action segmentation
  publication-title: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, the Netherlands, October 11–14, 2016, Proceedings, Part III 14
– reference: A. Richard, J. Gall, Temporal action detection using a statistical language model, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3131–3140.
– year: 2022
  ident: b50
  article-title: Hierarchically decomposed graph convolutional networks for skeleton-based action recognition
– start-page: 52
  year: 2022
  end-page: 68
  ident: b9
  article-title: Unified fully and timestamp supervised temporal action segmentation via sequence to sequence translation
  publication-title: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXV
– start-page: 257
  year: 2023
  end-page: 258
  ident: b44
  article-title: OpenPack challenge 2022 report: Impact of data cleaning and time alignment on activity recognition
  publication-title: 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events
– reference: Z. Liu, H. Zhang, Z. Chen, Z. Wang, W. Ouyang, Disentangling and unifying graph convolutions for skeleton-based action recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 143–152.
– volume: 17
  start-page: 1
  year: 2021
  end-page: 24
  ident: b31
  article-title: Part-wise spatio-temporal attention driven CNN-based 3D human action recognition
  publication-title: ACM Trans. Multimed. Comput. Commun. Appl.
– start-page: 225
  year: 2019
  end-page: 230
  ident: b32
  article-title: Skeleton-based view invariant deep features for human activity recognition
  publication-title: 2019 IEEE Fifth International Conference on Multimedia Big Data
– year: 2022
  ident: b20
  article-title: OpenPack: A large-scale dataset for recognizing packaging works in IoT-enabled logistic environments
– reference: Y. Chen, Z. Zhang, C. Yuan, B. Li, Y. Deng, W. Hu, Channel-wise topology refinement graph convolution for skeleton-based action recognition, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 13359–13368.
– year: 2021
  ident: b8
  article-title: Asformer: Transformer for action segmentation
– volume: 18
  start-page: 602
  year: 2005
  end-page: 610
  ident: b56
  article-title: Framewise phoneme classification with bidirectional LSTM and other neural network architectures
  publication-title: Neural Netw.
– reference: M.-H. Chen, B. Li, Y. Bao, G. AlRegib, Action segmentation with mixed temporal domain adaptation, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 605–614.
– reference: Y. Ben-Shabat, X. Yu, F. Saleh, D. Campbell, C. Rodriguez-Opazo, H. Li, S. Gould, The ikea asm dataset: Understanding people assembling furniture through actions, objects and pose, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 847–859.
– start-page: 267
  year: 2023
  end-page: 269
  ident: b25
  article-title: A spatial-temporal graph convolutional networks-based approach for the OpenPack challenge 2022
  publication-title: 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events
– reference: L. Shi, Y. Zhang, J. Cheng, H. Lu, Two-stream adaptive graph convolutional networks for skeleton-based action recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 12026–12035.
– reference: C. Liu, Y. Hu, Y. Li, S. Song, J. Liu, PKU-MMD: A large scale benchmark for skeleton-based human action understanding, in: Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities, 2017, pp. 1–8.
– volume: 20
  start-page: 4083
  year: 2020
  ident: b49
  article-title: Lara: Creating a dataset for human activity recognition in logistics using semantic attributes
  publication-title: Sensors
– start-page: 259
  year: 2023
  end-page: 261
  ident: b45
  article-title: Precise human activity recognition for the OpenPack challenge 2022
  publication-title: 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events
– year: 2022
  ident: b46
  article-title: Spatial temporal graph attention network for skeleton-based action recognition
– volume: 30
  year: 2017
  ident: b29
  article-title: Inductive representation learning on large graphs
  publication-title: Adv. Neural Inf. Process. Syst.
– reference: Y. Ishikawa, S. Kasai, Y. Aoki, H. Kataoka, Alleviating over-segmentation errors by detecting action boundaries, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 2322–2331.
– reference: Y. Huang, Y. Sugano, Y. Sato, Improving action segmentation via graph-based temporal reasoning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14024–14034.
– start-page: 264
  year: 2023
  end-page: 266
  ident: b41
  article-title: Transformer-based time series classification for the OpenPack challenge 2022
  publication-title: 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events
– reference: Y. Du, W. Wang, L. Wang, Hierarchical recurrent neural network for skeleton based action recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1110–1118.
– year: 2017
  ident: b39
  article-title: Tricornet: A hybrid temporal convolutional and recurrent network for video action segmentation
– reference: S.-H. Gao, Q. Han, Z.-Y. Li, P. Peng, L. Wang, M.-M. Cheng, Global2local: Efficient structure search for video action segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 16805–16814.
– year: 2016
  ident: b18
  article-title: Semi-supervised classification with graph convolutional networks
– reference: S. Yan, Y. Xiong, D. Lin, Spatial temporal graph convolutional networks for skeleton-based action recognition, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, No. 1, 2018.
– volume: 2
  start-page: 128
  year: 2021
  end-page: 145
  ident: b17
  article-title: Graph convolutional neural network for human action recognition: A comprehensive survey
  publication-title: IEEE Trans. Artif. Intell.
– start-page: 1194
  year: 2012
  end-page: 1201
  ident: b38
  article-title: A database for fine grained activity detection of cooking activities
  publication-title: 2012 IEEE Conference on Computer Vision and Pattern Recognition
– year: 2022
  ident: b14
  article-title: Human action recognition from various data modalities: A review
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– year: 2022
  ident: b1
  article-title: Temporal action segmentation: An analysis of modern technique
– reference: L. Shi, Y. Zhang, J. Cheng, H. Lu, Decoupled spatial-temporal attention network for skeleton-based action-gesture recognition, in: Proceedings of the Asian Conference on Computer Vision, 2020.
– reference: C. Lea, M.D. Flynn, R. Vidal, A. Reiter, G.D. Hager, Temporal convolutional networks for action segmentation and detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 156–165.
– year: 2022
  ident: b22
  article-title: Skeleton-based action segmentation with multi-stage spatial-temporal graph convolutional neural networks
  publication-title: IEEE Trans. Emerg. Top. Comput.
– volume: 16
  start-page: 115
  year: 2016
  ident: b42
  article-title: Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition
  publication-title: Sensors
– year: 2022
  ident: b51
  article-title: Skeleton-based action recognition via temporal-channel aggregation
– start-page: 34
  year: 2020
  end-page: 51
  ident: b7
  article-title: Boundary-aware cascade networks for temporal action segmentation
  publication-title: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16
– year: 2022
  ident: b15
  article-title: Action recognition based on RGB and skeleton data sets: A survey
  publication-title: Neurocomputing
– ident: 10.1016/j.neucom.2024.127482_b10
  doi: 10.1109/CVPR46437.2021.01653
– start-page: 1194
  year: 2012
  ident: 10.1016/j.neucom.2024.127482_b38
  article-title: A database for fine grained activity detection of cooking activities
– start-page: 52
  year: 2022
  ident: 10.1016/j.neucom.2024.127482_b9
  article-title: Unified fully and timestamp supervised temporal action segmentation via sequence to sequence translation
– ident: 10.1016/j.neucom.2024.127482_b3
  doi: 10.1109/CVPR.2016.216
– ident: 10.1016/j.neucom.2024.127482_b21
  doi: 10.1109/WACV48630.2021.00089
– ident: 10.1016/j.neucom.2024.127482_b6
  doi: 10.1109/WACV45572.2020.9093535
– ident: 10.1016/j.neucom.2024.127482_b19
  doi: 10.1609/aaai.v32i1.12328
– year: 2022
  ident: 10.1016/j.neucom.2024.127482_b51
– ident: 10.1016/j.neucom.2024.127482_b37
– year: 2021
  ident: 10.1016/j.neucom.2024.127482_b8
– ident: 10.1016/j.neucom.2024.127482_b30
– ident: 10.1016/j.neucom.2024.127482_b47
  doi: 10.1109/CVPR.2017.75
– ident: 10.1016/j.neucom.2024.127482_b16
  doi: 10.1109/ICCV51070.2023.01258
– ident: 10.1016/j.neucom.2024.127482_b34
  doi: 10.1109/CVPR42600.2020.00022
– year: 2022
  ident: 10.1016/j.neucom.2024.127482_b20
– volume: 16
  start-page: 115
  issue: 1
  year: 2016
  ident: 10.1016/j.neucom.2024.127482_b42
  article-title: Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition
  publication-title: Sensors
  doi: 10.3390/s16010115
– volume: 20
  start-page: 4083
  issue: 15
  year: 2020
  ident: 10.1016/j.neucom.2024.127482_b49
  article-title: Lara: Creating a dataset for human activity recognition in logistics using semantic attributes
  publication-title: Sensors
  doi: 10.3390/s20154083
– year: 2022
  ident: 10.1016/j.neucom.2024.127482_b50
– volume: 30
  year: 2017
  ident: 10.1016/j.neucom.2024.127482_b29
  article-title: Inductive representation learning on large graphs
  publication-title: Adv. Neural Inf. Process. Syst.
– volume: 208
  year: 2021
  ident: 10.1016/j.neucom.2024.127482_b36
  article-title: Skeleton-based action recognition via spatial and temporal transformer networks
  publication-title: Comput. Vis. Image Underst.
– ident: 10.1016/j.neucom.2024.127482_b26
  doi: 10.1109/CVPR.2019.00369
– start-page: 262
  year: 2023
  ident: 10.1016/j.neucom.2024.127482_b43
  article-title: Exploring cross modality feature fusion for activity recognition at OpenPack challenge 2022
– start-page: 257
  year: 2023
  ident: 10.1016/j.neucom.2024.127482_b44
  article-title: OpenPack challenge 2022 report: Impact of data cleaning and time alignment on activity recognition
– year: 2017
  ident: 10.1016/j.neucom.2024.127482_b27
– ident: 10.1016/j.neucom.2024.127482_b54
  doi: 10.1007/978-3-030-69541-5_3
– year: 2022
  ident: 10.1016/j.neucom.2024.127482_b52
– volume: 18
  start-page: 602
  issue: 5–6
  year: 2005
  ident: 10.1016/j.neucom.2024.127482_b56
  article-title: Framewise phoneme classification with bidirectional LSTM and other neural network architectures
  publication-title: Neural Netw.
  doi: 10.1016/j.neunet.2005.06.042
– year: 2022
  ident: 10.1016/j.neucom.2024.127482_b46
– start-page: 264
  year: 2023
  ident: 10.1016/j.neucom.2024.127482_b41
  article-title: Transformer-based time series classification for the OpenPack challenge 2022
– year: 2022
  ident: 10.1016/j.neucom.2024.127482_b1
– year: 2022
  ident: 10.1016/j.neucom.2024.127482_b35
– year: 2022
  ident: 10.1016/j.neucom.2024.127482_b15
  article-title: Action recognition based on RGB and skeleton data sets: A survey
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2022.09.071
– ident: 10.1016/j.neucom.2024.127482_b12
  doi: 10.1109/CVPR42600.2020.01404
– year: 2022
  ident: 10.1016/j.neucom.2024.127482_b14
  article-title: Human action recognition from various data modalities: A review
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2022.3183112
– start-page: 36
  year: 2016
  ident: 10.1016/j.neucom.2024.127482_b2
  article-title: Segmental spatiotemporal cnns for fine-grained action segmentation
– ident: 10.1016/j.neucom.2024.127482_b11
  doi: 10.1109/WACV48630.2021.00237
– year: 2016
  ident: 10.1016/j.neucom.2024.127482_b18
– ident: 10.1016/j.neucom.2024.127482_b53
  doi: 10.1109/CVPR.2016.341
– start-page: 259
  year: 2023
  ident: 10.1016/j.neucom.2024.127482_b45
  article-title: Precise human activity recognition for the OpenPack challenge 2022
– year: 2017
  ident: 10.1016/j.neucom.2024.127482_b39
– year: 2018
  ident: 10.1016/j.neucom.2024.127482_b28
– ident: 10.1016/j.neucom.2024.127482_b13
  doi: 10.1109/CVPR.2017.143
– year: 2021
  ident: 10.1016/j.neucom.2024.127482_b5
– ident: 10.1016/j.neucom.2024.127482_b33
  doi: 10.1109/CVPR.2015.7298714
– volume: 2
  start-page: 128
  issue: 2
  year: 2021
  ident: 10.1016/j.neucom.2024.127482_b17
  article-title: Graph convolutional neural network for human action recognition: A comprehensive survey
  publication-title: IEEE Trans. Artif. Intell.
  doi: 10.1109/TAI.2021.3076974
– ident: 10.1016/j.neucom.2024.127482_b48
  doi: 10.1145/3132734.3132739
– volume: 17
  start-page: 1
  issue: 3
  year: 2021
  ident: 10.1016/j.neucom.2024.127482_b31
  article-title: Part-wise spatio-temporal attention driven CNN-based 3D human action recognition
  publication-title: ACM Trans. Multimed. Comput. Commun. Appl.
  doi: 10.1145/3441628
– ident: 10.1016/j.neucom.2024.127482_b4
  doi: 10.1109/CVPR.2017.113
– year: 2022
  ident: 10.1016/j.neucom.2024.127482_b22
  article-title: Skeleton-based action segmentation with multi-stage spatial-temporal graph convolutional neural networks
  publication-title: IEEE Trans. Emerg. Top. Comput.
– start-page: 34
  year: 2020
  ident: 10.1016/j.neucom.2024.127482_b7
  article-title: Boundary-aware cascade networks for temporal action segmentation
– ident: 10.1016/j.neucom.2024.127482_b55
  doi: 10.1109/CVPR.2017.633
– start-page: 225
  year: 2019
  ident: 10.1016/j.neucom.2024.127482_b32
  article-title: Skeleton-based view invariant deep features for human activity recognition
– ident: 10.1016/j.neucom.2024.127482_b40
  doi: 10.1109/CVPR.2018.00705
– ident: 10.1016/j.neucom.2024.127482_b23
  doi: 10.1109/ICCV48922.2021.01311
– start-page: 267
  year: 2023
  ident: 10.1016/j.neucom.2024.127482_b25
  article-title: A spatial-temporal graph convolutional networks-based approach for the OpenPack challenge 2022
– ident: 10.1016/j.neucom.2024.127482_b24
  doi: 10.1109/CVPR.2019.01230
SSID ssj0017129
Score 2.4517996
Snippet Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 127482
SubjectTerms Graph convolutional network
Human action segmentation
Skeleton-based action recognition
Video understanding
Title A motion-aware and temporal-enhanced Spatial–Temporal Graph Convolutional Network for skeleton-based human action segmentation
URI https://dx.doi.org/10.1016/j.neucom.2024.127482
Volume 580
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Complete Freedom Collection [SCCMFC]
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: ACRLP
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Science Direct
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect Journal Collection
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: AIKHN
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  customDbUrl:
  mediaType: online
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: AKRWK
  dateStart: 19930201
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwELWqcuHCjihL5QNXt4njOM6xqigFpF5opd4iO3ahUNyqC9xQ_4E_5EuwYweBhEDiGMcjRZ6xZ6y8eQ-A81jKCHMuEQ4pRyQMOGKMSBQZD3PjcFPDFmiLHu0OyPUwHlZAu-yFsbBKf_a7M704rf1I069mczYeN2-DFJtbVFigIHEcWU5QQhKrYtB4_YR5hEmIHd8ejpGdXbbPFRgvrVYWM4JNomqE5n7G8M_p6UvK6eyALV8rwpb7nF1QUXoPbJc6DNBvy32wbkEnxoP4C58ryLWEnnJqgpS-L37yQys-bILtff3W9-_gpWWrhu2pfvYBaMZ6DhcOTTELF48mKVmNYZvrJCz0_KDrhIALdffk-5b0ARh0LvrtLvLKCiiPErxEkiYip5IzIqQyBR_LiQi4WQAR0JTTgOejXASKKkViplSuQsnCIBVkZLl5TIl1CKp6qtURgJjilLJRqkYkJVJSjqmIY0YDoRLBc1wDUbmgWe5px636xSQr8WUPmXNDZt2QOTfUAPq0mjnajT_mJ6Wvsm_hk5nM8Kvl8b8tT8CmfXLox1NQXc5X6sxUKEtRL0KwDjZaVzfd3gfQ2emA
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELZKGWDhjShPD6xuE8dxnLGqKAVKF1qpW2THDhRKikqBDfEf-If8EuzYqUBCILHaPinyne-hfHcfAMehlAHmXCLsU46I73HEGJEo0BrmWuE6hy3QFj3aGZDzYTisgFbZC2Nglc73W59eeGu30nC32XgYjRpXXox1FeUXKEgcBmQBLJIQR6YCq7_OcR5-5GM7cA-HyBwv--cKkFeungxoBOtIVfd1gcbwz_HpS8xpr4EVlyzCpv2edVBR-QZYLYkYoHuXm-CtCS0bD-IvfKogzyV0M6fGSOU3xV9-aNiHtbV9vL333R48NeOqYWuSPzsL1Gs9CwyHOpuFj3c6KhmSYRPsJCwI_aBthYCP6vreNS7lW2DQPum3OshRK6A0iPAMSRqJlErOiJBKZ3wsJcLj-gKER2NOPZ5mqfAUVYqETKlU-ZL5XixIZobz6BxrG1TzSa52AMQUx5RlscpITKSkHFMRhox6QkWCp7gGgvJCk9TNHTf0F-OkBJjdJlYNiVFDYtVQA2gu9WDnbvxxPip1lXyzn0SHhl8ld_8teQSWOv3LbtI9613sgWWzY6GQ-6A6mz6pA52uzMRhYY6fQ0vrFQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+motion-aware+and+temporal-enhanced+Spatial%E2%80%93Temporal+Graph+Convolutional+Network+for+skeleton-based+human+action+segmentation&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Chai%2C+Shurong&rft.au=Jain%2C+Rahul+Kumar&rft.au=Liu%2C+Jiaqing&rft.au=Teng%2C+Shiyu&rft.date=2024-05-01&rft.issn=0925-2312&rft.volume=580&rft.spage=127482&rft_id=info:doi/10.1016%2Fj.neucom.2024.127482&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_neucom_2024_127482
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon