A motion-aware and temporal-enhanced Spatial–Temporal Graph Convolutional Network for skeleton-based human action segmentation
Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize only a single action from a given input video, thus we need to input a pre-trimmed video containing only one type of action. In contrast, te...
Saved in:
| Published in | Neurocomputing (Amsterdam) Vol. 580; p. 127482 |
|---|---|
| Main Authors | , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Elsevier B.V
01.05.2024
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0925-2312 1872-8286 |
| DOI | 10.1016/j.neucom.2024.127482 |
Cover
| Abstract | Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize only a single action from a given input video, thus we need to input a pre-trimmed video containing only one type of action. In contrast, temporal action segmentation (TAS) aims to segment a temporally untrimmed video sequence by time. Consequently, it has wider application prospects in various fields. Previously proposed TAS-based methods use only RGB color video as input to segment the actions, but RGB video is not robust against diverse backgrounds. Whereas skeleton-based features are more resilient as they do not incorporate any background information but there has been limited research exploring this feature modality. To this end, we propose a motion-aware and temporal-enhanced spatial–temporal graph convolutional network for the skeleton-based human action segmentation. Our framework contains a motion-aware module, multi-scale temporal convolutional network, temporal-enhanced graph convolutional network module and a refinement module. Our method can efficiently capture the motion information and long-range dependencies using skeleton features while improving temporal modeling. We have conducted experiments using four publicly available datasets to demonstrate the effectiveness of our introduced method. The code is available at https://github.com/11yxk/openpack. |
|---|---|
| AbstractList | Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize only a single action from a given input video, thus we need to input a pre-trimmed video containing only one type of action. In contrast, temporal action segmentation (TAS) aims to segment a temporally untrimmed video sequence by time. Consequently, it has wider application prospects in various fields. Previously proposed TAS-based methods use only RGB color video as input to segment the actions, but RGB video is not robust against diverse backgrounds. Whereas skeleton-based features are more resilient as they do not incorporate any background information but there has been limited research exploring this feature modality. To this end, we propose a motion-aware and temporal-enhanced spatial–temporal graph convolutional network for the skeleton-based human action segmentation. Our framework contains a motion-aware module, multi-scale temporal convolutional network, temporal-enhanced graph convolutional network module and a refinement module. Our method can efficiently capture the motion information and long-range dependencies using skeleton features while improving temporal modeling. We have conducted experiments using four publicly available datasets to demonstrate the effectiveness of our introduced method. The code is available at https://github.com/11yxk/openpack. |
| ArticleNumber | 127482 |
| Author | Chen, Yen-Wei Teng, Shiyu Jain, Rahul Kumar Tateyama, Tomoko Chai, Shurong Li, Yinhao Liu, Jiaqing |
| Author_xml | – sequence: 1 givenname: Shurong surname: Chai fullname: Chai, Shurong organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan – sequence: 2 givenname: Rahul Kumar orcidid: 0000-0002-0768-2193 surname: Jain fullname: Jain, Rahul Kumar organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan – sequence: 3 givenname: Jiaqing surname: Liu fullname: Liu, Jiaqing organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan – sequence: 4 givenname: Shiyu surname: Teng fullname: Teng, Shiyu organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan – sequence: 5 givenname: Tomoko surname: Tateyama fullname: Tateyama, Tomoko organization: Department of Intelligent Information Engineering, Fujita Health University, Fujita, Japan – sequence: 6 givenname: Yinhao surname: Li fullname: Li, Yinhao organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan – sequence: 7 givenname: Yen-Wei orcidid: 0000-0002-5952-0188 surname: Chen fullname: Chen, Yen-Wei email: chen@is.ritsumei.ac.jp organization: Graduate School of Information and Engineering, Ritsumeikan University, Shiga, Japan |
| BookMark | eNqFkE1OwzAQRi1UJNrCDVj4Aim281sWSKiCgoRgAaytiTOhLold2W4rdr0DN-QkJAorFrCy7PH3NN-bkJGxBgk552zGGc8u1jODW2XbmWAimXGRJ4U4ImNe5CIqRJGNyJjNRRqJmIsTMvF-zRjPuZiPyeGatjZoayLYg0MKpqIB24110ERoVmAUVvR5A0FD83X4fPmZ0aWDzYourNnZZtsDurdHDHvr3mltHfXv2GDouCX4jrDatmAoqP4n9fjWognQX07JcQ2Nx7Ofc0peb29eFnfRw9PyfnH9EKk4FyGqsrxUWQVFUlaYxnGhkpJB17Rk2RwyBqpWJcMMMUkLRIW8Kjibl0nNeZrzJJ6SZOAqZ713WMuN0y24D8mZ7C3KtRwsyt6iHCx2sctfMaWHxYMD3fwXvhrC2BXbaXTSK429Ue1QBVlZ_TfgG9mol3g |
| CitedBy_id | crossref_primary_10_1007_s00371_024_03688_6 |
| Cites_doi | 10.1109/CVPR46437.2021.01653 10.1109/CVPR.2016.216 10.1109/WACV48630.2021.00089 10.1109/WACV45572.2020.9093535 10.1609/aaai.v32i1.12328 10.1109/CVPR.2017.75 10.1109/ICCV51070.2023.01258 10.1109/CVPR42600.2020.00022 10.3390/s16010115 10.3390/s20154083 10.1109/CVPR.2019.00369 10.1007/978-3-030-69541-5_3 10.1016/j.neunet.2005.06.042 10.1016/j.neucom.2022.09.071 10.1109/CVPR42600.2020.01404 10.1109/TPAMI.2022.3183112 10.1109/WACV48630.2021.00237 10.1109/CVPR.2016.341 10.1109/CVPR.2017.143 10.1109/CVPR.2015.7298714 10.1109/TAI.2021.3076974 10.1145/3132734.3132739 10.1145/3441628 10.1109/CVPR.2017.113 10.1109/CVPR.2017.633 10.1109/CVPR.2018.00705 10.1109/ICCV48922.2021.01311 10.1109/CVPR.2019.01230 |
| ContentType | Journal Article |
| Copyright | 2024 Elsevier B.V. |
| Copyright_xml | – notice: 2024 Elsevier B.V. |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.neucom.2024.127482 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1872-8286 |
| ExternalDocumentID | 10_1016_j_neucom_2024_127482 S0925231224002534 |
| GroupedDBID | --- --K --M .DC .~1 0R~ 123 1B1 1~. 1~5 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JM 9JN AABNK AACTN AADPK AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXLA AAXUO AAYFN ABBOA ABCQJ ABFNM ABJNI ABMAC ABYKQ ACDAQ ACGFS ACRLP ACZNC ADBBV ADEZE AEBSH AEKER AENEX AFKWA AFTJW AFXIZ AGHFR AGUBO AGWIK AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJOXV AKRWK ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD AXJTR BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ IHE J1W KOM MO0 MOBAO N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 RIG ROL RPZ SDF SDG SDP SES SEW SPC SPCBC SSN SSV SSZ T5K ZMT ~G- 29N AAQXK AATTM AAXKI AAYWO AAYXX ABWVN ABXDB ACLOT ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKYEP ANKPU APXCP ASPBG AVWKF AZFZN CITATION EFKBS EJD FEDTE FGOYB HLZ HVGLF HZ~ LG9 M41 R2- SBC WUQ XPP ~HD |
| ID | FETCH-LOGICAL-c372t-d67bc6da84bde5338c4b0a274b069a60acfcb0e6ee458eece1d8109b4f1157143 |
| IEDL.DBID | .~1 |
| ISSN | 0925-2312 |
| IngestDate | Thu Oct 16 04:44:24 EDT 2025 Thu Apr 24 23:10:35 EDT 2025 Sat Mar 30 16:20:01 EDT 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Skeleton-based action recognition Human action segmentation Graph convolutional network Video understanding |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c372t-d67bc6da84bde5338c4b0a274b069a60acfcb0e6ee458eece1d8109b4f1157143 |
| ORCID | 0000-0002-5952-0188 0000-0002-0768-2193 |
| ParticipantIDs | crossref_primary_10_1016_j_neucom_2024_127482 crossref_citationtrail_10_1016_j_neucom_2024_127482 elsevier_sciencedirect_doi_10_1016_j_neucom_2024_127482 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-05-01 2024-05-00 |
| PublicationDateYYYYMMDD | 2024-05-01 |
| PublicationDate_xml | – month: 05 year: 2024 text: 2024-05-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationTitle | Neurocomputing (Amsterdam) |
| PublicationYear | 2024 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | Ding, Sener, Yao (b1) 2022 Wang, Gao, Wang, Li, Wu (b7) 2020 Y. Du, W. Wang, L. Wang, Hierarchical recurrent neural network for skeleton based action recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1110–1118. Yue, Tian, Du (b15) 2022 P. Lei, S. Todorovic, Temporal deformable residual networks for action segmentation in videos, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6742–6751. Yi, Wen, Jiang (b8) 2021 Graves, Schmidhuber (b56) 2005; 18 Ding, Xu (b39) 2017 Rohrbach, Amin, Andriluka, Schiele (b38) 2012 Uchiyama (b41) 2023 Dhiman, Saxena, Vishwakarma (b32) 2019 Y. Ioannou, D. Robertson, R. Cipolla, A. Criminisi, Deep roots: Improving cnn efficiency with hierarchical filter groups, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1231–1240. D. Yang, Y. Wang, A. Dantcheva, Q. Kong, L. Garattoni, G. Francesca, F. Bremond, LAC-Latent Action Composition for Skeleton-based Action Segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 13679–13690. Y.A. Farha, J. Gall, Ms-tcn: Multi-stage temporal convolutional network for action segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3575–3584. S. Yan, Y. Xiong, D. Lin, Spatial temporal graph convolutional networks for skeleton-based action recognition, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, No. 1, 2018. Duan, Wang, Chen, Lin (b52) 2022 Chai, Liu, Jain, Li, Tateyama, Chen (b25) 2023 Hamilton, Ying, Leskovec (b29) 2017; 30 F. Yu, V. Koltun, T. Funkhouser, Dilated residual networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 472–480. Zhou, Li, Cheng, Geng, Xie, Keuper (b35) 2022 B. Singh, T.K. Marks, M. Jones, O. Tuzel, M. Shao, A multi-stream bi-directional recurrent neural network for fine-grained action detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1961–1970. S.-H. Gao, Q. Han, Z.-Y. Li, P. Peng, L. Wang, M.-M. Cheng, Global2local: Efficient structure search for video action segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 16805–16814. A. Richard, J. Gall, Temporal action detection using a statistical language model, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3131–3140. Xu, Hu, Leskovec, Jegelka (b28) 2018 Yoshimura, Morales, Maekawa, Hara (b20) 2022 S. Karaman, L. Seidenari, A. Del Bimbo, Fast saliency based pooling of fisher encoded dense trajectories, in: ECCV THUMOS Workshop, Vol. 1, No. 2, 2014, p. 5. Z. Liu, H. Zhang, Z. Chen, Z. Wang, W. Ouyang, Disentangling and unifying graph convolutions for skeleton-based action recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 143–152. M.-H. Chen, B. Li, Y. Bao, G. AlRegib, Action segmentation with mixed temporal domain adaptation, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 605–614. Matsubayashi (b44) 2023 Kipf, Welling (b18) 2016 Z. Cao, T. Simon, S.-E. Wei, Y. Sheikh, Realtime multi-person 2d pose estimation using part affinity fields, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 7291–7299. J. Liu, N. Akhtar, A. Mian, Skepxels: Spatio-temporal image representation of human skeleton joints for action recognition, in: CVPR Workshops, 2019, pp. 10–19. Lee, Lee, Lee, Lee (b50) 2022 L. Shi, Y. Zhang, J. Cheng, H. Lu, Decoupled spatial-temporal attention network for skeleton-based action-gesture recognition, in: Proceedings of the Asian Conference on Computer Vision, 2020. Ahmad, Jin, Zhang, Lai, Tang, Lin (b17) 2021; 2 Y. Ben-Shabat, X. Yu, F. Saleh, D. Campbell, C. Rodriguez-Opazo, H. Li, S. Gould, The ikea asm dataset: Understanding people assembling furniture through actions, objects and pose, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 847–859. Inoshita, Namba, Nakatani, Ishihara, Iwasaki, Moriwaki, Han (b43) 2023 C. Lea, M.D. Flynn, R. Vidal, A. Reiter, G.D. Hager, Temporal convolutional networks for action segmentation and detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 156–165. Ordóñez, Roggen (b42) 2016; 16 Behrmann, Golestaneh, Kolter, Gall, Noroozi (b9) 2022 L. Shi, Y. Zhang, J. Cheng, H. Lu, Two-stream adaptive graph convolutional networks for skeleton-based action recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 12026–12035. Wang, Zhang, Wei, Wang, Zhao, Jiang (b51) 2022 Dhiman, Vishwakarma, Agarwal (b31) 2021; 17 Y. Chen, Z. Zhang, C. Yuan, B. Li, Y. Deng, W. Hu, Channel-wise topology refinement graph convolution for skeleton-based action recognition, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 13359–13368. Sun, Ke, Rahmani, Bennamoun, Wang, Liu (b14) 2022 Singhania, Rahaman, Yao (b5) 2021 Filtjens, Vanrumste, Slaets (b22) 2022 Lea, Reiter, Vidal, Hager (b2) 2016 Y. Ishikawa, S. Kasai, Y. Aoki, H. Kataoka, Alleviating over-segmentation errors by detecting action boundaries, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 2322–2331. Plizzari, Cannici, Matteucci (b36) 2021; 208 Hu, Liu, Feng (b46) 2022 Y. Huang, Y. Sugano, Y. Sato, Improving action segmentation via graph-based temporal reasoning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14024–14034. Niemann, Reining, Moya Rueda, Nair, Steffens, Fink, Ten Hompel (b49) 2020; 20 Veličković, Cucurull, Casanova, Romero, Lio, Bengio (b27) 2017 C. Liu, Y. Hu, Y. Li, S. Song, J. Liu, PKU-MMD: A large scale benchmark for skeleton-based human action understanding, in: Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities, 2017, pp. 1–8. Wagh (b45) 2023 10.1016/j.neucom.2024.127482_b47 10.1016/j.neucom.2024.127482_b48 Behrmann (10.1016/j.neucom.2024.127482_b9) 2022 Niemann (10.1016/j.neucom.2024.127482_b49) 2020; 20 10.1016/j.neucom.2024.127482_b53 10.1016/j.neucom.2024.127482_b10 10.1016/j.neucom.2024.127482_b54 10.1016/j.neucom.2024.127482_b11 10.1016/j.neucom.2024.127482_b55 10.1016/j.neucom.2024.127482_b12 Filtjens (10.1016/j.neucom.2024.127482_b22) 2022 Chai (10.1016/j.neucom.2024.127482_b25) 2023 10.1016/j.neucom.2024.127482_b37 Plizzari (10.1016/j.neucom.2024.127482_b36) 2021; 208 Ding (10.1016/j.neucom.2024.127482_b39) 2017 Yi (10.1016/j.neucom.2024.127482_b8) 2021 Hu (10.1016/j.neucom.2024.127482_b46) 2022 Lee (10.1016/j.neucom.2024.127482_b50) 2022 Wang (10.1016/j.neucom.2024.127482_b7) 2020 Xu (10.1016/j.neucom.2024.127482_b28) 2018 10.1016/j.neucom.2024.127482_b40 Sun (10.1016/j.neucom.2024.127482_b14) 2022 10.1016/j.neucom.2024.127482_b24 Matsubayashi (10.1016/j.neucom.2024.127482_b44) 2023 Ding (10.1016/j.neucom.2024.127482_b1) 2022 10.1016/j.neucom.2024.127482_b6 Ahmad (10.1016/j.neucom.2024.127482_b17) 2021; 2 10.1016/j.neucom.2024.127482_b26 10.1016/j.neucom.2024.127482_b4 10.1016/j.neucom.2024.127482_b3 Uchiyama (10.1016/j.neucom.2024.127482_b41) 2023 Yue (10.1016/j.neucom.2024.127482_b15) 2022 Inoshita (10.1016/j.neucom.2024.127482_b43) 2023 Singhania (10.1016/j.neucom.2024.127482_b5) 2021 Veličković (10.1016/j.neucom.2024.127482_b27) 2017 10.1016/j.neucom.2024.127482_b33 10.1016/j.neucom.2024.127482_b34 Wang (10.1016/j.neucom.2024.127482_b51) 2022 10.1016/j.neucom.2024.127482_b30 10.1016/j.neucom.2024.127482_b19 Lea (10.1016/j.neucom.2024.127482_b2) 2016 10.1016/j.neucom.2024.127482_b13 Zhou (10.1016/j.neucom.2024.127482_b35) 2022 10.1016/j.neucom.2024.127482_b16 Hamilton (10.1016/j.neucom.2024.127482_b29) 2017; 30 Rohrbach (10.1016/j.neucom.2024.127482_b38) 2012 Duan (10.1016/j.neucom.2024.127482_b52) 2022 Dhiman (10.1016/j.neucom.2024.127482_b31) 2021; 17 Graves (10.1016/j.neucom.2024.127482_b56) 2005; 18 Wagh (10.1016/j.neucom.2024.127482_b45) 2023 Kipf (10.1016/j.neucom.2024.127482_b18) 2016 10.1016/j.neucom.2024.127482_b21 10.1016/j.neucom.2024.127482_b23 Ordóñez (10.1016/j.neucom.2024.127482_b42) 2016; 16 Dhiman (10.1016/j.neucom.2024.127482_b32) 2019 Yoshimura (10.1016/j.neucom.2024.127482_b20) 2022 |
| References_xml | – reference: F. Yu, V. Koltun, T. Funkhouser, Dilated residual networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 472–480. – reference: Y. Ioannou, D. Robertson, R. Cipolla, A. Criminisi, Deep roots: Improving cnn efficiency with hierarchical filter groups, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1231–1240. – volume: 208 year: 2021 ident: b36 article-title: Skeleton-based action recognition via spatial and temporal transformer networks publication-title: Comput. Vis. Image Underst. – reference: J. Liu, N. Akhtar, A. Mian, Skepxels: Spatio-temporal image representation of human skeleton joints for action recognition, in: CVPR Workshops, 2019, pp. 10–19. – year: 2018 ident: b28 article-title: How powerful are graph neural networks? – reference: Y.A. Farha, J. Gall, Ms-tcn: Multi-stage temporal convolutional network for action segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3575–3584. – reference: B. Singh, T.K. Marks, M. Jones, O. Tuzel, M. Shao, A multi-stream bi-directional recurrent neural network for fine-grained action detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1961–1970. – reference: S. Karaman, L. Seidenari, A. Del Bimbo, Fast saliency based pooling of fisher encoded dense trajectories, in: ECCV THUMOS Workshop, Vol. 1, No. 2, 2014, p. 5. – reference: P. Lei, S. Todorovic, Temporal deformable residual networks for action segmentation in videos, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6742–6751. – reference: D. Yang, Y. Wang, A. Dantcheva, Q. Kong, L. Garattoni, G. Francesca, F. Bremond, LAC-Latent Action Composition for Skeleton-based Action Segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 13679–13690. – year: 2017 ident: b27 article-title: Graph attention networks – start-page: 262 year: 2023 end-page: 263 ident: b43 article-title: Exploring cross modality feature fusion for activity recognition at OpenPack challenge 2022 publication-title: 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events – year: 2022 ident: b52 article-title: DG-STGCN: Dynamic spatial-temporal modeling for skeleton-based action recognition – year: 2021 ident: b5 article-title: Coarse to fine multi-resolution temporal convolutional network – reference: Z. Cao, T. Simon, S.-E. Wei, Y. Sheikh, Realtime multi-person 2d pose estimation using part affinity fields, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 7291–7299. – year: 2022 ident: b35 article-title: Hypergraph transformer for skeleton-based action recognition – start-page: 36 year: 2016 end-page: 52 ident: b2 article-title: Segmental spatiotemporal cnns for fine-grained action segmentation publication-title: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, the Netherlands, October 11–14, 2016, Proceedings, Part III 14 – reference: A. Richard, J. Gall, Temporal action detection using a statistical language model, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3131–3140. – year: 2022 ident: b50 article-title: Hierarchically decomposed graph convolutional networks for skeleton-based action recognition – start-page: 52 year: 2022 end-page: 68 ident: b9 article-title: Unified fully and timestamp supervised temporal action segmentation via sequence to sequence translation publication-title: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXV – start-page: 257 year: 2023 end-page: 258 ident: b44 article-title: OpenPack challenge 2022 report: Impact of data cleaning and time alignment on activity recognition publication-title: 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events – reference: Z. Liu, H. Zhang, Z. Chen, Z. Wang, W. Ouyang, Disentangling and unifying graph convolutions for skeleton-based action recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 143–152. – volume: 17 start-page: 1 year: 2021 end-page: 24 ident: b31 article-title: Part-wise spatio-temporal attention driven CNN-based 3D human action recognition publication-title: ACM Trans. Multimed. Comput. Commun. Appl. – start-page: 225 year: 2019 end-page: 230 ident: b32 article-title: Skeleton-based view invariant deep features for human activity recognition publication-title: 2019 IEEE Fifth International Conference on Multimedia Big Data – year: 2022 ident: b20 article-title: OpenPack: A large-scale dataset for recognizing packaging works in IoT-enabled logistic environments – reference: Y. Chen, Z. Zhang, C. Yuan, B. Li, Y. Deng, W. Hu, Channel-wise topology refinement graph convolution for skeleton-based action recognition, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 13359–13368. – year: 2021 ident: b8 article-title: Asformer: Transformer for action segmentation – volume: 18 start-page: 602 year: 2005 end-page: 610 ident: b56 article-title: Framewise phoneme classification with bidirectional LSTM and other neural network architectures publication-title: Neural Netw. – reference: M.-H. Chen, B. Li, Y. Bao, G. AlRegib, Action segmentation with mixed temporal domain adaptation, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 605–614. – reference: Y. Ben-Shabat, X. Yu, F. Saleh, D. Campbell, C. Rodriguez-Opazo, H. Li, S. Gould, The ikea asm dataset: Understanding people assembling furniture through actions, objects and pose, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 847–859. – start-page: 267 year: 2023 end-page: 269 ident: b25 article-title: A spatial-temporal graph convolutional networks-based approach for the OpenPack challenge 2022 publication-title: 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events – reference: L. Shi, Y. Zhang, J. Cheng, H. Lu, Two-stream adaptive graph convolutional networks for skeleton-based action recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 12026–12035. – reference: C. Liu, Y. Hu, Y. Li, S. Song, J. Liu, PKU-MMD: A large scale benchmark for skeleton-based human action understanding, in: Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities, 2017, pp. 1–8. – volume: 20 start-page: 4083 year: 2020 ident: b49 article-title: Lara: Creating a dataset for human activity recognition in logistics using semantic attributes publication-title: Sensors – start-page: 259 year: 2023 end-page: 261 ident: b45 article-title: Precise human activity recognition for the OpenPack challenge 2022 publication-title: 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events – year: 2022 ident: b46 article-title: Spatial temporal graph attention network for skeleton-based action recognition – volume: 30 year: 2017 ident: b29 article-title: Inductive representation learning on large graphs publication-title: Adv. Neural Inf. Process. Syst. – reference: Y. Ishikawa, S. Kasai, Y. Aoki, H. Kataoka, Alleviating over-segmentation errors by detecting action boundaries, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 2322–2331. – reference: Y. Huang, Y. Sugano, Y. Sato, Improving action segmentation via graph-based temporal reasoning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14024–14034. – start-page: 264 year: 2023 end-page: 266 ident: b41 article-title: Transformer-based time series classification for the OpenPack challenge 2022 publication-title: 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events – reference: Y. Du, W. Wang, L. Wang, Hierarchical recurrent neural network for skeleton based action recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1110–1118. – year: 2017 ident: b39 article-title: Tricornet: A hybrid temporal convolutional and recurrent network for video action segmentation – reference: S.-H. Gao, Q. Han, Z.-Y. Li, P. Peng, L. Wang, M.-M. Cheng, Global2local: Efficient structure search for video action segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 16805–16814. – year: 2016 ident: b18 article-title: Semi-supervised classification with graph convolutional networks – reference: S. Yan, Y. Xiong, D. Lin, Spatial temporal graph convolutional networks for skeleton-based action recognition, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, No. 1, 2018. – volume: 2 start-page: 128 year: 2021 end-page: 145 ident: b17 article-title: Graph convolutional neural network for human action recognition: A comprehensive survey publication-title: IEEE Trans. Artif. Intell. – start-page: 1194 year: 2012 end-page: 1201 ident: b38 article-title: A database for fine grained activity detection of cooking activities publication-title: 2012 IEEE Conference on Computer Vision and Pattern Recognition – year: 2022 ident: b14 article-title: Human action recognition from various data modalities: A review publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – year: 2022 ident: b1 article-title: Temporal action segmentation: An analysis of modern technique – reference: L. Shi, Y. Zhang, J. Cheng, H. Lu, Decoupled spatial-temporal attention network for skeleton-based action-gesture recognition, in: Proceedings of the Asian Conference on Computer Vision, 2020. – reference: C. Lea, M.D. Flynn, R. Vidal, A. Reiter, G.D. Hager, Temporal convolutional networks for action segmentation and detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 156–165. – year: 2022 ident: b22 article-title: Skeleton-based action segmentation with multi-stage spatial-temporal graph convolutional neural networks publication-title: IEEE Trans. Emerg. Top. Comput. – volume: 16 start-page: 115 year: 2016 ident: b42 article-title: Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition publication-title: Sensors – year: 2022 ident: b51 article-title: Skeleton-based action recognition via temporal-channel aggregation – start-page: 34 year: 2020 end-page: 51 ident: b7 article-title: Boundary-aware cascade networks for temporal action segmentation publication-title: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16 – year: 2022 ident: b15 article-title: Action recognition based on RGB and skeleton data sets: A survey publication-title: Neurocomputing – ident: 10.1016/j.neucom.2024.127482_b10 doi: 10.1109/CVPR46437.2021.01653 – start-page: 1194 year: 2012 ident: 10.1016/j.neucom.2024.127482_b38 article-title: A database for fine grained activity detection of cooking activities – start-page: 52 year: 2022 ident: 10.1016/j.neucom.2024.127482_b9 article-title: Unified fully and timestamp supervised temporal action segmentation via sequence to sequence translation – ident: 10.1016/j.neucom.2024.127482_b3 doi: 10.1109/CVPR.2016.216 – ident: 10.1016/j.neucom.2024.127482_b21 doi: 10.1109/WACV48630.2021.00089 – ident: 10.1016/j.neucom.2024.127482_b6 doi: 10.1109/WACV45572.2020.9093535 – ident: 10.1016/j.neucom.2024.127482_b19 doi: 10.1609/aaai.v32i1.12328 – year: 2022 ident: 10.1016/j.neucom.2024.127482_b51 – ident: 10.1016/j.neucom.2024.127482_b37 – year: 2021 ident: 10.1016/j.neucom.2024.127482_b8 – ident: 10.1016/j.neucom.2024.127482_b30 – ident: 10.1016/j.neucom.2024.127482_b47 doi: 10.1109/CVPR.2017.75 – ident: 10.1016/j.neucom.2024.127482_b16 doi: 10.1109/ICCV51070.2023.01258 – ident: 10.1016/j.neucom.2024.127482_b34 doi: 10.1109/CVPR42600.2020.00022 – year: 2022 ident: 10.1016/j.neucom.2024.127482_b20 – volume: 16 start-page: 115 issue: 1 year: 2016 ident: 10.1016/j.neucom.2024.127482_b42 article-title: Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition publication-title: Sensors doi: 10.3390/s16010115 – volume: 20 start-page: 4083 issue: 15 year: 2020 ident: 10.1016/j.neucom.2024.127482_b49 article-title: Lara: Creating a dataset for human activity recognition in logistics using semantic attributes publication-title: Sensors doi: 10.3390/s20154083 – year: 2022 ident: 10.1016/j.neucom.2024.127482_b50 – volume: 30 year: 2017 ident: 10.1016/j.neucom.2024.127482_b29 article-title: Inductive representation learning on large graphs publication-title: Adv. Neural Inf. Process. Syst. – volume: 208 year: 2021 ident: 10.1016/j.neucom.2024.127482_b36 article-title: Skeleton-based action recognition via spatial and temporal transformer networks publication-title: Comput. Vis. Image Underst. – ident: 10.1016/j.neucom.2024.127482_b26 doi: 10.1109/CVPR.2019.00369 – start-page: 262 year: 2023 ident: 10.1016/j.neucom.2024.127482_b43 article-title: Exploring cross modality feature fusion for activity recognition at OpenPack challenge 2022 – start-page: 257 year: 2023 ident: 10.1016/j.neucom.2024.127482_b44 article-title: OpenPack challenge 2022 report: Impact of data cleaning and time alignment on activity recognition – year: 2017 ident: 10.1016/j.neucom.2024.127482_b27 – ident: 10.1016/j.neucom.2024.127482_b54 doi: 10.1007/978-3-030-69541-5_3 – year: 2022 ident: 10.1016/j.neucom.2024.127482_b52 – volume: 18 start-page: 602 issue: 5–6 year: 2005 ident: 10.1016/j.neucom.2024.127482_b56 article-title: Framewise phoneme classification with bidirectional LSTM and other neural network architectures publication-title: Neural Netw. doi: 10.1016/j.neunet.2005.06.042 – year: 2022 ident: 10.1016/j.neucom.2024.127482_b46 – start-page: 264 year: 2023 ident: 10.1016/j.neucom.2024.127482_b41 article-title: Transformer-based time series classification for the OpenPack challenge 2022 – year: 2022 ident: 10.1016/j.neucom.2024.127482_b1 – year: 2022 ident: 10.1016/j.neucom.2024.127482_b35 – year: 2022 ident: 10.1016/j.neucom.2024.127482_b15 article-title: Action recognition based on RGB and skeleton data sets: A survey publication-title: Neurocomputing doi: 10.1016/j.neucom.2022.09.071 – ident: 10.1016/j.neucom.2024.127482_b12 doi: 10.1109/CVPR42600.2020.01404 – year: 2022 ident: 10.1016/j.neucom.2024.127482_b14 article-title: Human action recognition from various data modalities: A review publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2022.3183112 – start-page: 36 year: 2016 ident: 10.1016/j.neucom.2024.127482_b2 article-title: Segmental spatiotemporal cnns for fine-grained action segmentation – ident: 10.1016/j.neucom.2024.127482_b11 doi: 10.1109/WACV48630.2021.00237 – year: 2016 ident: 10.1016/j.neucom.2024.127482_b18 – ident: 10.1016/j.neucom.2024.127482_b53 doi: 10.1109/CVPR.2016.341 – start-page: 259 year: 2023 ident: 10.1016/j.neucom.2024.127482_b45 article-title: Precise human activity recognition for the OpenPack challenge 2022 – year: 2017 ident: 10.1016/j.neucom.2024.127482_b39 – year: 2018 ident: 10.1016/j.neucom.2024.127482_b28 – ident: 10.1016/j.neucom.2024.127482_b13 doi: 10.1109/CVPR.2017.143 – year: 2021 ident: 10.1016/j.neucom.2024.127482_b5 – ident: 10.1016/j.neucom.2024.127482_b33 doi: 10.1109/CVPR.2015.7298714 – volume: 2 start-page: 128 issue: 2 year: 2021 ident: 10.1016/j.neucom.2024.127482_b17 article-title: Graph convolutional neural network for human action recognition: A comprehensive survey publication-title: IEEE Trans. Artif. Intell. doi: 10.1109/TAI.2021.3076974 – ident: 10.1016/j.neucom.2024.127482_b48 doi: 10.1145/3132734.3132739 – volume: 17 start-page: 1 issue: 3 year: 2021 ident: 10.1016/j.neucom.2024.127482_b31 article-title: Part-wise spatio-temporal attention driven CNN-based 3D human action recognition publication-title: ACM Trans. Multimed. Comput. Commun. Appl. doi: 10.1145/3441628 – ident: 10.1016/j.neucom.2024.127482_b4 doi: 10.1109/CVPR.2017.113 – year: 2022 ident: 10.1016/j.neucom.2024.127482_b22 article-title: Skeleton-based action segmentation with multi-stage spatial-temporal graph convolutional neural networks publication-title: IEEE Trans. Emerg. Top. Comput. – start-page: 34 year: 2020 ident: 10.1016/j.neucom.2024.127482_b7 article-title: Boundary-aware cascade networks for temporal action segmentation – ident: 10.1016/j.neucom.2024.127482_b55 doi: 10.1109/CVPR.2017.633 – start-page: 225 year: 2019 ident: 10.1016/j.neucom.2024.127482_b32 article-title: Skeleton-based view invariant deep features for human activity recognition – ident: 10.1016/j.neucom.2024.127482_b40 doi: 10.1109/CVPR.2018.00705 – ident: 10.1016/j.neucom.2024.127482_b23 doi: 10.1109/ICCV48922.2021.01311 – start-page: 267 year: 2023 ident: 10.1016/j.neucom.2024.127482_b25 article-title: A spatial-temporal graph convolutional networks-based approach for the OpenPack challenge 2022 – ident: 10.1016/j.neucom.2024.127482_b24 doi: 10.1109/CVPR.2019.01230 |
| SSID | ssj0017129 |
| Score | 2.4517996 |
| Snippet | Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 127482 |
| SubjectTerms | Graph convolutional network Human action segmentation Skeleton-based action recognition Video understanding |
| Title | A motion-aware and temporal-enhanced Spatial–Temporal Graph Convolutional Network for skeleton-based human action segmentation |
| URI | https://dx.doi.org/10.1016/j.neucom.2024.127482 |
| Volume | 580 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) customDbUrl: eissn: 1872-8286 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017129 issn: 0925-2312 databaseCode: GBLVA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Complete Freedom Collection [SCCMFC] customDbUrl: eissn: 1872-8286 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017129 issn: 0925-2312 databaseCode: ACRLP dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Science Direct customDbUrl: eissn: 1872-8286 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017129 issn: 0925-2312 databaseCode: .~1 dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect Journal Collection customDbUrl: eissn: 1872-8286 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017129 issn: 0925-2312 databaseCode: AIKHN dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVLSH databaseName: Elsevier Journals customDbUrl: mediaType: online eissn: 1872-8286 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017129 issn: 0925-2312 databaseCode: AKRWK dateStart: 19930201 isFulltext: true providerName: Library Specific Holdings |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwELWqcuHCjihL5QNXt4njOM6xqigFpF5opd4iO3ahUNyqC9xQ_4E_5EuwYweBhEDiGMcjRZ6xZ6y8eQ-A81jKCHMuEQ4pRyQMOGKMSBQZD3PjcFPDFmiLHu0OyPUwHlZAu-yFsbBKf_a7M704rf1I069mczYeN2-DFJtbVFigIHEcWU5QQhKrYtB4_YR5hEmIHd8ejpGdXbbPFRgvrVYWM4JNomqE5n7G8M_p6UvK6eyALV8rwpb7nF1QUXoPbJc6DNBvy32wbkEnxoP4C58ryLWEnnJqgpS-L37yQys-bILtff3W9-_gpWWrhu2pfvYBaMZ6DhcOTTELF48mKVmNYZvrJCz0_KDrhIALdffk-5b0ARh0LvrtLvLKCiiPErxEkiYip5IzIqQyBR_LiQi4WQAR0JTTgOejXASKKkViplSuQsnCIBVkZLl5TIl1CKp6qtURgJjilLJRqkYkJVJSjqmIY0YDoRLBc1wDUbmgWe5px636xSQr8WUPmXNDZt2QOTfUAPq0mjnajT_mJ6Wvsm_hk5nM8Kvl8b8tT8CmfXLox1NQXc5X6sxUKEtRL0KwDjZaVzfd3gfQ2emA |
| linkProvider | Elsevier |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELZKGWDhjShPD6xuE8dxnLGqKAVKF1qpW2THDhRKikqBDfEf-If8EuzYqUBCILHaPinyne-hfHcfAMehlAHmXCLsU46I73HEGJEo0BrmWuE6hy3QFj3aGZDzYTisgFbZC2Nglc73W59eeGu30nC32XgYjRpXXox1FeUXKEgcBmQBLJIQR6YCq7_OcR5-5GM7cA-HyBwv--cKkFeungxoBOtIVfd1gcbwz_HpS8xpr4EVlyzCpv2edVBR-QZYLYkYoHuXm-CtCS0bD-IvfKogzyV0M6fGSOU3xV9-aNiHtbV9vL333R48NeOqYWuSPzsL1Gs9CwyHOpuFj3c6KhmSYRPsJCwI_aBthYCP6vreNS7lW2DQPum3OshRK6A0iPAMSRqJlErOiJBKZ3wsJcLj-gKER2NOPZ5mqfAUVYqETKlU-ZL5XixIZobz6BxrG1TzSa52AMQUx5RlscpITKSkHFMRhox6QkWCp7gGgvJCk9TNHTf0F-OkBJjdJlYNiVFDYtVQA2gu9WDnbvxxPip1lXyzn0SHhl8ld_8teQSWOv3LbtI9613sgWWzY6GQ-6A6mz6pA52uzMRhYY6fQ0vrFQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+motion-aware+and+temporal-enhanced+Spatial%E2%80%93Temporal+Graph+Convolutional+Network+for+skeleton-based+human+action+segmentation&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Chai%2C+Shurong&rft.au=Jain%2C+Rahul+Kumar&rft.au=Liu%2C+Jiaqing&rft.au=Teng%2C+Shiyu&rft.date=2024-05-01&rft.issn=0925-2312&rft.volume=580&rft.spage=127482&rft_id=info:doi/10.1016%2Fj.neucom.2024.127482&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_neucom_2024_127482 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon |