Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping

Many time-series classification problems involve developing metrics that are invariant to temporal misalignment. In human activity analysis, temporal misalignment arises due to various reasons including differing initial phase, sensor sampling rates, and elastic time-warps due to subject-specific bi...

Full description

Saved in:
Bibliographic Details
Published inProceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 12418 - 12427
Main Authors Lohit, Suhas, Wang, Qiao, Turaga, Pavan
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2019
Subjects
Online AccessGet full text
ISSN1063-6919
DOI10.1109/CVPR.2019.01271

Cover

Abstract Many time-series classification problems involve developing metrics that are invariant to temporal misalignment. In human activity analysis, temporal misalignment arises due to various reasons including differing initial phase, sensor sampling rates, and elastic time-warps due to subject-specific biomechanics. Past work in this area has only looked at reducing intra-class variability by elastic temporal alignment. In this paper, we propose a hybrid model-based and data-driven approach to learn warping functions that not just reduce intra-class variability, but also increase inter-class separation. We call this a temporal transformer network (TTN). TTN is an interpretable differentiable module, which can be easily integrated at the front end of a classification network. The module is capable of reducing intra-class variance by generating input-dependent warping functions which lead to rate-robust representations. At the same time, it increases inter-class variance by learning warping functions that are more discriminative. We show improvements over strong baselines in 3D action recognition on challenging datasets using the proposed framework. The improvements are especially pronounced when training sets are smaller.
AbstractList Many time-series classification problems involve developing metrics that are invariant to temporal misalignment. In human activity analysis, temporal misalignment arises due to various reasons including differing initial phase, sensor sampling rates, and elastic time-warps due to subject-specific biomechanics. Past work in this area has only looked at reducing intra-class variability by elastic temporal alignment. In this paper, we propose a hybrid model-based and data-driven approach to learn warping functions that not just reduce intra-class variability, but also increase inter-class separation. We call this a temporal transformer network (TTN). TTN is an interpretable differentiable module, which can be easily integrated at the front end of a classification network. The module is capable of reducing intra-class variance by generating input-dependent warping functions which lead to rate-robust representations. At the same time, it increases inter-class variance by learning warping functions that are more discriminative. We show improvements over strong baselines in 3D action recognition on challenging datasets using the proposed framework. The improvements are especially pronounced when training sets are smaller.
Author Lohit, Suhas
Wang, Qiao
Turaga, Pavan
Author_xml – sequence: 1
  givenname: Suhas
  surname: Lohit
  fullname: Lohit, Suhas
  organization: Arizona State Univ
– sequence: 2
  givenname: Qiao
  surname: Wang
  fullname: Wang, Qiao
  organization: Arizona State Univ
– sequence: 3
  givenname: Pavan
  surname: Turaga
  fullname: Turaga, Pavan
  organization: Arizona State Univ
BookMark eNotjk1LxDAURaMoOI5du3CTP9DxJWmbPHdSv0YGFakKboY0fZXoNB3SMuK_t6CrC_ccLveYHYQ-EGOnAhZCAJ6Xr0_PCwkCFyCkFnssQW2ElkYoicrss5mAQqUFCjxiyTB8AoCSQhRoZuy9om7bR7vhVbRhaPvYUeQPNH738Wu44Pe9DyNfkY3Bhw_et3wZdjZ6O7U2NPzKDy76zgc7-h3xynfE32zcTvIJO2ztZqDkP-fs5ea6Ku_S1ePtsrxcpV6CGtMaIK-xyQHbmmrnao1g86zRyjqHBSBh06jGmQkQkaYMW5kbcmhNTUKrOTv72_UTXm-nNzb-rA3mGWSZ-gX-m1aw
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR.2019.01271
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplorer
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplorer
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9781728132938
1728132932
EISSN 1063-6919
EndPage 12427
ExternalDocumentID 8954044
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i203t-b005b9d509fbebccb790a54d73acc9609e9dd3dc8b79eee7e49f258ec9a8be173
IEDL.DBID RIE
IngestDate Wed Aug 27 02:23:34 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-b005b9d509fbebccb790a54d73acc9609e9dd3dc8b79eee7e49f258ec9a8be173
PageCount 10
ParticipantIDs ieee_primary_8954044
PublicationCentury 2000
PublicationDate 2019-June
PublicationDateYYYYMMDD 2019-06-01
PublicationDate_xml – month: 06
  year: 2019
  text: 2019-June
PublicationDecade 2010
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
Score 2.371348
Snippet Many time-series classification problems involve developing metrics that are invariant to temporal misalignment. In human activity analysis, temporal...
SourceID ieee
SourceType Publisher
StartPage 12418
SubjectTerms Action Recognition
Representation Learning
RGBD sensors and analytics
Statistical Learning
Title Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping
URI https://ieeexplore.ieee.org/document/8954044
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwGG2AkydUMP5ODx7d2Ni6tl5RgiQQYkCJF9IfXw3RbAaGB_96221gNB68Ld1h69e032v73vsQuoq4VKBi8AIwzIs1kZ7FccwDISVVQBQJnRp5NE4Gs3g4J_Maut5pYQCgIJ-B7x6Lu3ydqY07KuswbvFFHNdRnbKk1GrtzlMiu5NJOKvce8KAd3qPkwfH3eK-u14Nf5RPKbJHv4lG2--WpJFXf5NLX33-smT874_to_a3Tg9PdhnoANUgPUTNCljiatquW-h5WhpQveHpFqfCCo9LBvj6Bg-zZZrjymr1BWcG36cfdhNto45FqvHt0i0ujjTjFkfsZCP4Sayc1KqNZv27aW_gVUUVvGU3iPKi4I7k2uIEI0EqJSkPBIk1jYRSzn4OuNaRVsy-sP2kEHPTJQwUF0xCSKMj1EizFI4RJtRQGYJR2oIqKQlLTKSA8oSJLhdgTlDLhWrxXvpmLKoonf7dfIb23GCVNKxz1MhXG7iwCT-Xl8VIfwGbPK-e
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwGG0QD3pCBeNve_DoxsbarfWKEkAgxAwlXsjafjNEsxkYHvzrbbeB0XjwtnSHrV_Tfq_te-9D6MrjQoIkYDkQM4soKiyN45gFkRCBBCqpa9TIw5HfnZD-lE4r6HqjhQGAnHwGtnnM7_JVKlfmqKzJuMYXhGyhbUoIoYVaa3Oi4um9jM9Z6d_jOrzZfhw_GPYWt80Fq_ujgEqePzo1NFx_uaCNvNqrTNjy85cp439_bQ81vpV6eLzJQfuoAskBqpXQEpcTd1lHz2FhQfWGwzVShQUeFRzw5Q3up_Mkw6XZ6gtOY9xLPvQ2WscdR4nCt3OzvBjajFkesRGO4KdoYcRWDTTp3IXtrlWWVbDmLcfL8pI7giuNFGIBQkoRcCeiRAVeJKUxoAOulKck0y90PwMgPG5RBpJHTIAbeIeomqQJHCFMgzgQLsRSaVglBGV-7EkIuM-iFo8gPkZ1E6rZe-GcMSujdPJ38yXa6YbDwWzQG92fol0zcAUp6wxVs8UKznX6z8RFPupf-ACy6w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Temporal+Transformer+Networks%3A+Joint+Learning+of+Invariant+and+Discriminative+Time+Warping&rft.au=Lohit%2C+Suhas&rft.au=Wang%2C+Qiao&rft.au=Turaga%2C+Pavan&rft.date=2019-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=12418&rft.epage=12427&rft_id=info:doi/10.1109%2FCVPR.2019.01271&rft.externalDocID=8954044