A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction

Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in computer vision and robotics. Recognition and prediction of observed human manipulation actions have their roots in the applications related...

Full description

Saved in:
Bibliographic Details
Published in2021 20th International Conference on Advanced Robotics (ICAR) pp. 968 - 973
Main Authors Akyol, Gamze, Sariel, Sanem, Aksoy, Eren Erdal
Format Conference Proceeding
LanguageEnglish
Published IEEE 06.12.2021
Subjects
Online AccessGet full text
DOI10.1109/ICAR53236.2021.9659385

Cover

Abstract Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in computer vision and robotics. Recognition and prediction of observed human manipulation actions have their roots in the applications related to, for instance, human-robot interaction and robot learning from demonstration. The current research trend heavily relies on advanced convolutional neural networks to process the structured Euclidean data, such as RGB camera images. These networks, however, come with immense computational complexity to be able to process high dimensional raw data. Different from the related works, we here introduce a deep graph autoencoder to jointly learn recognition and prediction of manipulation tasks from symbolic scene graphs, instead of relying on the structured Euclidean data. Our network has a variational autoencoder structure with two branches: one for identifying the input graph type and one for predicting the future graphs. The input of the proposed network is a set of semantic graphs which store the spatial relations between subjects and objects in the scene. The network output is a label set representing the detected and predicted class types. We benchmark our new model against different state-of-the-art methods on two different datasets, MANIAC and MSRC-9, and show that our proposed model can achieve better performance. We also release our source code https://github.com/gamzeakyol/GNet.
AbstractList Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in computer vision and robotics. Recognition and prediction of observed human manipulation actions have their roots in the applications related to, for instance, human-robot interaction and robot learning from demonstration. The current research trend heavily relies on advanced convolutional neural networks to process the structured Euclidean data, such as RGB camera images. These networks, however, come with immense computational complexity to be able to process high dimensional raw data. Different from the related works, we here introduce a deep graph autoencoder to jointly learn recognition and prediction of manipulation tasks from symbolic scene graphs, instead of relying on the structured Euclidean data. Our network has a variational autoencoder structure with two branches: one for identifying the input graph type and one for predicting the future graphs. The input of the proposed network is a set of semantic graphs which store the spatial relations between subjects and objects in the scene. The network output is a label set representing the detected and predicted class types. We benchmark our new model against different state-of-the-art methods on two different datasets, MANIAC and MSRC-9, and show that our proposed model can achieve better performance. We also release our source code https://github.com/gamzeakyol/GNet.
Author Akyol, Gamze
Sariel, Sanem
Aksoy, Eren Erdal
Author_xml – sequence: 1
  givenname: Gamze
  surname: Akyol
  fullname: Akyol, Gamze
  email: akyolga@itu.edu.tr
  organization: Artificial Intelligence and Robotics Laboratory, Istanbul Technical University,Faculty of Computer and Informatics Engineering,Maslak,Turkey
– sequence: 2
  givenname: Sanem
  surname: Sariel
  fullname: Sariel, Sanem
  email: sariel@itu.edu.tr
  organization: Artificial Intelligence and Robotics Laboratory, Istanbul Technical University,Faculty of Computer and Informatics Engineering,Maslak,Turkey
– sequence: 3
  givenname: Eren Erdal
  surname: Aksoy
  fullname: Aksoy, Eren Erdal
  email: eren.aksoy@hh.se
  organization: School of Information Technology, Center for Applied Intelligent Systems Research, Halmstad University,Halmstad,Sweden
BookMark eNotT81KxDAYjKAHd_UJBMkLtCb5kjQ5lkVXYUUp6nWJ6RcN1KTE7sG3t3T3NAPzw8yKnKeckJBbzmrOmb172rSdAgG6Fkzw2mplwagzsuJaKwnaSHZJupZ-uBLdFHNyA90WN37T9jBlTD73WGjIhT67FMfDsJho6xfo0OevFBfuUk9fC_Zxka7IRXDDL16fcE3eH-7fNo_V7mU7b9pVUYCdKtlwYVD2ukHGNHNefUomwQdUAZiQje_BCvBcBacdGtnMV6RWYAMzcxbW5ObYGxFxP5b448rf_nQT_gEvzkyR
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICAR53236.2021.9659385
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1665436840
9781665436847
EndPage 973
ExternalDocumentID 9659385
Genre orig-research
GrantInformation_xml – fundername: Scientific and Technological Research Council of Turkey
  grantid: 119E-436
  funderid: 10.13039/501100004410
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i239t-47128e4d67e0060ac5b4043cfe5f30247cd3923c15fa6ae84723646539f087123
IEDL.DBID RIE
IngestDate Thu Jun 29 18:37:38 EDT 2023
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i239t-47128e4d67e0060ac5b4043cfe5f30247cd3923c15fa6ae84723646539f087123
OpenAccessLink https://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-46345
PageCount 6
ParticipantIDs ieee_primary_9659385
PublicationCentury 2000
PublicationDate 2021-Dec.-6
PublicationDateYYYYMMDD 2021-12-06
PublicationDate_xml – month: 12
  year: 2021
  text: 2021-Dec.-6
  day: 06
PublicationDecade 2020
PublicationTitle 2021 20th International Conference on Advanced Robotics (ICAR)
PublicationTitleAbbrev ICAR
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.8028843
Snippet Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in...
SourceID ieee
SourceType Publisher
StartPage 968
SubjectTerms Benchmark testing
Computer vision
Human-robot interaction
Predictive models
Robot vision systems
Semantics
Title A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction
URI https://ieeexplore.ieee.org/document/9659385
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3Pa8MgGJW2p5220Y79xsOOM000GnMMZV036BhlHb0Vo1-gDNJRkkv_-vmlacfGDrupiIoenp--9z5C7rRxqdMxMMFDwTzeFswoHjEfD4UFzzmEAoXC0xc1mcfPC7nokPuDFgYAGvIZBFhs_vLd2tb4VDZE8zuhZZd0E612Wq1W9BuF6fBplM2k4AKJBzwK2s4_sqY0oDE-JtP9dDuuyEdQV3lgt7-cGP-7nhMy-Jbn0dcD8JySDpR9Msvou49727c9-ohG1DSrqzU6VTrYUH87pVNTrvYJu2jWSBrobE8h8mVTOj8y_t1gdUDm44e30YS1CRPYiou0Yh5ouIbYqQTQZ8VYmaN5ji1AFsKDcWKdvw4JG8nCKAMemNA-Hs1pi9AHTlyckV65LuGcUC3yXLkoTiAVcWyMBif9KJH1EZdQwl6QPu7H8nPnibFst-Ly7-YrcoRn0tBA1DXpVZsabjyYV_ltc4pfA4qf5w
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwGG0QD3pSA8bf9uDRja0_xnZciAjKiCFguJGu_ZYQkmHIdvGvt98YGI0Hb23TtE17eP3a995HyEOoTGRCAQ5nHncs3maOCpjv2HjIy1jKwOMoFE7GwWAmXuZy3iCPey0MAFTkM3CxWP3lm7Uu8amsg-Z3PJQH5FAKIeRWrVXLfn0v6gx78URyxpF6wHy37v4jb0oFG_0Tkuwm3LJFVm5ZpK7-_OXF-N8VnZL2t0CPvu2h54w0IG-RSUzfbeRbv-7RZ7SipnFZrNGr0sCG2vspTVS-3KXsonElaqCTHYnIllVu7Mj4e4PVNpn1n6a9gVOnTHCWjEeFY6GGhSBM0AV0WlFapmifozOQGbdw3NXGXoi49mWmAgUWmtBAHu1pM8-GToyfk2a-zuGC0JCnaWB80YWIC6FUCEbaUXxtYy4ecH1JWrgfi4-tK8ai3oqrv5vvydFgmowWo-H49Zoc4_lUpJDghjSLTQm3FtqL9K460S-TQ6M0
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+20th+International+Conference+on+Advanced+Robotics+%28ICAR%29&rft.atitle=A+Variational+Graph+Autoencoder+for+Manipulation+Action+Recognition+and+Prediction&rft.au=Akyol%2C+Gamze&rft.au=Sariel%2C+Sanem&rft.au=Aksoy%2C+Eren+Erdal&rft.date=2021-12-06&rft.pub=IEEE&rft.spage=968&rft.epage=973&rft_id=info:doi/10.1109%2FICAR53236.2021.9659385&rft.externalDocID=9659385