A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction

Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in computer vision and robotics. Recognition and prediction of observed human manipulation actions have their roots in the applications related...

Full description

Saved in:

Bibliographic Details
Published in	2021 20th International Conference on Advanced Robotics (ICAR) pp. 968 - 973
Main Authors	Akyol, Gamze, Sariel, Sanem, Aksoy, Eren Erdal
Format	Conference Proceeding
Language	English
Published	IEEE 06.12.2021
Subjects	Benchmark testing Computer vision Human-robot interaction Predictive models Robot vision systems Semantics
Online Access	Get full text
DOI	10.1109/ICAR53236.2021.9659385

Cover

Abstract	Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in computer vision and robotics. Recognition and prediction of observed human manipulation actions have their roots in the applications related to, for instance, human-robot interaction and robot learning from demonstration. The current research trend heavily relies on advanced convolutional neural networks to process the structured Euclidean data, such as RGB camera images. These networks, however, come with immense computational complexity to be able to process high dimensional raw data. Different from the related works, we here introduce a deep graph autoencoder to jointly learn recognition and prediction of manipulation tasks from symbolic scene graphs, instead of relying on the structured Euclidean data. Our network has a variational autoencoder structure with two branches: one for identifying the input graph type and one for predicting the future graphs. The input of the proposed network is a set of semantic graphs which store the spatial relations between subjects and objects in the scene. The network output is a label set representing the detected and predicted class types. We benchmark our new model against different state-of-the-art methods on two different datasets, MANIAC and MSRC-9, and show that our proposed model can achieve better performance. We also release our source code https://github.com/gamzeakyol/GNet.
AbstractList	Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in computer vision and robotics. Recognition and prediction of observed human manipulation actions have their roots in the applications related to, for instance, human-robot interaction and robot learning from demonstration. The current research trend heavily relies on advanced convolutional neural networks to process the structured Euclidean data, such as RGB camera images. These networks, however, come with immense computational complexity to be able to process high dimensional raw data. Different from the related works, we here introduce a deep graph autoencoder to jointly learn recognition and prediction of manipulation tasks from symbolic scene graphs, instead of relying on the structured Euclidean data. Our network has a variational autoencoder structure with two branches: one for identifying the input graph type and one for predicting the future graphs. The input of the proposed network is a set of semantic graphs which store the spatial relations between subjects and objects in the scene. The network output is a label set representing the detected and predicted class types. We benchmark our new model against different state-of-the-art methods on two different datasets, MANIAC and MSRC-9, and show that our proposed model can achieve better performance. We also release our source code https://github.com/gamzeakyol/GNet.
Author	Akyol, Gamze Sariel, Sanem Aksoy, Eren Erdal
Author_xml	– sequence: 1 givenname: Gamze surname: Akyol fullname: Akyol, Gamze email: akyolga@itu.edu.tr organization: Artificial Intelligence and Robotics Laboratory, Istanbul Technical University,Faculty of Computer and Informatics Engineering,Maslak,Turkey – sequence: 2 givenname: Sanem surname: Sariel fullname: Sariel, Sanem email: sariel@itu.edu.tr organization: Artificial Intelligence and Robotics Laboratory, Istanbul Technical University,Faculty of Computer and Informatics Engineering,Maslak,Turkey – sequence: 3 givenname: Eren Erdal surname: Aksoy fullname: Aksoy, Eren Erdal email: eren.aksoy@hh.se organization: School of Information Technology, Center for Applied Intelligent Systems Research, Halmstad University,Halmstad,Sweden
BookMark	eNotT81KxDAYjKAHd_UJBMkLtCb5kjQ5lkVXYUUp6nWJ6RcN1KTE7sG3t3T3NAPzw8yKnKeckJBbzmrOmb172rSdAgG6Fkzw2mplwagzsuJaKwnaSHZJupZ-uBLdFHNyA90WN37T9jBlTD73WGjIhT67FMfDsJho6xfo0OevFBfuUk9fC_Zxka7IRXDDL16fcE3eH-7fNo_V7mU7b9pVUYCdKtlwYVD2ukHGNHNefUomwQdUAZiQje_BCvBcBacdGtnMV6RWYAMzcxbW5ObYGxFxP5b448rf_nQT_gEvzkyR
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ICAR53236.2021.9659385
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	1665436840 9781665436847
EndPage	973
ExternalDocumentID	9659385
Genre	orig-research
GrantInformation_xml	– fundername: Scientific and Technological Research Council of Turkey grantid: 119E-436 funderid: 10.13039/501100004410
GroupedDBID	6IE 6IL CBEJK RIE RIL
ID	FETCH-LOGICAL-i239t-47128e4d67e0060ac5b4043cfe5f30247cd3923c15fa6ae84723646539f087123
IEDL.DBID	RIE
IngestDate	Thu Jun 29 18:37:38 EDT 2023
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i239t-47128e4d67e0060ac5b4043cfe5f30247cd3923c15fa6ae84723646539f087123
OpenAccessLink	https://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-46345
PageCount	6
ParticipantIDs	ieee_primary_9659385
PublicationCentury	2000
PublicationDate	2021-Dec.-6
PublicationDateYYYYMMDD	2021-12-06
PublicationDate_xml	– month: 12 year: 2021 text: 2021-Dec.-6 day: 06
PublicationDecade	2020
PublicationTitle	2021 20th International Conference on Advanced Robotics (ICAR)
PublicationTitleAbbrev	ICAR
PublicationYear	2021
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	1.8028843
Snippet	Despite decades of research, understanding human manipulation activities is, and has always been, one of the most attractive and challenging research topics in...
SourceID	ieee
SourceType	Publisher
StartPage	968
SubjectTerms	Benchmark testing Computer vision Human-robot interaction Predictive models Robot vision systems Semantics
Title	A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction
URI	https://ieeexplore.ieee.org/document/9659385
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3Pa8MgGJW2p5220Y79xsOOM000GnMMZV036BhlHb0Vo1-gDNJRkkv_-vmlacfGDrupiIoenp--9z5C7rRxqdMxMMFDwTzeFswoHjEfD4UFzzmEAoXC0xc1mcfPC7nokPuDFgYAGvIZBFhs_vLd2tb4VDZE8zuhZZd0E612Wq1W9BuF6fBplM2k4AKJBzwK2s4_sqY0oDE-JtP9dDuuyEdQV3lgt7-cGP-7nhMy-Jbn0dcD8JySDpR9Msvou49727c9-ohG1DSrqzU6VTrYUH87pVNTrvYJu2jWSBrobE8h8mVTOj8y_t1gdUDm44e30YS1CRPYiou0Yh5ouIbYqQTQZ8VYmaN5ji1AFsKDcWKdvw4JG8nCKAMemNA-Hs1pi9AHTlyckV65LuGcUC3yXLkoTiAVcWyMBif9KJH1EZdQwl6QPu7H8nPnibFst-Ly7-YrcoRn0tBA1DXpVZsabjyYV_ltc4pfA4qf5w
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwGG0QD3pSA8bf9uDRja0_xnZciAjKiCFguJGu_ZYQkmHIdvGvt98YGI0Hb23TtE17eP3a995HyEOoTGRCAQ5nHncs3maOCpjv2HjIy1jKwOMoFE7GwWAmXuZy3iCPey0MAFTkM3CxWP3lm7Uu8amsg-Z3PJQH5FAKIeRWrVXLfn0v6gx78URyxpF6wHy37v4jb0oFG_0Tkuwm3LJFVm5ZpK7-_OXF-N8VnZL2t0CPvu2h54w0IG-RSUzfbeRbv-7RZ7SipnFZrNGr0sCG2vspTVS-3KXsonElaqCTHYnIllVu7Mj4e4PVNpn1n6a9gVOnTHCWjEeFY6GGhSBM0AV0WlFapmifozOQGbdw3NXGXoi49mWmAgUWmtBAHu1pM8-GToyfk2a-zuGC0JCnaWB80YWIC6FUCEbaUXxtYy4ecH1JWrgfi4-tK8ai3oqrv5vvydFgmowWo-H49Zoc4_lUpJDghjSLTQm3FtqL9K460S-TQ6M0
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+20th+International+Conference+on+Advanced+Robotics+%28ICAR%29&rft.atitle=A+Variational+Graph+Autoencoder+for+Manipulation+Action+Recognition+and+Prediction&rft.au=Akyol%2C+Gamze&rft.au=Sariel%2C+Sanem&rft.au=Aksoy%2C+Eren+Erdal&rft.date=2021-12-06&rft.pub=IEEE&rft.spage=968&rft.epage=973&rft_id=info:doi/10.1109%2FICAR53236.2021.9659385&rft.externalDocID=9659385