Fully Convolutional Encoder-Decoder With an Attention Mechanism for Practical Pedestrian Trajectory Prediction

Pedestrian trajectory prediction using video is essential for many practical traffic applications. Most existing pedestrian trajectory prediction methods are based on fully connected long short-term memory (LSTM) networks and perform well on public datasets. However, these methods still have three d...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on intelligent transportation systems Vol. 23; no. 11; pp. 20046 - 20060
Main Authors	Chen, Kai, Song, Xiao, Yuan, Haitao, Ren, Xiaoxiang
Format	Journal Article
Language	English
Published	New York IEEE 01.11.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms attention mechanism Barriers Coders Convolution Convolutional neural networks Data processing Datasets Encoders-Decoders Feature extraction Force Image annotation Image segmentation long short-term memory (LSTM) Markov processes Pedestrian behavior Pedestrians Predictive models Trajectory
Online Access	Get full text
ISSN	1524-9050 1558-0016
DOI	10.1109/TITS.2022.3170874

Cover

Abstract	Pedestrian trajectory prediction using video is essential for many practical traffic applications. Most existing pedestrian trajectory prediction methods are based on fully connected long short-term memory (LSTM) networks and perform well on public datasets. However, these methods still have three defects: a) Most of them rely on manual annotations to obtain information about the environment surrounding the subject pedestrian, which limits practical applications; b) The interaction among pedestrians and obstacles in a scene is little studied, which leads to greater prediction error; c) Traditional LSTM methods are based on the previous moment and ignore the correlation between the future and distant past states of the pedestrian, which generates unrealistic trajectories. To tackle these problems, first, in the stage of data processing, we use an image semantic segmentation algorithm to obtain multi-category obstacle information and design an end-to-end "Siamese Position Extraction" model to obtain more accurate pedestrian interaction data. Second, we design an end-to-end fully convolutional LSTM encoder-decoder with an attention mechanism (FLEAM) to overcome the shortcomings of LSTM. Third, we compare FLEAM with several state-of-the-art LSTM-based prediction methods on multiple video sequences in the datasets ETH, UCY and MOT20. The results show that our approach generates the same prediction error as the best results of the state-of-the-art method. However, FLEAM has more potential for practice application because it does not rely on manually annotated data. We further validate the effectiveness of FLEAM by employing manually annotated data, finding that it generates much less prediction error.
AbstractList	Pedestrian trajectory prediction using video is essential for many practical traffic applications. Most existing pedestrian trajectory prediction methods are based on fully connected long short-term memory (LSTM) networks and perform well on public datasets. However, these methods still have three defects: a) Most of them rely on manual annotations to obtain information about the environment surrounding the subject pedestrian, which limits practical applications; b) The interaction among pedestrians and obstacles in a scene is little studied, which leads to greater prediction error; c) Traditional LSTM methods are based on the previous moment and ignore the correlation between the future and distant past states of the pedestrian, which generates unrealistic trajectories. To tackle these problems, first, in the stage of data processing, we use an image semantic segmentation algorithm to obtain multi-category obstacle information and design an end-to-end “Siamese Position Extraction” model to obtain more accurate pedestrian interaction data. Second, we design an end-to-end fully convolutional LSTM encoder-decoder with an attention mechanism (FLEAM) to overcome the shortcomings of LSTM. Third, we compare FLEAM with several state-of-the-art LSTM-based prediction methods on multiple video sequences in the datasets ETH, UCY and MOT20. The results show that our approach generates the same prediction error as the best results of the state-of-the-art method. However, FLEAM has more potential for practice application because it does not rely on manually annotated data. We further validate the effectiveness of FLEAM by employing manually annotated data, finding that it generates much less prediction error.
Author	Chen, Kai Song, Xiao Ren, Xiaoxiang Yuan, Haitao
Author_xml	– sequence: 1 givenname: Kai orcidid: 0000-0002-2436-1420 surname: Chen fullname: Chen, Kai email: chen_kai@nuaa.edu.cn organization: College of Mechanical and Electrical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, China – sequence: 2 givenname: Xiao orcidid: 0000-0003-4279-426X surname: Song fullname: Song, Xiao email: songxiao@buaa.edu.cn organization: School of Cyber Science and Technology, Beihang University, Beijing, China – sequence: 3 givenname: Haitao orcidid: 0000-0001-8475-419X surname: Yuan fullname: Yuan, Haitao email: haitao.yuan@njit.edu organization: Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ, USA – sequence: 4 givenname: Xiaoxiang surname: Ren fullname: Ren, Xiaoxiang email: 370726684@qq.com organization: Wendong New District Middle School, Shanxi, China
BookMark	eNp9kM9LwzAUx4NMcJv-AeKl4LkzP9omOY656WDiwIrHkqavLKNLZpoJ--9t3fDgwdN75H0-4b3vCA2ss4DQLcETQrB8yJf524RiSieMcCx4coGGJE1FjDHJBn1Pk1jiFF-hUdtuu9ckJWSI7OLQNMdo5uyXaw7BOKuaaG61q8DHj_BTow8TNpGy0TQEsD0TvYDeKGvaXVQ7H6290sHozlxDBW3wpoNzr7agg_PHbg6V0b14jS5r1bRwc65j9L6Y57PnePX6tJxNV7GmkoW4rEBkQjAJGRYUaqIp4ZUqq7qkXNYcSpEQxpkiJSc1r7BkWCQZU1nJpU6AjdH96d-9d5-HbqVi6w6-u60tKGcJEzyVrKP4idLeta2HutAmqH7P4JVpCoKLPtyiD7fowy3O4XYm-WPuvdkpf_zXuTs5BgB-eckzQTFh3yg6iLU
CODEN	ITISFG
CitedBy_id	crossref_primary_10_1007_s11042_023_17346_x crossref_primary_10_1016_j_physa_2025_130435 crossref_primary_10_1109_OJITS_2023_3233952 crossref_primary_10_1016_j_apenergy_2024_124306 crossref_primary_10_3390_healthcare11091268 crossref_primary_10_1016_j_knosys_2024_111744 crossref_primary_10_1007_s00371_024_03368_5 crossref_primary_10_1016_j_eswa_2024_125706 crossref_primary_10_1177_14727978251321985
Cites_doi	10.1109/AIM.2017.8014190 10.1109/CVPR.2012.6248110 10.1109/CVPR.2016.90 10.1007/978-3-030-01240-3_7 10.1038/35035023 10.1109/TASE.2016.2543242 10.1109/CVPR.2016.110 10.1109/WACV.2018.00135 10.1109/TITS.2016.2515063 10.1109/TITS.2019.2892377 10.1109/ROBOT.2010.5509779 10.1109/CVPR.2018.00240 10.1109/CVPR.2017.789 10.18653/v1/D16-1171 10.1109/TPAMI.2011.64 10.1109/TRO.2016.2540623 10.1146/annurev-psych-122414-033400 10.1109/ICCVW.2015.84 10.5120/ijca2016910497 10.1109/JAS.2019.1911393 10.1109/TITS.2018.2873145 10.1109/CVPR.2017.106 10.1109/TITS.2016.2625324 10.1111/j.1467-8659.2007.01089.x 10.1007/978-3-642-33765-9_15 10.1109/LRA.2018.2852793 10.1016/j.physa.2018.06.045 10.1109/SMC.2016.7844676 10.1109/TITS.2018.2873118 10.1109/CVPR.2019.00441 10.1109/TITS.2020.2981118 10.1109/TCYB.2017.2705345 10.1109/CVPR.2016.91 10.1109/CVPR.2017.243 10.1109/CVPR.2015.7298935 10.1021/acscentsci.7b00512 10.1007/s11263-015-0816-y
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID	97E RIA RIE AAYXX CITATION 7SC 7SP 8FD FR3 JQ2 KR7 L7M L~C L~D
DOI	10.1109/TITS.2022.3170874
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database Engineering Research Database ProQuest Computer Science Collection Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Civil Engineering Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Engineering Research Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Civil Engineering Abstracts
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1558-0016
EndPage	20060
ExternalDocumentID	10_1109_TITS_2022_3170874 9768201
Genre	orig-research
GrantInformation_xml	– fundername: National Key Research and Development Program of China grantid: 2018YFB1702703 funderid: 10.13039/501100012166 – fundername: National Natural Science Foundation of China (NSFC) grantid: 61473013; 61802015; 61703011 funderid: 10.13039/501100001809 – fundername: Open Fund of China State Key Laboratory of Intelligent Manufacturing System Technology funderid: 10.13039/501100020732
GroupedDBID	-~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT AENEX AETIX AGQYO AGSQL AHBIQ AIBXA AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 IFIPE IPLJI JAVBF LAI M43 O9- OCL P2P PQQKQ RIA RIE RNS ZY4 AAYXX CITATION 7SC 7SP 8FD FR3 JQ2 KR7 L7M L~C L~D
ID	FETCH-LOGICAL-c293t-bde868839e6082ef1c217dabdfb279f7eb841373a1b71f7d09308463a6b79c4e3
IEDL.DBID	RIE
ISSN	1524-9050
IngestDate	Mon Jun 30 07:05:39 EDT 2025 Wed Oct 01 05:03:14 EDT 2025 Thu Apr 24 22:57:03 EDT 2025 Wed Aug 27 02:18:56 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	11
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c293t-bde868839e6082ef1c217dabdfb279f7eb841373a1b71f7d09308463a6b79c4e3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0003-4279-426X 0000-0002-2436-1420 0000-0001-8475-419X
PQID	2734387593
PQPubID	75735
PageCount	15
ParticipantIDs	proquest_journals_2734387593 crossref_primary_10_1109_TITS_2022_3170874 ieee_primary_9768201 crossref_citationtrail_10_1109_TITS_2022_3170874
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2022-11-01
PublicationDateYYYYMMDD	2022-11-01
PublicationDate_xml	– month: 11 year: 2022 text: 2022-11-01 day: 01
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE transactions on intelligent transportation systems
PublicationTitleAbbrev	TITS
PublicationYear	2022
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 ref12 ref15 ref14 ref17 chen (ref48) 2018 ref50 kingma (ref42) 2014 ref46 ref45 ref47 simonyan (ref37) 2015 ref44 ref43 gregor (ref16) 2015 ref49 ref8 ref7 ref9 ref4 ref3 ref6 manh (ref11) 2018 ref5 ref40 bartoli (ref18) 2017 ling (ref28) 2018; 38 milan (ref51) 2016 ref35 ref34 ref36 ref31 ref30 ref33 ref32 ren (ref29) 2015 ref2 ref1 ref39 ref38 nikhil (ref23) 2018 (ref41) 2018 pellegrini (ref25) 2009 ref24 ref26 ref20 vemula (ref10) 2017 ref22 ref21 ref27 varshneya (ref19) 2017
References_xml	– year: 2017 ident: ref19 article-title: Human trajectory prediction using spatially aware deep attention models publication-title: arXiv 1705 09436 – start-page: 1 year: 2015 ident: ref37 article-title: Very deep convolutional networks for large-scale image recognition publication-title: Proc Int Conf Learn Represent – ident: ref2 doi: 10.1109/AIM.2017.8014190 – ident: ref21 doi: 10.1109/CVPR.2012.6248110 – ident: ref38 doi: 10.1109/CVPR.2016.90 – ident: ref32 doi: 10.1007/978-3-030-01240-3_7 – ident: ref4 doi: 10.1038/35035023 – ident: ref8 doi: 10.1109/TASE.2016.2543242 – ident: ref9 doi: 10.1109/CVPR.2016.110 – ident: ref17 doi: 10.1109/WACV.2018.00135 – ident: ref44 doi: 10.1109/TITS.2016.2515063 – start-page: 91 year: 2015 ident: ref29 article-title: Faster R-CNN: Towards real-time object detection with region proposal networks publication-title: Proc Adv Neural Inf Process Syst – ident: ref43 doi: 10.1109/TITS.2019.2892377 – year: 2018 ident: ref41 publication-title: PyTorch – ident: ref13 doi: 10.1109/ROBOT.2010.5509779 – ident: ref7 doi: 10.1109/CVPR.2018.00240 – ident: ref35 doi: 10.1109/CVPR.2017.789 – ident: ref27 doi: 10.18653/v1/D16-1171 – start-page: 1 year: 2018 ident: ref23 article-title: Convolutional neural network for trajectory prediction publication-title: Proc Eur Conf Comput Vis (ECCV) – ident: ref5 doi: 10.1109/TPAMI.2011.64 – ident: ref3 doi: 10.1109/TRO.2016.2540623 – year: 2016 ident: ref51 article-title: MOT16: A benchmark for multi-object tracking publication-title: arXiv 1603 00831 [cs] – ident: ref26 doi: 10.1146/annurev-psych-122414-033400 – year: 2018 ident: ref11 article-title: Scene-LSTM: A model for human trajectory prediction publication-title: arXiv 1808 04018 – ident: ref36 doi: 10.1109/ICCVW.2015.84 – year: 2017 ident: ref10 article-title: Social attention: Modeling attention in human crowds publication-title: arXiv 1710 04689 – ident: ref14 doi: 10.5120/ijca2016910497 – start-page: 3 year: 2018 ident: ref48 article-title: Encoder-decoder with atrous separable convolution for semantic image segmentation publication-title: Proc ECCV – ident: ref47 doi: 10.1109/JAS.2019.1911393 – ident: ref46 doi: 10.1109/TITS.2018.2873145 – year: 2015 ident: ref16 article-title: DRAW: A recurrent neural network for image generation publication-title: arXiv 1502 04623 – ident: ref30 doi: 10.1109/CVPR.2017.106 – volume: 38 start-page: 10 year: 2018 ident: ref28 article-title: Long text classification combined with attention mechanism publication-title: J Comput Appl – ident: ref45 doi: 10.1109/TITS.2016.2625324 – ident: ref24 doi: 10.1111/j.1467-8659.2007.01089.x – year: 2017 ident: ref18 article-title: Context-aware trajectory prediction publication-title: arXiv 1705 02503 – ident: ref12 doi: 10.1007/978-3-642-33765-9_15 – ident: ref1 doi: 10.1109/LRA.2018.2852793 – ident: ref49 doi: 10.1016/j.physa.2018.06.045 – ident: ref39 doi: 10.1109/SMC.2016.7844676 – ident: ref50 doi: 10.1109/TITS.2018.2873118 – start-page: 261 year: 2009 ident: ref25 article-title: You'll never walk alone: Modeling social behavior for multi-target tracking publication-title: Proc IEEE 12th Int Conf Comput Vis (ICCV) – year: 2014 ident: ref42 article-title: Adam: A method for stochastic optimization publication-title: arXiv 1412 6980 – ident: ref33 doi: 10.1109/CVPR.2019.00441 – ident: ref22 doi: 10.1109/TITS.2020.2981118 – ident: ref6 doi: 10.1109/TCYB.2017.2705345 – ident: ref31 doi: 10.1109/CVPR.2016.91 – ident: ref20 doi: 10.1109/CVPR.2017.243 – ident: ref40 doi: 10.1109/CVPR.2015.7298935 – ident: ref15 doi: 10.1021/acscentsci.7b00512 – ident: ref34 doi: 10.1007/s11263-015-0816-y
SSID	ssj0014511
Score	2.4328957
Snippet	Pedestrian trajectory prediction using video is essential for many practical traffic applications. Most existing pedestrian trajectory prediction methods are...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	20046
SubjectTerms	Algorithms attention mechanism Barriers Coders Convolution Convolutional neural networks Data processing Datasets Encoders-Decoders Feature extraction Force Image annotation Image segmentation long short-term memory (LSTM) Markov processes Pedestrian behavior Pedestrians Predictive models Trajectory
Title	Fully Convolutional Encoder-Decoder With an Attention Mechanism for Practical Pedestrian Trajectory Prediction
URI	https://ieeexplore.ieee.org/document/9768201 https://www.proquest.com/docview/2734387593
Volume	23
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-0016 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014511 issn: 1524-9050 databaseCode: RIE dateStart: 20000101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB4BJ3oor6Iuj8oHTqhe8rTjI-IhWmkREovKLYrtWUELWQRZpOXXM-NkV7RUVU-JEk9k6Ztkvsl4PgPsVcRhvXO5TH2eyYwipCQvUdJXsfZKVYmx3Jw8OFdnV9n36_x6Ab7Oe2EQMSw-wz6fhlq-H7sJ_yo7oNDJAWsRFnWh2l6tecWAdbaCNmqSSRPlswpmHJmD4bfhJWWCSUIJqo4Knf0Wg8KmKu--xCG8nK7AYDaxdlXJr_6ksX338odm4__OfBU-djxTHLaOsQYLWK_DhzfqgxtQc_o5FUfj-rnzPzI4qbnJ_VEeYziKH7fNjahqcdg07cJIMUBuFr59uhfEd0Wrd0RAiwv0GDYBqQUFwJ-hGjCl-1wJYsNPcHV6Mjw6k932C9IRB2ik9ViogggUKuIJOIodpS--sn5kE21GGm1BEVCnVWx1PNI-MmlEbCatlNXGZZhuwlI9rvEziIQ-DU7pBKM0z5TTllmCpWtGcWMv9iCaAVK6Tpuct8i4K0OOEpmSMSwZw7LDsAf7c5OHVpjjX4M3GJP5wA6OHuzMUC-7V_epZL2flLI4k2793WoblvnZbUPiDiw1jxPcJWbS2C_BJV8B_0Xfcw
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB5ROLQ9FFqKui2lPvRU1Usejr0-Ih5aKIsqdVG5RbE9K6BttoIsEvz6zjjZVR-o4pQo8SiWvknmm4znM8D7ijhs8L6QeSiUVBQhJXmJlqFKTdC6yqzj5uTRiR6eqqOz4mwJPi56YRAxLj7DPp_GWn6Y-hn_Ktum0MkB6xGsFEqpou3WWtQMWGkrqqNmStqkmNcw08Rujw_HXygXzDJKUU0yMOqPKBS3VfnnWxwDzMEqjOZTa9eVfOvPGtf3d3-pNj507mvwrGOaYqd1jeewhPULePqb_uA61JyA3ordaX3TeSAZ7Nfc5n4l9zAexdeL5lxUtdhpmnZppBghtwtfXP8QxHhFq3hEUIvPGDBuA1ILCoGXsR5wS_e5FsSGL-H0YH-8O5TdBgzSEwtopAs40AOiUKiJKeAk9ZTAhMqFicuMnRh0A4qBJq9SZ9KJCYnNE-IzeaWdsV5hvgHL9bTGVyAy-jh4bTJM8kJpbxzzBEfXrObWXuxBMgek9J06OW-S8b2MWUpiS8awZAzLDsMefFiY_GylOf43eJ0xWQzs4OjB5hz1snt5r0tW_Mkpj7P56_ut3sHj4Xh0XB4fnnx6A0_4OW174iYsN1czfEs8pXFb0T1_AdnJ4sA
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Fully+Convolutional+Encoder-Decoder+With+an+Attention+Mechanism+for+Practical+Pedestrian+Trajectory+Prediction&rft.jtitle=IEEE+transactions+on+intelligent+transportation+systems&rft.au=Chen%2C+Kai&rft.au=Song%2C+Xiao&rft.au=Yuan%2C+Haitao&rft.au=Ren%2C+Xiaoxiang&rft.date=2022-11-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1524-9050&rft.eissn=1558-0016&rft.volume=23&rft.issue=11&rft.spage=20046&rft_id=info:doi/10.1109%2FTITS.2022.3170874&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1524-9050&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1524-9050&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1524-9050&client=summon