TLtrack: Combining Transformers and a Linear Model for Robust Multi-Object Tracking

Multi-object tracking (MOT) aims at estimating locations and identities of objects in videos. Many modern multiple-object tracking systems follow the tracking-by-detection paradigm, consisting of a detector followed by a method for associating detections into tracks. Tracking by associating detectio...

Full description

Saved in:

Bibliographic Details
Published in	AI (Basel) Vol. 5; no. 3; pp. 938 - 947
Main Authors	He, Zuojie, Zhao, Kai, Zeng, Dan
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.09.2024
Subjects	Accuracy Algorithms Computer vision Datasets Design Estimates Kalman filters Methods Motion perception motion prediction multi-object tracking Multiple target tracking Queries Sensors Tracking systems transformer Transformers
Online Access	Get full text
ISSN	2673-2688 2673-2688
DOI	10.3390/ai5030047

Cover

More Information
Summary:	Multi-object tracking (MOT) aims at estimating locations and identities of objects in videos. Many modern multiple-object tracking systems follow the tracking-by-detection paradigm, consisting of a detector followed by a method for associating detections into tracks. Tracking by associating detections through motion-based similarity heuristics is the basic way. Motion models aim at utilizing motion information to estimate future locations, playing an important role in enhancing the performance of association. Recently, a large-scale dataset, DanceTrack, where objects have uniform appearance and diverse motion patterns, was proposed. With existing hand-crafted motion models, it is hard to achieve decent results on DanceTrack because of the lack of prior knowledge. In this work, we present a motion-based algorithm named TLtrack, which adopts a hybrid strategy to make motion estimates based on confidence scores. For high confidence score detections, TLtrack employs transformers to predict its locations. For low confidence score detections, a simple linear model that estimates locations through trajectory historical information is used. TLtrack can not only consider the historical information of the trajectory, but also analyze the latest movements. Our experimental results on the DanceTrack dataset show that our method achieves the best performance compared with other motion models.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2673-2688 2673-2688
DOI:	10.3390/ai5030047