Data-Driven Imputation Method for Traffic Data in Sectional Units of Road Links

Missing data imputation is a critical step in data processing for intelligent transportation systems. This paper proposes a data-driven imputation method for sections of road based on their spatial and temporal correlation using a modified k- nearest neighbor method. This computing-distributable imp...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on intelligent transportation systems Vol. 17; no. 6; pp. 1762 - 1771
Main Authors	Tak, Sehyun, Woo, Soomin, Yeo, Hwasoo
Format	Journal Article
Language	English
Published	New York IEEE 01.06.2016 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms big data Correlation Data processing Detectors Imputation intelligent transportation system Intelligent transportation systems Junctions kNN Missing data Roads Sensors Traffic engineering Traffic flow Imputation kNN big data intelligent transportation system
Online Access	Get full text
ISSN	1524-9050 1558-0016
DOI	10.1109/TITS.2016.2530312

Cover

More Information
Summary:	Missing data imputation is a critical step in data processing for intelligent transportation systems. This paper proposes a data-driven imputation method for sections of road based on their spatial and temporal correlation using a modified k- nearest neighbor method. This computing-distributable imputation method is different from the conventional algorithms in the fact that it attempts to impute missing data of a section with multiple sensors that have correlation to each other, at once. This increases computational efficiency greatly compared with other methods, whose imputation subject is individual sensors. In addition, the geometrical property of each section is conserved; in other words, the continuation of traffic properties that each sensor captures is conserved, therefore increasing accuracy of imputation. This paper shows results and analysis of comparison of the proposed method to others such as nearest historical data and expectation maximization by varying missing data type, missing ratio, traffic state, and day type. The results show that the proposed algorithm achieves better performance in almost all of the missing types, missing ratios, day types, and traffic states. When the missing data type cannot be identified or various missing types are mixed, the proposed algorithm shows accurate and stable imputation performance.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1524-9050 1558-0016
DOI:	10.1109/TITS.2016.2530312