Research on the railway multi-source homonymous geographical entity matching algorithm based on dynamic time warping

This study aims to explore an efficient technique for matching multisource homonymous geographical entities in railways to address the identification issues of homonymous geographical entities. Focusing on railway line vector spatial data, this research investigates the matching problem of multisour...

Full description

Saved in:
Bibliographic Details
Published inIntelligent decision technologies Vol. 18; no. 3; pp. 1879 - 1891
Main Authors Gong, Weiwei, Zhou, Lingyun, Zhou, Langya, Bao, Jingjng, Chen, Cheng
Format Journal Article
LanguageEnglish
Published London, England SAGE Publications 01.01.2024
Sage Publications Ltd
Subjects
Online AccessGet full text
ISSN1872-4981
1875-8843
DOI10.3233/IDT-240684

Cover

More Information
Summary:This study aims to explore an efficient technique for matching multisource homonymous geographical entities in railways to address the identification issues of homonymous geographical entities. Focusing on railway line vector spatial data, this research investigates the matching problem of multisource homonymous geographical entities. Building on statistical feature matching of attribute data, a curve similarity calculation method based on the DTW algorithm is designed to achieve better local elastic matching, overcoming the limitations of the Fréchet algorithm. The empirical study utilizes railway line layer data from two data sources within Beijing’s jurisdiction, fusing 6237 segment lines from source 2 with 105 long lines from source 1. The structural comparison between the two data sources is conducted through statistical methods, applying cosine similarity and the maximum similarity value of TF-IDF for text similarity calculation. Finally, Python is used to implement the DTW algorithm for curve similarity. The experimental results show an average DTW distance of 3.92, a standard deviation of 4.63, and a mode of 0.005. Similarity measurement results indicate that 95.53% of records are within the predetermined threshold, demonstrating the effectiveness and applicability of the method. The findings significantly enhance the accuracy of railway data matching, promoting the informatization of the railway industry, and hold substantial significance for improving railway operational efficiency and system performance.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1872-4981
1875-8843
DOI:10.3233/IDT-240684