Geocoding Freeform Placenames: An Example of Deciphering the Czech National Immigration Database

The growth of international migration and its societal and political impacts bring a greater need for accurate data to measure, understand and control migration flows. However, in the Czech immigration database, the birthplaces of immigrants are only kept in freeform text fields, a substantial obsta...

Full description

Saved in:
Bibliographic Details
Published inISPRS international journal of geo-information Vol. 10; no. 5; p. 335
Main Authors Šimbera, Jan, Drbohlav, Dušan, Štych, Přemysl
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.05.2021
Subjects
Online AccessGet full text
ISSN2220-9964
2220-9964
DOI10.3390/ijgi10050335

Cover

More Information
Summary:The growth of international migration and its societal and political impacts bring a greater need for accurate data to measure, understand and control migration flows. However, in the Czech immigration database, the birthplaces of immigrants are only kept in freeform text fields, a substantial obstacle to their further processing due to numerous errors in transcription and spelling. This study overcomes this obstacle by deploying a custom geocoding engine based on GeoNames, tailored transcription rules and fuzzy matching in order to achieve good accuracy even for noisy data while not depending on third-party services, resulting in lower costs than the comparable approaches. The results are presented on a subnational level for the immigrants coming to Czechia from the USA, Ukraine, Moldova and Vietnam, revealing important spatial patterns that are invisible on the national level.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2220-9964
2220-9964
DOI:10.3390/ijgi10050335