Collecting, Integrating, Enriching and Republishing Open City Data as Linked Data

Access to high quality and recent data is crucial both for decision makers in cities as well as for the public. Likewise, infrastructure providers could offer more tailored solutions to cities based on such data. However, even though there are many data sets containing relevant indicators about citi...

Full description

Saved in:

Bibliographic Details
Published in	The Semantic Web - ISWC 2015 pp. 57 - 75
Main Authors	Bischof, Stefan, Martin, Christoph, Polleres, Axel, Schneider, Patrik
Format	Book Chapter
Language	English
Published	Cham Springer International Publishing 2015
Series	Lecture Notes in Computer Science
Subjects	City Data Link Open Data Open Data Source Target Indicator Triple Store
Online Access	Get full text
ISBN	3319250094 9783319250090
ISSN	0302-9743 1611-3349
DOI	10.1007/978-3-319-25010-6_4

Cover

More Information
Summary:	Access to high quality and recent data is crucial both for decision makers in cities as well as for the public. Likewise, infrastructure providers could offer more tailored solutions to cities based on such data. However, even though there are many data sets containing relevant indicators about cities available as open data, it is cumbersome to integrate and analyze them, since the collection is still a manual process and the sources are not connected to each other upfront. Further, disjoint indicators and cities across the available data sources lead to a large proportion of missing values when integrating these sources. In this paper we present a platform for collecting, integrating, and enriching open data about cities in a reusable and comparable manner: we have integrated various open data sources and present approaches for predicting missing values, where we use standard regression methods in combination with principal component analysis (PCA) to improve quality and amount of predicted values. Since indicators and cities only have partial overlaps across data sets, we particularly focus on predicting indicator values across data sets, where we extend, adapt, and evaluate our prediction model for this particular purpose: as a “side product” we learn ontology mappings (simple equations and sub-properties) for pairs of indicators from different data sets. Finally, we republish the integrated and predicted values as linked open data.
Bibliography:	Compared to an informal, preliminary version of this paper presented at the Know@LOD 2015 workshop, Section 5, 6, and 8 are entirely new, plus more data sources have been integrated.
ISBN:	3319250094 9783319250090
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-319-25010-6_4