Adding measurement error to location data to protect subject confidentiality while allowing for consistent estimation of exposure effects
In public use data sets, it is desirable not to report a respondent's location precisely to protect subject confidentiality. However, the direct use of perturbed location data to construct explanatory exposure variables for regression models will generally make naive estimates of all parameters...
Saved in:
Published in | Journal of the Royal Statistical Society Series C: Applied Statistics Vol. 69; no. 5; pp. 1251 - 1268 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Oxford
Wiley
01.11.2020
Oxford University Press |
Subjects | |
Online Access | Get full text |
ISSN | 0035-9254 1467-9876 |
DOI | 10.1111/rssc.12439 |
Cover
Summary: | In public use data sets, it is desirable not to report a respondent's location precisely to protect subject confidentiality. However, the direct use of perturbed location data to construct explanatory exposure variables for regression models will generally make naive estimates of all parameters biased and inconsistent. We propose an approach where a perturbation vector, consisting of a random distance at a random angle, is added to a respondent's reported geographic co-ordinates. We show that, as long as the distribution of the perturbation is public and there is an underlying prior population density map, external researchers can construct unbiased and consistent estimates of location-dependent exposure effects by using numerical integration techniques over all possible actual locations, although coefficient confidence intervals are wider than if the true location data were known. We examine our method by using a Monte Carlo simulation exercise and apply it to a real world example using data on perceived and actual distance to a health facility in Tanzania. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0035-9254 1467-9876 |
DOI: | 10.1111/rssc.12439 |