Accounting for Spatial Autocorrelation in Algorithm-Driven Hedonic Models: A Spatial Cross-Validation Approach

Data-driven machine learning algorithms have initiated a paradigm shift in hedonic house price and rent modeling through their ability to capture highly complex and non-monotonic relationships. Their superior accuracy compared to parametric model alternatives has been demonstrated repeatedly in the...

Full description

Saved in:

Bibliographic Details
Published in	The journal of real estate finance and economics Vol. 68; no. 2; pp. 235 - 273
Main Authors	Deppner, Juergen, Cajias, Marcelo
Format	Journal Article
Language	English
Published	New York, NY Springer US 01.02.2024 Springer Nature B.V
Subjects	Algorithms Automated valuation models Bias Credit risk Decision making Economics Economics and Finance Financial Services Generalizability Hedonic modeling Housing prices Machine learning Mass appraisal Optimism Regional/Spatial Science Risk management Spatial autocorrelation Spatial cross-validation Valuation Valuation methods Spatial cross-validation Hedonic modeling Spatial autocorrelation Automated valuation models Mass appraisal Machine learning
Online Access	Get full text
ISSN	1573-045X 0895-5638 1573-045X
DOI	10.1007/s11146-022-09915-y

Cover

More Information
Summary:	Data-driven machine learning algorithms have initiated a paradigm shift in hedonic house price and rent modeling through their ability to capture highly complex and non-monotonic relationships. Their superior accuracy compared to parametric model alternatives has been demonstrated repeatedly in the literature. However, the statistical independence of the data implicitly assumed by resampling-based error estimates is unlikely to hold in a real estate context as price-formation processes in property markets are inherently spatial, which leads to spatial dependence structures in the data. When performing conventional cross-validation techniques for model selection and model assessment, spatial dependence between training and test data may lead to undetected overfitting and overoptimistic perception of predictive power. This study sheds light on the bias in cross-validation errors of tree-based algorithms induced by spatial autocorrelation and proposes a bias-reduced spatial cross-validation strategy. The findings confirm that error estimates from non-spatial resampling methods are overly optimistic, whereas spatially conscious techniques are more dependable and can increase generalizability. As accurate and unbiased error estimates are crucial to automated valuation methods, our results prove helpful for applications including, but not limited to, mass appraisal, credit risk management, portfolio allocation and investment decision making.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1573-045X 0895-5638 1573-045X
DOI:	10.1007/s11146-022-09915-y