Graph neural networks for house price prediction: do or don’t?

The domain of house price prediction, also referred to as real estate appraisal, has recently seen a shift from traditional statistical methodologies toward machine learning and deep learning techniques. As housing data is characterized by heterogeneous tabular data, and is subject to spatial depend...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of data science and analytics Vol. 20; no. 4; pp. 3563 - 3593
Main Authors Geerts, Margot, vanden Broucke, Seppe, De Weerdt, Jochen
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.10.2025
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN2364-415X
2364-4168
DOI10.1007/s41060-024-00682-y

Cover

More Information
Summary:The domain of house price prediction, also referred to as real estate appraisal, has recently seen a shift from traditional statistical methodologies toward machine learning and deep learning techniques. As housing data is characterized by heterogeneous tabular data, and is subject to spatial dependencies, there is an exigent need for predictive models capable of capturing these complexities. Specifically, graph neural networks (GNNs) have been posited to discern spatial relationships by structuring housing data as graphs. Nevertheless, recent approaches frequently neglect alternative methods for graph construction, or lack a systematic comparative framework for different GNN approaches. Moreover, tree-based models, which are considered the state-of-the-art for tabular data, along with other contemporary methods, are often overlooked when evaluating GNNs. Therefore, this paper performs a comprehensive benchmark of graph construction methods and prevalent GNN models. Furthermore, we compare GNN approaches for house price prediction against an extensive suite of statistical, machine learning, and deep learning models. The results, drawn from six diverse housing datasets, reveal that GNNs are unsuccessful in surpassing machine learning and deep learning baselines. In particular, optimizing the graph structure yields only marginal improvements, with k-nearest neighbor graphs generally exhibiting superior performance. Among the GNN architectures evaluated, GraphSAGE and Transformer-based models demonstrate superior accuracy compared to other GNN variants. Ultimately, the findings suggest a general recommendation against the adoption of GNNs in favor of tree-based models such as LightGBM and CatBoost for house price prediction tasks.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2364-415X
2364-4168
DOI:10.1007/s41060-024-00682-y