Graph neural networks for house price prediction: do or don’t?
The domain of house price prediction, also referred to as real estate appraisal, has recently seen a shift from traditional statistical methodologies toward machine learning and deep learning techniques. As housing data is characterized by heterogeneous tabular data, and is subject to spatial depend...
Saved in:
| Published in | International journal of data science and analytics Vol. 20; no. 4; pp. 3563 - 3593 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
Cham
Springer International Publishing
01.10.2025
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2364-415X 2364-4168 |
| DOI | 10.1007/s41060-024-00682-y |
Cover
| Summary: | The domain of house price prediction, also referred to as real estate appraisal, has recently seen a shift from traditional statistical methodologies toward machine learning and deep learning techniques. As housing data is characterized by heterogeneous tabular data, and is subject to spatial dependencies, there is an exigent need for predictive models capable of capturing these complexities. Specifically, graph neural networks (GNNs) have been posited to discern spatial relationships by structuring housing data as graphs. Nevertheless, recent approaches frequently neglect alternative methods for graph construction, or lack a systematic comparative framework for different GNN approaches. Moreover, tree-based models, which are considered the state-of-the-art for tabular data, along with other contemporary methods, are often overlooked when evaluating GNNs. Therefore, this paper performs a comprehensive benchmark of graph construction methods and prevalent GNN models. Furthermore, we compare GNN approaches for house price prediction against an extensive suite of statistical, machine learning, and deep learning models. The results, drawn from six diverse housing datasets, reveal that GNNs are unsuccessful in surpassing machine learning and deep learning baselines. In particular, optimizing the graph structure yields only marginal improvements, with k-nearest neighbor graphs generally exhibiting superior performance. Among the GNN architectures evaluated, GraphSAGE and Transformer-based models demonstrate superior accuracy compared to other GNN variants. Ultimately, the findings suggest a general recommendation against the adoption of GNNs in favor of tree-based models such as LightGBM and CatBoost for house price prediction tasks. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2364-415X 2364-4168 |
| DOI: | 10.1007/s41060-024-00682-y |