SAMA: Spatially-Aware Model-Agnostic Machine Learning Framework for Geophysical Data

Geophysical data is a form of spatial data that suffers from various limitations when applying conventional machine learning algorithms and evaluation techniques. A key limitation facing models trained on geophysical data is their inability to generalize well when deployed to predict from new unseen...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 11; pp. 7436 - 7449
Main Authors Yamani, Asma Z., Katterbaeur, Klemens, Alshehri, Abdallah A., Al-Zaidy, Rabeah A.
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2169-3536
2169-3536
DOI10.1109/ACCESS.2023.3236802

Cover

More Information
Summary:Geophysical data is a form of spatial data that suffers from various limitations when applying conventional machine learning algorithms and evaluation techniques. A key limitation facing models trained on geophysical data is their inability to generalize well when deployed to predict from new unseen data. We address the problem of inaccurate performance assessments of machine learning models, that stems from violating independence assumptions during the feature selection and evaluation phases of the learning process. Our proposed spatially-aware and model-agnostic (SAMA) framework provides a suite of spatially-aware feature generation, feature selection, and model validation algorithms that account for spatial characteristics of geophysical data. The framework is model agnostic, as it tackles data-related challenges that are not affected by the specific machine learning algorithm used to fit the data. To demonstrate the effectiveness of the proposed approach, it is applied to the water saturation mapping problem using a novel geophysical dataset to train a prediction model. The proposed spatially-aware models obtains an <inline-formula> <tex-math notation="LaTeX">R^{2} </tex-math></inline-formula> of 0.620, an <inline-formula> <tex-math notation="LaTeX">RMSE </tex-math></inline-formula> of 0.220 for predicting water saturation for the Whole Region of the reservoir model box and an <inline-formula> <tex-math notation="LaTeX">R^{2} </tex-math></inline-formula> of 0.161, an <inline-formula> <tex-math notation="LaTeX">RMSE </tex-math></inline-formula> of 0.263 for the Interwell Region. Extensive experiments on 5 additional unseen datasets show that the model maintains stable performance across different datasets, which demonstrates the ability of the SAMA framework to produce robust models that are transferable to new datasets.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3236802