Forecasting maximal and minimal air temperatures using explainable machine learning: Shapley additive explanation versus local interpretable model-agnostic explanations

This study investigates the performance of four boosting machine learning models, AdaBoost, XGBoost, CatBoost, and LightGBM, for forecasting maximal (Tmax) and minimal (Tmin) air temperatures at six lead times: the same day and 1, 7, 15, 21, and 30 days ahead. Daily temperature data from the USGS 02...

Full description

Saved in:
Bibliographic Details
Published inStochastic environmental research and risk assessment Vol. 39; no. 6; pp. 2551 - 2581
Main Authors Daif, Noureddine, Di Nunno, Fabio, Granata, Francesco, Difi, Salah, Kisi, Ozgur, Heddam, Salim, Kim, Sungwon, Adnan, Rana Muhammad, Zounemat-Kermani, Mohammad
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.06.2025
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1436-3240
1436-3259
DOI10.1007/s00477-025-02984-4

Cover

More Information
Summary:This study investigates the performance of four boosting machine learning models, AdaBoost, XGBoost, CatBoost, and LightGBM, for forecasting maximal (Tmax) and minimal (Tmin) air temperatures at six lead times: the same day and 1, 7, 15, 21, and 30 days ahead. Daily temperature data from the USGS 02187010 weather station (South Carolina, USA) were used for model training and validation. To address the challenges posed by the non-linearity and complexity of climate data, the models were integrated with explainable artificial intelligence (XAI) techniques, specifically SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME), which provide insights into the role of input features in shaping predictions. Results indicate that forecasting accuracy declines with increasing lead time. Among the tested models, CatBoost1 consistently exhibited the best performance. For Tmax forecasting on the validation set, CatBoost1 yielded a correlation coefficient (R) of 0.900, Nash–Sutcliffe efficiency (NSE) of 0.810, root mean squared error (RMSE) of 3.447 °C, mean absolute error (MAE) of 2.571 °C, Willmott’s Index (WI) of 0.947, Legates and McCabe Index (LM) of 0.615, explained variance score (EVS) of 0.810, and absolute percentage bias (APB) of 15.415%. For Tmin, CatBoost1 achieved R  = 0.941, NSE = 0.885, RMSE = 2.618 °C, MAE = 1.952 °C, WI = 0.969, LM = 0.709, EVS = 0.885, and APB = 54.360%. These findings demonstrate that boosting models, when combined with explainable AI techniques, offer a robust and transparent framework for temperature forecasting, supporting their application in climate risk management, agriculture, and energy planning. Graphical abstract
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1436-3240
1436-3259
DOI:10.1007/s00477-025-02984-4