Comprehensive Evaluation of Satellite-Based Rainfall Measurements Through Rain Gauge Validation Using Advanced Statistical Regression and Machine Learning Models by Using Python Comprehensive Evaluation of Satellite-Based Rainfall Measurements Through Rain Gauge Validation Using Advanced Statistical Regression and Machine Learning Models by Using Python

The accuracy of rainfall data is crucial for climate monitoring, disaster prevention, and water resource management. This study evaluates the effectiveness of various regression and machine-learning models for rainfall prediction using satellite-based and ground-based gauge data. The models tested i...

Full description

Saved in:
Bibliographic Details
Published inWater resources management Vol. 39; no. 9; pp. 4563 - 4587
Main Author Sumith, K V
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Netherlands 01.07.2025
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0920-4741
1573-1650
DOI10.1007/s11269-025-04168-9

Cover

More Information
Summary:The accuracy of rainfall data is crucial for climate monitoring, disaster prevention, and water resource management. This study evaluates the effectiveness of various regression and machine-learning models for rainfall prediction using satellite-based and ground-based gauge data. The models tested include Linear, Ridge, Lasso, Polynomial Regression, Random Forest, Decision Tree, Gradient Boosting Machine, Support Vector Machines, and Artificial Neural Networks. Without the normalization, Linear, Ridge, and Lasso regression models showed similar performance, with R² values of 0.60 for training and 0.57 for testing, indicating reasonable accuracy but limited generalization. Polynomial regression showed a higher R² in training (0.66), but significant overfitting was observed with a drop in test performance (R² = 0.50). Random Forest and GBM performed well in training (R² = 0.94–0.95) but showed a decline in testing (R² = 0.43–0.47), indicating some overfitting. The regression models exhibited stable performance after undergoing normalization using Min-Max and Z-score. The polynomial regression method showed increased consistency but still displayed signs of overfitting. The study found that machine learning models, particularly the Random Forest and ANN algorithm, showed improved generalization after normalization, with ANN achieving the best test R² value of 0.60. Normalization techniques, particularly Min-Max and Z-score, significantly improved model performance, with statistical analysis confirming these improvements highlights the potential of machine learning models, particularly the Random Forest and ANN algorithm, for accurate rainfall prediction, particularly in flood warning systems, irrigation planning, and water resource management.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0920-4741
1573-1650
DOI:10.1007/s11269-025-04168-9