Comprehensive Evaluation of Satellite-Based Rainfall Measurements Through Rain Gauge Validation Using Advanced Statistical Regression and Machine Learning Models by Using Python Comprehensive Evaluation of Satellite-Based Rainfall Measurements Through Rain Gauge Validation Using Advanced Statistical Regression and Machine Learning Models by Using Python
The accuracy of rainfall data is crucial for climate monitoring, disaster prevention, and water resource management. This study evaluates the effectiveness of various regression and machine-learning models for rainfall prediction using satellite-based and ground-based gauge data. The models tested i...
Saved in:
| Published in | Water resources management Vol. 39; no. 9; pp. 4563 - 4587 |
|---|---|
| Main Author | |
| Format | Journal Article |
| Language | English |
| Published |
Dordrecht
Springer Netherlands
01.07.2025
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0920-4741 1573-1650 |
| DOI | 10.1007/s11269-025-04168-9 |
Cover
| Summary: | The accuracy of rainfall data is crucial for climate monitoring, disaster prevention, and water resource management. This study evaluates the effectiveness of various regression and machine-learning models for rainfall prediction using satellite-based and ground-based gauge data. The models tested include Linear, Ridge, Lasso, Polynomial Regression, Random Forest, Decision Tree, Gradient Boosting Machine, Support Vector Machines, and Artificial Neural Networks. Without the normalization, Linear, Ridge, and Lasso regression models showed similar performance, with R² values of 0.60 for training and 0.57 for testing, indicating reasonable accuracy but limited generalization. Polynomial regression showed a higher R² in training (0.66), but significant overfitting was observed with a drop in test performance (R² = 0.50). Random Forest and GBM performed well in training (R² = 0.94–0.95) but showed a decline in testing (R² = 0.43–0.47), indicating some overfitting. The regression models exhibited stable performance after undergoing normalization using Min-Max and Z-score. The polynomial regression method showed increased consistency but still displayed signs of overfitting. The study found that machine learning models, particularly the Random Forest and ANN algorithm, showed improved generalization after normalization, with ANN achieving the best test R² value of 0.60. Normalization techniques, particularly Min-Max and Z-score, significantly improved model performance, with statistical analysis confirming these improvements highlights the potential of machine learning models, particularly the Random Forest and ANN algorithm, for accurate rainfall prediction, particularly in flood warning systems, irrigation planning, and water resource management. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 0920-4741 1573-1650 |
| DOI: | 10.1007/s11269-025-04168-9 |