Evaluation of the bias and precision of regression techniques and machine learning approaches in total dissolved solids modeling of an urban aquifer

TDS is modeled for an aquifer near an unlined landfill in Canada. Canadian Drinking Water Guidelines and other indices are used to evaluate TDS concentrations in 27 monitoring wells surrounding the landfill. This study aims to predict TDS concentrations using three different modeling approaches: dua...

Full description

Saved in:

Bibliographic Details
Published in	Environmental science and pollution research international Vol. 26; no. 2; pp. 1821 - 1833
Main Authors	Pan, Conglian, Ng, Kelvin Tsun Wai, Fallah, Bahareh, Richter, Amy
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.01.2019 Springer Nature B.V
Subjects	Aquatic Pollution Aquifers Artificial intelligence Artificial neural networks Atmospheric Protection/Air Quality Control/Air Pollution Back propagation Bias Canada data collection Drinking water Earth and Environmental Science Ecotoxicology Environment Environmental Chemistry Environmental Health Environmental monitoring Environmental Monitoring - methods Environmental science Extreme values Groundwater Groundwater - chemistry Groundwater quality guidelines Irrigation water Landfill Landfills Learning algorithms Machine Learning Mathematical models Modelling Models, Statistical monitoring Neural networks Performance evaluation polymerase chain reaction Quality assessment Quality control Regression analysis Research Article Statistical analysis Total dissolved solids Waste disposal sites Waste Water Technology Water Management Water Pollutants - analysis Water Pollution - statistics & numerical data Water Pollution Control Water quality wells Canada Principal component regression Bias and precision Total dissolved solids Artificial neural network Multivariate statistical analysis Machine learning methods
Online Access	Get full text
ISSN	0944-1344 1614-7499 1614-7499
DOI	10.1007/s11356-018-3751-y

Cover

More Information
Summary:	TDS is modeled for an aquifer near an unlined landfill in Canada. Canadian Drinking Water Guidelines and other indices are used to evaluate TDS concentrations in 27 monitoring wells surrounding the landfill. This study aims to predict TDS concentrations using three different modeling approaches: dual-step multiple linear regression (MLR), hybrid principal component regression (PCR), and backpropagation neural networks (BPNN). An analysis of the bias and precision of each models follows, using performance evaluation metrics and statistical indices. TDS is one of the most important parameters in assessing suitability of water for irrigation, and for overall groundwater quality assessment. Good agreement was observed between the MLR1 model and field data, although multicollinearity issues exist. Percentage errors of hybrid PCR were comparable to the dual-step MLR method. Percentage error for hybrid PCR was found to be inversely proportional to TDS concentrations, which was not observed for dual-step MLR. Larger errors were obtained from the BPNN models, and higher percentage errors were observed in monitoring wells with lower TDS concentrations. All models in this study adequately describe the data in testing stage ( R 2 > 0.86). Generally, the dual-step MLR and hybrid PCR models fared better ( R 2 avg = 0.981 and 0.974, respectively), while BPNN models performed worse ( R 2 avg = 0.904). For this dataset, both regression and machine learning models are more suited to predict mid-range data compared to extreme values. Advanced regression methods (hybrid PCR and dual-step MLR) are more advantageous compared to BPNN.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0944-1344 1614-7499 1614-7499
DOI:	10.1007/s11356-018-3751-y