Modeling the global ocean distribution of dissolved cadmium based on machine learning—SHAP algorithm
Cadmium (Cd) is a bio-essential trace metal in the ocean that can be toxic at high concentrations, significantly impacting the marine environment and phytoplankton growth. Its distribution pattern is closely proportional to that of phosphate (PO4), although the mechanism is not fully understood. At...
Saved in:
| Published in | The Science of the total environment Vol. 958; p. 177951 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
Netherlands
Elsevier B.V
01.01.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0048-9697 1879-1026 1879-1026 |
| DOI | 10.1016/j.scitotenv.2024.177951 |
Cover
| Summary: | Cadmium (Cd) is a bio-essential trace metal in the ocean that can be toxic at high concentrations, significantly impacting the marine environment and phytoplankton growth. Its distribution pattern is closely proportional to that of phosphate (PO4), although the mechanism is not fully understood. At low concentrations, evidence indicates Cd is able to act as an enzyme cofactor in biological processes. An understanding of the spatial distribution of dissolved cadmium (dCd) remains lacking and is constrained by the limitations of current observational data. Based on the observational data, this study applied advanced machine learning methods to reconstruct a global dataset of dCd, aiming to improve the accuracy and comprehensiveness of dCd cycling analyses. A comparison of five machine learning algorithms (artificial neural network, support vector machine, Lasso regression, k-nearest neighbors, and random forest) found that the random forest model showed the best performance (Rsq = 0.99, RMSE = 0.035 nmol kg−1, MAE = 0.019 nmol kg−1, MAPE = 0.345), reducing bias by 25 % compared to previous studies. Using SHapley Additive exPlanations approach (SHAP), this study explored the factors influencing the dCd distribution at various depths and discussed the potential causes of changes in the Cd-PO4 relationship. The results showed that the temporal and spatial variability of Cd was influenced by surface biological processes, deep-sea mineralization, and seawater stratification. Variations in the Cd-PO4 relationship were linked to differences in biological fractionation inside and outside high-nutrient, low-chlorophyll (HNLC) regions, as well as the mixing of water masses with different Cd:PO4 ratios. Further analysis indicated that >80 % of particles degraded into Cd and PO4 were produced in HNLC regions. This study highlights the broad potential of machine learning in oceanography, offering a global perspective on Cd cycling and new insights into the mechanisms driving element cycling.
[Display omitted]
•Selecting the optimal algorithm to reduce the bias of the model output by 25 %.•First application of a novel interpretation method in the study of trace metal cycling.•Biological processes and regeneration influence Cd distribution and Cd: PO4 ratio.•Over 80 % of particles degrade into Cd and PO4 within HNLC regions. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 0048-9697 1879-1026 1879-1026 |
| DOI: | 10.1016/j.scitotenv.2024.177951 |