Comparison and application of SOFM, fuzzy c-means and k-means clustering algorithms for natural soil environment regionalization in China

Soil attributes and their environmental drivers exhibit different patterns in different geographical directions, along with distinct regional characteristics, which may have important effects on substance migration and transformation such as organic matter and soil elements or the environmental impa...

Full description

Saved in:
Bibliographic Details
Published inEnvironmental research Vol. 216; no. Pt 2; p. 114519
Main Authors Zhao, Wenhao, Ma, Jin, Liu, Qiyuan, Song, Jing, Tysklind, Mats, Liu, Chengshuai, Wang, Dong, Qu, Yajing, Wu, Yihang, Wu, Fengchang
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier Inc 01.01.2023
Subjects
Online AccessGet full text
ISSN0013-9351
1096-0953
1096-0953
DOI10.1016/j.envres.2022.114519

Cover

More Information
Summary:Soil attributes and their environmental drivers exhibit different patterns in different geographical directions, along with distinct regional characteristics, which may have important effects on substance migration and transformation such as organic matter and soil elements or the environmental impacts of pollutants. Therefore, regional soil characteristics should be considered in the process of regionalization for environmental management. However, no comprehensive evaluation or systematic classification of the natural soil environment has been established for China. Here, we established an index system for natural soil environmental regionalization (NSER) by combining literature data obtained based on bibliometrics with the analytic hierarchy process (AHP). Based on the index system, we collected spatial distribution data for 14 indexes at the national scale. In addition, three clustering algorithms—self-organizing feature mapping (SOFM), fuzzy c-means (FCM) and k-means (KM)—were used to classify and define the natural soil environment. We imported four cluster validity indexes (CVI) to evaluate different models: Davies-Bouldin index (DB), Silhouette index (Sil) and Calinski-Harabasz index (CH) for FCM and KM, clustering quality index (CQI) for SOFM. Analysis and comparison of the results showed that when the number of clusters was 13, the FCM clustering algorithm achieved the optimal clustering results (DB = 1.16, Sil = 0.78, CH = 6.77 × 106), allowing the natural soil environment of China to be divided into 12 regions with distinct characteristics. Our study provides a set of comprehensive scientific research methods for regionalization research based on spatial data, it has important reference value for improving soil environmental management based on local conditions in China. •Propose a regionalization scheme based on data-driven and three clustering algorithms.•Establish the regionalization indexes system by bibliometrics and analytic hierarchy process.•Four cluster validity indexes are used to evaluate the clustering results of different models.•The advantages of fuzzy clustering algorithm in soil regionalization are further confirmed.•Divide Chinese natural soil environment into 12 regions with their own characteristics.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0013-9351
1096-0953
1096-0953
DOI:10.1016/j.envres.2022.114519