A novel density peaks clustering algorithm based on Hopkins statistic
Density peaks clustering (DPC) is a promising algorithm due to straightforward and easy implementation. However, most of its improvements still rely on expert, strong prior information, or complex iterations to identify the cluster centers, which inevitably adds subjectivity and instability. Moreove...
Saved in:
| Published in | Expert systems with applications Vol. 201; p. 116892 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
New York
Elsevier Ltd
01.09.2022
Elsevier BV |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0957-4174 1873-6793 |
| DOI | 10.1016/j.eswa.2022.116892 |
Cover
| Summary: | Density peaks clustering (DPC) is a promising algorithm due to straightforward and easy implementation. However, most of its improvements still rely on expert, strong prior information, or complex iterations to identify the cluster centers, which inevitably adds subjectivity and instability. Moreover, some crisp and sensitive density metrics will sometimes reduce the representativeness of the center, resulting in poor clustering. To this end, we propose an enhanced algorithm, called Density peaks clustering based on Hopkins Statistic. The main property of the method is to realize the automatic identification of cluster centers without prior information. Specifically, with a two-stage strategy, we first specify some objects as candidate centers by linear regression and residual analysis. Subsequently, inspired by optimization idea we design a novel validity index (AHS) instead of the original decision graph to find the desired centers from the candidates. Another novel part of DPC-AHS is that the proposed adjusted-k-nearest neighbors (A-kNN) dynamically defines the neighbors during the process, which further enhances the robustness against outliers. Finally, we compare performance of DPC-AHS with 7 state-of-the-art methods over synthetic, UCI, and image datasets. Experiments on 25 datasets and in-depth discussion cases from 5 perspectives demonstrate that our algorithm is feasible and effective in clustering and center identification.
•A novel density peaks clustering based on Hopkins Statistic (DPC-AHS) is proposed.•DPC-AHS can automatically find clusters and centers without manual participation.•A cluster validity index AHS with low complexity is designed to evaluate clustering.•Experiments and discussions on various datasets show the effectiveness of our method.•DPC-AHS requires only one parameter and can be applied to high dimensional data. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0957-4174 1873-6793 |
| DOI: | 10.1016/j.eswa.2022.116892 |