Clustering Algorithm by Boundary Detection Base on Entropy of KNN
Clustering analysis has been widely applied in various fields, and boundary detection based clustering algorithms have shown effective performance. In this work, we propose a clustering algorithm by boundary detection based on entropy of KNN (CBDEK). A border point contains only the nearest neighbor...
Saved in:
| Published in | IEEE access Vol. 13; pp. 62353 - 62366 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Piscataway
IEEE
2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2169-3536 2169-3536 |
| DOI | 10.1109/ACCESS.2025.3555772 |
Cover
| Summary: | Clustering analysis has been widely applied in various fields, and boundary detection based clustering algorithms have shown effective performance. In this work, we propose a clustering algorithm by boundary detection based on entropy of KNN (CBDEK). A border point contains only the nearest neighbors within a specific directional range. Thus, we define entropy of KNN (EK) to accurately identify the boundary of the cluster. Since entropy has the property of measuring uncertainty, it can be used to quantify the possibility that a point is a border point. A lower EK indicates an uneven neighbor distribution, increasing the possibility of the point being a border point. Then, the border points are clustered based on the directional similarity of their nearest neighbors. Specifically, if a border point is a neighbor of another border point and most of their nearest neighbors show directional similarity, they are considered to belong to the same cluster. Furthermore, we assign the label of a border point to the interior points located within the maximum nearest neighbor sub-block of this border point to facilitate an efficient allocation for the remaining points (interior points). In addition, our algorithm incorporates noise mitigation techniques using average distance and box plot analysis. The effectiveness of CBDEK is proven by a comparative evaluation of nine algorithms on 24 datasets. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2169-3536 2169-3536 |
| DOI: | 10.1109/ACCESS.2025.3555772 |