Chameleon 2 An Improved Graph-Based Clustering Algorithm
Traditional clustering algorithms fail to produce human-like results when confronted with data of variable density, complex distributions, or in the presence of noise. We propose an improved graph-based clustering algorithm called Chameleon 2, which overcomes several drawbacks of state-of-the-art cl...
Saved in:
| Published in | ACM transactions on knowledge discovery from data Vol. 13; no. 1; pp. 1 - 27 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
01.01.2019
|
| Online Access | Get full text |
| ISSN | 1556-4681 1556-472X |
| DOI | 10.1145/3299876 |
Cover
| Summary: | Traditional clustering algorithms fail to produce human-like results when confronted with data of variable density, complex distributions, or in the presence of noise. We propose an improved graph-based clustering algorithm called Chameleon 2, which overcomes several drawbacks of state-of-the-art clustering approaches. We modified the internal cluster quality measure and added an extra step to ensure algorithm robustness. Our results reveal a significant positive impact on the clustering quality measured by Normalized Mutual Information on 32 artificial datasets used in the clustering literature. This significant improvement is also confirmed on real-world datasets.
The performance of clustering algorithms such as DBSCAN is extremely parameter sensitive, and exhaustive manual parameter tuning is necessary to obtain a meaningful result. All hierarchical clustering methods are very sensitive to cutoff selection, and a human expert is often required to find the true cutoff for each clustering result. We present an automated cutoff selection method that enables the Chameleon 2 algorithm to generate high-quality clustering in autonomous mode. |
|---|---|
| ISSN: | 1556-4681 1556-472X |
| DOI: | 10.1145/3299876 |