Initialization of cluster refinement algorithms: a review and comparative study

Various iterative refinement clustering methods are dependent on the initial state of the model and are capable of obtaining one of their local optima only. Since the task of identifying the global optimization is NP-hard, the study of the initialization method towards a sub-optimization is of great...

Full description

Saved in:

Bibliographic Details
Published in	2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541) Vol. 1; pp. 297 - 302
Main Authors	JI HE, MAN LAN, TAN, Chew-Lim, SUNG, Sam-Yuan, LOW, Hwee-Boon
Format	Conference Proceeding
Language	English
Published	Piscataway NJ IEEE 2004
Subjects	Applied sciences Artificial intelligence Clustering algorithms Clustering methods Computer science; control theory; systems Data mining Exact sciences and technology Image segmentation Iterative algorithms Iterative methods Optimization methods Sampling methods Sun Unsupervised learning Cluster analysis Initialization Density estimation Local search Density measurement Random sampling Refinement method Optimization method Iterative method Global optimum K means algorithm Neural network Modeling Classification NP hard problem State dependence
Online Access	Get full text
ISBN	0780383591 9780780383593
ISSN	1098-7576
DOI	10.1109/IJCNN.2004.1379917

Cover

More Information
Summary:	Various iterative refinement clustering methods are dependent on the initial state of the model and are capable of obtaining one of their local optima only. Since the task of identifying the global optimization is NP-hard, the study of the initialization method towards a sub-optimization is of great value. This paper reviews the various cluster initialization methods in the literature by categorizing them into three major families, namely random sampling methods, distance optimization methods, and density estimation methods. In addition, using a set of quantitative measures, we assess their performance on a number of synthetic and real-life data sets. Our controlled benchmark identifies two distance optimization methods, namely SCS and KKZ, as complements of the k-means learning characteristics towards a better cluster separation in the output solution.
ISBN:	0780383591 9780780383593
ISSN:	1098-7576
DOI:	10.1109/IJCNN.2004.1379917