K-Harmonic means type clustering algorithm for mixed datasets

[Display omitted] •A K-Harmonic clustering algorithm for mixed data has been presented to reduce random initialization problem for partitional algorithms.•The proposed clustering algorithm uses a distance measure developed for mixed datasets.•The experiment results suggest that clustering results ar...

Full description

Saved in:

Bibliographic Details
Published in	Applied soft computing Vol. 48; pp. 39 - 49
Main Authors	Ahmad, Amir, Hashmi, Sarosh
Format	Journal Article
Language	English
Published	Elsevier B.V 01.11.2016
Subjects	Categorical attributes Clustering K-Harmonic means clustering Mixed data Numeric attributes Numeric attributes Categorical attributes Clustering Mixed data K-Harmonic means clustering
Online Access	Get full text
ISSN	1568-4946 1872-9681
DOI	10.1016/j.asoc.2016.06.019

Cover

More Information
Summary:	[Display omitted] •A K-Harmonic clustering algorithm for mixed data has been presented to reduce random initialization problem for partitional algorithms.•The proposed clustering algorithm uses a distance measure developed for mixed datasets.•The experiment results suggest that clustering results are quite insensitive to random initialization.•The proposed algorithm performed better than other clustering algorithms for various datasets. K-means type clustering algorithms for mixed data that consists of numeric and categorical attributes suffer from cluster center initialization problem. The final clustering results depend upon the initial cluster centers. Random cluster center initialization is a popular initialization technique. However, clustering results are not consistent with different cluster center initializations. K-Harmonic means clustering algorithm tries to overcome this problem for pure numeric data. In this paper, we extend the K-Harmonic means clustering algorithm for mixed datasets. We propose a definition for a cluster center and a distance measure. These cluster centers and the distance measure are used with the cost function of K-Harmonic means clustering algorithm in the proposed algorithm. Experiments were carried out with pure categorical datasets and mixed datasets. Results suggest that the proposed clustering algorithm is quite insensitive to the cluster center initialization problem. Comparative studies with other clustering algorithms show that the proposed algorithm produce better clustering results.
ISSN:	1568-4946 1872-9681
DOI:	10.1016/j.asoc.2016.06.019