A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data

In many applications, data objects are described by both numeric and categorical features. The k-prototype algorithm is one of the most important algorithms for clustering this type of data. However, this method performs hard partition, which may lead to misclassification for the data objects in the...

Full description

Saved in:

Bibliographic Details
Published in	Knowledge-based systems Vol. 30; pp. 129 - 135
Main Authors	Ji, Jinchao, Pang, Wei, Zhou, Chunguang, Han, Xiao, Wang, Zhe
Format	Journal Article
Language	English
Published	Elsevier B.V 01.06.2012
Subjects	Algorithms Attribute significance Clustering Data mining Dissimilarity measure Fuzzy clustering Mixed data Dissimilarity measure Fuzzy clustering Data mining Mixed data Attribute significance
Online Access	Get full text
ISSN	0950-7051 1872-7409
DOI	10.1016/j.knosys.2012.01.006

Cover

More Information
Summary:	In many applications, data objects are described by both numeric and categorical features. The k-prototype algorithm is one of the most important algorithms for clustering this type of data. However, this method performs hard partition, which may lead to misclassification for the data objects in the boundaries of regions, and the dissimilarity measure only uses the user-given parameter for adjusting the significance of attribute. In this paper, first, we combine mean and fuzzy centroid to represent the prototype of a cluster, and employ a new measure based on co-occurrence of values to evaluate the dissimilarity between data objects and prototypes of clusters. This measure also takes into account the significance of different attributes towards the clustering process. Then we present our algorithm for clustering mixed data. Finally, the performance of the proposed method is demonstrated by a series of experiments on four real world datasets in comparison with that of traditional clustering algorithms.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2
ISSN:	0950-7051 1872-7409
DOI:	10.1016/j.knosys.2012.01.006