An improved k-prototypes clustering algorithm for mixed numeric and categorical data

Data objects with mixed numeric and categorical attributes are commonly encountered in real world. The k-prototypes algorithm is one of the principal algorithms for clustering this type of data objects. In this paper, we propose an improved k-prototypes algorithm to cluster mixed data. In our method...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 120; pp. 590 - 596
Main Authors Ji, Jinchao, Bai, Tian, Zhou, Chunguang, Ma, Chao, Wang, Zhe
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 23.11.2013
Elsevier
Subjects
Online AccessGet full text
ISSN0925-2312
1872-8286
DOI10.1016/j.neucom.2013.04.011

Cover

Abstract Data objects with mixed numeric and categorical attributes are commonly encountered in real world. The k-prototypes algorithm is one of the principal algorithms for clustering this type of data objects. In this paper, we propose an improved k-prototypes algorithm to cluster mixed data. In our method, we first introduce the concept of the distribution centroid for representing the prototype of categorical attributes in a cluster. Then we combine both mean with distribution centroid to represent the prototype of the cluster with mixed attributes, and thus propose a new measure to calculate the dissimilarity between data objects and prototypes of clusters. This measure takes into account the significance of different attributes towards the clustering process. Finally, we present our algorithm for clustering mixed data, and the performance of our method is demonstrated by a series of experiments on four real-world datasets in comparison with that of traditional clustering algorithms. •We propose a new representation for the prototype of a cluster with mixed attributes.•We give a new measure to assess the dissimilarity between data objects and prototype.•This measure considers the significance of attribute towards clustering process.•Our algorithm can calculate the significance of attribute towards clustering.•Our algorithm achieves better results according to the clustering accuracy.
AbstractList Data objects with mixed numeric and categorical attributes are commonly encountered in real world. The k-prototypes algorithm is one of the principal algorithms for clustering this type of data objects. In this paper, we propose an improved k-prototypes algorithm to cluster mixed data. In our method, we first introduce the concept of the distribution centroid for representing the prototype of categorical attributes in a cluster. Then we combine both mean with distribution centroid to represent the prototype of the cluster with mixed attributes, and thus propose a new measure to calculate the dissimilarity between data objects and prototypes of clusters. This measure takes into account the significance of different attributes towards the clustering process. Finally, we present our algorithm for clustering mixed data, and the performance of our method is demonstrated by a series of experiments on four real-world datasets in comparison with that of traditional clustering algorithms.
Data objects with mixed numeric and categorical attributes are commonly encountered in real world. The k-prototypes algorithm is one of the principal algorithms for clustering this type of data objects. In this paper, we propose an improved k-prototypes algorithm to cluster mixed data. In our method, we first introduce the concept of the distribution centroid for representing the prototype of categorical attributes in a cluster. Then we combine both mean with distribution centroid to represent the prototype of the cluster with mixed attributes, and thus propose a new measure to calculate the dissimilarity between data objects and prototypes of clusters. This measure takes into account the significance of different attributes towards the clustering process. Finally, we present our algorithm for clustering mixed data, and the performance of our method is demonstrated by a series of experiments on four real-world datasets in comparison with that of traditional clustering algorithms. •We propose a new representation for the prototype of a cluster with mixed attributes.•We give a new measure to assess the dissimilarity between data objects and prototype.•This measure considers the significance of attribute towards clustering process.•Our algorithm can calculate the significance of attribute towards clustering.•Our algorithm achieves better results according to the clustering accuracy.
Author Wang, Zhe
Zhou, Chunguang
Bai, Tian
Ma, Chao
Ji, Jinchao
Author_xml – sequence: 1
  givenname: Jinchao
  surname: Ji
  fullname: Ji, Jinchao
  email: jinchao0374@163.com
  organization: College of Computer Science and Technology, Jilin University, Changchun 130012, China
– sequence: 2
  givenname: Tian
  surname: Bai
  fullname: Bai, Tian
  email: dayton915@gmail.com
  organization: College of Computer Science and Technology, Jilin University, Changchun 130012, China
– sequence: 3
  givenname: Chunguang
  surname: Zhou
  fullname: Zhou, Chunguang
  email: cgzhou@jlu.edu.cn
  organization: College of Computer Science and Technology, Jilin University, Changchun 130012, China
– sequence: 4
  givenname: Chao
  surname: Ma
  fullname: Ma, Chao
  email: billmach0913@gmail.com
  organization: College of Computer Science and Technology, Jilin University, Changchun 130012, China
– sequence: 5
  givenname: Zhe
  surname: Wang
  fullname: Wang, Zhe
  email: wzj0431@gmail.com
  organization: College of Computer Science and Technology, Jilin University, Changchun 130012, China
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=27678780$$DView record in Pascal Francis
BookMark eNqFkD1rHDEQhkVwIGfH_8CFmkCaXetrJW0KgzFJHDCkuV7opFlHl13pImlN_O-j40yKFE41A_O8L8Nzjs5iioDQFSU9JVRe7_sIq0tLzwjlPRE9ofQN2lCtWKeZlmdoQ0Y2dIxT9g6dl7InhCrKxg3a3kYclkNOT-Dxz64tNdXnAxTs5rVUyCE-Yjs_phzqjwVPKeMl_G5sXJd2dNhGj52tcCScnbG31b5Hbyc7F7h8mRdo--Xz9u6-e_j-9dvd7UPnuBxrt-NOemGFl2IHwzhRrh1MZHJCkdHLwY2Kksnu2KiJt4oNoDRjbKTAQWjJL9DHU237-tcKpZolFAfzbCOktRg6EMn1MAje0A8vqC3tyynb6EIxhxwWm58NU1JppUnjxIlzOZWSYfqLUGKOrs3enFybo2tDhGmuW-zTPzEXqq0hxZptmP8XvjmFoal6CpBNcQGiAx8yuGp8Cq8X_AHCJZ-g
CitedBy_id crossref_primary_10_1007_s11063_023_11332_y
crossref_primary_10_1016_j_patcog_2017_02_001
crossref_primary_10_1016_j_ins_2024_121528
crossref_primary_10_1016_j_eswa_2017_12_013
crossref_primary_10_1109_ACCESS_2021_3057113
crossref_primary_10_3390_su10082614
crossref_primary_10_3233_JIFS_179001
crossref_primary_10_1007_s10462_019_09800_w
crossref_primary_10_1016_j_asoc_2017_04_031
crossref_primary_10_1080_0951192X_2018_1509129
crossref_primary_10_1016_j_neucom_2018_03_048
crossref_primary_10_1016_j_patcog_2024_111062
crossref_primary_10_1155_2017_6393652
crossref_primary_10_1007_s10489_018_1324_x
crossref_primary_10_1016_j_cie_2022_108164
crossref_primary_10_2147_JPR_S246503
crossref_primary_10_1080_09617353_2024_2369961
crossref_primary_10_3390_math12182884
crossref_primary_10_1007_s00357_016_9208_4
crossref_primary_10_1016_j_neucom_2021_08_050
crossref_primary_10_1007_s13369_019_04121_0
crossref_primary_10_1109_ACCESS_2023_3345466
crossref_primary_10_1016_j_asoc_2016_06_019
crossref_primary_10_3233_JIFS_18146
crossref_primary_10_1016_j_ins_2021_07_039
crossref_primary_10_1002_wics_1456
crossref_primary_10_1155_2020_5143797
crossref_primary_10_1016_j_ins_2019_07_100
crossref_primary_10_3390_a12090177
crossref_primary_10_1007_s13042_022_01602_x
crossref_primary_10_1016_j_chemolab_2020_104070
crossref_primary_10_1109_ACCESS_2021_3118411
crossref_primary_10_3390_jtaer17040076
crossref_primary_10_1088_1755_1315_252_5_052100
crossref_primary_10_56294_saludcyt2022194
crossref_primary_10_1155_2023_7114343
crossref_primary_10_1186_s40537_024_01052_y
crossref_primary_10_32604_cmc_2021_017548
crossref_primary_10_1016_j_eswa_2021_115882
crossref_primary_10_1007_s13042_023_01968_6
crossref_primary_10_1016_j_eswa_2021_116100
crossref_primary_10_1142_S1469026820500303
crossref_primary_10_1007_s12046_018_0823_0
crossref_primary_10_3390_e17031535
crossref_primary_10_1007_s11277_019_06709_z
crossref_primary_10_1016_j_jbi_2018_02_008
crossref_primary_10_1145_3609333
crossref_primary_10_1007_s10844_014_0348_x
crossref_primary_10_12677_SA_2022_111009
crossref_primary_10_1007_s11634_018_0316_3
crossref_primary_10_1016_j_patcog_2023_109353
crossref_primary_10_1016_j_neucom_2015_08_018
crossref_primary_10_1007_s10115_024_02319_9
crossref_primary_10_1016_j_is_2025_102549
crossref_primary_10_1109_ACCESS_2019_2903568
crossref_primary_10_1007_s10844_017_0472_5
crossref_primary_10_1088_1742_6596_1367_1_012018
crossref_primary_10_1016_j_celrep_2021_108975
crossref_primary_10_1016_j_ins_2021_02_045
crossref_primary_10_1109_TNNLS_2017_2704779
crossref_primary_10_1002_ldr_3741
crossref_primary_10_1109_ACCESS_2020_2973216
crossref_primary_10_1016_j_engappai_2016_08_009
crossref_primary_10_1088_1757_899X_725_1_012105
crossref_primary_10_1109_TIM_2022_3216366
Cites_doi 10.1109/91.784206
10.1023/A:1009769707641
10.1016/j.eswa.2005.11.017
10.1109/TKDE.2002.1019208
10.1109/TPAMI.2005.95
10.1016/j.neucom.2012.02.013
10.1145/331499.331504
10.1145/584887.584888
10.1109/CEC.2010.5586136
10.1109/TIT.1982.1056489
10.1016/j.patrec.2004.04.004
10.1016/j.eswa.2007.08.049
10.1016/j.knosys.2009.11.001
10.1016/j.neucom.2011.07.014
10.1016/j.eswa.2011.01.074
10.2307/2528080
10.1016/j.neucom.2011.09.002
10.1016/j.neucom.2011.11.001
10.1016/j.patcog.2011.07.006
10.1016/j.knosys.2012.01.006
ContentType Journal Article
Copyright 2013 Elsevier B.V.
2014 INIST-CNRS
Copyright_xml – notice: 2013 Elsevier B.V.
– notice: 2014 INIST-CNRS
DBID AAYXX
CITATION
IQODW
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1016/j.neucom.2013.04.011
DatabaseName CrossRef
Pascal-Francis
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Applied Sciences
EISSN 1872-8286
EndPage 596
ExternalDocumentID 27678780
10_1016_j_neucom_2013_04_011
S0925231213004773
GroupedDBID ---
--K
--M
.DC
.~1
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JM
9JN
AABNK
AACTN
AADPK
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXLA
AAXUO
AAYFN
ABBOA
ABCQJ
ABFNM
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFXIZ
AGHFR
AGUBO
AGWIK
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
AXJTR
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
IHE
J1W
KOM
LG9
M41
MO0
MOBAO
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
RIG
ROL
RPZ
SDF
SDG
SDP
SES
SPC
SPCBC
SSN
SSV
SSZ
T5K
ZMT
~G-
29N
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ACLOT
ACNNM
ACRPL
ACVFH
ADCNI
ADJOM
ADMUD
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
CITATION
EFKBS
FEDTE
FGOYB
HLZ
HVGLF
HZ~
R2-
SBC
SEW
WUQ
XPP
~HD
AGCQF
AGRNS
BNPGV
IQODW
SSH
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c369t-b3c6d4a4d64be59f138cef0fc4709d65c9710fab2980da725e7822291e3e4863
IEDL.DBID .~1
ISSN 0925-2312
IngestDate Sat Sep 27 17:20:23 EDT 2025
Mon Jul 21 09:16:52 EDT 2025
Wed Oct 01 03:45:18 EDT 2025
Thu Apr 24 23:07:37 EDT 2025
Fri Feb 23 02:27:18 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Dissimilarity measure
Clustering
Data mining
Mixed data
Attribute significance
Cluster analysis
Data type
Data analysis
Prototype
Similarity
Cluster
Categorical data
Center of mass
Selection criterion
Language English
License CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c369t-b3c6d4a4d64be59f138cef0fc4709d65c9710fab2980da725e7822291e3e4863
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
PQID 1506385543
PQPubID 23500
PageCount 7
ParticipantIDs proquest_miscellaneous_1506385543
pascalfrancis_primary_27678780
crossref_primary_10_1016_j_neucom_2013_04_011
crossref_citationtrail_10_1016_j_neucom_2013_04_011
elsevier_sciencedirect_doi_10_1016_j_neucom_2013_04_011
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2013-11-23
PublicationDateYYYYMMDD 2013-11-23
PublicationDate_xml – month: 11
  year: 2013
  text: 2013-11-23
  day: 23
PublicationDecade 2010
PublicationPlace Amsterdam
PublicationPlace_xml – name: Amsterdam
PublicationTitle Neurocomputing (Amsterdam)
PublicationYear 2013
Publisher Elsevier B.V
Elsevier
Publisher_xml – name: Elsevier B.V
– name: Elsevier
References Jain, Murty, Flynn (bib8) 1999; 31
Ji, Pang, Zhou, Han, Wang (bib28) 2012; 30
Hsu, Chen (bib3) 2007; 32
Hsu, Huang (bib10) 2008; 35
D. Barbara, J. Couto, Y. Li, COOLCAT: an entropy-based algorithm for categorical clustering, in: Proceedings of the Eleventh International Conference on Information and Knowledge Management, 2002, pp. 582–589.
Li, Chen, Bu (bib5) 2012; 87
Huang (bib13) 1998; 2
Huang (bib6) 1998; 2
A. Frank, A. Asuncion, UCI Machine Learning Repository, University of California, School of Information and Computer Science, CA, (http://archive.ics.uci.edu/ml), 2010
Goodall (bib20) 1966; 22
Bezdek, Keller, Krisnapuram (bib24) 1999
Huang, Ng (bib14) 1999; 7
Deng, He, Xu (bib17) 2010; 23
Kim, Lee, Lee (bib26) 2004; 25
He, Tan (bib15) 2012; 81
Chen, Feng (bib4) 2012; 77
Li, Biswas (bib19) 2002; 14
Fayyad, Piatetsky-Shapiro, Smyth (bib2) 1996
Lloyd (bib12) 1982; 28
Cheessman, Stutz (bib18) 1995
Z.X. Huang, Clustering large data sets with mixed numeric and categorical values, in: Proceedings of the First Pacific-Asia Knowledge Discovery and Data Mining Conference, 1997, pp. 21–34.
Chatzis (bib21) 2011; 38
Z. Zheng, M.G. Gong, J.J. Ma, L.C. Jiao, Unsupervised evolutionary clustering algorithm for mixed type data, in: Proceedings of the IEEE Congress on Evolutionary Computation (CEC), 2010, pp. 1–8.
Huang, Ng, Rong, Li (bib27) 2005; 27
Dunham (bib1) 2002
Han, Kamber (bib9) 2001
David, Averbuch (bib22) 2012; 45
Hsu, Lin, Tai (bib11) 2011; 74
Jain, Dubes (bib7) 1988
Goodall (10.1016/j.neucom.2013.04.011_bib20) 1966; 22
10.1016/j.neucom.2013.04.011_bib16
Jain (10.1016/j.neucom.2013.04.011_bib8) 1999; 31
Huang (10.1016/j.neucom.2013.04.011_bib14) 1999; 7
Huang (10.1016/j.neucom.2013.04.011_bib27) 2005; 27
Deng (10.1016/j.neucom.2013.04.011_bib17) 2010; 23
Hsu (10.1016/j.neucom.2013.04.011_bib11) 2011; 74
David (10.1016/j.neucom.2013.04.011_bib22) 2012; 45
Huang (10.1016/j.neucom.2013.04.011_bib13) 1998; 2
Chen (10.1016/j.neucom.2013.04.011_bib4) 2012; 77
Ji (10.1016/j.neucom.2013.04.011_bib28) 2012; 30
Fayyad (10.1016/j.neucom.2013.04.011_bib2) 1996
Lloyd (10.1016/j.neucom.2013.04.011_bib12) 1982; 28
Kim (10.1016/j.neucom.2013.04.011_bib26) 2004; 25
Bezdek (10.1016/j.neucom.2013.04.011_bib24) 1999
Hsu (10.1016/j.neucom.2013.04.011_bib10) 2008; 35
Dunham (10.1016/j.neucom.2013.04.011_bib1) 2002
10.1016/j.neucom.2013.04.011_bib25
10.1016/j.neucom.2013.04.011_bib29
Huang (10.1016/j.neucom.2013.04.011_bib6) 1998; 2
Han (10.1016/j.neucom.2013.04.011_bib9) 2001
Li (10.1016/j.neucom.2013.04.011_bib5) 2012; 87
Cheessman (10.1016/j.neucom.2013.04.011_bib18) 1995
10.1016/j.neucom.2013.04.011_bib23
Hsu (10.1016/j.neucom.2013.04.011_bib3) 2007; 32
Chatzis (10.1016/j.neucom.2013.04.011_bib21) 2011; 38
He (10.1016/j.neucom.2013.04.011_bib15) 2012; 81
Li (10.1016/j.neucom.2013.04.011_bib19) 2002; 14
Jain (10.1016/j.neucom.2013.04.011_bib7) 1988
References_xml – volume: 35
  start-page: 1177
  year: 2008
  end-page: 1185
  ident: bib10
  article-title: Incremental clustering of mixed data based on distance hierarchy
  publication-title: Expert Syst. Appl.
– volume: 38
  start-page: 8684
  year: 2011
  end-page: 8689
  ident: bib21
  article-title: A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional
  publication-title: Expert Syst. Appl.
– reference: Z.X. Huang, Clustering large data sets with mixed numeric and categorical values, in: Proceedings of the First Pacific-Asia Knowledge Discovery and Data Mining Conference, 1997, pp. 21–34.
– volume: 32
  start-page: 12
  year: 2007
  end-page: 27
  ident: bib3
  article-title: Mining of mixed data with application to catalog marketing
  publication-title: Expert Syst. Appl.
– year: 2001
  ident: bib9
  article-title: Data Mining Concepts and Techniques
– volume: 2
  start-page: 283
  year: 1998
  end-page: 304
  ident: bib13
  article-title: Extensions to the k-means algorithm for clustering large data sets with categorical values
  publication-title: Data Min. Knowl. Discovery
– volume: 23
  start-page: 144
  year: 2010
  end-page: 149
  ident: bib17
  article-title: G-ANMI: a mutual information based genetic clustering algorithm for categorical data
  publication-title: Knowledge-based Syst.
– volume: 25
  start-page: 1263
  year: 2004
  end-page: 1271
  ident: bib26
  article-title: Fuzzy clustering of categorical data using fuzzy centroids
  publication-title: Pattern Recognition Lett.
– volume: 28
  start-page: 129
  year: 1982
  end-page: 137
  ident: bib12
  article-title: Least squares quantization in PCM
  publication-title: IEEE Trans. Inf. Theory
– reference: Z. Zheng, M.G. Gong, J.J. Ma, L.C. Jiao, Unsupervised evolutionary clustering algorithm for mixed type data, in: Proceedings of the IEEE Congress on Evolutionary Computation (CEC), 2010, pp. 1–8.
– year: 1988
  ident: bib7
  article-title: Algorithms for Clustering Data
– year: 1999
  ident: bib24
  article-title: Fuzzy Models and Algorithms for Pattern Recognition and Image Processing
– volume: 7
  start-page: 446
  year: 1999
  end-page: 452
  ident: bib14
  article-title: A fuzzy k-modes algorithm for clustering categorical data
  publication-title: IEEE Trans. Fuzzy Syst.
– year: 1996
  ident: bib2
  article-title: Advances in Knowledge Discovery and Data Mining
– volume: 45
  start-page: 416
  year: 2012
  end-page: 433
  ident: bib22
  article-title: SpectralCAT: categorical spectral clustering of numerical and nominal data
  publication-title: Pattern Recognition
– volume: 2
  start-page: 283
  year: 1998
  end-page: 304
  ident: bib6
  article-title: Extensions to the k-means algorithm for clustering large data sets with categorical values
  publication-title: Data Min. Knowl. Discovery
– reference: D. Barbara, J. Couto, Y. Li, COOLCAT: an entropy-based algorithm for categorical clustering, in: Proceedings of the Eleventh International Conference on Information and Knowledge Management, 2002, pp. 582–589.
– volume: 87
  start-page: 120
  year: 2012
  end-page: 131
  ident: bib5
  article-title: Clustering analysis using manifold kernel concept factorization
  publication-title: Neurocomputing
– year: 2002
  ident: bib1
  article-title: Data Mining: Introductory and Advanced Topics
– volume: 77
  start-page: 229
  year: 2012
  end-page: 242
  ident: bib4
  article-title: Spectral clustering: a semi-supervised approach
  publication-title: Neurocomputing
– volume: 31
  start-page: 264
  year: 1999
  end-page: 323
  ident: bib8
  article-title: Data clustering: a survey
  publication-title: ACM Comput. Surv.
– volume: 30
  start-page: 129
  year: 2012
  end-page: 135
  ident: bib28
  article-title: A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data
  publication-title: Knowledge-based Syst.
– volume: 14
  start-page: 673
  year: 2002
  end-page: 690
  ident: bib19
  article-title: Unsupervised learning with mixed numeric and nominal data
  publication-title: IEEE Trans. Knowl. Data Eng.
– reference: A. Frank, A. Asuncion, UCI Machine Learning Repository, University of California, School of Information and Computer Science, CA, (http://archive.ics.uci.edu/ml), 2010
– volume: 74
  start-page: 3832
  year: 2011
  end-page: 3842
  ident: bib11
  article-title: Apply extended self-organizing map to cluster and classify mixed-type data
  publication-title: Neurocomputing
– volume: 22
  start-page: 882
  year: 1966
  end-page: 907
  ident: bib20
  article-title: A new similarity index based on probability
  publication-title: Biometrics
– volume: 81
  start-page: 49
  year: 2012
  end-page: 59
  ident: bib15
  article-title: A two-stage genetic algorithm for automatic clustering
  publication-title: Neurocomputing
– volume: 27
  start-page: 657
  year: 2005
  end-page: 668
  ident: bib27
  article-title: Automated variable weighting in k-means type clustering
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– year: 1995
  ident: bib18
  article-title: Bayesian classification (AUTOCLASS): theory and results
  publication-title: Adv. Knowl. Discovery Data Min.
– ident: 10.1016/j.neucom.2013.04.011_bib29
– volume: 7
  start-page: 446
  issue: 4
  year: 1999
  ident: 10.1016/j.neucom.2013.04.011_bib14
  article-title: A fuzzy k-modes algorithm for clustering categorical data
  publication-title: IEEE Trans. Fuzzy Syst.
  doi: 10.1109/91.784206
– year: 1996
  ident: 10.1016/j.neucom.2013.04.011_bib2
– year: 1988
  ident: 10.1016/j.neucom.2013.04.011_bib7
– volume: 2
  start-page: 283
  issue: 3
  year: 1998
  ident: 10.1016/j.neucom.2013.04.011_bib6
  article-title: Extensions to the k-means algorithm for clustering large data sets with categorical values
  publication-title: Data Min. Knowl. Discovery
  doi: 10.1023/A:1009769707641
– volume: 2
  start-page: 283
  issue: 3
  year: 1998
  ident: 10.1016/j.neucom.2013.04.011_bib13
  article-title: Extensions to the k-means algorithm for clustering large data sets with categorical values
  publication-title: Data Min. Knowl. Discovery
  doi: 10.1023/A:1009769707641
– volume: 32
  start-page: 12
  issue: 1
  year: 2007
  ident: 10.1016/j.neucom.2013.04.011_bib3
  article-title: Mining of mixed data with application to catalog marketing
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2005.11.017
– volume: 14
  start-page: 673
  issue: 4
  year: 2002
  ident: 10.1016/j.neucom.2013.04.011_bib19
  article-title: Unsupervised learning with mixed numeric and nominal data
  publication-title: IEEE Trans. Knowl. Data Eng.
  doi: 10.1109/TKDE.2002.1019208
– volume: 27
  start-page: 657
  issue: 5
  year: 2005
  ident: 10.1016/j.neucom.2013.04.011_bib27
  article-title: Automated variable weighting in k-means type clustering
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2005.95
– volume: 87
  start-page: 120
  issue: 15
  year: 2012
  ident: 10.1016/j.neucom.2013.04.011_bib5
  article-title: Clustering analysis using manifold kernel concept factorization
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2012.02.013
– volume: 31
  start-page: 264
  issue: 3
  year: 1999
  ident: 10.1016/j.neucom.2013.04.011_bib8
  article-title: Data clustering: a survey
  publication-title: ACM Comput. Surv.
  doi: 10.1145/331499.331504
– year: 2001
  ident: 10.1016/j.neucom.2013.04.011_bib9
– ident: 10.1016/j.neucom.2013.04.011_bib16
  doi: 10.1145/584887.584888
– ident: 10.1016/j.neucom.2013.04.011_bib25
  doi: 10.1109/CEC.2010.5586136
– year: 2002
  ident: 10.1016/j.neucom.2013.04.011_bib1
– volume: 28
  start-page: 129
  issue: 2
  year: 1982
  ident: 10.1016/j.neucom.2013.04.011_bib12
  article-title: Least squares quantization in PCM
  publication-title: IEEE Trans. Inf. Theory
  doi: 10.1109/TIT.1982.1056489
– volume: 25
  start-page: 1263
  issue: 11
  year: 2004
  ident: 10.1016/j.neucom.2013.04.011_bib26
  article-title: Fuzzy clustering of categorical data using fuzzy centroids
  publication-title: Pattern Recognition Lett.
  doi: 10.1016/j.patrec.2004.04.004
– volume: 35
  start-page: 1177
  issue: 3
  year: 2008
  ident: 10.1016/j.neucom.2013.04.011_bib10
  article-title: Incremental clustering of mixed data based on distance hierarchy
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2007.08.049
– volume: 23
  start-page: 144
  issue: 2
  year: 2010
  ident: 10.1016/j.neucom.2013.04.011_bib17
  article-title: G-ANMI: a mutual information based genetic clustering algorithm for categorical data
  publication-title: Knowledge-based Syst.
  doi: 10.1016/j.knosys.2009.11.001
– volume: 74
  start-page: 3832
  issue: 18
  year: 2011
  ident: 10.1016/j.neucom.2013.04.011_bib11
  article-title: Apply extended self-organizing map to cluster and classify mixed-type data
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2011.07.014
– year: 1999
  ident: 10.1016/j.neucom.2013.04.011_bib24
– volume: 38
  start-page: 8684
  issue: 7
  year: 2011
  ident: 10.1016/j.neucom.2013.04.011_bib21
  article-title: A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2011.01.074
– volume: 22
  start-page: 882
  issue: 4
  year: 1966
  ident: 10.1016/j.neucom.2013.04.011_bib20
  article-title: A new similarity index based on probability
  publication-title: Biometrics
  doi: 10.2307/2528080
– volume: 77
  start-page: 229
  issue: 1
  year: 2012
  ident: 10.1016/j.neucom.2013.04.011_bib4
  article-title: Spectral clustering: a semi-supervised approach
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2011.09.002
– volume: 81
  start-page: 49
  issue: 1
  year: 2012
  ident: 10.1016/j.neucom.2013.04.011_bib15
  article-title: A two-stage genetic algorithm for automatic clustering
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2011.11.001
– volume: 45
  start-page: 416
  issue: 1
  year: 2012
  ident: 10.1016/j.neucom.2013.04.011_bib22
  article-title: SpectralCAT: categorical spectral clustering of numerical and nominal data
  publication-title: Pattern Recognition
  doi: 10.1016/j.patcog.2011.07.006
– year: 1995
  ident: 10.1016/j.neucom.2013.04.011_bib18
  article-title: Bayesian classification (AUTOCLASS): theory and results
  publication-title: Adv. Knowl. Discovery Data Min.
– ident: 10.1016/j.neucom.2013.04.011_bib23
– volume: 30
  start-page: 129
  issue: 1
  year: 2012
  ident: 10.1016/j.neucom.2013.04.011_bib28
  article-title: A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data
  publication-title: Knowledge-based Syst.
  doi: 10.1016/j.knosys.2012.01.006
SSID ssj0017129
Score 2.3896797
Snippet Data objects with mixed numeric and categorical attributes are commonly encountered in real world. The k-prototypes algorithm is one of the principal...
SourceID proquest
pascalfrancis
crossref
elsevier
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 590
SubjectTerms Algorithms
Applied sciences
Attribute significance
Clustering
Computer science; control theory; systems
Data mining
Data processing. List processing. Character string processing
Dissimilarity measure
Exact sciences and technology
Memory organisation. Data processing
Mixed data
Software
Title An improved k-prototypes clustering algorithm for mixed numeric and categorical data
URI https://dx.doi.org/10.1016/j.neucom.2013.04.011
https://www.proquest.com/docview/1506385543
Volume 120
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier ScienceDirect
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Complete Freedom Collection [SCCMFC]
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: ACRLP
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals [SCFCJ]
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: AIKHN
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  customDbUrl:
  mediaType: online
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: AKRWK
  dateStart: 19930201
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3PaxQxFA6lXgrSalW6VZcIXuNOfsxkclyKZVXoxRV6C_lZR7ezS3cHPPm3m5eZWShSCj1OSMjwkrzvJfnyPYQ-ViptCmIZiHWcE5EAgNgEO0SlySJDlNHFzLa4qhY_xNfr8voAXYxvYYBWOfj-3qdnbz2UzAZrzjZNM_teKJZ2USBJBpKHEhQ_hZCQxeDT3z3Ng0rKer09VhKoPT6fyxyvNnTAGUkgyLPgKaUPwdPzjdkmo8U-28V_jjuj0eULdDyEkXje_-lLdBDaU3QypmjAw4p9hZbzFjf53CB4_JuAKsMaTl232K060EhIyIXN6mZ91-x-3uIUweLb5k-q23b5Kgeb1mMgTd30WiIYGKWv0fLy8_JiQYZECsTxSu2I5a7ywghfCRtKFSmvXYhFdEIWylelUynOiMYyVRfeSFaGHDcoGngQdcXfoMN23YYzhBn1lvlK-VpQEYKwQjlqEuTbwAtl1ATx0XzaDSLjkOtipUc22S_dG12D0XUhdDL6BJF9q00vsvFIfTmOjL43WXTCgUdaTu8N5L47JhNqy7qYoA_jyOq00OD2xLRh3W01SDFyIPXx8yd3_xYdwRe8ZWT8HTrc3XXhfQpqdnaaZ-0UPZt_-ba4-gdzGfeN
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1baxQxFA61PihIveNaWyP4GndymcnksRTLqrUvrtC3kGsd3c4u3R3oU3-7OZmZhSJS6OskIcNJcr6T5Mt3EPpYqbQpiGUg1nFORAIAYhPsEJUmiwxRRhcz2-Ksmv0UX8_L8x10PL6FAVrl4Pt7n5699fBlOlhzumqa6Y9CsbSLAkkykDyU_AF6KEomYQf26WbL86CSsl5wj5UEqo_v5zLJqw0dkEYSCvKseErp__Dpycqsk9Vin-7iH8-d4ejkGdob4kh81P_qc7QT2hfo6ZijAQ9L9iWaH7W4yQcHweM_BGQZlnDsusZu0YFIQoIubBYXy6tm8-sSpxAWXzbXqW7b5bscbFqPgTV10YuJYKCUvkLzk8_z4xkZMikQxyu1IZa7ygsjfCVsKFWkvHYhFtEJWShflU6lQCMay1RdeCNZGXLgoGjgQdQVf41222Ub3iDMqLfMV8rXgooQhBXKUZMw3wZeKKMmiI_m025QGYdkFws90sl-697oGoyuC6GT0SeIbFutepWNO-rLcWT0rdmiExDc0fLw1kBuu2Mywbasiwn6MI6sTisNrk9MG5bdWoMWIwdWH3977-7fo0ez-fdTffrl7Ns-egwl8LCR8Xdod3PVhYMU4WzsYZ7BfwEpgPki
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+improved+k-prototypes+clustering+algorithm+for+mixed+numeric+and+categorical+data&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=JINCHAO+JI&rft.au=TIAN+BAI&rft.au=CHUNGUANG+ZHOU&rft.au=CHAO+MA&rft.date=2013-11-23&rft.pub=Elsevier&rft.issn=0925-2312&rft.volume=120&rft.spage=590&rft.epage=596&rft_id=info:doi/10.1016%2Fj.neucom.2013.04.011&rft.externalDBID=n%2Fa&rft.externalDocID=27678780
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon