Non-dominated sorting genetic algorithm using fuzzy membership chromosome for categorical data clustering

•Integrate the non-dominated Pareto sorting in multi-objective genetic algorithm.•Consider both clustering compactness and fuzzy separation in the objective function.•Utilize the fuzzy membership chromosome to reduce the computational time of solution selection.•Evaluate the algorithm on the real li...

Full description

Saved in:
Bibliographic Details
Published inApplied soft computing Vol. 30; pp. 113 - 122
Main Authors Yang, Chao-Lung, Kuo, R.J., Chien, Chia-Hsuan, Quyen, Nguyen Thi Phuong
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.05.2015
Subjects
Online AccessGet full text
ISSN1568-4946
1872-9681
DOI10.1016/j.asoc.2015.01.031

Cover

More Information
Summary:•Integrate the non-dominated Pareto sorting in multi-objective genetic algorithm.•Consider both clustering compactness and fuzzy separation in the objective function.•Utilize the fuzzy membership chromosome to reduce the computational time of solution selection.•Evaluate the algorithm on the real life datasets collected from UCI. In this research, a data clustering algorithm named as non-dominated sorting genetic algorithm-fuzzy membership chromosome (NSGA-FMC) based on K-modes method which combines fuzzy genetic algorithm and multi-objective optimization was proposed to improve the clustering quality on categorical data. The proposed method uses fuzzy membership value as chromosome. In addition, due to this innovative chromosome setting, a more efficient solution selection technique which selects a solution from non-dominated Pareto front based on the largest fuzzy membership is integrated in the proposed algorithm. The multiple objective functions: fuzzy compactness within a cluster (π) and separation among clusters (sep) are used to optimize the clustering quality. A series of experiments by using three UCI categorical datasets were conducted to compare the clustering results of the proposed NSGA-FMC with two existing methods: genetic algorithm fuzzy K-modes (GA-FKM) and multi-objective genetic algorithm-based fuzzy clustering of categorical attributes (MOGA (π, sep)). Adjusted Rand index (ARI), π, sep, and computation time were used as performance indexes for comparison. The experimental result showed that the proposed method can obtain better clustering quality in terms of ARI, π, and sep simultaneously with shorter computation time.
ISSN:1568-4946
1872-9681
DOI:10.1016/j.asoc.2015.01.031