Initializing Partition-Optimization Algorithms

Clustering datasets is a challenging problem needed in a wide array of applications. Partition-optimization approaches, such as k-means or expectation-maximization (EM) algorithms, are sub-optimal and find solutions in the vicinity of their initialization. This paper proposes a staged approach to sp...

Full description

Saved in:
Bibliographic Details
Published inIEEE/ACM transactions on computational biology and bioinformatics Vol. 6; no. 1; pp. 144 - 157
Main Author Maitra, R.
Format Journal Article
LanguageEnglish
Published United States IEEE 01.01.2009
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1545-5963
1557-9964
1557-9964
DOI10.1109/TCBB.2007.70244

Cover

More Information
Summary:Clustering datasets is a challenging problem needed in a wide array of applications. Partition-optimization approaches, such as k-means or expectation-maximization (EM) algorithms, are sub-optimal and find solutions in the vicinity of their initialization. This paper proposes a staged approach to specifying initial values by finding a large number of local modes and then obtaining representatives from the most separated ones. Results on test experiments are excellent. We also provide a detailed comparative assessment of the suggested algorithm with many commonly-used initialization approaches in the literature. Finally, the methodology is applied to two datasets on diurnal microarray gene expressions and industrial releases of mercury.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:1545-5963
1557-9964
1557-9964
DOI:10.1109/TCBB.2007.70244