Fuzzy C-means method for clustering microarray data
Motivation: Clustering analysis of data from DNA microarray hybridization studies is essential for identifying biologically relevant groups of genes. Partitional clustering methods such as K-means or self-organizing maps assign each gene to a single cluster. However, these methods do not provide inf...
        Saved in:
      
    
          | Published in | Bioinformatics Vol. 19; no. 8; pp. 973 - 980 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        Oxford
          Oxford University Press
    
        22.05.2003
     Oxford Publishing Limited (England) Oxford University Press (OUP)  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1367-4803 1460-2059 1367-4811  | 
| DOI | 10.1093/bioinformatics/btg119 | 
Cover
| Summary: | Motivation: Clustering analysis of data from DNA microarray hybridization studies is essential for identifying biologically relevant groups of genes. Partitional clustering methods such as K-means or self-organizing maps assign each gene to a single cluster. However, these methods do not provide information about the influence of a given gene for the overall shape of clusters. Here we apply a fuzzy partitioning method, Fuzzy C-means (FCM), to attribute cluster membership values to genes. Results: A major problem in applying the FCM method for clustering microarray data is the choice of the fuzziness parameter m. We show that the commonly used value m = 2 is not appropriate for some data sets, and that optimal values for m vary widely from one data set to another. We propose an empirical method, based on the distribution of distances between genes in a given data set, to determine an adequate value for m. By setting threshold levels for the membership values, genes which are tigthly associated to a given cluster can be selected. Using a yeast cell cycle data set as an example, we show that this selection increases the overall biological significance of the genes within the cluster. Availability: Supplementary text and Matlab functions are available at http://www-igbmc.u-strasbg.fr/fcm/ Contact: doulaye@titus.u-strasbg.fr * To whom correspondence should be addressed. | 
|---|---|
| Bibliography: | ark:/67375/HXZ-TRSC97Z2-3 local:190973 istex:77327D2E4A6A032978E578CDC57C2EC6AD9DD4CF PII:1460-2059 ObjectType-Article-1 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 ObjectType-Undefined-3  | 
| ISSN: | 1367-4803 1460-2059 1367-4811  | 
| DOI: | 10.1093/bioinformatics/btg119 |