Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm

Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods giv...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 22; no. 1; pp. 58 - 67
Main Authors Grotkjær, Thomas, Winther, Ole, Regenberg, Birgitte, Nielsen, Jens, Hansen, Lars Kai
Format Journal Article
LanguageEnglish
Published Oxford Oxford University Press 01.01.2006
Oxford Publishing Limited (England)
Subjects
Online AccessGet full text
ISSN1367-4803
1460-2059
1367-4811
DOI10.1093/bioinformatics/bti746

Cover

Abstract Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. Results: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. Availability: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. Contact: tg@biocentrum.dtu.dk and owi@imm.dtu.dk Supplementary information:
AbstractList MOTIVATION: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. RESULTS: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. AVAILABILITY: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. CONTACT: tgiocentrum.dtu.dk and owimm.dtu.dk Supplementary information: http://www.cmb.dtu.dk/
Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. Results: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. Availability: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. Contact: tg@biocentrum.dtu.dk and owi@imm.dtu.dk Supplementary information: http://www.cmb.dtu.dk/
Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results.MOTIVATIONHierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results.We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data.RESULTSWe show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data.Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W.AVAILABILITYMatlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W.
Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. Results: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. Availability:  Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. Contact:  tg@biocentrum.dtu.dk and owi@imm.dtu.dk Supplementary information:  
Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. Results: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. Availability: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. Contact: tg@biocentrum.dtu.dk and owi@imm.dtu.dk Supplementary information:
Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W.
Author Grotkjær, Thomas
Hansen, Lars Kai
Nielsen, Jens
Regenberg, Birgitte
Winther, Ole
Author_xml – sequence: 1
  givenname: Thomas
  surname: Grotkjær
  fullname: Grotkjær, Thomas
  organization: Center for Microbial Biotechnology BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
– sequence: 2
  givenname: Ole
  surname: Winther
  fullname: Winther, Ole
  organization: Informatics and Mathematical Modelling, Building 321, Technical University of Denmark DK-2800 Kgs. Lyngby, Denmark
– sequence: 3
  givenname: Birgitte
  surname: Regenberg
  fullname: Regenberg, Birgitte
  organization: Center for Microbial Biotechnology BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
– sequence: 4
  givenname: Jens
  surname: Nielsen
  fullname: Nielsen, Jens
  organization: Center for Microbial Biotechnology BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
– sequence: 5
  givenname: Lars Kai
  surname: Hansen
  fullname: Hansen, Lars Kai
  organization: Informatics and Mathematical Modelling, Building 321, Technical University of Denmark DK-2800 Kgs. Lyngby, Denmark
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17426177$$DView record in Pascal Francis
https://www.ncbi.nlm.nih.gov/pubmed/16257984$$D View this record in MEDLINE/PubMed
BookMark eNqFkVtrFTEUhYNU7EV_ghIE-zY2mdxm8KmeWiu0CqJ40IeQySSnqZmkTTJo_70p57TFvvQpm823dvZeaxdshRgMAC8xeotRTw4GF12wMU2qOJ0PhuIE5U_ADqYcNS1i_VatCRcN7RDZBrs5XyDEMKX0GdjGvGWi7-gO-PU1DnMucJp9cU3Wyhuofe2Y5MIKRgu9SisDjz4fwsnpFFVK6hqOqqhsSoZ_XDmH5byKYsgm5DlD5Vcx1fb0HDy1ymfzYvPuge_HH74tTprTLx8_LQ5PG00JKQ3X1rDempFjRpSwo9J6xAMhwqLeYkbrti1nQ9saYtU4DKyexFvRjbo3hmiyB_bXcy9TvJpNLnJyWRvvVTBxzpILjjhh5FEQ95QI0vEKvn4AXsQ5hXpEZTrO247RCr3aQPMwmVFeJjepdC1vza3Amw2gboy1SQXt8j0naMuxEJVja67am3My9h5B8iZs-X_Ych121b17oNOuVCKGkpTzj6qbtdrVsP_efanS72oYEUyeLH_Kxdn74-XR8oc8I_8AdenHxw
CODEN BOINFP
CitedBy_id crossref_primary_10_1109_TCBB_2014_2359433
crossref_primary_10_1186_gb_2006_7_11_r108
crossref_primary_10_1089_omi_2008_0074
crossref_primary_10_1186_gb_2006_7_11_r107
crossref_primary_10_1093_bib_bbs057
crossref_primary_10_1109_TCBB_2008_33
crossref_primary_10_1109_TCBB_2012_108
crossref_primary_10_1016_j_ces_2007_08_017
crossref_primary_10_1016_j_csda_2009_05_031
crossref_primary_10_1002_widm_1053
crossref_primary_10_1089_omi_2009_0118
crossref_primary_10_1109_TNB_2012_2208198
crossref_primary_10_1111_j_1365_2958_2010_07209_x
crossref_primary_10_1186_1755_8794_6_41
crossref_primary_10_1198_106186007X237838
crossref_primary_10_1137_100804395
crossref_primary_10_1093_bioinformatics_btm463
crossref_primary_10_3389_fonc_2022_887318
crossref_primary_10_1016_j_ymeth_2015_03_017
crossref_primary_10_1016_j_segan_2018_100181
crossref_primary_10_1186_1471_2164_13_313
crossref_primary_10_1016_j_tim_2013_04_009
crossref_primary_10_1128_AEM_00548_06
crossref_primary_10_1186_gb_2009_10_5_r47
crossref_primary_10_1016_j_artmed_2008_07_014
crossref_primary_10_1186_1471_2164_9_341
crossref_primary_10_1002_cem_1048
crossref_primary_10_1038_s41598_021_00678_9
crossref_primary_10_1093_bioinformatics_btp521
crossref_primary_10_1371_journal_pone_0171429
crossref_primary_10_1109_TCBB_2016_2622692
crossref_primary_10_1038_msb_2011_80
Cites_doi 10.1093/bioinformatics/18.3.413
10.1023/A:1023949509487
10.1093/bioinformatics/18.2.275
10.2174/1389202043348472
10.1016/B978-155860797-2/50012-3
10.1073/pnas.95.25.14863
10.1093/bioinformatics/btg046
10.1093/bioinformatics/btf877
10.1093/bioinformatics/17.suppl_1.S22
10.1101/gr.180801
10.1091/mbc.11.12.4241
10.1111/j.2517-6161.1995.tb02031.x
10.1073/pnas.96.6.2907
10.1186/gb-2002-3-2-research0009
10.1038/10343
10.1074/jbc.M304478200
10.1126/science.278.5338.680
10.1165/ajrcmb.27.2.f247
10.1093/bioinformatics/btg107
10.1152/physiolgenomics.00139.2003
10.1073/pnas.132656399
10.1073/pnas.091062498
10.1093/bioinformatics/18.5.735
10.1101/gr.397002
10.1089/106652701753307485
10.1093/bioinformatics/btg245
10.1016/S0092-8674(00)00015-5
10.1093/bioinformatics/btg232
10.1080/01621459.1963.10500845
10.1038/75556
ContentType Journal Article
Copyright 2006 INIST-CNRS
Copyright Oxford University Press(England) Jan 1, 2006
Copyright_xml – notice: 2006 INIST-CNRS
– notice: Copyright Oxford University Press(England) Jan 1, 2006
DBID BSCLL
AAYXX
CITATION
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7QF
7QO
7QQ
7SC
7SE
7SP
7SR
7TA
7TB
7TM
7TO
7U5
8BQ
8FD
F28
FR3
H8D
H8G
H94
JG9
JQ2
K9.
KR7
L7M
L~C
L~D
P64
7X8
DOI 10.1093/bioinformatics/bti746
DatabaseName Istex
CrossRef
Pascal-Francis
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Aluminium Industry Abstracts
Biotechnology Research Abstracts
Ceramic Abstracts
Computer and Information Systems Abstracts
Corrosion Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
Materials Business File
Mechanical & Transportation Engineering Abstracts
Nucleic Acids Abstracts
Oncogenes and Growth Factors Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
Aerospace Database
Copper Technical Reference Library
AIDS and Cancer Research Abstracts
Materials Research Database
ProQuest Computer Science Collection
ProQuest Health & Medical Complete (Alumni)
Civil Engineering Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Materials Research Database
Oncogenes and Growth Factors Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
Nucleic Acids Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Health & Medical Complete (Alumni)
Materials Business File
Aerospace Database
Copper Technical Reference Library
Engineered Materials Abstracts
Biotechnology Research Abstracts
AIDS and Cancer Research Abstracts
Advanced Technologies Database with Aerospace
ANTE: Abstracts in New Technology & Engineering
Civil Engineering Abstracts
Aluminium Industry Abstracts
Electronics & Communications Abstracts
Ceramic Abstracts
METADEX
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts Professional
Solid State and Superconductivity Abstracts
Engineering Research Database
Corrosion Abstracts
MEDLINE - Academic
DatabaseTitleList Engineering Research Database
Materials Research Database
MEDLINE - Academic
CrossRef

MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1460-2059
1367-4811
EndPage 67
ExternalDocumentID 1006449811
16257984
17426177
10_1093_bioinformatics_bti746
ark_67375_HXZ_CMBFXDXW_M
Genre Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID -~X
.2P
.I3
482
48X
5GY
AAMVS
ABGNP
ABJNI
ABPTD
ACGFS
ACUFI
ADZXQ
ALMA_UNASSIGNED_HOLDINGS
BSCLL
CZ4
EE~
F5P
F9B
H5~
HAR
HW0
IOX
KSI
KSN
NGC
Q5Y
RD5
ROZ
RXO
TLC
TN5
TOX
WH7
~91
---
-E4
.DC
.GJ
0R~
1TH
23N
2WC
4.4
53G
5WA
70D
AAIJN
AAIMJ
AAJKP
AAJQQ
AAKPC
AAMDB
AAOGV
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
AAYXX
ABEJV
ABEUO
ABIXL
ABNGD
ABNKS
ABPQP
ABQLI
ABWST
ABXVV
ABZBJ
ACIWK
ACPRK
ACUKT
ACUXJ
ACYTK
ADBBV
ADEYI
ADEZT
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADMLS
ADOCK
ADPDF
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFNX
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQPQ
AGQXC
AGSYK
AHMBA
AHXPO
AI.
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALTZX
ALUQC
AMNDL
APIBT
APWMN
ARIXL
ASPBG
AVWKF
AXUDD
AYOIW
AZFZN
AZVOD
BAWUL
BAYMD
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C1A
C45
CAG
CDBKE
CITATION
COF
CS3
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
EJD
EMOBN
FEDTE
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
HVGLF
HZ~
J21
JXSIZ
KAQDR
KOP
KQ8
M-Z
MK~
ML0
N9A
NLBLG
NMDNZ
NOMLY
NTWIH
NVLIB
O0~
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PB-
PEELM
PQQKQ
Q1.
R44
RNS
ROL
RUSNO
RW1
SV3
TEORI
TJP
TR2
VH1
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
~KM
.-4
ABEFU
AQDSO
ATTQO
ELUNK
IQODW
NU-
O~Y
RIG
RNI
RPM
RZF
RZO
ZGI
ABQTQ
ADRIX
AFXEN
BCRHZ
CGR
CUY
CVF
ECM
EIF
M49
NPM
ROX
7QF
7QO
7QQ
7SC
7SE
7SP
7SR
7TA
7TB
7TM
7TO
7U5
8BQ
8FD
F28
FR3
H8D
H8G
H94
JG9
JQ2
K9.
KR7
L7M
L~C
L~D
P64
7X8
ID FETCH-LOGICAL-c433t-6cfe59fed6153a7fdaccd1b337f09f154162265b22e3fadbb54806278dc9ee3c3
ISSN 1367-4803
IngestDate Fri Sep 05 09:30:57 EDT 2025
Tue Oct 07 09:38:44 EDT 2025
Fri Oct 03 10:51:40 EDT 2025
Wed Feb 19 01:43:07 EST 2025
Mon Jul 21 09:13:04 EDT 2025
Wed Oct 01 04:04:37 EDT 2025
Thu Apr 24 22:52:15 EDT 2025
Sat Sep 20 11:01:51 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Visualization
MATLAB software
DNA chip
Error rate
K means algorithm
Gene expression
Microarray
Algorithm
Capture
Clusterin
Original document
Signal
Classification
Bioinformatics
Language English
License CC BY 4.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c433t-6cfe59fed6153a7fdaccd1b337f09f154162265b22e3fadbb54806278dc9ee3c3
Notes To whom correspondence should be addressed.
istex:455E6F40C2346CA981BBEA1EFCDF773BB84D3EA1
Associate Editor: Joaquin Dopazo
ark:/67375/HXZ-CMBFXDXW-M
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
PMID 16257984
PQID 198662854
PQPubID 36124
PageCount 10
ParticipantIDs proquest_miscellaneous_67606353
proquest_miscellaneous_19437386
proquest_journals_198662854
pubmed_primary_16257984
pascalfrancis_primary_17426177
crossref_primary_10_1093_bioinformatics_bti746
crossref_citationtrail_10_1093_bioinformatics_bti746
istex_primary_ark_67375_HXZ_CMBFXDXW_M
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2006-01-01
PublicationDateYYYYMMDD 2006-01-01
PublicationDate_xml – month: 01
  year: 2006
  text: 2006-01-01
  day: 01
PublicationDecade 2000
PublicationPlace Oxford
PublicationPlace_xml – name: Oxford
– name: England
PublicationTitle Bioinformatics
PublicationTitleAlternate Bioinformatics
PublicationYear 2006
Publisher Oxford University Press
Oxford Publishing Limited (England)
Publisher_xml – name: Oxford University Press
– name: Oxford Publishing Limited (England)
References Eisen (2023012408334145900_b9) 1998; 95
Bro (2023012408334145900_b6) 2003; 278
Sharan (2023012408334145900_b33) 2003; 19
Jones (2023012408334145900_b22) 2003; 16
MacKay (2023012408334145900_b25) 2003
Benjamini (2023012408334145900_b5) 1995; 57
Gasch (2023012408334145900_b13) 2000; 11
Hastie (2023012408334145900_b20) 2001
Dubey (2023012408334145900_b8) 2004
Reiner (2023012408334145900_b30) 2003; 19
Monti (2023012408334145900_b27) 2003; 52
Goldstein (2023012408334145900_b17) 2002; 12
Kaminski (2023012408334145900_b23) 2002; 27
Smet (2023012408334145900_b34) 2002; 18
Grotkjær (2023012408334145900_b18) 2004; 4
Rocke (2023012408334145900_b32) 2003; 19
Ward (2023012408334145900_b40) 1963; 58
Pan (2023012408334145900_b28) 2002; 3
DeRisi (2023012408334145900_b7) 1997; 278
Hansen (2023012408334145900_b19) 2000
Fred (2023012408334145900_b11) 2002
Tusher (2023012408334145900_b38) 2001; 98
Gibbons (2023012408334145900_b16) 2002; 12
Ghosh (2023012408334145900_b15) 2002; 18
Attias (2023012408334145900_b3) 2000
Ramoni (2023012408334145900_b29) 2002; 99
Kamvar (2023012408334145900_b24) 2002
Tavazoie (2023012408334145900_b37) 1999; 22
Venet (2023012408334145900_b39) 2003; 19
McLachlan (2023012408334145900_b26) 2002; 18
Ashburner (2023012408334145900_b2) 2001; 11
Geller (2023012408334145900_b14) 2003; 19
Falkenauer (2023012408334145900_b10) 2003
Tamayo (2023012408334145900_b36) 1999; 96
Hughes (2023012408334145900_b21) 2000; 102
Strehl (2023012408334145900_b35) 2002; 3
Rocke (2023012408334145900_b31) 2001; 8
Bar-Joseph (2023012408334145900_b4) 2001; 17
Fred (2023012408334145900_b12) 2003
Ashburner (2023012408334145900_b1) 2000; 25
References_xml – volume: 3
  start-page: 583
  year: 2002
  ident: 2023012408334145900_b35
  article-title: Cluster ensembles—a knowledge reuse framework for combining multiple partitions
  publication-title: J. Mach. Learn. Res.
– volume: 18
  start-page: 413
  year: 2002
  ident: 2023012408334145900_b26
  article-title: A mixture model-based approach to the clustering of microarray expression data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/18.3.413
– volume: 52
  start-page: 91
  year: 2003
  ident: 2023012408334145900_b27
  article-title: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data
  publication-title: Mach. Learn.
  doi: 10.1023/A:1023949509487
– volume: 18
  start-page: 275
  year: 2002
  ident: 2023012408334145900_b15
  article-title: Mixture modelling of gene expression data from microarray experiments
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/18.2.275
– start-page: 283
  year: 2002
  ident: 2023012408334145900_b24
  article-title: Interpreting and extending classical agglomerative clustering algorithms using a model-based approach
– start-page: 276
  year: 2002
  ident: 2023012408334145900_b11
  article-title: Data clustering using evidence accumulation
– volume: 4
  start-page: 673
  year: 2004
  ident: 2023012408334145900_b18
  article-title: Enhancing yeast transcription analysis through integration of heterogenous data
  publication-title: Curr. Genomics
  doi: 10.2174/1389202043348472
– start-page: 209
  volume-title: Adv. Neur. Info. Proc. Sys.
  year: 2000
  ident: 2023012408334145900_b3
  article-title: A variational Bayesian framework for graphical models
– start-page: 219
  volume-title: Evolutionary Computation in Bioinformatics
  year: 2003
  ident: 2023012408334145900_b10
  article-title: Clustering microarray data with evolutionary algorithms
  doi: 10.1016/B978-155860797-2/50012-3
– volume: 95
  start-page: 14863
  year: 1998
  ident: 2023012408334145900_b9
  article-title: Cluster analysis and display of genome-wide expression patterns
  publication-title: Proc. Natl Acad. Sci. USA
  doi: 10.1073/pnas.95.25.14863
– volume: 19
  start-page: 659
  year: 2003
  ident: 2023012408334145900_b39
  article-title: MatArray: a Matlab toolbox for microarray data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg046
– volume: 19
  start-page: 368
  year: 2003
  ident: 2023012408334145900_b30
  article-title: Identifying differentially expressed genes using false discovery rate controlling procedures
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btf877
– volume: 17
  start-page: S22
  issue: Suppl. 1
  year: 2001
  ident: 2023012408334145900_b4
  article-title: Fast optimal leaf ordering for hierarchical clustering
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/17.suppl_1.S22
– start-page: 399
  year: 2004
  ident: 2023012408334145900_b8
  article-title: Clustering protein sequence and structure space with infinite Gaussian mixture models
– start-page: 3494
  year: 2000
  ident: 2023012408334145900_b19
  article-title: Modeling text with generalizable Gaussian mixtures
– volume: 11
  start-page: 1425
  year: 2001
  ident: 2023012408334145900_b2
  article-title: Creating the gene ontology resource: design and implementation—the gene ontology consortium
  publication-title: Genome Res.
  doi: 10.1101/gr.180801
– volume: 11
  start-page: 4241
  year: 2000
  ident: 2023012408334145900_b13
  article-title: Genomic expression programs in the response of yeast cells to environmental changes
  publication-title: Mol. Biol. Cell
  doi: 10.1091/mbc.11.12.4241
– volume: 57
  start-page: 289
  year: 1995
  ident: 2023012408334145900_b5
  article-title: Controlling the false discovery rate: a practical and powerful approach to multiple testing
  publication-title: J. R. Stat. Soc.
  doi: 10.1111/j.2517-6161.1995.tb02031.x
– volume: 96
  start-page: 2907
  year: 1999
  ident: 2023012408334145900_b36
  article-title: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation
  publication-title: Proc. Natl Acad. Sci. USA
  doi: 10.1073/pnas.96.6.2907
– volume: 3
  start-page: 1
  year: 2002
  ident: 2023012408334145900_b28
  article-title: Model-based cluster analysis of microarray gene-expression data
  publication-title: Genome Biol.
  doi: 10.1186/gb-2002-3-2-research0009
– volume: 22
  start-page: 281
  year: 1999
  ident: 2023012408334145900_b37
  article-title: Systematic determination of genetic network architecture
  publication-title: Nat. Genet.
  doi: 10.1038/10343
– volume: 278
  start-page: 32141
  year: 2003
  ident: 2023012408334145900_b6
  article-title: Transcriptional, proteomic, and metabolic responses to lithium in galactose-grown yeast cells
  publication-title: J. Biol. Chem.
  doi: 10.1074/jbc.M304478200
– volume: 278
  start-page: 680
  year: 1997
  ident: 2023012408334145900_b7
  article-title: Exploring the metabolic and genetic control of gene expression on a genomic scale
  publication-title: Science
  doi: 10.1126/science.278.5338.680
– volume: 27
  start-page: 125
  year: 2002
  ident: 2023012408334145900_b23
  article-title: Practical approaches to analyzing results of microarray experiments
  publication-title: Am. J. Respir. Cell Mol. Biol.
  doi: 10.1165/ajrcmb.27.2.f247
– volume: 19
  start-page: 966
  year: 2003
  ident: 2023012408334145900_b32
  article-title: Approximate variance-stabilising transformations for gene-expression microarray data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg107
– volume: 12
  start-page: 219
  year: 2002
  ident: 2023012408334145900_b17
  article-title: Statistical issues in the clustering of gene expression data
  publication-title: Stat. Sin.
– volume: 16
  start-page: 107
  year: 2003
  ident: 2023012408334145900_b22
  article-title: Transcriptome profiling of a Saccharomyces cerevisiae mutant with a constitutively activated Ras/cAMP pathway
  publication-title: Physiol. Genomics
  doi: 10.1152/physiolgenomics.00139.2003
– year: 2003
  ident: 2023012408334145900_b25
  article-title: Information Theory, Inference and Learning Algorithms
– volume: 99
  start-page: 9121
  year: 2002
  ident: 2023012408334145900_b29
  article-title: Cluster analysis of gene expression dynamics
  publication-title: Proc. Natl Acad. Sci. USA
  doi: 10.1073/pnas.132656399
– volume: 98
  start-page: 5116
  year: 2001
  ident: 2023012408334145900_b38
  article-title: Significance analysis of microarrays applied to the ionizing radiation response
  publication-title: Proc. Natl Acad. Sci. USA
  doi: 10.1073/pnas.091062498
– volume: 18
  start-page: 735
  year: 2002
  ident: 2023012408334145900_b34
  article-title: Adaptive quality-based clustering of gene expression profiles
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/18.5.735
– volume-title: Springer Series in Statistics
  year: 2001
  ident: 2023012408334145900_b20
  article-title: The Elements of Statistical Learning — Data Mining, Inference, and Prediction
– volume: 12
  start-page: 1574
  year: 2002
  ident: 2023012408334145900_b16
  article-title: Judging the quality of gene expression-based clustering methods using gene annotation
  publication-title: Genome Res.
  doi: 10.1101/gr.397002
– volume: 8
  start-page: 557
  year: 2001
  ident: 2023012408334145900_b31
  article-title: A model for measurement error for gene expression arrays
  publication-title: J. Comput. Biol.
  doi: 10.1089/106652701753307485
– volume: 19
  start-page: 1817
  year: 2003
  ident: 2023012408334145900_b14
  article-title: Transformation and normalization of oligonucleotide microarray data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg245
– volume: 102
  start-page: 109
  year: 2000
  ident: 2023012408334145900_b21
  article-title: Functional discovery via a compendium of expression profiles
  publication-title: Cell
  doi: 10.1016/S0092-8674(00)00015-5
– volume: 19
  start-page: 1787
  year: 2003
  ident: 2023012408334145900_b33
  article-title: CLICK and EXPANDER: a system for clustering and visualizing gene expression data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg232
– volume: 58
  start-page: 236
  year: 1963
  ident: 2023012408334145900_b40
  article-title: Hierarchical grouping to optimize an objective function
  publication-title: J. Am. Stat. Assoc.
  doi: 10.1080/01621459.1963.10500845
– volume: 25
  start-page: 25
  year: 2000
  ident: 2023012408334145900_b1
  article-title: Gene Ontology: tool for the unification of biology. the gene ontology consortium
  publication-title: Nat. Genet.
  doi: 10.1038/75556
– start-page: 128
  year: 2003
  ident: 2023012408334145900_b12
  article-title: Robust data clustering
SSID ssj0051444
ssj0005056
Score 2.0356517
Snippet Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole...
Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA...
MOTIVATION: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole...
SourceID proquest
pubmed
pascalfrancis
crossref
istex
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 58
SubjectTerms Algorithms
Biological and medical sciences
Cluster Analysis
Computational Biology - methods
Computer Simulation
Deoxyribonucleic acid
DNA
Fundamental and applied biological sciences. Psychology
Gene Expression Profiling
General aspects
Genes, Fungal
Genome
Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)
Models, Statistical
Normal Distribution
Oligonucleotide Array Sequence Analysis - methods
Open Reading Frames
Pattern Recognition, Automated
Relocation
Sequence Alignment
Title Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm
URI https://api.istex.fr/ark:/67375/HXZ-CMBFXDXW-M/fulltext.pdf
https://www.ncbi.nlm.nih.gov/pubmed/16257984
https://www.proquest.com/docview/198662854
https://www.proquest.com/docview/19437386
https://www.proquest.com/docview/67606353
Volume 22
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAFT
  databaseName: Open Access Digital Library
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: KQ8
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html
  providerName: Colorado Alliance of Research Libraries
– providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: ADMLS
  dateStart: 19980101
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
– providerCode: PRVBFR
  databaseName: Free Medical Journals
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 20241102
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: DIK
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://www.freemedicaljournals.com
  providerName: Flying Publisher
– providerCode: PRVFQY
  databaseName: GFMER Free Medical Journals
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 20241102
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: GX1
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php
  providerName: Geneva Foundation for Medical Education and Research
– providerCode: PRVOVD
  databaseName: Journals@Ovid LWW All Open Access Journal Collection Rolling
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: OVEED
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: http://ovidsp.ovid.com/
  providerName: Ovid
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 20220930
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1bb5swFLayVpP2Mu0-1q3zw7SXiLbBhMtjb1k0NYlUJRraHhA2dpQ1hQ6I1O6P7u_MBxsIve3ygiJjjOPz4XNsf-cchD70PEEtJnpmHFnUtKXGND0miElA95KIUSpgv2M0doYz-3PQDzqdX2uspVVBd9jPW_1K_keqskzKFbxk_0GydaOyQP6W8pVXKWF5_SsZn6Z0lReKFGjmcrR5ly1XEPpAc5mXwPPuHo33u-dAvIuyLLrqAik055VfGxieDCjVSb7Ku9Fynmay-Lx12LtIdXzVMqYzBCi9rDjxOgnI2obCpywtzr6XB_BOdo2DVG7wJIXGyWRZo-qUzxuq2cEimy8qBlJ5ZgIKPNFknPs3KrQD5Nrumnbhgm5f66yaiwmEZPf21PzH18v0_KwncMu6AVQ1G6ug8DeUhAqgRVtjBwXFwrVvCcs9noSD2clJOD0Opu27ehUlrTrbhz5d_DAhnRkc--vcLg_QpiXVDeQUmU6Chnu0V6YVrv9h5Vrmk912t3ZVp1pG0yZ8_5dA4o0AWUIlYLl7hVRaStMn6LFe4uB9hdenqMOTZ-ihSnp69Rx9U6jFa6jFDWpxKnCJWixRixvU4gq1GFCLJYRwjVpco_YFmg2Op4dDU6f4MJlNSGE6TPC-L3gM647IFXHEWNyjhLhizxfSvO85cn3Qp5bFiYhiSiE8oWO5Xsx8zgkjL9FGkib8NcI2EQK0CbjTyLZ9z3di6nk0JtSXppowkF2NYch0_HtIw7IMFQ-DhO2hD9XQG2infuxCBYD50wMfSwHVtaPsDNiTbj8cBl_Dw9HBIDgKvoQjA223JNg078K-husaaKsSaahnnTzs-Z5Tuj0b6H19V6oEOOeLEp6uoEoZr8y5u4bjOnJp0icGeqWQ0rzbkTrc9-w39757Cz1qvvC3aKPIVvydNM4Lul0C_TdsdvBK
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Robust+multi-scale+clustering+of+large+DNA+microarray+datasets+with+the+consensus+algorithm&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Grotkj%C3%A6r%2C+Thomas&rft.au=Winther%2C+Ole&rft.au=Regenberg%2C+Birgitte&rft.au=Nielsen%2C+Jens&rft.date=2006-01-01&rft.pub=Oxford+Publishing+Limited+%28England%29&rft.issn=1367-4803&rft.eissn=1367-4811&rft.volume=22&rft.issue=1&rft.spage=58&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbti746&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=1006449811
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon