Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm

Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods giv...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformatics Vol. 22; no. 1; pp. 58 - 67
Main Authors	Grotkjær, Thomas, Winther, Ole, Regenberg, Birgitte, Nielsen, Jens, Hansen, Lars Kai
Format	Journal Article
Language	English
Published	Oxford Oxford University Press 01.01.2006 Oxford Publishing Limited (England)
Subjects	Algorithms Biological and medical sciences Cluster Analysis Computational Biology - methods Computer Simulation Deoxyribonucleic acid DNA Fundamental and applied biological sciences. Psychology Gene Expression Profiling General aspects Genes, Fungal Genome Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Models, Statistical Normal Distribution Oligonucleotide Array Sequence Analysis - methods Open Reading Frames Pattern Recognition, Automated Relocation Sequence Alignment Visualization MATLAB software DNA chip Error rate K means algorithm Gene expression Microarray Algorithm Capture Clusterin Original document Signal Classification Bioinformatics
Online Access	Get full text
ISSN	1367-4803 1460-2059 1367-4811
DOI	10.1093/bioinformatics/bti746

Cover

Abstract	Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. Results: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. Availability: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. Contact: tg@biocentrum.dtu.dk and owi@imm.dtu.dk Supplementary information:
AbstractList	MOTIVATION: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. RESULTS: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. AVAILABILITY: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. CONTACT: tgiocentrum.dtu.dk and owimm.dtu.dk Supplementary information: http://www.cmb.dtu.dk/ Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. Results: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. Availability: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. Contact: tg@biocentrum.dtu.dk and owi@imm.dtu.dk Supplementary information: http://www.cmb.dtu.dk/ Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results.MOTIVATIONHierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results.We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data.RESULTSWe show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data.Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W.AVAILABILITYMatlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. Results: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. Availability: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. Contact: tg@biocentrum.dtu.dk and owi@imm.dtu.dk Supplementary information: Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. Results: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. Availability: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W. Contact: tg@biocentrum.dtu.dk and owi@imm.dtu.dk Supplementary information: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results. We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W.
Author	Grotkjær, Thomas Hansen, Lars Kai Nielsen, Jens Regenberg, Birgitte Winther, Ole
Author_xml	– sequence: 1 givenname: Thomas surname: Grotkjær fullname: Grotkjær, Thomas organization: Center for Microbial Biotechnology BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark – sequence: 2 givenname: Ole surname: Winther fullname: Winther, Ole organization: Informatics and Mathematical Modelling, Building 321, Technical University of Denmark DK-2800 Kgs. Lyngby, Denmark – sequence: 3 givenname: Birgitte surname: Regenberg fullname: Regenberg, Birgitte organization: Center for Microbial Biotechnology BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark – sequence: 4 givenname: Jens surname: Nielsen fullname: Nielsen, Jens organization: Center for Microbial Biotechnology BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark – sequence: 5 givenname: Lars Kai surname: Hansen fullname: Hansen, Lars Kai organization: Informatics and Mathematical Modelling, Building 321, Technical University of Denmark DK-2800 Kgs. Lyngby, Denmark
BackLink	http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17426177$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/16257984$$D View this record in MEDLINE/PubMed
BookMark	eNqFkVtrFTEUhYNU7EV_ghIE-zY2mdxm8KmeWiu0CqJ40IeQySSnqZmkTTJo_70p57TFvvQpm823dvZeaxdshRgMAC8xeotRTw4GF12wMU2qOJ0PhuIE5U_ADqYcNS1i_VatCRcN7RDZBrs5XyDEMKX0GdjGvGWi7-gO-PU1DnMucJp9cU3Wyhuofe2Y5MIKRgu9SisDjz4fwsnpFFVK6hqOqqhsSoZ_XDmH5byKYsgm5DlD5Vcx1fb0HDy1ymfzYvPuge_HH74tTprTLx8_LQ5PG00JKQ3X1rDempFjRpSwo9J6xAMhwqLeYkbrti1nQ9saYtU4DKyexFvRjbo3hmiyB_bXcy9TvJpNLnJyWRvvVTBxzpILjjhh5FEQ95QI0vEKvn4AXsQ5hXpEZTrO247RCr3aQPMwmVFeJjepdC1vza3Amw2gboy1SQXt8j0naMuxEJVja67am3My9h5B8iZs-X_Ych121b17oNOuVCKGkpTzj6qbtdrVsP_efanS72oYEUyeLH_Kxdn74-XR8oc8I_8AdenHxw
CODEN	BOINFP
CitedBy_id	crossref_primary_10_1109_TCBB_2014_2359433 crossref_primary_10_1186_gb_2006_7_11_r108 crossref_primary_10_1089_omi_2008_0074 crossref_primary_10_1186_gb_2006_7_11_r107 crossref_primary_10_1093_bib_bbs057 crossref_primary_10_1109_TCBB_2008_33 crossref_primary_10_1109_TCBB_2012_108 crossref_primary_10_1016_j_ces_2007_08_017 crossref_primary_10_1016_j_csda_2009_05_031 crossref_primary_10_1002_widm_1053 crossref_primary_10_1089_omi_2009_0118 crossref_primary_10_1109_TNB_2012_2208198 crossref_primary_10_1111_j_1365_2958_2010_07209_x crossref_primary_10_1186_1755_8794_6_41 crossref_primary_10_1198_106186007X237838 crossref_primary_10_1137_100804395 crossref_primary_10_1093_bioinformatics_btm463 crossref_primary_10_3389_fonc_2022_887318 crossref_primary_10_1016_j_ymeth_2015_03_017 crossref_primary_10_1016_j_segan_2018_100181 crossref_primary_10_1186_1471_2164_13_313 crossref_primary_10_1016_j_tim_2013_04_009 crossref_primary_10_1128_AEM_00548_06 crossref_primary_10_1186_gb_2009_10_5_r47 crossref_primary_10_1016_j_artmed_2008_07_014 crossref_primary_10_1186_1471_2164_9_341 crossref_primary_10_1002_cem_1048 crossref_primary_10_1038_s41598_021_00678_9 crossref_primary_10_1093_bioinformatics_btp521 crossref_primary_10_1371_journal_pone_0171429 crossref_primary_10_1109_TCBB_2016_2622692 crossref_primary_10_1038_msb_2011_80
Cites_doi	10.1093/bioinformatics/18.3.413 10.1023/A:1023949509487 10.1093/bioinformatics/18.2.275 10.2174/1389202043348472 10.1016/B978-155860797-2/50012-3 10.1073/pnas.95.25.14863 10.1093/bioinformatics/btg046 10.1093/bioinformatics/btf877 10.1093/bioinformatics/17.suppl_1.S22 10.1101/gr.180801 10.1091/mbc.11.12.4241 10.1111/j.2517-6161.1995.tb02031.x 10.1073/pnas.96.6.2907 10.1186/gb-2002-3-2-research0009 10.1038/10343 10.1074/jbc.M304478200 10.1126/science.278.5338.680 10.1165/ajrcmb.27.2.f247 10.1093/bioinformatics/btg107 10.1152/physiolgenomics.00139.2003 10.1073/pnas.132656399 10.1073/pnas.091062498 10.1093/bioinformatics/18.5.735 10.1101/gr.397002 10.1089/106652701753307485 10.1093/bioinformatics/btg245 10.1016/S0092-8674(00)00015-5 10.1093/bioinformatics/btg232 10.1080/01621459.1963.10500845 10.1038/75556
ContentType	Journal Article
Copyright	2006 INIST-CNRS Copyright Oxford University Press(England) Jan 1, 2006
Copyright_xml	– notice: 2006 INIST-CNRS – notice: Copyright Oxford University Press(England) Jan 1, 2006
DBID	BSCLL AAYXX CITATION IQODW CGR CUY CVF ECM EIF NPM 7QF 7QO 7QQ 7SC 7SE 7SP 7SR 7TA 7TB 7TM 7TO 7U5 8BQ 8FD F28 FR3 H8D H8G H94 JG9 JQ2 K9. KR7 L7M L~C L~D P64 7X8
DOI	10.1093/bioinformatics/bti746
DatabaseName	Istex CrossRef Pascal-Francis Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Aluminium Industry Abstracts Biotechnology Research Abstracts Ceramic Abstracts Computer and Information Systems Abstracts Corrosion Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts Materials Business File Mechanical & Transportation Engineering Abstracts Nucleic Acids Abstracts Oncogenes and Growth Factors Abstracts Solid State and Superconductivity Abstracts METADEX Technology Research Database ANTE: Abstracts in New Technology & Engineering Engineering Research Database Aerospace Database Copper Technical Reference Library AIDS and Cancer Research Abstracts Materials Research Database ProQuest Computer Science Collection ProQuest Health & Medical Complete (Alumni) Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Biotechnology and BioEngineering Abstracts MEDLINE - Academic
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Materials Research Database Oncogenes and Growth Factors Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Nucleic Acids Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Health & Medical Complete (Alumni) Materials Business File Aerospace Database Copper Technical Reference Library Engineered Materials Abstracts Biotechnology Research Abstracts AIDS and Cancer Research Abstracts Advanced Technologies Database with Aerospace ANTE: Abstracts in New Technology & Engineering Civil Engineering Abstracts Aluminium Industry Abstracts Electronics & Communications Abstracts Ceramic Abstracts METADEX Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts Professional Solid State and Superconductivity Abstracts Engineering Research Database Corrosion Abstracts MEDLINE - Academic
DatabaseTitleList	Engineering Research Database Materials Research Database MEDLINE - Academic CrossRef MEDLINE
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Biology
EISSN	1460-2059 1367-4811
EndPage	67
ExternalDocumentID	1006449811 16257984 17426177 10_1093_bioinformatics_bti746 ark_67375_HXZ_CMBFXDXW_M
Genre	Research Support, Non-U.S. Gov't Journal Article
GroupedDBID	-~X .2P .I3 482 48X 5GY AAMVS ABGNP ABJNI ABPTD ACGFS ACUFI ADZXQ ALMA_UNASSIGNED_HOLDINGS BSCLL CZ4 EE~ F5P F9B H5~ HAR HW0 IOX KSI KSN NGC Q5Y RD5 ROZ RXO TLC TN5 TOX WH7 ~91 --- -E4 .DC .GJ 0R~ 1TH 23N 2WC 4.4 53G 5WA 70D AAIJN AAIMJ AAJKP AAJQQ AAKPC AAMDB AAOGV AAPQZ AAPXW AAUQX AAVAP AAVLN AAYXX ABEJV ABEUO ABIXL ABNGD ABNKS ABPQP ABQLI ABWST ABXVV ABZBJ ACIWK ACPRK ACUKT ACUXJ ACYTK ADBBV ADEYI ADEZT ADFTL ADGKP ADGZP ADHKW ADHZD ADMLS ADOCK ADPDF ADRDM ADRTK ADVEK ADYVW ADZTZ AECKG AEGPL AEJOX AEKKA AEKSI AELWJ AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFNX AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQPQ AGQXC AGSYK AHMBA AHXPO AI. AIJHB AJEEA AJEUX AKHUL AKWXX ALTZX ALUQC AMNDL APIBT APWMN ARIXL ASPBG AVWKF AXUDD AYOIW AZFZN AZVOD BAWUL BAYMD BHONS BQDIO BQUQU BSWAC BTQHN C1A C45 CAG CDBKE CITATION COF CS3 DAKXR DIK DILTD DU5 D~K EBD EBS EJD EMOBN FEDTE FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 HVGLF HZ~ J21 JXSIZ KAQDR KOP KQ8 M-Z MK~ ML0 N9A NLBLG NMDNZ NOMLY NTWIH NVLIB O0~ O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PB- PEELM PQQKQ Q1. R44 RNS ROL RUSNO RW1 SV3 TEORI TJP TR2 VH1 W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ~KM .-4 ABEFU AQDSO ATTQO ELUNK IQODW NU- O~Y RIG RNI RPM RZF RZO ZGI ABQTQ ADRIX AFXEN BCRHZ CGR CUY CVF ECM EIF M49 NPM ROX 7QF 7QO 7QQ 7SC 7SE 7SP 7SR 7TA 7TB 7TM 7TO 7U5 8BQ 8FD F28 FR3 H8D H8G H94 JG9 JQ2 K9. KR7 L7M L~C L~D P64 7X8
ID	FETCH-LOGICAL-c433t-6cfe59fed6153a7fdaccd1b337f09f154162265b22e3fadbb54806278dc9ee3c3
ISSN	1367-4803
IngestDate	Fri Sep 05 09:30:57 EDT 2025 Tue Oct 07 09:38:44 EDT 2025 Fri Oct 03 10:51:40 EDT 2025 Wed Feb 19 01:43:07 EST 2025 Mon Jul 21 09:13:04 EDT 2025 Wed Oct 01 04:04:37 EDT 2025 Thu Apr 24 22:52:15 EDT 2025 Sat Sep 20 11:01:51 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	1
Keywords	Visualization MATLAB software DNA chip Error rate K means algorithm Gene expression Microarray Algorithm Capture Clusterin Original document Signal Classification Bioinformatics
Language	English
License	CC BY 4.0
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c433t-6cfe59fed6153a7fdaccd1b337f09f154162265b22e3fadbb54806278dc9ee3c3
Notes	To whom correspondence should be addressed. istex:455E6F40C2346CA981BBEA1EFCDF773BB84D3EA1 Associate Editor: Joaquin Dopazo ark:/67375/HXZ-CMBFXDXW-M ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
PMID	16257984
PQID	198662854
PQPubID	36124
PageCount	10
ParticipantIDs	proquest_miscellaneous_67606353 proquest_miscellaneous_19437386 proquest_journals_198662854 pubmed_primary_16257984 pascalfrancis_primary_17426177 crossref_primary_10_1093_bioinformatics_bti746 crossref_citationtrail_10_1093_bioinformatics_bti746 istex_primary_ark_67375_HXZ_CMBFXDXW_M
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2006-01-01
PublicationDateYYYYMMDD	2006-01-01
PublicationDate_xml	– month: 01 year: 2006 text: 2006-01-01 day: 01
PublicationDecade	2000
PublicationPlace	Oxford
PublicationPlace_xml	– name: Oxford – name: England
PublicationTitle	Bioinformatics
PublicationTitleAlternate	Bioinformatics
PublicationYear	2006
Publisher	Oxford University Press Oxford Publishing Limited (England)
Publisher_xml	– name: Oxford University Press – name: Oxford Publishing Limited (England)
References	Eisen (2023012408334145900_b9) 1998; 95 Bro (2023012408334145900_b6) 2003; 278 Sharan (2023012408334145900_b33) 2003; 19 Jones (2023012408334145900_b22) 2003; 16 MacKay (2023012408334145900_b25) 2003 Benjamini (2023012408334145900_b5) 1995; 57 Gasch (2023012408334145900_b13) 2000; 11 Hastie (2023012408334145900_b20) 2001 Dubey (2023012408334145900_b8) 2004 Reiner (2023012408334145900_b30) 2003; 19 Monti (2023012408334145900_b27) 2003; 52 Goldstein (2023012408334145900_b17) 2002; 12 Kaminski (2023012408334145900_b23) 2002; 27 Smet (2023012408334145900_b34) 2002; 18 Grotkjær (2023012408334145900_b18) 2004; 4 Rocke (2023012408334145900_b32) 2003; 19 Ward (2023012408334145900_b40) 1963; 58 Pan (2023012408334145900_b28) 2002; 3 DeRisi (2023012408334145900_b7) 1997; 278 Hansen (2023012408334145900_b19) 2000 Fred (2023012408334145900_b11) 2002 Tusher (2023012408334145900_b38) 2001; 98 Gibbons (2023012408334145900_b16) 2002; 12 Ghosh (2023012408334145900_b15) 2002; 18 Attias (2023012408334145900_b3) 2000 Ramoni (2023012408334145900_b29) 2002; 99 Kamvar (2023012408334145900_b24) 2002 Tavazoie (2023012408334145900_b37) 1999; 22 Venet (2023012408334145900_b39) 2003; 19 McLachlan (2023012408334145900_b26) 2002; 18 Ashburner (2023012408334145900_b2) 2001; 11 Geller (2023012408334145900_b14) 2003; 19 Falkenauer (2023012408334145900_b10) 2003 Tamayo (2023012408334145900_b36) 1999; 96 Hughes (2023012408334145900_b21) 2000; 102 Strehl (2023012408334145900_b35) 2002; 3 Rocke (2023012408334145900_b31) 2001; 8 Bar-Joseph (2023012408334145900_b4) 2001; 17 Fred (2023012408334145900_b12) 2003 Ashburner (2023012408334145900_b1) 2000; 25
References_xml	– volume: 3 start-page: 583 year: 2002 ident: 2023012408334145900_b35 article-title: Cluster ensembles—a knowledge reuse framework for combining multiple partitions publication-title: J. Mach. Learn. Res. – volume: 18 start-page: 413 year: 2002 ident: 2023012408334145900_b26 article-title: A mixture model-based approach to the clustering of microarray expression data publication-title: Bioinformatics doi: 10.1093/bioinformatics/18.3.413 – volume: 52 start-page: 91 year: 2003 ident: 2023012408334145900_b27 article-title: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data publication-title: Mach. Learn. doi: 10.1023/A:1023949509487 – volume: 18 start-page: 275 year: 2002 ident: 2023012408334145900_b15 article-title: Mixture modelling of gene expression data from microarray experiments publication-title: Bioinformatics doi: 10.1093/bioinformatics/18.2.275 – start-page: 283 year: 2002 ident: 2023012408334145900_b24 article-title: Interpreting and extending classical agglomerative clustering algorithms using a model-based approach – start-page: 276 year: 2002 ident: 2023012408334145900_b11 article-title: Data clustering using evidence accumulation – volume: 4 start-page: 673 year: 2004 ident: 2023012408334145900_b18 article-title: Enhancing yeast transcription analysis through integration of heterogenous data publication-title: Curr. Genomics doi: 10.2174/1389202043348472 – start-page: 209 volume-title: Adv. Neur. Info. Proc. Sys. year: 2000 ident: 2023012408334145900_b3 article-title: A variational Bayesian framework for graphical models – start-page: 219 volume-title: Evolutionary Computation in Bioinformatics year: 2003 ident: 2023012408334145900_b10 article-title: Clustering microarray data with evolutionary algorithms doi: 10.1016/B978-155860797-2/50012-3 – volume: 95 start-page: 14863 year: 1998 ident: 2023012408334145900_b9 article-title: Cluster analysis and display of genome-wide expression patterns publication-title: Proc. Natl Acad. Sci. USA doi: 10.1073/pnas.95.25.14863 – volume: 19 start-page: 659 year: 2003 ident: 2023012408334145900_b39 article-title: MatArray: a Matlab toolbox for microarray data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btg046 – volume: 19 start-page: 368 year: 2003 ident: 2023012408334145900_b30 article-title: Identifying differentially expressed genes using false discovery rate controlling procedures publication-title: Bioinformatics doi: 10.1093/bioinformatics/btf877 – volume: 17 start-page: S22 issue: Suppl. 1 year: 2001 ident: 2023012408334145900_b4 article-title: Fast optimal leaf ordering for hierarchical clustering publication-title: Bioinformatics doi: 10.1093/bioinformatics/17.suppl_1.S22 – start-page: 399 year: 2004 ident: 2023012408334145900_b8 article-title: Clustering protein sequence and structure space with infinite Gaussian mixture models – start-page: 3494 year: 2000 ident: 2023012408334145900_b19 article-title: Modeling text with generalizable Gaussian mixtures – volume: 11 start-page: 1425 year: 2001 ident: 2023012408334145900_b2 article-title: Creating the gene ontology resource: design and implementation—the gene ontology consortium publication-title: Genome Res. doi: 10.1101/gr.180801 – volume: 11 start-page: 4241 year: 2000 ident: 2023012408334145900_b13 article-title: Genomic expression programs in the response of yeast cells to environmental changes publication-title: Mol. Biol. Cell doi: 10.1091/mbc.11.12.4241 – volume: 57 start-page: 289 year: 1995 ident: 2023012408334145900_b5 article-title: Controlling the false discovery rate: a practical and powerful approach to multiple testing publication-title: J. R. Stat. Soc. doi: 10.1111/j.2517-6161.1995.tb02031.x – volume: 96 start-page: 2907 year: 1999 ident: 2023012408334145900_b36 article-title: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation publication-title: Proc. Natl Acad. Sci. USA doi: 10.1073/pnas.96.6.2907 – volume: 3 start-page: 1 year: 2002 ident: 2023012408334145900_b28 article-title: Model-based cluster analysis of microarray gene-expression data publication-title: Genome Biol. doi: 10.1186/gb-2002-3-2-research0009 – volume: 22 start-page: 281 year: 1999 ident: 2023012408334145900_b37 article-title: Systematic determination of genetic network architecture publication-title: Nat. Genet. doi: 10.1038/10343 – volume: 278 start-page: 32141 year: 2003 ident: 2023012408334145900_b6 article-title: Transcriptional, proteomic, and metabolic responses to lithium in galactose-grown yeast cells publication-title: J. Biol. Chem. doi: 10.1074/jbc.M304478200 – volume: 278 start-page: 680 year: 1997 ident: 2023012408334145900_b7 article-title: Exploring the metabolic and genetic control of gene expression on a genomic scale publication-title: Science doi: 10.1126/science.278.5338.680 – volume: 27 start-page: 125 year: 2002 ident: 2023012408334145900_b23 article-title: Practical approaches to analyzing results of microarray experiments publication-title: Am. J. Respir. Cell Mol. Biol. doi: 10.1165/ajrcmb.27.2.f247 – volume: 19 start-page: 966 year: 2003 ident: 2023012408334145900_b32 article-title: Approximate variance-stabilising transformations for gene-expression microarray data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btg107 – volume: 12 start-page: 219 year: 2002 ident: 2023012408334145900_b17 article-title: Statistical issues in the clustering of gene expression data publication-title: Stat. Sin. – volume: 16 start-page: 107 year: 2003 ident: 2023012408334145900_b22 article-title: Transcriptome profiling of a Saccharomyces cerevisiae mutant with a constitutively activated Ras/cAMP pathway publication-title: Physiol. Genomics doi: 10.1152/physiolgenomics.00139.2003 – year: 2003 ident: 2023012408334145900_b25 article-title: Information Theory, Inference and Learning Algorithms – volume: 99 start-page: 9121 year: 2002 ident: 2023012408334145900_b29 article-title: Cluster analysis of gene expression dynamics publication-title: Proc. Natl Acad. Sci. USA doi: 10.1073/pnas.132656399 – volume: 98 start-page: 5116 year: 2001 ident: 2023012408334145900_b38 article-title: Significance analysis of microarrays applied to the ionizing radiation response publication-title: Proc. Natl Acad. Sci. USA doi: 10.1073/pnas.091062498 – volume: 18 start-page: 735 year: 2002 ident: 2023012408334145900_b34 article-title: Adaptive quality-based clustering of gene expression profiles publication-title: Bioinformatics doi: 10.1093/bioinformatics/18.5.735 – volume-title: Springer Series in Statistics year: 2001 ident: 2023012408334145900_b20 article-title: The Elements of Statistical Learning — Data Mining, Inference, and Prediction – volume: 12 start-page: 1574 year: 2002 ident: 2023012408334145900_b16 article-title: Judging the quality of gene expression-based clustering methods using gene annotation publication-title: Genome Res. doi: 10.1101/gr.397002 – volume: 8 start-page: 557 year: 2001 ident: 2023012408334145900_b31 article-title: A model for measurement error for gene expression arrays publication-title: J. Comput. Biol. doi: 10.1089/106652701753307485 – volume: 19 start-page: 1817 year: 2003 ident: 2023012408334145900_b14 article-title: Transformation and normalization of oligonucleotide microarray data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btg245 – volume: 102 start-page: 109 year: 2000 ident: 2023012408334145900_b21 article-title: Functional discovery via a compendium of expression profiles publication-title: Cell doi: 10.1016/S0092-8674(00)00015-5 – volume: 19 start-page: 1787 year: 2003 ident: 2023012408334145900_b33 article-title: CLICK and EXPANDER: a system for clustering and visualizing gene expression data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btg232 – volume: 58 start-page: 236 year: 1963 ident: 2023012408334145900_b40 article-title: Hierarchical grouping to optimize an objective function publication-title: J. Am. Stat. Assoc. doi: 10.1080/01621459.1963.10500845 – volume: 25 start-page: 25 year: 2000 ident: 2023012408334145900_b1 article-title: Gene Ontology: tool for the unification of biology. the gene ontology consortium publication-title: Nat. Genet. doi: 10.1038/75556 – start-page: 128 year: 2003 ident: 2023012408334145900_b12 article-title: Robust data clustering
SSID	ssj0051444 ssj0005056
Score	2.0356517
Snippet	Motivation: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole... Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA... MOTIVATION: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole...
SourceID	proquest pubmed pascalfrancis crossref istex
SourceType	Aggregation Database Index Database Enrichment Source Publisher
StartPage	58
SubjectTerms	Algorithms Biological and medical sciences Cluster Analysis Computational Biology - methods Computer Simulation Deoxyribonucleic acid DNA Fundamental and applied biological sciences. Psychology Gene Expression Profiling General aspects Genes, Fungal Genome Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Models, Statistical Normal Distribution Oligonucleotide Array Sequence Analysis - methods Open Reading Frames Pattern Recognition, Automated Relocation Sequence Alignment
Title	Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm
URI	https://api.istex.fr/ark:/67375/HXZ-CMBFXDXW-M/fulltext.pdf https://www.ncbi.nlm.nih.gov/pubmed/16257984 https://www.proquest.com/docview/198662854 https://www.proquest.com/docview/19437386 https://www.proquest.com/docview/67606353
Volume	22
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1460-2059 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4803 databaseCode: KQ8 dateStart: 19960101 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1460-2059 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0005056 issn: 1367-4803 databaseCode: ADMLS dateStart: 19980101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1460-2059 dateEnd: 20241102 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4803 databaseCode: DIK dateStart: 19960101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1460-2059 dateEnd: 20241102 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4803 databaseCode: GX1 dateStart: 19960101 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVOVD databaseName: Journals@Ovid LWW All Open Access Journal Collection Rolling customDbUrl: eissn: 1460-2059 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4803 databaseCode: OVEED dateStart: 20010101 isFulltext: true titleUrlDefault: http://ovidsp.ovid.com/ providerName: Ovid – providerCode: PRVASL databaseName: Oxford Journals Open Access Collection customDbUrl: eissn: 1460-2059 dateEnd: 20220930 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4803 databaseCode: TOX dateStart: 19850101 isFulltext: true titleUrlDefault: https://academic.oup.com/journals/ providerName: Oxford University Press – providerCode: PRVASL databaseName: Oxford Journals Open Access Collection customDbUrl: eissn: 1460-2059 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4803 databaseCode: TOX dateStart: 19850101 isFulltext: true titleUrlDefault: https://academic.oup.com/journals/ providerName: Oxford University Press
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1bb5swFLayVpP2Mu0-1q3zw7SXiLbBhMtjb1k0NYlUJRraHhA2dpQ1hQ6I1O6P7u_MBxsIve3ygiJjjOPz4XNsf-cchD70PEEtJnpmHFnUtKXGND0miElA95KIUSpgv2M0doYz-3PQDzqdX2uspVVBd9jPW_1K_keqskzKFbxk_0GydaOyQP6W8pVXKWF5_SsZn6Z0lReKFGjmcrR5ly1XEPpAc5mXwPPuHo33u-dAvIuyLLrqAik055VfGxieDCjVSb7Ku9Fynmay-Lx12LtIdXzVMqYzBCi9rDjxOgnI2obCpywtzr6XB_BOdo2DVG7wJIXGyWRZo-qUzxuq2cEimy8qBlJ5ZgIKPNFknPs3KrQD5Nrumnbhgm5f66yaiwmEZPf21PzH18v0_KwncMu6AVQ1G6ug8DeUhAqgRVtjBwXFwrVvCcs9noSD2clJOD0Opu27ehUlrTrbhz5d_DAhnRkc--vcLg_QpiXVDeQUmU6Chnu0V6YVrv9h5Vrmk912t3ZVp1pG0yZ8_5dA4o0AWUIlYLl7hVRaStMn6LFe4uB9hdenqMOTZ-ihSnp69Rx9U6jFa6jFDWpxKnCJWixRixvU4gq1GFCLJYRwjVpco_YFmg2Op4dDU6f4MJlNSGE6TPC-L3gM647IFXHEWNyjhLhizxfSvO85cn3Qp5bFiYhiSiE8oWO5Xsx8zgkjL9FGkib8NcI2EQK0CbjTyLZ9z3di6nk0JtSXppowkF2NYch0_HtIw7IMFQ-DhO2hD9XQG2infuxCBYD50wMfSwHVtaPsDNiTbj8cBl_Dw9HBIDgKvoQjA223JNg078K-husaaKsSaahnnTzs-Z5Tuj0b6H19V6oEOOeLEp6uoEoZr8y5u4bjOnJp0icGeqWQ0rzbkTrc9-w39757Cz1qvvC3aKPIVvydNM4Lul0C_TdsdvBK
linkProvider	Oxford University Press
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Robust+multi-scale+clustering+of+large+DNA+microarray+datasets+with+the+consensus+algorithm&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Grotkj%C3%A6r%2C+Thomas&rft.au=Winther%2C+Ole&rft.au=Regenberg%2C+Birgitte&rft.au=Nielsen%2C+Jens&rft.date=2006-01-01&rft.pub=Oxford+Publishing+Limited+%28England%29&rft.issn=1367-4803&rft.eissn=1367-4811&rft.volume=22&rft.issue=1&rft.spage=58&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbti746&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=1006449811
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon