Integrative clustering methods for high-dimensional molecular data

High-throughput 'omic' data, such as gene expression, DNA methylation, DNA copy number, has played an instrumental role in furthering our understanding of the molecular basis in states of human health and disease. As cells with similar morphological characteristics can exhibit entirely dif...

Full description

Saved in:
Bibliographic Details
Published inTranslational cancer research Vol. 3; no. 3; pp. 202 - 216
Main Authors Chalise, Prabhakar, Koestler, Devin C, Bimali, Milan, Yu, Qing, Fridley, Brooke L
Format Journal Article
LanguageEnglish
Published China 01.06.2014
Subjects
Online AccessGet full text
ISSN2218-676X
2219-6803
DOI10.3978/j.issn.2218-676X.2014.06.03

Cover

Abstract High-throughput 'omic' data, such as gene expression, DNA methylation, DNA copy number, has played an instrumental role in furthering our understanding of the molecular basis in states of human health and disease. As cells with similar morphological characteristics can exhibit entirely different molecular profiles and because of the potential that these discrepancies might further our understanding of patient-level variability in clinical outcomes, there is significant interest in the use of high-throughput 'omic' data for the identification of novel molecular subtypes of a disease. While numerous clustering methods have been proposed for identifying of molecular subtypes, most were developed for single "omic' data types and may not be appropriate when more than one 'omic' data type are collected on study subjects. Given that complex diseases, such as cancer, arise as a result of genomic, epigenomic, transcriptomic, and proteomic alterations, integrative clustering methods for the simultaneous clustering of multiple 'omic' data types have great potential to aid in molecular subtype discovery. Traditionally, ad hoc manual data integration has been performed using the results obtained from the clustering of individual 'omic' data types on the same set of patient samples. However, such methods often result in inconsistent assignment of subjects to the molecular cancer subtypes. Recently, several methods have been proposed in the literature that offers a rigorous framework for the simultaneous integration of multiple 'omic' data types in a single comprehensive analysis. In this paper, we present a systematic review of existing integrative clustering methods.
AbstractList High-throughput ‘omic’ data, such as gene expression, DNA methylation, DNA copy number, has played an instrumental role in furthering our understanding of the molecular basis in states of human health and disease. As cells with similar morphological characteristics can exhibit entirely different molecular profiles and because of the potential that these discrepancies might further our understanding of patient-level variability in clinical outcomes, there is significant interest in the use of high-throughput ‘omic’ data for the identification of novel molecular subtypes of a disease. While numerous clustering methods have been proposed for identifying of molecular subtypes, most were developed for single “omic’ data types and may not be appropriate when more than one ‘omic’ data type are collected on study subjects. Given that complex diseases, such as cancer, arise as a result of genomic, epigenomic, transcriptomic, and proteomic alterations, integrative clustering methods for the simultaneous clustering of multiple ‘omic’ data types have great potential to aid in molecular subtype discovery. Traditionally, ad hoc manual data integration has been performed using the results obtained from the clustering of individual ‘omic’ data types on the same set of patient samples. However, such methods often result in inconsistent assignment of subjects to the molecular cancer subtypes. Recently, several methods have been proposed in the literature that offers a rigorous framework for the simultaneous integration of multiple ‘omic’ data types in a single comprehensive analysis. In this paper, we present a systematic review of existing integrative clustering methods.
Author Yu, Qing
Fridley, Brooke L
Chalise, Prabhakar
Koestler, Devin C
Bimali, Milan
Author_xml – sequence: 1
  givenname: Prabhakar
  surname: Chalise
  fullname: Chalise, Prabhakar
  organization: Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS 66160, USA
– sequence: 2
  givenname: Devin C
  surname: Koestler
  fullname: Koestler, Devin C
  organization: Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS 66160, USA
– sequence: 3
  givenname: Milan
  surname: Bimali
  fullname: Bimali, Milan
  organization: Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS 66160, USA
– sequence: 4
  givenname: Qing
  surname: Yu
  fullname: Yu, Qing
  organization: Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS 66160, USA
– sequence: 5
  givenname: Brooke L
  surname: Fridley
  fullname: Fridley, Brooke L
  organization: Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS 66160, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/25243110$$D View this record in MEDLINE/PubMed
BookMark eNpVkEFLw0AQhRep2Fr7FyTgOXF3s51NLoKWqoWCFwVvYZqdTbYkm5JNC_57i1XR0wzz5n083iUb-c4TYzeCJ2mus9tt4kLwiZQii0HDeyK5UAmHhKdnbHI85zFkPB197aeXMZuFsOWcSyEyxeGCjeVcqlQIPmEPKz9Q1ePgDhSVzT4M1DtfRS0NdWdCZLs-ql1Vx8a15IPrPDZR2zVU7hvsI4MDXrFzi02g2fecsrfH5eviOV6_PK0W9-t4JwGG2EqBVgJakaImrYwRaHQuN1BakhwVZvkxXq64Ia6lziTOCaAEgajI2HTK7k7c3X7TkinJDz02xa53LfYfRYeu-K94VxdVdyiUAFDHUqbs-i_g1_nTRvoJqB5qAA
ContentType Journal Article
Copyright Pioneer Bioscience Publishing Company. All rights reserved.
Copyright_xml – notice: Pioneer Bioscience Publishing Company. All rights reserved.
DBID NPM
5PM
DOI 10.3978/j.issn.2218-676X.2014.06.03
DatabaseName PubMed
PubMed Central (Full Participant titles)
DatabaseTitle PubMed
DatabaseTitleList
PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
EISSN 2219-6803
EndPage 216
ExternalDocumentID PMC4166480
25243110
Genre Journal Article
GrantInformation_xml – fundername: NCI NIH HHS
  grantid: R21 CA182715
– fundername: NCI NIH HHS
  grantid: P30 CA168524
– fundername: NIGMS NIH HHS
  grantid: P20 GM103418
GroupedDBID NPM
53G
5PM
ADBBV
AENEX
ALMA_UNASSIGNED_HOLDINGS
BAWUL
DIK
PGMZT
ID FETCH-LOGICAL-p266t-f21af26af13a7e74dd1ad792b6cfe20a4a89840940de072782a5e66c61aa4edf3
ISSN 2218-676X
IngestDate Thu Aug 21 18:10:33 EDT 2025
Fri Sep 17 22:38:26 EDT 2021
IsPeerReviewed false
IsScholarly true
Issue 3
Keywords cophenetic correlation
mixture models
latent models
non-negative matrix factorization
Consensus clustering
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-p266t-f21af26af13a7e74dd1ad792b6cfe20a4a89840940de072782a5e66c61aa4edf3
PMID 25243110
PageCount 15
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_4166480
pubmed_primary_25243110
PublicationCentury 2000
PublicationDate 2014-Jun-01
20140601
PublicationDateYYYYMMDD 2014-06-01
PublicationDate_xml – month: 06
  year: 2014
  text: 2014-Jun-01
  day: 01
PublicationDecade 2010
PublicationPlace China
PublicationPlace_xml – name: China
PublicationTitle Translational cancer research
PublicationTitleAlternate Transl Cancer Res
PublicationYear 2014
References 22879375 - Nucleic Acids Res. 2012 Oct;40(19):9379-91
12761060 - Bioinformatics. 2003 May 22;19(8):973-80
15573120 - Nat Rev Cancer. 2004 Dec;4(12):988-93
12840046 - Genome Res. 2003 Jul;13(7):1706-18
12537558 - Genome Biol. 2002;3(12):RESEARCH0069
14711987 - Proc Natl Acad Sci U S A. 2004 Jan 20;101(3):811-6
20492682 - BMC Cancer. 2010 May 21;10 :227
12917485 - Proc Natl Acad Sci U S A. 2003 Sep 2;100(18):10393-8
18595779 - J Biomed Inform. 2009 Feb;42(1):74-81
11707567 - Proc Natl Acad Sci U S A. 2001 Nov 20;98(24):13790-5
11553815 - Proc Natl Acad Sci U S A. 2001 Sep 11;98(19):10869-74
11934740 - Bioinformatics. 2002 Mar;18(3):413-22
18061589 - Comput Biol Med. 2008 Mar;38(3):283-93
18173289 - Anal Chem. 2008 Feb 1;80(3):665-74
10391217 - Nat Genet. 1999 Jul;22(3):281-5
15737073 - Biometrics. 2005 Mar;61(1):10-6
12011421 - Proc Natl Acad Sci U S A. 2002 May 14;99(10):6567-72
10077610 - Proc Natl Acad Sci U S A. 1999 Mar 16;96(6):2907-12
20834038 - Bioinformatics. 2010 Oct 15;26(20):2578-85
20129251 - Cancer Cell. 2010 Jan 19;17(1):98-110
19126652 - Carcinogenesis. 2009 Mar;30(3):416-22
9843981 - Proc Natl Acad Sci U S A. 1998 Dec 8;95(25):14863-8
15016911 - Proc Natl Acad Sci U S A. 2004 Mar 23;101(12):4164-9
12118244 - Nat Med. 2002 Aug;8(8):816-24
19698124 - BMC Bioinformatics. 2009 Aug 22;10:260
12416686 - Neural Netw. 2002 Oct-Nov;15(8-9):953-66
20802251 - Bioinformatics. 2010 Nov 1;26(21):2705-12
18234564 - J Biomed Inform. 2008 Aug;41(4):602-6
18662380 - Breast Cancer Res. 2008;10(4):R65
19759197 - Bioinformatics. 2009 Nov 15;25(22):2906-12
11786909 - Nat Med. 2002 Jan;8(1):68-74
10676951 - Nature. 2000 Feb 3;403(6769):503-11
24587839 - Ann Appl Stat. 2013 Apr 9;7(1):269-294
10712947 - Curr Opin Immunol. 2000 Apr;12(2):201-5
15914541 - Bioinformatics. 2005 Aug 1;21(15):3201-12
11673243 - Bioinformatics. 2001 Oct;17(10):977-87
11864371 - Genome Biol. 2002;3(2):RESEARCH0009
10548103 - Nature. 1999 Oct 21;401(6755):788-91
15094809 - PLoS Biol. 2004 Apr;2(4):E108
References_xml – reference: 11553815 - Proc Natl Acad Sci U S A. 2001 Sep 11;98(19):10869-74
– reference: 15914541 - Bioinformatics. 2005 Aug 1;21(15):3201-12
– reference: 20129251 - Cancer Cell. 2010 Jan 19;17(1):98-110
– reference: 12537558 - Genome Biol. 2002;3(12):RESEARCH0069
– reference: 18061589 - Comput Biol Med. 2008 Mar;38(3):283-93
– reference: 11934740 - Bioinformatics. 2002 Mar;18(3):413-22
– reference: 12416686 - Neural Netw. 2002 Oct-Nov;15(8-9):953-66
– reference: 18662380 - Breast Cancer Res. 2008;10(4):R65
– reference: 11707567 - Proc Natl Acad Sci U S A. 2001 Nov 20;98(24):13790-5
– reference: 11864371 - Genome Biol. 2002;3(2):RESEARCH0009
– reference: 10077610 - Proc Natl Acad Sci U S A. 1999 Mar 16;96(6):2907-12
– reference: 12917485 - Proc Natl Acad Sci U S A. 2003 Sep 2;100(18):10393-8
– reference: 20802251 - Bioinformatics. 2010 Nov 1;26(21):2705-12
– reference: 10712947 - Curr Opin Immunol. 2000 Apr;12(2):201-5
– reference: 15737073 - Biometrics. 2005 Mar;61(1):10-6
– reference: 12761060 - Bioinformatics. 2003 May 22;19(8):973-80
– reference: 20492682 - BMC Cancer. 2010 May 21;10 :227
– reference: 15016911 - Proc Natl Acad Sci U S A. 2004 Mar 23;101(12):4164-9
– reference: 12118244 - Nat Med. 2002 Aug;8(8):816-24
– reference: 11786909 - Nat Med. 2002 Jan;8(1):68-74
– reference: 10676951 - Nature. 2000 Feb 3;403(6769):503-11
– reference: 19126652 - Carcinogenesis. 2009 Mar;30(3):416-22
– reference: 18595779 - J Biomed Inform. 2009 Feb;42(1):74-81
– reference: 9843981 - Proc Natl Acad Sci U S A. 1998 Dec 8;95(25):14863-8
– reference: 22879375 - Nucleic Acids Res. 2012 Oct;40(19):9379-91
– reference: 15094809 - PLoS Biol. 2004 Apr;2(4):E108
– reference: 18173289 - Anal Chem. 2008 Feb 1;80(3):665-74
– reference: 12011421 - Proc Natl Acad Sci U S A. 2002 May 14;99(10):6567-72
– reference: 12840046 - Genome Res. 2003 Jul;13(7):1706-18
– reference: 19698124 - BMC Bioinformatics. 2009 Aug 22;10:260
– reference: 10548103 - Nature. 1999 Oct 21;401(6755):788-91
– reference: 18234564 - J Biomed Inform. 2008 Aug;41(4):602-6
– reference: 11673243 - Bioinformatics. 2001 Oct;17(10):977-87
– reference: 14711987 - Proc Natl Acad Sci U S A. 2004 Jan 20;101(3):811-6
– reference: 10391217 - Nat Genet. 1999 Jul;22(3):281-5
– reference: 15573120 - Nat Rev Cancer. 2004 Dec;4(12):988-93
– reference: 20834038 - Bioinformatics. 2010 Oct 15;26(20):2578-85
– reference: 19759197 - Bioinformatics. 2009 Nov 15;25(22):2906-12
– reference: 24587839 - Ann Appl Stat. 2013 Apr 9;7(1):269-294
SSID ssj0002118406
Score 2.110945
SecondaryResourceType review_article
Snippet High-throughput 'omic' data, such as gene expression, DNA methylation, DNA copy number, has played an instrumental role in furthering our understanding of the...
High-throughput ‘omic’ data, such as gene expression, DNA methylation, DNA copy number, has played an instrumental role in furthering our understanding of the...
SourceID pubmedcentral
pubmed
SourceType Open Access Repository
Index Database
StartPage 202
Title Integrative clustering methods for high-dimensional molecular data
URI https://www.ncbi.nlm.nih.gov/pubmed/25243110
https://pubmed.ncbi.nlm.nih.gov/PMC4166480
Volume 3
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fa9swEBZZB2MvY2O_um7DsL0FZbasSPLjFjbalZQOWuieiizJpCxJQ0j2UPbH706SbaWMsfVFBCWWHN_H6Tv57hMh7wvDmHSmolrpnPIK3-_WrqBCOC6ssE42WJw8PRGH5_zrxfhiMPiVVpds6pG5-WNdyV2sCn1gV6yS_Q_LdoNCB3wG-0ILFob2n2x8FLUeMPvHzLeoeYCRfzgV2gstDFGOmFqU8A_yG8NFex7uMJalddzUL1vzdnPQIBzWwygG1G0aT2aomBgqw9a6nukfukvvPb52-NJ4HfzYT6wnHPWb8AsdKrGnV_Mekd-32PWtXT_j9kPB-zSp4KUYcAQqpD-OEBaUtq-iQuVl6mbLBE1l6jJzlqy-LFRe3nbswJqUd-w4_qibE1PzuNdfLdOrwEqrhbc5GzNgSDFxdldX-3Q6ATIquMrvkftMAvNCSn103O3QQWgM0a8_nbCd7wF5F2_mw19uBQWm47wJr9nNuU1IzNlj8ihGH9nHAKUnZOCWT8mnBEZZD6MswigDGGW3YZR1MMoQRs_I-ZfPZ5NDGo_WoCtgZBvasEI3TOimKLV0kltbaCsrVgvTOJZrrlXlQ__cuhwormJ67IQwotCaO9uUz8ne8nrpXpLMMK3yMXiCSinOGqWlsVIqp52wMI7ZJy_CP79cBf2Uy_bh7BO580y6H6Dk-e43y6uZlz6PFnt15ysPyMMexa_J3ma9dW-AVm7qt9760J6cTn8DQNl8WA
linkProvider Flying Publisher
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Integrative+clustering+methods+for+high-dimensional+molecular+data&rft.jtitle=Translational+cancer+research&rft.au=Chalise%2C+Prabhakar&rft.au=Koestler%2C+Devin+C.&rft.au=Bimali%2C+Milan&rft.au=Yu%2C+Qing&rft.date=2014-06-01&rft.issn=2218-676X&rft.eissn=2219-6803&rft.volume=3&rft.issue=3&rft.spage=202&rft.epage=216&rft_id=info:doi/10.3978%2Fj.issn.2218-676X.2014.06.03&rft_id=info%3Apmid%2F25243110&rft.externalDocID=PMC4166480
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2218-676X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2218-676X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2218-676X&client=summon