COMBING: Clustering in Oncology for Mathematical and Biological Identification of Novel Gene Signatures

Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper,...

Full description

Saved in:
Bibliographic Details
Published inIEEE/ACM transactions on computational biology and bioinformatics Vol. 19; no. 6; pp. 3317 - 3331
Main Authors Battistella, Enzo, Vakalopoulou, Maria, Sun, Roger, Estienne, Theo, Lerousseau, Marvin, Nikolaev, Sergey, Andres, Emilie Alvarez, Carre, Alexandre, Niyoteka, Stephane, Robert, Charlotte, Paragios, Nikos, Deutsch, Eric
Format Journal Article
LanguageEnglish
Published United States IEEE 01.11.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Institute of Electrical and Electronics Engineers
Subjects
Online AccessGet full text
ISSN1545-5963
1557-9964
2374-0043
1557-9964
DOI10.1109/TCBB.2021.3123910

Cover

Abstract Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper, we introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers. Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm. It offers modularity as concerns metric functions and scalability, while being able to automatically determine the best number of clusters. Our evaluation includes both mathematical and biological criteria to define a quantitative metric. The recovered signature is applied to a variety of biological tasks, including screening of biological pathways and functions, and characterization relevance on tumor types and subtypes. Quantitative comparisons among different distance metrics, commonly used clustering methods and a referential gene signature used in the literature, confirm state of the art performance of our approach. In particular, our signature, based on 27 genes, reports at least 30 times better mathematical significance (average Dunn's Index) and <inline-formula><tex-math notation="LaTeX">25\%</tex-math> <mml:math><mml:mrow><mml:mn>25</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="battistella-ieq1-3123910.gif"/> </inline-formula> better biological significance (average Enrichment in Protein-Protein Interaction) than those produced by other referential clustering methods. Finally, our signature reports promising results on distinguishing immune inflammatory and immune desert tumors, while reporting a high balanced accuracy of <inline-formula><tex-math notation="LaTeX">92\%</tex-math> <mml:math><mml:mrow><mml:mn>92</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="battistella-ieq2-3123910.gif"/> </inline-formula> on tumor types classification and averaged balanced accuracy of <inline-formula><tex-math notation="LaTeX">68\%</tex-math> <mml:math><mml:mrow><mml:mn>68</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="battistella-ieq3-3123910.gif"/> </inline-formula> on tumor subtypes classification, which represents, respectively <inline-formula><tex-math notation="LaTeX">7\%</tex-math> <mml:math><mml:mrow><mml:mn>7</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="battistella-ieq4-3123910.gif"/> </inline-formula> and <inline-formula><tex-math notation="LaTeX">9\%</tex-math> <mml:math><mml:mrow><mml:mn>9</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="battistella-ieq5-3123910.gif"/> </inline-formula> higher performance compared to the referential signature.
AbstractList Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper, we introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers. Our method is based on the LP-Stability algorithm, a high dimensional centerbased unsupervised clustering algorithm. It offers modularity as concerns metric functions and scalability, while being able to automatically determine the best number of clusters. Our evaluation includes both mathematical and biological criteria to define a quantitative metric. The recovered signature is applied to a variety of biological tasks, including screening of biological pathways and functions, and characterization relevance on tumor types and subtypes. Quantitative comparisons among different distance metrics, commonly used clustering methods and a referential gene signature used in the literature, confirm state of the art performance of our approach. In particular, our signature, based on 27 genes, reports at least 30 times better mathematical significance (average Dunn's Index) and 25% better biological significance (average Enrichment in Protein-Protein Interaction) than those produced by other referential clustering methods. Finally, our signature reports promising results on distinguishing immune inflammatory and immune desert tumors, while reporting a high balanced accuracy of 92% on tumor types classification and averaged balanced accuracy of 68% on tumor subtypes classification, which represents, respectively 7% and 9% higher performance compared to the referential signature.
Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper, we introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers. Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm. It offers modularity as concerns metric functions and scalability, while being able to automatically determine the best number of clusters. Our evaluation includes both mathematical and biological criteria to define a quantitative metric. The recovered signature is applied to a variety of biological tasks, including screening of biological pathways and functions, and characterization relevance on tumor types and subtypes. Quantitative comparisons among different distance metrics, commonly used clustering methods and a referential gene signature used in the literature, confirm state of the art performance of our approach. In particular, our signature, based on 27 genes, reports at least 30 times better mathematical significance (average Dunn’s Index) and [Formula Omitted] better biological significance (average Enrichment in Protein-Protein Interaction) than those produced by other referential clustering methods. Finally, our signature reports promising results on distinguishing immune inflammatory and immune desert tumors, while reporting a high balanced accuracy of [Formula Omitted] on tumor types classification and averaged balanced accuracy of [Formula Omitted] on tumor subtypes classification, which represents, respectively [Formula Omitted] and [Formula Omitted] higher performance compared to the referential signature.
Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper, we introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers. Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm. It offers modularity as concerns metric functions and scalability, while being able to automatically determine the best number of clusters. Our evaluation includes both mathematical and biological criteria to define a quantitative metric. The recovered signature is applied to a variety of biological tasks, including screening of biological pathways and functions, and characterization relevance on tumor types and subtypes. Quantitative comparisons among different distance metrics, commonly used clustering methods and a referential gene signature used in the literature, confirm state of the art performance of our approach. In particular, our signature, based on 27 genes, reports at least 30 times better mathematical significance (average Dunn's Index) and <inline-formula><tex-math notation="LaTeX">25\%</tex-math> <mml:math><mml:mrow><mml:mn>25</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="battistella-ieq1-3123910.gif"/> </inline-formula> better biological significance (average Enrichment in Protein-Protein Interaction) than those produced by other referential clustering methods. Finally, our signature reports promising results on distinguishing immune inflammatory and immune desert tumors, while reporting a high balanced accuracy of <inline-formula><tex-math notation="LaTeX">92\%</tex-math> <mml:math><mml:mrow><mml:mn>92</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="battistella-ieq2-3123910.gif"/> </inline-formula> on tumor types classification and averaged balanced accuracy of <inline-formula><tex-math notation="LaTeX">68\%</tex-math> <mml:math><mml:mrow><mml:mn>68</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="battistella-ieq3-3123910.gif"/> </inline-formula> on tumor subtypes classification, which represents, respectively <inline-formula><tex-math notation="LaTeX">7\%</tex-math> <mml:math><mml:mrow><mml:mn>7</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="battistella-ieq4-3123910.gif"/> </inline-formula> and <inline-formula><tex-math notation="LaTeX">9\%</tex-math> <mml:math><mml:mrow><mml:mn>9</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="battistella-ieq5-3123910.gif"/> </inline-formula> higher performance compared to the referential signature.
Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper, we introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers. Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm. It offers modularity as concerns metric functions and scalability, while being able to automatically determine the best number of clusters. Our evaluation includes both mathematical and biological criteria to define a quantitative metric. The recovered signature is applied to a variety of biological tasks, including screening of biological pathways and functions, and characterization relevance on tumor types and subtypes. Quantitative comparisons among different distance metrics, commonly used clustering methods and a referential gene signature used in the literature, confirm state of the art performance of our approach. In particular, our signature, based on 27 genes, reports at least 30 times better mathematical significance (average Dunn's Index) and 25% better biological significance (average Enrichment in Protein-Protein Interaction) than those produced by other referential clustering methods. Finally, our signature reports promising results on distinguishing immune inflammatory and immune desert tumors, while reporting a high balanced accuracy of 92% on tumor types classification and averaged balanced accuracy of 68% on tumor subtypes classification, which represents, respectively 7% and 9% higher performance compared to the referential signature.
Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper, we introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers. Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm. It offers modularity as concerns metric functions and scalability, while being able to automatically determine the best number of clusters. Our evaluation includes both mathematical and biological criteria to define a quantitative metric. The recovered signature is applied to a variety of biological tasks, including screening of biological pathways and functions, and characterization relevance on tumor types and subtypes. Quantitative comparisons among different distance metrics, commonly used clustering methods and a referential gene signature used in the literature, confirm state of the art performance of our approach. In particular, our signature, based on 27 genes, reports at least 30 times better mathematical significance (average Dunn's Index) and 25% better biological significance (average Enrichment in Protein-Protein Interaction) than those produced by other referential clustering methods. Finally, our signature reports promising results on distinguishing immune inflammatory and immune desert tumors, while reporting a high balanced accuracy of 92% on tumor types classification and averaged balanced accuracy of 68% on tumor subtypes classification, which represents, respectively 7% and 9% higher performance compared to the referential signature.Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper, we introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers. Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm. It offers modularity as concerns metric functions and scalability, while being able to automatically determine the best number of clusters. Our evaluation includes both mathematical and biological criteria to define a quantitative metric. The recovered signature is applied to a variety of biological tasks, including screening of biological pathways and functions, and characterization relevance on tumor types and subtypes. Quantitative comparisons among different distance metrics, commonly used clustering methods and a referential gene signature used in the literature, confirm state of the art performance of our approach. In particular, our signature, based on 27 genes, reports at least 30 times better mathematical significance (average Dunn's Index) and 25% better biological significance (average Enrichment in Protein-Protein Interaction) than those produced by other referential clustering methods. Finally, our signature reports promising results on distinguishing immune inflammatory and immune desert tumors, while reporting a high balanced accuracy of 92% on tumor types classification and averaged balanced accuracy of 68% on tumor subtypes classification, which represents, respectively 7% and 9% higher performance compared to the referential signature.
Author Estienne, Theo
Carre, Alexandre
Nikolaev, Sergey
Deutsch, Eric
Lerousseau, Marvin
Andres, Emilie Alvarez
Paragios, Nikos
Niyoteka, Stephane
Sun, Roger
Robert, Charlotte
Battistella, Enzo
Vakalopoulou, Maria
Author_xml – sequence: 1
  givenname: Enzo
  orcidid: 0000-0001-7053-5666
  surname: Battistella
  fullname: Battistella, Enzo
  email: en.battistella@gmail.com
  organization: CentraleSupélec, Mathématiques et Informatique pour la Complexité et les Systémes, Université Paris-Saclay, Gif-sur-Yvette, France
– sequence: 2
  givenname: Maria
  orcidid: 0000-0003-0791-1264
  surname: Vakalopoulou
  fullname: Vakalopoulou, Maria
  email: maria.vakalopoulou@centralesupelec.fr
  organization: CentraleSupélec, Mathématiques et Informatique pour la Complexité et les Systémes, Université Paris-Saclay, Gif-sur-Yvette, France
– sequence: 3
  givenname: Roger
  orcidid: 0000-0001-9866-6449
  surname: Sun
  fullname: Sun, Roger
  email: roger.sun@gustaveroussy.fr
  organization: Gustave Roussy-CentraleSupélec-TheraPanacea, Noesia Center of Artificial Intelligence in Radiation Therapy and Oncology, Gustave Roussy Cancer Campus, Villejuif, France
– sequence: 4
  givenname: Theo
  orcidid: 0000-0002-7569-5033
  surname: Estienne
  fullname: Estienne, Theo
  email: theo.estienne@centralesupelec.fr
  organization: CentraleSupélec, Mathématiques et Informatique pour la Complexité et les Systémes, Université Paris-Saclay, Gif-sur-Yvette, France
– sequence: 5
  givenname: Marvin
  orcidid: 0000-0001-6421-708X
  surname: Lerousseau
  fullname: Lerousseau, Marvin
  email: marvin.lerousseau@gmail.com
  organization: CentraleSupélec, Mathématiques et Informatique pour la Complexité et les Systémes, Université Paris-Saclay, Gif-sur-Yvette, France
– sequence: 6
  givenname: Sergey
  orcidid: 0000-0001-8587-2307
  surname: Nikolaev
  fullname: Nikolaev, Sergey
  email: sergey.nikolaev@gustaveroussy.fr
  organization: Institut Gustave Roussy, Inserm U1030 Molecular Radiotherapy and Innovative Therapeutics, Université Paris-Saclay, Villejuif, France
– sequence: 7
  givenname: Emilie Alvarez
  orcidid: 0000-0002-2441-1766
  surname: Andres
  fullname: Andres, Emilie Alvarez
  email: emilie.alvarez-andres@u-psud.fr
  organization: Gustave Roussy-CentraleSupélec-TheraPanacea, Noesia Center of Artificial Intelligence in Radiation Therapy and Oncology, Gustave Roussy Cancer Campus, Villejuif, France
– sequence: 8
  givenname: Alexandre
  orcidid: 0000-0002-4793-835X
  surname: Carre
  fullname: Carre, Alexandre
  email: alexandre.carre@gustaveroussy.fr
  organization: Gustave Roussy-CentraleSupélec-TheraPanacea, Noesia Center of Artificial Intelligence in Radiation Therapy and Oncology, Gustave Roussy Cancer Campus, Villejuif, France
– sequence: 9
  givenname: Stephane
  orcidid: 0000-0001-6360-7086
  surname: Niyoteka
  fullname: Niyoteka, Stephane
  email: Stephane.NIYOTEKA@gustaveroussy.fr
  organization: Gustave Roussy-CentraleSupélec-TheraPanacea, Noesia Center of Artificial Intelligence in Radiation Therapy and Oncology, Gustave Roussy Cancer Campus, Villejuif, France
– sequence: 10
  givenname: Charlotte
  orcidid: 0000-0001-8838-1343
  surname: Robert
  fullname: Robert, Charlotte
  email: ch.robert@gustaveroussy.fr
  organization: Gustave Roussy-CentraleSupélec-TheraPanacea, Noesia Center of Artificial Intelligence in Radiation Therapy and Oncology, Gustave Roussy Cancer Campus, Villejuif, France
– sequence: 11
  givenname: Nikos
  orcidid: 0000-0002-9668-4763
  surname: Paragios
  fullname: Paragios, Nikos
  email: nikos.paragios@ecp.fr
  organization: Gustave Roussy-CentraleSupélec-TheraPanacea, Noesia Center of Artificial Intelligence in Radiation Therapy and Oncology, Gustave Roussy Cancer Campus, Villejuif, France
– sequence: 12
  givenname: Eric
  orcidid: 0000-0002-8223-3697
  surname: Deutsch
  fullname: Deutsch, Eric
  email: eric.deutsch@gustaveroussy.fr
  organization: Gustave Roussy-CentraleSupélec-TheraPanacea, Noesia Center of Artificial Intelligence in Radiation Therapy and Oncology, Gustave Roussy Cancer Campus, Villejuif, France
BackLink https://www.ncbi.nlm.nih.gov/pubmed/34714749$$D View this record in MEDLINE/PubMed
https://centralesupelec.hal.science/hal-03530265$$DView record in HAL
BookMark eNp9kl1v0zAUhi00xD7gByAkZIkbuEixHX9yt0ZbV6lbLxjXlpM4nafULrEz1H9PspQhFYkrH9vPe47Pe3wOTnzwFoD3GM0wRurrfTGfzwgieJZjkiuMXoEzzJjIlOL0ZIwpy5ji-Sk4j_ERIUIVom_AaU4FpoKqM7Ap1rfz5d3iGyzaPibbOb-BzsO1r0IbNnvYhA7emvRgtya5yrTQ-BrO3Xj5vF3W1ifXDHFywcPQwLvwZFu4sN7C727jTeo7G9-C141po313WC_Aj-ur--ImW60Xy-JylVWU8JSVtME8F5QIIrEgFPOGc0kkE4bJplJcyJoghjk3lSS0JE2JLMJSIkpqyUh-AciUt_c7s_9l2lbvOrc13V5jpEfXdKrKUo-u6YNrg-jLJHowf_FgnL65XOnxDOUsR4SzJzywnyd214WfvY1Jb12sbNsab0MfNWEKIckJ5QP66Qh9DH3nh_Y1EVQynksyFv94oPpya-uX-n9mNAB4AqouxNjZ5p-Oxn9w3JE40lQuPU8odca1_1V-mJTOWvtSSTE1PAflvwGYprlB
CODEN ITCBCY
CitedBy_id crossref_primary_10_1093_bioinformatics_btae341
crossref_primary_10_1038_s41746_025_01471_y
crossref_primary_10_3390_diagnostics14141472
Cites_doi 10.1080/21541264.2016.1268245
10.1158/1055-9965.EPI-10-1224
10.18632/oncotarget.17606
10.18632/oncotarget.12818
10.1200/JCO.2013.53.6607
10.1016/j.cell.2011.02.013
10.1007/BF01908075
10.1186/1878-5085-4-7
10.3390/biom9050201
10.1155/2017/7687851
10.1530/ERC-18-0068
10.1038/s41467-017-00289-x
10.1186/s13059-017-1215-1
10.3389/fgene.2019.00236
10.1186/1471-2105-12-323
10.1038/s41598-017-06368-9
10.1038/nrg.2017.38
10.1038/sj.onc.1206472
10.1016/S1470-2045(18)30413-3
10.4172/1747-0862.1000183
10.1093/nar/gky1131
10.1073/pnas.211566398
10.1073/pnas.091062498
10.1038/modpathol.2016.64
10.1038/ncomms15657
10.1016/j.immuni.2018.03.023
10.1023/A:1012801612483
10.1371/journal.pone.0143196
10.1172/JCI91190
10.5430/air.v7n1p15
10.1183/09031936.00179813
10.1007/978-3-030-17938-0_41
10.18632/oncotarget.27215
10.1038/s41540-017-0038-8
10.1056/NEJMp1607591
10.4137/BBI.S38316
10.1016/j.cell.2018.02.060
10.1186/s12920-017-0245-6
10.1371/journal.pone.0184385
10.1016/j.patcog.2021.108108
10.1016/j.asoc.2016.11.026
10.1186/1755-8794-3-39
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Distributed under a Creative Commons Attribution 4.0 International License
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
– notice: Distributed under a Creative Commons Attribution 4.0 International License
DBID 97E
RIA
RIE
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7QF
7QO
7QQ
7SC
7SE
7SP
7SR
7TA
7TB
7U5
8BQ
8FD
F28
FR3
H8D
JG9
JQ2
KR7
L7M
L~C
L~D
P64
7X8
1XC
VOOES
ADTOC
UNPAY
DOI 10.1109/TCBB.2021.3123910
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Aluminium Industry Abstracts
Biotechnology Research Abstracts
Ceramic Abstracts
Computer and Information Systems Abstracts
Corrosion Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
Materials Business File
Mechanical & Transportation Engineering Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
Aerospace Database
Materials Research Database
ProQuest Computer Science Collection
Civil Engineering Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
Hyper Article en Ligne (HAL)
Hyper Article en Ligne (HAL) (Open Access)
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Materials Research Database
Civil Engineering Abstracts
Aluminium Industry Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Ceramic Abstracts
Materials Business File
METADEX
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts Professional
Aerospace Database
Engineered Materials Abstracts
Biotechnology Research Abstracts
Solid State and Superconductivity Abstracts
Engineering Research Database
Corrosion Abstracts
Advanced Technologies Database with Aerospace
ANTE: Abstracts in New Technology & Engineering
MEDLINE - Academic
DatabaseTitleList
Materials Research Database

MEDLINE
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 4
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Biology
Computer Science
EISSN 1557-9964
EndPage 3331
ExternalDocumentID oai:HAL:hal-03530265v1
34714749
10_1109_TCBB_2021_3123910
9594710
Genre orig-research
Research Support, Non-U.S. Gov't
Journal Article
GrantInformation_xml – fundername: ARC
  grantid: SIGNIT201801286
– fundername: Fondation pour la Recherche Médicale
  grantid: DIC20161236437
  funderid: 10.13039/501100002915
GroupedDBID 0R~
29I
4.4
53G
5GY
5VS
6IK
8US
97E
AAJGR
AAKMM
AALFJ
AARMG
AASAJ
AAWTH
AAWTV
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACM
ACPRK
ADBCU
ADL
AEBYY
AEFXT
AEJOY
AENEX
AENSD
AETIX
AFRAH
AFWIH
AFWXC
AGQYO
AGSQL
AHBIQ
AIBXA
AIKLT
AKJIK
AKQYR
AKRVB
ALMA_UNASSIGNED_HOLDINGS
ASPBG
ATWAV
AVWKF
BDXCO
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CCLIF
CS3
DU5
EBS
EJD
FEDTE
GUFHI
HGAVV
HZ~
I07
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
LHSKQ
M43
O9-
OCL
P1C
P2P
PQQKQ
RIA
RIE
RNI
RNS
ROL
RZB
TN5
XOL
AAYXX
CITATION
AAYOK
ADPZR
CGR
CUY
CVF
ECM
EIF
NPM
RIG
W7O
7QF
7QO
7QQ
7SC
7SE
7SP
7SR
7TA
7TB
7U5
8BQ
8FD
F28
FR3
H8D
JG9
JQ2
KR7
L7M
L~C
L~D
P64
7X8
1XC
VOOES
ADTOC
UNPAY
ID FETCH-LOGICAL-c426t-b4f163742728172416f6682857a58fc9678d205166ac824b2fb0e0188042d8523
IEDL.DBID UNPAY
ISSN 1545-5963
1557-9964
2374-0043
IngestDate Sun Oct 26 03:58:07 EDT 2025
Tue Oct 28 06:37:03 EDT 2025
Mon Sep 29 03:47:20 EDT 2025
Mon Jun 30 06:14:50 EDT 2025
Thu Apr 03 07:05:58 EDT 2025
Sat Oct 25 04:05:13 EDT 2025
Thu Apr 24 23:12:03 EDT 2025
Wed Aug 27 02:34:38 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords Biomarkers
Multi-tumor association
Clustering
Predictive Signature
Genomics
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0
other-oa
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c426t-b4f163742728172416f6682857a58fc9678d205166ac824b2fb0e0188042d8523
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0001-8587-2307
0000-0001-6360-7086
0000-0002-8223-3697
0000-0001-9866-6449
0000-0001-6421-708X
0000-0002-9668-4763
0000-0002-2441-1766
0000-0003-0791-1264
0000-0001-7053-5666
0000-0002-7569-5033
0000-0001-8838-1343
0000-0002-4793-835X
OpenAccessLink https://proxy.k.utb.cz/login?url=https://centralesupelec.hal.science/hal-03530265
PMID 34714749
PQID 2748563820
PQPubID 85499
PageCount 15
ParticipantIDs hal_primary_oai_HAL_hal_03530265v1
proquest_journals_2748563820
proquest_miscellaneous_2590086246
pubmed_primary_34714749
crossref_primary_10_1109_TCBB_2021_3123910
crossref_citationtrail_10_1109_TCBB_2021_3123910
ieee_primary_9594710
unpaywall_primary_10_1109_tcbb_2021_3123910
PublicationCentury 2000
PublicationDate 2022-11-01
PublicationDateYYYYMMDD 2022-11-01
PublicationDate_xml – month: 11
  year: 2022
  text: 2022-11-01
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: New York
PublicationTitle IEEE/ACM transactions on computational biology and bioinformatics
PublicationTitleAbbrev TCBB
PublicationTitleAlternate IEEE/ACM Trans Comput Biol Bioinform
PublicationYear 2022
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Institute of Electrical and Electronics Engineers
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
– name: Institute of Electrical and Electronics Engineers
References ref15
ref14
wan (ref7) 2010; 24
ref11
ref10
ver steeg (ref23) 2014
ref17
ref16
ref19
ref18
chassagnon (ref31) 2020
ref51
ref50
ref46
ref45
ref48
ref47
ref42
ref41
ref44
ref43
ref49
ref8
macqueen (ref22) 1967
ref9
ref4
ref3
ref6
ref5
ref40
ref35
komodakis (ref24) 2009
ref34
ref37
ref36
ref30
ref33
kaufmann (ref12) 1987
ref2
ref1
ref39
ref38
kovács (ref13) 2005
ref26
ref25
ref21
ref28
ref27
ref29
van dam (ref20) 2018; 19
chassagnon (ref32) 2020
References_xml – start-page: 405
  year: 1987
  ident: ref12
  article-title: Clustering by means of medoids
  publication-title: Data Analysis based on the L1-Norm and Related Methods
– year: 1967
  ident: ref22
  article-title: Some methods for classification and analysis of multivariate observations
  publication-title: Proc 5th Berkeley Symp Math Statist Probability
– ident: ref50
  doi: 10.1080/21541264.2016.1268245
– ident: ref40
  doi: 10.1158/1055-9965.EPI-10-1224
– ident: ref36
  doi: 10.18632/oncotarget.17606
– ident: ref38
  doi: 10.18632/oncotarget.12818
– ident: ref1
  doi: 10.1200/JCO.2013.53.6607
– ident: ref2
  doi: 10.1016/j.cell.2011.02.013
– ident: ref15
  doi: 10.1007/BF01908075
– ident: ref10
  doi: 10.1186/1878-5085-4-7
– ident: ref45
  doi: 10.3390/biom9050201
– ident: ref42
  doi: 10.1155/2017/7687851
– ident: ref46
  doi: 10.1530/ERC-18-0068
– ident: ref4
  doi: 10.1038/s41467-017-00289-x
– year: 2020
  ident: ref32
  article-title: AI-driven CT-based quantification, staging and short-term outcome prediction of COVID-19 pneumonia
– ident: ref6
  doi: 10.1186/s13059-017-1215-1
– ident: ref5
  doi: 10.3389/fgene.2019.00236
– ident: ref34
  doi: 10.1186/1471-2105-12-323
– ident: ref37
  doi: 10.1038/s41598-017-06368-9
– ident: ref18
  doi: 10.1038/nrg.2017.38
– volume: 19
  start-page: 575
  year: 2018
  ident: ref20
  article-title: Gene co-expression analysis for functional classification and gene-disease predictions
  publication-title: Brief Bioinf
– ident: ref43
  doi: 10.1038/sj.onc.1206472
– ident: ref8
  doi: 10.1016/S1470-2045(18)30413-3
– ident: ref39
  doi: 10.4172/1747-0862.1000183
– ident: ref30
  doi: 10.1093/nar/gky1131
– start-page: 577
  year: 2014
  ident: ref23
  article-title: Discovering structure in high-dimensional data through correlation explanation
  publication-title: Proc 27th Int Conf Neural Inf Process Syst
– ident: ref3
  doi: 10.1073/pnas.211566398
– ident: ref29
  doi: 10.1073/pnas.091062498
– start-page: 865
  year: 2009
  ident: ref24
  article-title: Clustering via LP-based stabilities
  publication-title: Proc 21st Int Conf Neural Inf Process Syst
– ident: ref47
  doi: 10.1038/modpathol.2016.64
– ident: ref9
  doi: 10.1038/ncomms15657
– ident: ref26
  doi: 10.1016/j.immuni.2018.03.023
– ident: ref11
  doi: 10.1023/A:1012801612483
– ident: ref16
  doi: 10.1371/journal.pone.0143196
– ident: ref48
  doi: 10.1172/JCI91190
– year: 2020
  ident: ref31
  article-title: Holistic AI-driven quantification, staging and prognosis of COVID-19 pneumonia
  publication-title: medRxiv 2020 04 17 20069187
– ident: ref14
  doi: 10.5430/air.v7n1p15
– ident: ref41
  doi: 10.1183/09031936.00179813
– ident: ref28
  doi: 10.1007/978-3-030-17938-0_41
– year: 2005
  ident: ref13
  article-title: Cluster validity measurement techniques
  publication-title: Proc 6th Int Symp Hung Researchers Comput Intell
– ident: ref49
  doi: 10.18632/oncotarget.27215
– ident: ref21
  doi: 10.1038/s41540-017-0038-8
– ident: ref33
  doi: 10.1056/NEJMp1607591
– ident: ref27
  doi: 10.4137/BBI.S38316
– volume: 24
  start-page: 489
  year: 2010
  ident: ref7
  article-title: A breast cancer prognostic signature predicts clinical outcomes in multiple tumor types
  publication-title: Oncol Rep
– ident: ref25
  doi: 10.1016/j.cell.2018.02.060
– ident: ref17
  doi: 10.1186/s12920-017-0245-6
– ident: ref51
  doi: 10.1371/journal.pone.0184385
– ident: ref19
  doi: 10.1016/j.patcog.2021.108108
– ident: ref35
  doi: 10.1016/j.asoc.2016.11.026
– ident: ref44
  doi: 10.1186/1755-8794-3-39
SSID ssj0024904
Score 2.3379087
Snippet Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of...
SourceID unpaywall
hal
proquest
pubmed
crossref
ieee
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 3317
SubjectTerms Accuracy
Algorithms
Bioinformatics
Biology
Biomarkers
Classification
Cluster Analysis
Clustering
Clustering algorithms
Computer Science
Gene Expression Profiling - methods
Genes
Genomics
Humans
Indexes
Inflammation
Mathematical analysis
Measurement
Modularity
multi-tumor association
Neoplasms - genetics
Pattern Recognition, Automated - methods
Precision medicine
predictive signature
Protein interaction
Proteins
Tumors
SummonAdditionalLinks – databaseName: IEEE Electronic Library (IEL)
  dbid: RIE
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Nb9QwEB21lYByKB-lJaUggzgB2SZO4sTcuivKgtjtgVbqLbITp60aJRVNipZfz4yTDRQqxC2JHcvWjD3zbM8bgNeB9rnxRe5SpKMbKj9zlTQFLoZa8yLkYWyzN8zmYnocfj6JTlbg3RALY4yxl8_MiB7tWX5eZy1tle3JSOJaigB9NU5EF6v1i1dP2lSB5BG4EWpVf4Lpe3LvaDIeIxLkPgJUHqB9XIe7ATaE_ZA3zNHqGV2GtFlWbnM478O9trpUi--qLH8zQgcPYLbsfnf35GLUNnqU_fiD2fF_x_cQNnpvlO136vMIVkz1GO50-SkXm3A6OZyNP80_vmeTsiVGBbRz7Lxih5Xlul4w9HjZbCB-xZZUlbPud_vahQEX_b4gqws2r69NyYjsmn09P-1oRa-ewPHBh6PJ1O0zM7gZWvTG1WGBfhyi6pgn6AGhU1cIQVx4sYqSIpNoAXOO010IlSU8RKlrz3hE_RbyPEHsuwVrVV2Zp8AKERH_jFFGExsgwi802yqIpM45wtHYAW8poDTracspe0aZWvjiyZTEm5J40168DrwZfrnsODv-VfkVSn2oR2zb0_0vKX3zAkqpJKJr34FNktZQqxeUA7tLJUn7qX-VIsxPIlzVOBa_HIpx0tJJjKpM3WIdytVKoTnCge1OuYa2l5rpwNtB2_4aRpNpfWMYO7f38Bmsc4rXsMGTu7DWfGvNc_SiGv3CTp-fkDgPzA
  priority: 102
  providerName: IEEE
Title COMBING: Clustering in Oncology for Mathematical and Biological Identification of Novel Gene Signatures
URI https://ieeexplore.ieee.org/document/9594710
https://www.ncbi.nlm.nih.gov/pubmed/34714749
https://www.proquest.com/docview/2748563820
https://www.proquest.com/docview/2590086246
https://centralesupelec.hal.science/hal-03530265
UnpaywallVersion submittedVersion
Volume 19
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1557-9964
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0024904
  issn: 1545-5963
  databaseCode: RIE
  dateStart: 20040101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3rb9MwED9tnRDwgdd4BMZkEJ9A6RLHefGtrRgF0Q7EKm2fIjtxtokorWgyVP567hInGg8h7VseZ0funX2_q32_A3jlKZdrN8hsynS0hXRTW8Y6x8VQKZ4LLsKmesNsHkwX4uOJf7IFTpcLY44k6nW9ojoww3NEoMYXHOC17XhU5ibwt2En8BF9D2BnMf88Om1oUYVv47PmTL2PX0YoL5racqGwacfLbGq6TnxQpUphcMhdjFm5F1P-7BW3tH1OhyKbaiv_Ap634WZdruTmhyyKK87o8C586YbRnkH5NqwrNUx__sHweJ1x3oM7BpmyUWtK92FLlw_gRlurcrMLZ5Oj2fjD_P1bNilqYldAn8cuSnZUNrzXG4bol816EljsSZYZa5s3t21KcG7-I2TLnM2Xl7pgRHzNvl6ctRSj64ewOHx3PJnapkqDnaJ3r2wlcsR0GGGHPEI0hAAvDwLixQulH-VpjN4w4zj1g0CmERdoAcrRDtHACZ5FGAc_gkG5LPUTYDmp1I-01IqYAVF_6MKl58cq4xiahhY4nZKS1FCYUyWNImlCGSdOjifjcUJ6TYxeLXjdN1m1_B3_E36Jv30vR8zb09GnhJ51-rh0Ldglw-ilYj9GB49t9zpDScwysE4w5I98XOE4vn7Rv8YJTLsystTLGmWobiul6QQWPG4NrO_bw55xtsQWvOkt7q9hkC3_Noyn15J-Brc4pXQ0-ZV7MKi-1_o5Aq1K7TfZkPtmhv0C7qEcGg
linkProvider Unpaywall
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB6VImg58CqPQAGDOBWyTRw7m3Drrihb2N0e2Eq9RXbilIooqWhStPx6ZpxsoFAhbknsWLZm7JnP9nwD8DrQPjd-mLkU6egK5aeuik2Oi6HWPBdcDG32htk8nByJj8fyeA3e9rEwxhh7-cwM6NGe5WdV2tBW2W4sY1xLEaBfl0II2UZr_WLWi22yQPIJXIl61Z1h-l68uxiPRogFuY8QlQdoITfhZoBNYU_iSwbp2he6DmnzrFzlct6CjaY8U8vvqih-M0P7d2C2GkB7--TroKn1IP3xB7fj_47wLtzu_FG21yrQPVgz5X240WaoXG7ByfhwNjqYf3jHxkVDnApo6dhpyQ5Ly3a9ZOjzsllP_YotqTJj7e_2tQ0EzrudQVblbF5dmIIR3TX7fHrSEoueP4Cj_feL8cTtcjO4Kdr02tUiR08OcfWQR-gDoVuXhyGx4Q2VjPI0RhuYcZzwYajSiAuUu_aMR-RvgmcRot-HsF5WpXkMLA8lMdAYZTTxASIAQ8OtAhnrjCMgHTrgrQSUpB1xOeXPKBILYLw4IfEmJN6kE68DO_0vZy1rx78qv0Kp9_WIb3uyN03omxdQUqVQXvgObJG0-lqdoBzYXilJ0k3-8wSBfiRxXeNY_LIvxmlLZzGqNFWDdShbKwXnhA48apWrb3ulmQ686bXtr2HUqdaXhvHk6h6-gI3JYjZNpgfzT09hk1P0hg2l3Ib1-ltjnqFPVevndir9BPCeExk
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3db9MwED9tnRDwwNdgBAYyiCdQusRxvnhrK0ZBtAOxSuMpshNnmxalFU2Gyl_PXeJE40NIe8vH2ZF7Z9_vat_vAF55yuXaDTKbMh1tId3UlrHOcTFUiueCi7Cp3jCbB9OF-Hjin2yB0-XCmCOJel2vqA7M8AwRqPEFB3htOx6VuQn8bdgJfETfA9hZzD-PvjW0qMK38Vlzpt7HLyOUF01tuVDYtONlNjVdJz6oUqUwOOQuxqzciyl_9opb2j6jQ5FNtZV_Ac_bcLMuV3LzQxbFFWd0eBe-dMNoz6BcDOtKDdOffzA8Xmec9-COQaZs1JrSfdjS5QO40daq3OzC6eRoNv4wf_-WTYqa2BXQ57Hzkh2VDe_1hiH6ZbOeBBZ7kmXG2ubNbZsSnJv_CNkyZ_PlpS4YEV-zr-enLcXo-iEsDt8dT6a2qdJgp-jdK1uJHDEdRtghjxANIcDLg4B48ULpR3kaozfMOE79IJBpxAVagHK0QzRwgmcRxsGPYFAuS_0YWE4q9SMttSJmQNQfunDp-bHKOIamoQVOp6QkNRTmVEmjSJpQxomT48l4nJBeE6NXC173TVYtf8f_hF_ib9_LEfP2dPQpoWedPi5dC3bJMHqp2I_RwWPb_c5QErMMrBMM-SMfVziOr1_0r3EC066MLPWyRhmq20ppOoEFe62B9X172DPOltiCN73F_TUMsuXfhvHkWtJP4RanlI4mv3IfBtX3Wj9DoFWp52Zu_QJLlhsZ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=COMBING%3A+Clustering+in+Oncology+for+Mathematical+and+Biological+Identification+of+Novel+Gene+Signatures&rft.jtitle=IEEE%2FACM+transactions+on+computational+biology+and+bioinformatics&rft.au=Battistella%2C+Enzo&rft.au=Vakalopoulou%2C+Maria&rft.au=Sun%2C+Roger&rft.au=Estienne%2C+Theo&rft.date=2022-11-01&rft.eissn=1557-9964&rft.volume=19&rft.issue=6&rft.spage=3317&rft_id=info:doi/10.1109%2FTCBB.2021.3123910&rft_id=info%3Apmid%2F34714749&rft.externalDocID=34714749
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1545-5963&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1545-5963&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1545-5963&client=summon