Investigating the Efficacy of Nonlinear Dimensionality Reduction Schemes in Classifying Gene and Protein Expression Studies

The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. A major limitation in the ability to accurate classify th...

Full description

Saved in:
Bibliographic Details
Published inIEEE/ACM transactions on computational biology and bioinformatics Vol. 5; no. 3; pp. 368 - 384
Main Authors Lee, George, Rodriguez, Carlos, Madabhushi, Anant
Format Journal Article
LanguageEnglish
Published United States IEEE 01.07.2008
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1545-5963
1557-9964
2374-0043
1557-9964
DOI10.1109/TCBB.2008.36

Cover

Abstract The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. A major limitation in the ability to accurate classify these high-dimensional datasets stems from the 'curse of dimensionality', occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, Principal Component Analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on Euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene- and protein-expression studies. Towards this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable.
AbstractList The recent explosion in procurement and availability of high-dimensional gene and protein expression profile data sets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. While some investigators are focused on identifying informative genes and proteins that play a role in specific diseases, other researchers have attempted instead to use patients based on their expression profiles to prognosticate disease status. A major limitation in the ability to accurately classify these high-dimensional data sets stems from the "curse of dimensionality," occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, principal component analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. While some researchers have begun to explore nonlinear DR methods for computer vision problems such as face detection and recognition, to the best of our knowledge, few such attempts have been made for classification and visualization of high-dimensional biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene and protein expression studies. Toward this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, and Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, and Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable. Owing to the inherent nonlinear structure- - of gene and protein expression studies, our claim is that the nonlinear DR methods provide a more truthful low-dimensional representation of the data compared to the linear DR schemes. Evaluation of the DR schemes was done by 1) assessing the discriminability of two supervised classifiers (Support Vector Machine and C4.5 Decision Trees) in the different low- dimensional data embeddings and 2) five cluster validity measures to evaluate the size, distance, and tightness of object aggregates in the low-dimensional space. For each of the seven evaluation measures considered, statistically significant improvement in the quality of the embeddings across 10 cancer data sets via the use of three nonlinear DR schemes over three linear DR techniques was observed. Similar trends were observed when linear and nonlinear DR was applied to the high-dimensional data following feature pruning to isolate the most informative features. Qualitative evaluation of the low-dimensional data embedding obtained via the six DR methods further suggests that the nonlinear schemes are better able to identify potential novel classes (e.g., cancer subtypes) within the data.
Evaluation of the DR schemes was done by 1 assessing the discriminability of two supervised classifiers (Support Vector Machine and C4.5 Decision Trees) in the different low- dimensional data embeddings and 2 five cluster validity measures to evaluate the size, distance, and tightness of object aggregates in the low-dimensional space.
The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. A major limitation in the ability to accurate classify these high-dimensional datasets stems from the 'curse of dimensionality', occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, Principal Component Analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on Euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene- and protein-expression studies. Towards this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable.
The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. While some investigators are focused on identifying informative genes and proteins that play a role in specific diseases, other researchers have attempted instead to use patients based on their expression profiles to prognosticate disease status. A major limitation in the ability to accurate classify these high-dimensional datasets stems from the ‘curse of dimensionality’, occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, Principal Component Analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on Euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. While some researchers have begun to explore nonlinear DR methods for computer vision problems such as face detection and recognition, to the best of our knowledge, few such attempts have been made for classification and visualization of high-dimensional biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene- and protein-expression studies. Towards this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable. Owing to the to the inherent nonlinear structure of gene- and protein-expression studies, our claim is that the nonlinear DR methods provide a more truthful low-dimensional representation of the data compared to the linear DR schemes. Evaluation of the DR schemes was done by (i) assessing the discriminability of two supervised classifiers (Support Vector Machine and C4.5 Decision Trees) in the different low-dimensional data embeddings and (ii) 5 cluster validity measures to evaluate the size, distance and tightness of object aggregates in the low-dimensional space. For each of the 7 evaluation measures considered, statistically significant improvement in the quality of the embeddings across 10 cancer datasets via the use of 3 nonlinear DR schemes over 3 linear DR techniques was observed. Similar trends were observed when linear and nonlinear DR was applied to the high-dimensional data following feature pruning to isolate the most informative features. Qualitative evaluation of the low-dimensional data embedding obtained via the 6 DR methods further suggests that the nonlinear schemes are better able to identify potential novel classes (e.g. cancer subtypes) within the data.
The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze [abstract truncated by publisher].
The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. A major limitation in the ability to accurate classify these high-dimensional datasets stems from the 'curse of dimensionality', occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, Principal Component Analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on Euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene- and protein-expression studies. Towards this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable.The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. A major limitation in the ability to accurate classify these high-dimensional datasets stems from the 'curse of dimensionality', occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, Principal Component Analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on Euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene- and protein-expression studies. Towards this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable.
Author Rodriguez, Carlos
Madabhushi, Anant
Lee, George
AuthorAffiliation 2 University of Puerto Rico, Mayagez, PR 00681-9000
1 Rutgers, The State University of New Jersey, Department of Biomedical Engineering, Piscataway, NJ 08854, USA
AuthorAffiliation_xml – name: 1 Rutgers, The State University of New Jersey, Department of Biomedical Engineering, Piscataway, NJ 08854, USA
– name: 2 University of Puerto Rico, Mayagez, PR 00681-9000
Author_xml – sequence: 1
  givenname: George
  surname: Lee
  fullname: Lee, George
  email: geolee@eden.rutgers.edu
  organization: Rutgers University, Piscataway
– sequence: 2
  givenname: Carlos
  surname: Rodriguez
  fullname: Rodriguez, Carlos
  email: carlos@evri.com
  organization: University of Puerto Rico, Mayagez
– sequence: 3
  givenname: Anant
  surname: Madabhushi
  fullname: Madabhushi, Anant
  email: anantm@rci.rutgers.edu
  organization: Rutgers University, Piscataway
BackLink https://www.ncbi.nlm.nih.gov/pubmed/18670041$$D View this record in MEDLINE/PubMed
BookMark eNqFkkFv1DAQhS1URNuFGzckFHGAC1lsx3biCxJdllKpAgTlHDnOeNdV1l5ipxDx53HYFS1FwGnGmm-e7TdzjA6cd4DQQ4LnhGD54mJxcjKnGFfzQtxBR4TzMpdSsIMpZzznUhSH6DiES4wpk5jdQ4ekEiXGjByh72fuCkK0KxWtW2VxDdnSGKuVHjNvsnfeddaB6rPXdgMuWO9UZ-OYfYR20DEds096DRsImXXZolMhWDNOSqfgIFOuzT70PkIqLr9tewjhZ0scWgvhPrprVBfgwT7O0Oc3y4vF2_z8_enZ4tV5rjmvYk5ow4gSbckLkQKpGmNAG9NSYdoWm5QSAUxUHEvATdkqAcrwgjYKl7okxQzlO93BbdX4VXVdve3tRvVjTXA9mVhH3TT1ZGJdiMS_3PHbodlAq8HFXl33eGXr3yvOruuVv6opF1SkZ87Qs71A778Myd56Y4OGrlMO_BDqSkheMcrKRD79JylkkQbF_w9SzGQpKpnAJ7fASz_0aWrTtQWVBSOT2uObP7y2Y78XCXi-A3TvQ-jB_OHYtHY3HKO3cG2jmtYj-WO7vzU92jVZAPilz5ikpWDFD9iW5Vw
CODEN ITCBCY
CitedBy_id crossref_primary_10_1002_dta_304
crossref_primary_10_1016_j_tifs_2009_07_002
crossref_primary_10_1038_srep27306
crossref_primary_10_1093_nar_gkw545
crossref_primary_10_1109_RBME_2010_2083647
crossref_primary_10_1007_s11042_019_7181_8
crossref_primary_10_1109_ACCESS_2018_2876162
crossref_primary_10_1186_1471_2105_13_26
crossref_primary_10_1093_bioinformatics_bts108
crossref_primary_10_1016_j_eswa_2010_07_104
crossref_primary_10_1016_j_eswa_2010_03_002
crossref_primary_10_1109_TMI_2015_2456188
crossref_primary_10_1007_s10439_024_03459_3
crossref_primary_10_1002_nme_7427
crossref_primary_10_1186_s12880_016_0172_6
crossref_primary_10_1016_j_patcog_2013_07_011
crossref_primary_10_1118_1_3180955
crossref_primary_10_1007_s13258_019_00896_6
crossref_primary_10_1016_j_jbi_2016_03_002
crossref_primary_10_1002_nme_7072
crossref_primary_10_1007_s10278_010_9298_1
crossref_primary_10_1016_j_procs_2016_07_213
crossref_primary_10_1371_journal_pone_0118220
crossref_primary_10_1016_j_cmpb_2011_12_007
crossref_primary_10_1016_j_neuroimage_2015_10_026
crossref_primary_10_1109_TMI_2014_2355175
crossref_primary_10_1016_j_compmedimag_2014_07_002
crossref_primary_10_1016_j_compbiomed_2010_06_007
crossref_primary_10_1155_2014_769159
crossref_primary_10_4103_2153_3539_159441
crossref_primary_10_1002_mp_12208
crossref_primary_10_1146_annurev_bioeng_112415_114722
crossref_primary_10_1186_s12885_016_2198_0
crossref_primary_10_1016_j_chroma_2009_01_094
crossref_primary_10_1111_cas_12880
crossref_primary_10_1016_j_procs_2015_07_463
crossref_primary_10_1016_j_ymeth_2012_08_012
crossref_primary_10_1038_s41598_019_42392_7
crossref_primary_10_1016_j_compbiomed_2010_09_010
crossref_primary_10_1118_1_4790466
crossref_primary_10_1007_s10115_014_0813_4
crossref_primary_10_1016_j_eswa_2014_01_011
crossref_primary_10_1002_cpe_5497
crossref_primary_10_3390_bioengineering11040314
crossref_primary_10_1016_j_knosys_2015_09_005
crossref_primary_10_1016_j_compmedimag_2011_01_008
crossref_primary_10_1016_j_jmgm_2011_12_006
crossref_primary_10_1371_journal_pone_0159088
crossref_primary_10_1007_s11227_021_03962_7
crossref_primary_10_17759_sps_2024150208
crossref_primary_10_3390_a2031155
crossref_primary_10_1109_TBME_2009_2035305
crossref_primary_10_1155_2018_7341973
Cites_doi 10.1162/089976603321780317
10.1038/35000501
10.1023/A:1007608224229
10.1007/3-540-45014-9_1
10.1016/j.neunet.2006.05.014
10.1073/pnas.95.26.15623
10.1142/9781860947322_0021
10.1016/j.jspi.2007.06.019
10.1073/pnas.0506637102
10.1016/j.artmed.2005.01.006
10.1038/nm733
10.1016/S1535-6108(02)00032-6
10.1007/11566465_90
10.1186/1471-2407-7-55
10.1089/106652700750050943
10.1016/S1535-6108(02)00030-2
10.1093/bioinformatics/btm216
10.1109/TCBB.2004.45
10.1002/pmic.200600165
10.1109/ISBI.2007.357094
10.1109/34.868688
10.1016/j.ygeno.2004.09.007
10.1038/415436a
10.1073/pnas.96.6.2907
10.1515/9781400874668
10.1016/0890-5401(89)90010-2
10.1016/j.compbiomed.2005.04.001
10.1023/A:1022627411411
10.1016/j.bbadis.2007.05.005
10.1126/science.290.5500.2319
10.1016/j.artmed.2006.06.002
10.1007/11889762_3
10.1186/1471-2105-6-195
10.1073/pnas.96.12.6745
10.1155/JBB.2005.155
10.1158/0008-5472.CAN-04-0452
10.1186/1471-2105-8-90
10.1126/science.286.5439.531
10.1016/S0140-6736(02)07746-2
10.1093/bioinformatics/bth267
10.2202/1544-6115.1147
10.1016/S0014-5793(02)02873-9
10.1037/h0071325
10.1093/bioinformatics/bti517
10.1038/415530a
10.1038/nm0102-68
10.1073/pnas.97.1.262
10.1007/BF02345820
10.1093/bioinformatics/btg496
10.1126/science.290.5500.2323
10.1007/978-3-540-75759-7_34
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2008
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2008
DBID 97E
RIA
RIE
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7QF
7QO
7QQ
7SC
7SE
7SP
7SR
7TA
7TB
7U5
8BQ
8FD
F28
FR3
H8D
JG9
JQ2
KR7
L7M
L~C
L~D
P64
RC3
7X8
5PM
ADTOC
UNPAY
DOI 10.1109/TCBB.2008.36
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Aluminium Industry Abstracts
Biotechnology Research Abstracts
Ceramic Abstracts
Computer and Information Systems Abstracts
Corrosion Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
Materials Business File
Mechanical & Transportation Engineering Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
Aerospace Database
Materials Research Database
ProQuest Computer Science Collection
Civil Engineering Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Biotechnology and BioEngineering Abstracts
Genetics Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Materials Research Database
Civil Engineering Abstracts
Aluminium Industry Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Ceramic Abstracts
Materials Business File
METADEX
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts Professional
Aerospace Database
Engineered Materials Abstracts
Biotechnology Research Abstracts
Solid State and Superconductivity Abstracts
Engineering Research Database
Corrosion Abstracts
Advanced Technologies Database with Aerospace
ANTE: Abstracts in New Technology & Engineering
Genetics Abstracts
MEDLINE - Academic
DatabaseTitleList Genetics Abstracts
Materials Research Database

MEDLINE

Genetics Abstracts
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 4
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1557-9964
EndPage 384
ExternalDocumentID oai:pubmedcentral.nih.gov:2562675
PMC2562675
2328962871
18670041
10_1109_TCBB_2008_36
4492764
Genre orig-research
Research Support, Non-U.S. Gov't
Journal Article
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NCI NIH HHS
  grantid: R21 CA127186
– fundername: NCI NIH HHS
  grantid: R03 CA128081
– fundername: NCI NIH HHS
  grantid: R03CA128081-01
– fundername: NCI NIH HHS
  grantid: R21CA127186-01
GroupedDBID 0R~
29I
4.4
53G
5GY
5VS
6IK
8US
97E
AAJGR
AAKMM
AALFJ
AARMG
AASAJ
AAWTH
AAWTV
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACM
ACPRK
ADBCU
ADL
AEBYY
AEFXT
AEJOY
AENEX
AENSD
AETIX
AFRAH
AFWIH
AFWXC
AGQYO
AGSQL
AHBIQ
AIBXA
AIKLT
AKJIK
AKQYR
AKRVB
ALMA_UNASSIGNED_HOLDINGS
ASPBG
ATWAV
AVWKF
BDXCO
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CCLIF
CS3
DU5
EBS
EJD
FEDTE
GUFHI
HGAVV
HZ~
I07
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
LHSKQ
M43
O9-
OCL
P1C
P2P
PQQKQ
RIA
RIE
RNI
RNS
ROL
RZB
TN5
XOL
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
RIG
7QF
7QO
7QQ
7SC
7SE
7SP
7SR
7TA
7TB
7U5
8BQ
8FD
F28
FR3
H8D
JG9
JQ2
KR7
L7M
L~C
L~D
P64
RC3
7X8
5PM
ADTOC
UNPAY
ID FETCH-LOGICAL-c558t-12b41a6d7536a6d18bffecffd26fdd0fcff16e468509e0b7da6eaf532ba07c713
IEDL.DBID RIE
ISSN 1545-5963
1557-9964
2374-0043
IngestDate Sun Oct 26 03:27:11 EDT 2025
Tue Sep 30 16:21:44 EDT 2025
Tue Oct 07 10:02:46 EDT 2025
Wed Oct 01 14:06:11 EDT 2025
Tue Oct 07 10:11:23 EDT 2025
Mon Jun 30 07:00:15 EDT 2025
Mon Jul 21 05:37:02 EDT 2025
Thu Apr 24 22:56:30 EDT 2025
Wed Oct 01 05:55:44 EDT 2025
Wed Aug 27 01:47:16 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords Bioinformatics (genome or protein) databases
and association rules
Feature extraction or construction
Data and knowledge visualization
Clustering
classification
Data mining
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c558t-12b41a6d7536a6d18bffecffd26fdd0fcff16e468509e0b7da6eaf532ba07c713
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
These datasets were downloaded from the Biomedical Kent-Ridge Repositories at http://sdmc.lit.org.sg/GEDatasets/Datasets, http://sdmc.i2r.a-star.edu.sg/rp and the Gene Expression Omnibus(GEO) Repository at http://www.ncbi.nlm.nih.gov/geo/.
OpenAccessLink https://proxy.k.utb.cz/login?url=http://doi.org/10.1109/TCBB.2008.36
PMID 18670041
PQID 863293417
PQPubID 23462
PageCount 17
ParticipantIDs unpaywall_primary_10_1109_tcbb_2008_36
crossref_primary_10_1109_TCBB_2008_36
crossref_citationtrail_10_1109_TCBB_2008_36
proquest_miscellaneous_69370057
pubmed_primary_18670041
pubmedcentral_primary_oai_pubmedcentral_nih_gov_2562675
proquest_miscellaneous_869584247
ieee_primary_4492764
proquest_miscellaneous_20497689
proquest_journals_863293417
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2008-07-01
PublicationDateYYYYMMDD 2008-07-01
PublicationDate_xml – month: 07
  year: 2008
  text: 2008-07-01
  day: 01
PublicationDecade 2000
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: New York
PublicationTitle IEEE/ACM transactions on computational biology and bioinformatics
PublicationTitleAbbrev TCBB
PublicationTitleAlternate IEEE/ACM Trans Comput Biol Bioinform
PublicationYear 2008
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
Gordon (ref54) 2002; 62
ref57
ref12
ref14
ref58
ref53
ref52
ref55
ref10
ref17
ref16
ref19
ref18
Quinlan (ref56); 1
ref51
ref50
ref46
ref45
ref47
ref42
ref41
ref44
ref43
Kovacs (ref59)
Liu (ref15) 2002; 13
ref49
Doyle (ref48)
ref8
ref7
ref9
ref4
ref3
ref6
ref5
ref40
ref35
ref34
ref37
ref36
ref30
Wigle (ref27) 2002; 62
ref33
ref32
ref2
ref1
ref39
ref38
Tan (ref11) 2003; 2
ref24
ref23
ref26
ref25
ref20
ref22
ref21
ref28
Duda (ref31) 2000
ref29
References_xml – ident: ref44
  doi: 10.1162/089976603321780317
– ident: ref8
  doi: 10.1038/35000501
– volume: 13
  start-page: 51
  year: 2002
  ident: ref15
  article-title: A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns
  publication-title: Genome Informatics
– ident: ref53
  doi: 10.1023/A:1007608224229
– ident: ref51
  doi: 10.1007/3-540-45014-9_1
– volume: 1
  start-page: 725
  volume-title: Proc. 13th Nat’l Conf. Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conf. (AAAI/IAAI ’96)
  ident: ref56
  article-title: Bagging, Boosting, and C4.5
– ident: ref40
  doi: 10.1016/j.neunet.2006.05.014
– ident: ref4
  doi: 10.1073/pnas.95.26.15623
– ident: ref3
  doi: 10.1142/9781860947322_0021
– ident: ref18
  doi: 10.1016/j.jspi.2007.06.019
– ident: ref36
  doi: 10.1073/pnas.0506637102
– ident: ref6
  doi: 10.1016/j.artmed.2005.01.006
– ident: ref26
  doi: 10.1038/nm733
– volume-title: Proc. 10th Int’l Conf. Medical Image Computing and Computer-Assisted Intervention (MICCAI)
  ident: ref48
  article-title: Using Manifold Learning for Content-Based Image Retrieval of Prostate Histopathology
– ident: ref32
  doi: 10.1016/S1535-6108(02)00032-6
– ident: ref45
  doi: 10.1007/11566465_90
– ident: ref7
  doi: 10.1186/1471-2407-7-55
– ident: ref9
  doi: 10.1089/106652700750050943
– ident: ref17
  doi: 10.1016/S1535-6108(02)00030-2
– volume: 62
  start-page: 4963
  year: 2002
  ident: ref54
  article-title: Translation of Microarray Data into Clinically Relevant Cancer Diagnostic Tests Using Gene Expression Ratios in Lung Cancer and Mesothelioma
  publication-title: Cancer Research
– ident: ref12
  doi: 10.1093/bioinformatics/btm216
– ident: ref29
  doi: 10.1109/TCBB.2004.45
– volume-title: Pattern Classification
  year: 2000
  ident: ref31
– ident: ref37
  doi: 10.1002/pmic.200600165
– ident: ref47
  doi: 10.1109/ISBI.2007.357094
– ident: ref41
  doi: 10.1109/34.868688
– ident: ref13
  doi: 10.1016/j.ygeno.2004.09.007
– ident: ref23
  doi: 10.1038/415436a
– ident: ref20
  doi: 10.1073/pnas.96.6.2907
– ident: ref28
  doi: 10.1515/9781400874668
– ident: ref57
  doi: 10.1016/0890-5401(89)90010-2
– ident: ref2
  doi: 10.1016/j.compbiomed.2005.04.001
– ident: ref55
  doi: 10.1023/A:1022627411411
– ident: ref21
  doi: 10.1016/j.bbadis.2007.05.005
– ident: ref42
  doi: 10.1126/science.290.5500.2319
– ident: ref5
  doi: 10.1016/j.artmed.2006.06.002
– ident: ref52
  doi: 10.1007/11889762_3
– ident: ref34
  doi: 10.1186/1471-2105-6-195
– ident: ref16
  doi: 10.1073/pnas.96.12.6745
– ident: ref30
  doi: 10.1155/JBB.2005.155
– ident: ref24
  doi: 10.1158/0008-5472.CAN-04-0452
– ident: ref35
  doi: 10.1186/1471-2105-8-90
– ident: ref1
  doi: 10.1126/science.286.5439.531
– ident: ref19
  doi: 10.1016/S0140-6736(02)07746-2
– ident: ref14
  doi: 10.1093/bioinformatics/bth267
– volume: 2
  start-page: S75-S83
  year: 2003
  ident: ref11
  article-title: Ensemble Machine Learning on Gene Expression Data for Cancer Classification
  publication-title: Applied Bioinformatics
– ident: ref33
  doi: 10.2202/1544-6115.1147
– ident: ref38
  doi: 10.1016/S0014-5793(02)02873-9
– ident: ref39
  doi: 10.1037/h0071325
– ident: ref58
  doi: 10.1093/bioinformatics/bti517
– ident: ref22
  doi: 10.1038/415530a
– ident: ref25
  doi: 10.1038/nm0102-68
– ident: ref10
  doi: 10.1073/pnas.97.1.262
– volume-title: Proc. Sixth Int’l Symp. Hungarian Researchers on Computational Intelligence (CINTI)
  ident: ref59
  article-title: Cluster Validity Measurement Techniques
– ident: ref49
  doi: 10.1007/BF02345820
– volume: 62
  start-page: 3005
  year: 2002
  ident: ref27
  article-title: Molecular Profiling of Non-Small Cell Lung Cancer and Correlation with Disease-Free Survival
  publication-title: Cancer Research
– ident: ref50
  doi: 10.1093/bioinformatics/btg496
– ident: ref43
  doi: 10.1126/science.290.5500.2323
– ident: ref46
  doi: 10.1007/978-3-540-75759-7_34
SSID ssj0024904
Score 2.1517625
Snippet The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated...
Evaluation of the DR schemes was done by 1 assessing the discriminability of two supervised classifiers (Support Vector Machine and C4.5 Decision Trees) in the...
The recent explosion in procurement and availability of high-dimensional gene and protein expression profile data sets for cancer diagnostics has necessitated...
SourceID unpaywall
pubmedcentral
proquest
pubmed
crossref
ieee
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 368
SubjectTerms Algorithms
and association rules
Availability
Bioinformatics
Bioinformatics (genome or protein) databases
Biomedical measurements
Cancer
classification
Clustering
Data and knowledge visualization
Data Interpretation, Statistical
Data mining
Decision trees
Discriminant analysis
Diseases
Explosions
Feature extraction or construction
Gene Expression Profiling - methods
Machine learning
Nonlinear Dynamics
Pattern Recognition, Automated - methods
Principal component analysis
Procurement
Protein engineering
Reproducibility of Results
Sensitivity and Specificity
Software
Studies
SummonAdditionalLinks – databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9NAEB6VVIheeBWKKY89ABe0re192cc-UlVIRBU0UjlFu_YaIoJTNY4g8OeZ9domUQjiFCs7tuT1rP2N9pvvA3iVRMxYkwuaJNJQzgtGDUsZzXRmY8MEU9Y1Cr8fyPMhf3clrragNbxb3r6PwvTw8uT42BMembwF21Ig4O7B9nBwcfSpVkLlguJ_NY1eCEURvfPaTk5x6ja5Oqp7elhlxrSXWvkI1a4qfwOY6zzJO_PyWi--68lk6SN0dg9O21Yezz35ejCvzEH2c13Z8V_3dx_uNiCUHPmseQBbtnwIt70t5WIXfi2Jb5SfCUJE0ndKEzpbkGlBBl5dQ9-QU2cN4GU9EMyTD04G1j1o8hFT4ZudkXFJatfNcd1NRZzGNdFlTi6cOgQO9n80PFw8xRMaH8HwrH95ck4bkwaaCZFUNIoNj7TMseyR-BMlxvFQiiKPZZHnYYGHkbRcJohMbGhUrqXVhWCx0aHKsER-DL1yWtonQGQei1xhzWa15SIyaaozfPtimilEGZYF8LZ9aqOsUTB3RhqTUV3JhOnITac31mQygNdd9LVX7tgQt-sSoIvhPI2V5AHstwkxapb1bJRIhvCIRyqAl90orke3yaJLO53P8KocEV6Sbo6QiAhdD3AAZENEIlPEhTHHkD2fgX_uIHF9VTwKQK3kZhfg5MJXR8rxl1o2HMFtjOVhAG-6LF6bGLdI2ol5-r-B-7DjWTSOxPwMetXN3D5HqFaZF81S_Q1ZmTnc
  priority: 102
  providerName: Unpaywall
Title Investigating the Efficacy of Nonlinear Dimensionality Reduction Schemes in Classifying Gene and Protein Expression Studies
URI https://ieeexplore.ieee.org/document/4492764
https://www.ncbi.nlm.nih.gov/pubmed/18670041
https://www.proquest.com/docview/863293417
https://www.proquest.com/docview/20497689
https://www.proquest.com/docview/69370057
https://www.proquest.com/docview/869584247
https://pubmed.ncbi.nlm.nih.gov/PMC2562675
http://doi.org/10.1109/TCBB.2008.36
UnpaywallVersion submittedVersion
Volume 5
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1557-9964
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0024904
  issn: 1545-5963
  databaseCode: RIE
  dateStart: 20040101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9NAEB61RYheeJWHKZQ9ABdwGnvX6_WxLakqpEYVNFI5WfsyRA1O1diCwJ9n1uu4DW0kTra0Y0u7M2t_Y3_zDcAbEVFllUlCIbgKGStoqGhGQy21jRVNaGpdofDxkB-N2Kez5GwNPnS1MNbahnxme-60-Zdvprp2n8p2GcvilLN1WE8F97VaV7p6WdMq0CGCMMGo6kju2e7pwf6-Z03SpmORcKUpLFp6EzWtVW5DmTfJkvfq8kLOf8rJ5Nqb6PABHC_m4Ako5726Uj39-x95x_-d5EO430JSsudj6BGs2fIx3PVNKudb8OeaFEf5jSBgJAOnOyH1nEwLMvRaG_KSfHSNArzIB0J78tmJwjq3ky8YGD_sjIxL0vTgHDe1VcQpXhNZGnLitCJwcPCrZeXiJZ7e-ARGh4PTg6OwbdkQ6iQRVRjFikWSG0yCOB4ioRwrpShMzAtj-gWeRtwyLhCn2L5KjeRWFgmNleynGhPmp7BRTkv7HAg3cWJSzOCstCyJVJZJjc9iDLoUMYelAbxfuC_XrZ65a6sxyZu8pp_lzu--zSblAbztrC-8jscKuy3nks6m9UYA24vIyNtNPssFpwiWWJQG8Lobxd3pfrnI0k7rGd6VId4T2WoLjvjQVQQHQFZYCJ4hSowZmjzzoXg1gzaUA0iXgrQzcOLhyyPl-HsjIo5QN8ZkMYB3XTjfWJhKK7VYmBe3L8w2bHoGjSMwv4SN6rK2rxCmVWqn2Z87cGc0PNn7-hdP-D-u
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fb9MwED6NITRe-DUGYcD8ALxAuib-EeeRjU4F1gpBJ-0tshMHKko6rY2g8M9zttNsZavEUyL5Esm-c_Jd8t13AC9kRLXRBQ-lFDpkrKShpikNc5WbWFNOE2MLhQdD0T9hH0756Qa8aWthjDGOfGY69tT9yy-meW0_le0zlsaJYDfgJmeMcV-tdaGsl7pmgRYThBzjqqW5p_ujw4MDz5ukrmeRtMUpLFp5F7nmKtfhzKt0ya26OlOLn2oyufQuOroLg-UsPAXle6ee607--x-Bx_-d5j2404BS8tZH0X3YMNUDuOXbVC624c8lMY7qK0HISHpWeULlCzItydCrbahz8s62CvAyHwjuyWcrC2sdT75gaPwwMzKuiOvCOXbVVcRqXhNVFeSTVYvAwd6vhpeLl3iC40M4OeqNDvth07QhzDmX8zCKNYuUKDANEniIpLa8lLIsYlEWRbfE00gYJiQiFdPVSaGEUSWnsVbdJMeUeQc2q2llHgMRRcyLBHM4owzjkU5TlePTGMMuQdRhaACvl-7L8kbR3DbWmGQus-mmmfW7b7RJRQAvW-szr-Sxxm7buqS1abwRwO4yMrJmm88yKSjCJRYlAey1o7g_7U8XVZlpPcO7MkR8Ml1vIRAh2prgAMgaCylSxIkxQ5NHPhQvZtCEcgDJSpC2BlY-fHWkGn9zMuIIdmNMFwN41YbzlYWZ51ovF-bJ9QuzB1v90eA4O34__LgLtz2fxtKZn8Lm_Lw2zxC0zfVzt1f_AguEQUs
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9NAEB6VVIheeBWKKY89ABe0re192cc-UlVIRBU0UjlFu_YaIoJTNY4g8OeZ9domUQjiFCs7tuT1rP2N9pvvA3iVRMxYkwuaJNJQzgtGDUsZzXRmY8MEU9Y1Cr8fyPMhf3clrragNbxb3r6PwvTw8uT42BMembwF21Ig4O7B9nBwcfSpVkLlguJ_NY1eCEURvfPaTk5x6ja5Oqp7elhlxrSXWvkI1a4qfwOY6zzJO_PyWi--68lk6SN0dg9O21Yezz35ejCvzEH2c13Z8V_3dx_uNiCUHPmseQBbtnwIt70t5WIXfi2Jb5SfCUJE0ndKEzpbkGlBBl5dQ9-QU2cN4GU9EMyTD04G1j1o8hFT4ZudkXFJatfNcd1NRZzGNdFlTi6cOgQO9n80PFw8xRMaH8HwrH95ck4bkwaaCZFUNIoNj7TMseyR-BMlxvFQiiKPZZHnYYGHkbRcJohMbGhUrqXVhWCx0aHKsER-DL1yWtonQGQei1xhzWa15SIyaaozfPtimilEGZYF8LZ9aqOsUTB3RhqTUV3JhOnITac31mQygNdd9LVX7tgQt-sSoIvhPI2V5AHstwkxapb1bJRIhvCIRyqAl90orke3yaJLO53P8KocEV6Sbo6QiAhdD3AAZENEIlPEhTHHkD2fgX_uIHF9VTwKQK3kZhfg5MJXR8rxl1o2HMFtjOVhAG-6LF6bGLdI2ol5-r-B-7DjWTSOxPwMetXN3D5HqFaZF81S_Q1ZmTnc
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Investigating+the+Efficacy+of+Nonlinear+Dimensionality+Reduction+Schemes+in+Classifying+Gene+and+Protein+Expression+Studies&rft.jtitle=IEEE%2FACM+transactions+on+computational+biology+and+bioinformatics&rft.au=Lee%2C+G&rft.au=Rodriguez%2C+C&rft.au=Madabhushi%2C+A&rft.date=2008-07-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1545-5963&rft.eissn=1557-9964&rft.volume=5&rft.issue=3&rft.spage=368&rft_id=info:doi/10.1109%2FTCBB.2008.36&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=2328962871
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1545-5963&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1545-5963&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1545-5963&client=summon