Investigating the Efficacy of Nonlinear Dimensionality Reduction Schemes in Classifying Gene and Protein Expression Studies
The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. A major limitation in the ability to accurate classify th...
Saved in:
| Published in | IEEE/ACM transactions on computational biology and bioinformatics Vol. 5; no. 3; pp. 368 - 384 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
IEEE
01.07.2008
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1545-5963 1557-9964 2374-0043 1557-9964 |
| DOI | 10.1109/TCBB.2008.36 |
Cover
| Abstract | The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. A major limitation in the ability to accurate classify these high-dimensional datasets stems from the 'curse of dimensionality', occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, Principal Component Analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on Euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene- and protein-expression studies. Towards this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable. |
|---|---|
| AbstractList | The recent explosion in procurement and availability of high-dimensional gene and protein expression profile data sets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. While some investigators are focused on identifying informative genes and proteins that play a role in specific diseases, other researchers have attempted instead to use patients based on their expression profiles to prognosticate disease status. A major limitation in the ability to accurately classify these high-dimensional data sets stems from the "curse of dimensionality," occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, principal component analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. While some researchers have begun to explore nonlinear DR methods for computer vision problems such as face detection and recognition, to the best of our knowledge, few such attempts have been made for classification and visualization of high-dimensional biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene and protein expression studies. Toward this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, and Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, and Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable. Owing to the inherent nonlinear structure- - of gene and protein expression studies, our claim is that the nonlinear DR methods provide a more truthful low-dimensional representation of the data compared to the linear DR schemes. Evaluation of the DR schemes was done by 1) assessing the discriminability of two supervised classifiers (Support Vector Machine and C4.5 Decision Trees) in the different low- dimensional data embeddings and 2) five cluster validity measures to evaluate the size, distance, and tightness of object aggregates in the low-dimensional space. For each of the seven evaluation measures considered, statistically significant improvement in the quality of the embeddings across 10 cancer data sets via the use of three nonlinear DR schemes over three linear DR techniques was observed. Similar trends were observed when linear and nonlinear DR was applied to the high-dimensional data following feature pruning to isolate the most informative features. Qualitative evaluation of the low-dimensional data embedding obtained via the six DR methods further suggests that the nonlinear schemes are better able to identify potential novel classes (e.g., cancer subtypes) within the data. Evaluation of the DR schemes was done by 1 assessing the discriminability of two supervised classifiers (Support Vector Machine and C4.5 Decision Trees) in the different low- dimensional data embeddings and 2 five cluster validity measures to evaluate the size, distance, and tightness of object aggregates in the low-dimensional space. The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. A major limitation in the ability to accurate classify these high-dimensional datasets stems from the 'curse of dimensionality', occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, Principal Component Analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on Euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene- and protein-expression studies. Towards this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable. The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. While some investigators are focused on identifying informative genes and proteins that play a role in specific diseases, other researchers have attempted instead to use patients based on their expression profiles to prognosticate disease status. A major limitation in the ability to accurate classify these high-dimensional datasets stems from the ‘curse of dimensionality’, occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, Principal Component Analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on Euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. While some researchers have begun to explore nonlinear DR methods for computer vision problems such as face detection and recognition, to the best of our knowledge, few such attempts have been made for classification and visualization of high-dimensional biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene- and protein-expression studies. Towards this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable. Owing to the to the inherent nonlinear structure of gene- and protein-expression studies, our claim is that the nonlinear DR methods provide a more truthful low-dimensional representation of the data compared to the linear DR schemes. Evaluation of the DR schemes was done by (i) assessing the discriminability of two supervised classifiers (Support Vector Machine and C4.5 Decision Trees) in the different low-dimensional data embeddings and (ii) 5 cluster validity measures to evaluate the size, distance and tightness of object aggregates in the low-dimensional space. For each of the 7 evaluation measures considered, statistically significant improvement in the quality of the embeddings across 10 cancer datasets via the use of 3 nonlinear DR schemes over 3 linear DR techniques was observed. Similar trends were observed when linear and nonlinear DR was applied to the high-dimensional data following feature pruning to isolate the most informative features. Qualitative evaluation of the low-dimensional data embedding obtained via the 6 DR methods further suggests that the nonlinear schemes are better able to identify potential novel classes (e.g. cancer subtypes) within the data. The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze [abstract truncated by publisher]. The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. A major limitation in the ability to accurate classify these high-dimensional datasets stems from the 'curse of dimensionality', occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, Principal Component Analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on Euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene- and protein-expression studies. Towards this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable.The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated the development of sophisticated machine learning tools with which to analyze them. A major limitation in the ability to accurate classify these high-dimensional datasets stems from the 'curse of dimensionality', occurring in situations where the number of genes or peptides significantly exceeds the total number of patient samples. Previous attempts at dealing with this issue have mostly centered on the use of a dimensionality reduction (DR) scheme, Principal Component Analysis (PCA), to obtain a low-dimensional projection of the high-dimensional data. However, linear PCA and other linear DR methods, which rely on Euclidean distances to estimate object similarity, do not account for the inherent underlying nonlinear structure associated with most biomedical data. The motivation behind this work is to identify the appropriate DR methods for analysis of high-dimensional gene- and protein-expression studies. Towards this end, we empirically and rigorously compare three nonlinear (Isomap, Locally Linear Embedding, Laplacian Eigenmaps) and three linear DR schemes (PCA, Linear Discriminant Analysis, Multidimensional Scaling) with the intent of determining a reduced subspace representation in which the individual object classes are more easily discriminable. |
| Author | Rodriguez, Carlos Madabhushi, Anant Lee, George |
| AuthorAffiliation | 2 University of Puerto Rico, Mayagez, PR 00681-9000 1 Rutgers, The State University of New Jersey, Department of Biomedical Engineering, Piscataway, NJ 08854, USA |
| AuthorAffiliation_xml | – name: 1 Rutgers, The State University of New Jersey, Department of Biomedical Engineering, Piscataway, NJ 08854, USA – name: 2 University of Puerto Rico, Mayagez, PR 00681-9000 |
| Author_xml | – sequence: 1 givenname: George surname: Lee fullname: Lee, George email: geolee@eden.rutgers.edu organization: Rutgers University, Piscataway – sequence: 2 givenname: Carlos surname: Rodriguez fullname: Rodriguez, Carlos email: carlos@evri.com organization: University of Puerto Rico, Mayagez – sequence: 3 givenname: Anant surname: Madabhushi fullname: Madabhushi, Anant email: anantm@rci.rutgers.edu organization: Rutgers University, Piscataway |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/18670041$$D View this record in MEDLINE/PubMed |
| BookMark | eNqFkkFv1DAQhS1URNuFGzckFHGAC1lsx3biCxJdllKpAgTlHDnOeNdV1l5ipxDx53HYFS1FwGnGmm-e7TdzjA6cd4DQQ4LnhGD54mJxcjKnGFfzQtxBR4TzMpdSsIMpZzznUhSH6DiES4wpk5jdQ4ekEiXGjByh72fuCkK0KxWtW2VxDdnSGKuVHjNvsnfeddaB6rPXdgMuWO9UZ-OYfYR20DEds096DRsImXXZolMhWDNOSqfgIFOuzT70PkIqLr9tewjhZ0scWgvhPrprVBfgwT7O0Oc3y4vF2_z8_enZ4tV5rjmvYk5ow4gSbckLkQKpGmNAG9NSYdoWm5QSAUxUHEvATdkqAcrwgjYKl7okxQzlO93BbdX4VXVdve3tRvVjTXA9mVhH3TT1ZGJdiMS_3PHbodlAq8HFXl33eGXr3yvOruuVv6opF1SkZ87Qs71A778Myd56Y4OGrlMO_BDqSkheMcrKRD79JylkkQbF_w9SzGQpKpnAJ7fASz_0aWrTtQWVBSOT2uObP7y2Y78XCXi-A3TvQ-jB_OHYtHY3HKO3cG2jmtYj-WO7vzU92jVZAPilz5ikpWDFD9iW5Vw |
| CODEN | ITCBCY |
| CitedBy_id | crossref_primary_10_1002_dta_304 crossref_primary_10_1016_j_tifs_2009_07_002 crossref_primary_10_1038_srep27306 crossref_primary_10_1093_nar_gkw545 crossref_primary_10_1109_RBME_2010_2083647 crossref_primary_10_1007_s11042_019_7181_8 crossref_primary_10_1109_ACCESS_2018_2876162 crossref_primary_10_1186_1471_2105_13_26 crossref_primary_10_1093_bioinformatics_bts108 crossref_primary_10_1016_j_eswa_2010_07_104 crossref_primary_10_1016_j_eswa_2010_03_002 crossref_primary_10_1109_TMI_2015_2456188 crossref_primary_10_1007_s10439_024_03459_3 crossref_primary_10_1002_nme_7427 crossref_primary_10_1186_s12880_016_0172_6 crossref_primary_10_1016_j_patcog_2013_07_011 crossref_primary_10_1118_1_3180955 crossref_primary_10_1007_s13258_019_00896_6 crossref_primary_10_1016_j_jbi_2016_03_002 crossref_primary_10_1002_nme_7072 crossref_primary_10_1007_s10278_010_9298_1 crossref_primary_10_1016_j_procs_2016_07_213 crossref_primary_10_1371_journal_pone_0118220 crossref_primary_10_1016_j_cmpb_2011_12_007 crossref_primary_10_1016_j_neuroimage_2015_10_026 crossref_primary_10_1109_TMI_2014_2355175 crossref_primary_10_1016_j_compmedimag_2014_07_002 crossref_primary_10_1016_j_compbiomed_2010_06_007 crossref_primary_10_1155_2014_769159 crossref_primary_10_4103_2153_3539_159441 crossref_primary_10_1002_mp_12208 crossref_primary_10_1146_annurev_bioeng_112415_114722 crossref_primary_10_1186_s12885_016_2198_0 crossref_primary_10_1016_j_chroma_2009_01_094 crossref_primary_10_1111_cas_12880 crossref_primary_10_1016_j_procs_2015_07_463 crossref_primary_10_1016_j_ymeth_2012_08_012 crossref_primary_10_1038_s41598_019_42392_7 crossref_primary_10_1016_j_compbiomed_2010_09_010 crossref_primary_10_1118_1_4790466 crossref_primary_10_1007_s10115_014_0813_4 crossref_primary_10_1016_j_eswa_2014_01_011 crossref_primary_10_1002_cpe_5497 crossref_primary_10_3390_bioengineering11040314 crossref_primary_10_1016_j_knosys_2015_09_005 crossref_primary_10_1016_j_compmedimag_2011_01_008 crossref_primary_10_1016_j_jmgm_2011_12_006 crossref_primary_10_1371_journal_pone_0159088 crossref_primary_10_1007_s11227_021_03962_7 crossref_primary_10_17759_sps_2024150208 crossref_primary_10_3390_a2031155 crossref_primary_10_1109_TBME_2009_2035305 crossref_primary_10_1155_2018_7341973 |
| Cites_doi | 10.1162/089976603321780317 10.1038/35000501 10.1023/A:1007608224229 10.1007/3-540-45014-9_1 10.1016/j.neunet.2006.05.014 10.1073/pnas.95.26.15623 10.1142/9781860947322_0021 10.1016/j.jspi.2007.06.019 10.1073/pnas.0506637102 10.1016/j.artmed.2005.01.006 10.1038/nm733 10.1016/S1535-6108(02)00032-6 10.1007/11566465_90 10.1186/1471-2407-7-55 10.1089/106652700750050943 10.1016/S1535-6108(02)00030-2 10.1093/bioinformatics/btm216 10.1109/TCBB.2004.45 10.1002/pmic.200600165 10.1109/ISBI.2007.357094 10.1109/34.868688 10.1016/j.ygeno.2004.09.007 10.1038/415436a 10.1073/pnas.96.6.2907 10.1515/9781400874668 10.1016/0890-5401(89)90010-2 10.1016/j.compbiomed.2005.04.001 10.1023/A:1022627411411 10.1016/j.bbadis.2007.05.005 10.1126/science.290.5500.2319 10.1016/j.artmed.2006.06.002 10.1007/11889762_3 10.1186/1471-2105-6-195 10.1073/pnas.96.12.6745 10.1155/JBB.2005.155 10.1158/0008-5472.CAN-04-0452 10.1186/1471-2105-8-90 10.1126/science.286.5439.531 10.1016/S0140-6736(02)07746-2 10.1093/bioinformatics/bth267 10.2202/1544-6115.1147 10.1016/S0014-5793(02)02873-9 10.1037/h0071325 10.1093/bioinformatics/bti517 10.1038/415530a 10.1038/nm0102-68 10.1073/pnas.97.1.262 10.1007/BF02345820 10.1093/bioinformatics/btg496 10.1126/science.290.5500.2323 10.1007/978-3-540-75759-7_34 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2008 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2008 |
| DBID | 97E RIA RIE AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QF 7QO 7QQ 7SC 7SE 7SP 7SR 7TA 7TB 7U5 8BQ 8FD F28 FR3 H8D JG9 JQ2 KR7 L7M L~C L~D P64 RC3 7X8 5PM ADTOC UNPAY |
| DOI | 10.1109/TCBB.2008.36 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Aluminium Industry Abstracts Biotechnology Research Abstracts Ceramic Abstracts Computer and Information Systems Abstracts Corrosion Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts Materials Business File Mechanical & Transportation Engineering Abstracts Solid State and Superconductivity Abstracts METADEX Technology Research Database ANTE: Abstracts in New Technology & Engineering Engineering Research Database Aerospace Database Materials Research Database ProQuest Computer Science Collection Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Biotechnology and BioEngineering Abstracts Genetics Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) Unpaywall for CDI: Periodical Content Unpaywall |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Materials Research Database Civil Engineering Abstracts Aluminium Industry Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Ceramic Abstracts Materials Business File METADEX Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts Professional Aerospace Database Engineered Materials Abstracts Biotechnology Research Abstracts Solid State and Superconductivity Abstracts Engineering Research Database Corrosion Abstracts Advanced Technologies Database with Aerospace ANTE: Abstracts in New Technology & Engineering Genetics Abstracts MEDLINE - Academic |
| DatabaseTitleList | Genetics Abstracts Materials Research Database MEDLINE Genetics Abstracts MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 4 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Biology |
| EISSN | 1557-9964 |
| EndPage | 384 |
| ExternalDocumentID | oai:pubmedcentral.nih.gov:2562675 PMC2562675 2328962871 18670041 10_1109_TCBB_2008_36 4492764 |
| Genre | orig-research Research Support, Non-U.S. Gov't Journal Article Research Support, N.I.H., Extramural |
| GrantInformation_xml | – fundername: NCI NIH HHS grantid: R21 CA127186 – fundername: NCI NIH HHS grantid: R03 CA128081 – fundername: NCI NIH HHS grantid: R03CA128081-01 – fundername: NCI NIH HHS grantid: R21CA127186-01 |
| GroupedDBID | 0R~ 29I 4.4 53G 5GY 5VS 6IK 8US 97E AAJGR AAKMM AALFJ AARMG AASAJ AAWTH AAWTV ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACM ACPRK ADBCU ADL AEBYY AEFXT AEJOY AENEX AENSD AETIX AFRAH AFWIH AFWXC AGQYO AGSQL AHBIQ AIBXA AIKLT AKJIK AKQYR AKRVB ALMA_UNASSIGNED_HOLDINGS ASPBG ATWAV AVWKF BDXCO BEFXN BFFAM BGNUA BKEBE BPEOZ CCLIF CS3 DU5 EBS EJD FEDTE GUFHI HGAVV HZ~ I07 IEDLZ IFIPE IPLJI JAVBF LAI LHSKQ M43 O9- OCL P1C P2P PQQKQ RIA RIE RNI RNS ROL RZB TN5 XOL AAYXX CITATION CGR CUY CVF ECM EIF NPM RIG 7QF 7QO 7QQ 7SC 7SE 7SP 7SR 7TA 7TB 7U5 8BQ 8FD F28 FR3 H8D JG9 JQ2 KR7 L7M L~C L~D P64 RC3 7X8 5PM ADTOC UNPAY |
| ID | FETCH-LOGICAL-c558t-12b41a6d7536a6d18bffecffd26fdd0fcff16e468509e0b7da6eaf532ba07c713 |
| IEDL.DBID | RIE |
| ISSN | 1545-5963 1557-9964 2374-0043 |
| IngestDate | Sun Oct 26 03:27:11 EDT 2025 Tue Sep 30 16:21:44 EDT 2025 Tue Oct 07 10:02:46 EDT 2025 Wed Oct 01 14:06:11 EDT 2025 Tue Oct 07 10:11:23 EDT 2025 Mon Jun 30 07:00:15 EDT 2025 Mon Jul 21 05:37:02 EDT 2025 Thu Apr 24 22:56:30 EDT 2025 Wed Oct 01 05:55:44 EDT 2025 Wed Aug 27 01:47:16 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Keywords | Bioinformatics (genome or protein) databases and association rules Feature extraction or construction Data and knowledge visualization Clustering classification Data mining |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c558t-12b41a6d7536a6d18bffecffd26fdd0fcff16e468509e0b7da6eaf532ba07c713 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 These datasets were downloaded from the Biomedical Kent-Ridge Repositories at http://sdmc.lit.org.sg/GEDatasets/Datasets, http://sdmc.i2r.a-star.edu.sg/rp and the Gene Expression Omnibus(GEO) Repository at http://www.ncbi.nlm.nih.gov/geo/. |
| OpenAccessLink | https://proxy.k.utb.cz/login?url=http://doi.org/10.1109/TCBB.2008.36 |
| PMID | 18670041 |
| PQID | 863293417 |
| PQPubID | 23462 |
| PageCount | 17 |
| ParticipantIDs | unpaywall_primary_10_1109_tcbb_2008_36 crossref_primary_10_1109_TCBB_2008_36 crossref_citationtrail_10_1109_TCBB_2008_36 proquest_miscellaneous_69370057 pubmed_primary_18670041 pubmedcentral_primary_oai_pubmedcentral_nih_gov_2562675 proquest_miscellaneous_869584247 ieee_primary_4492764 proquest_miscellaneous_20497689 proquest_journals_863293417 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2008-07-01 |
| PublicationDateYYYYMMDD | 2008-07-01 |
| PublicationDate_xml | – month: 07 year: 2008 text: 2008-07-01 day: 01 |
| PublicationDecade | 2000 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States – name: New York |
| PublicationTitle | IEEE/ACM transactions on computational biology and bioinformatics |
| PublicationTitleAbbrev | TCBB |
| PublicationTitleAlternate | IEEE/ACM Trans Comput Biol Bioinform |
| PublicationYear | 2008 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 Gordon (ref54) 2002; 62 ref57 ref12 ref14 ref58 ref53 ref52 ref55 ref10 ref17 ref16 ref19 ref18 Quinlan (ref56); 1 ref51 ref50 ref46 ref45 ref47 ref42 ref41 ref44 ref43 Kovacs (ref59) Liu (ref15) 2002; 13 ref49 Doyle (ref48) ref8 ref7 ref9 ref4 ref3 ref6 ref5 ref40 ref35 ref34 ref37 ref36 ref30 Wigle (ref27) 2002; 62 ref33 ref32 ref2 ref1 ref39 ref38 Tan (ref11) 2003; 2 ref24 ref23 ref26 ref25 ref20 ref22 ref21 ref28 Duda (ref31) 2000 ref29 |
| References_xml | – ident: ref44 doi: 10.1162/089976603321780317 – ident: ref8 doi: 10.1038/35000501 – volume: 13 start-page: 51 year: 2002 ident: ref15 article-title: A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns publication-title: Genome Informatics – ident: ref53 doi: 10.1023/A:1007608224229 – ident: ref51 doi: 10.1007/3-540-45014-9_1 – volume: 1 start-page: 725 volume-title: Proc. 13th Nat’l Conf. Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conf. (AAAI/IAAI ’96) ident: ref56 article-title: Bagging, Boosting, and C4.5 – ident: ref40 doi: 10.1016/j.neunet.2006.05.014 – ident: ref4 doi: 10.1073/pnas.95.26.15623 – ident: ref3 doi: 10.1142/9781860947322_0021 – ident: ref18 doi: 10.1016/j.jspi.2007.06.019 – ident: ref36 doi: 10.1073/pnas.0506637102 – ident: ref6 doi: 10.1016/j.artmed.2005.01.006 – ident: ref26 doi: 10.1038/nm733 – volume-title: Proc. 10th Int’l Conf. Medical Image Computing and Computer-Assisted Intervention (MICCAI) ident: ref48 article-title: Using Manifold Learning for Content-Based Image Retrieval of Prostate Histopathology – ident: ref32 doi: 10.1016/S1535-6108(02)00032-6 – ident: ref45 doi: 10.1007/11566465_90 – ident: ref7 doi: 10.1186/1471-2407-7-55 – ident: ref9 doi: 10.1089/106652700750050943 – ident: ref17 doi: 10.1016/S1535-6108(02)00030-2 – volume: 62 start-page: 4963 year: 2002 ident: ref54 article-title: Translation of Microarray Data into Clinically Relevant Cancer Diagnostic Tests Using Gene Expression Ratios in Lung Cancer and Mesothelioma publication-title: Cancer Research – ident: ref12 doi: 10.1093/bioinformatics/btm216 – ident: ref29 doi: 10.1109/TCBB.2004.45 – volume-title: Pattern Classification year: 2000 ident: ref31 – ident: ref37 doi: 10.1002/pmic.200600165 – ident: ref47 doi: 10.1109/ISBI.2007.357094 – ident: ref41 doi: 10.1109/34.868688 – ident: ref13 doi: 10.1016/j.ygeno.2004.09.007 – ident: ref23 doi: 10.1038/415436a – ident: ref20 doi: 10.1073/pnas.96.6.2907 – ident: ref28 doi: 10.1515/9781400874668 – ident: ref57 doi: 10.1016/0890-5401(89)90010-2 – ident: ref2 doi: 10.1016/j.compbiomed.2005.04.001 – ident: ref55 doi: 10.1023/A:1022627411411 – ident: ref21 doi: 10.1016/j.bbadis.2007.05.005 – ident: ref42 doi: 10.1126/science.290.5500.2319 – ident: ref5 doi: 10.1016/j.artmed.2006.06.002 – ident: ref52 doi: 10.1007/11889762_3 – ident: ref34 doi: 10.1186/1471-2105-6-195 – ident: ref16 doi: 10.1073/pnas.96.12.6745 – ident: ref30 doi: 10.1155/JBB.2005.155 – ident: ref24 doi: 10.1158/0008-5472.CAN-04-0452 – ident: ref35 doi: 10.1186/1471-2105-8-90 – ident: ref1 doi: 10.1126/science.286.5439.531 – ident: ref19 doi: 10.1016/S0140-6736(02)07746-2 – ident: ref14 doi: 10.1093/bioinformatics/bth267 – volume: 2 start-page: S75-S83 year: 2003 ident: ref11 article-title: Ensemble Machine Learning on Gene Expression Data for Cancer Classification publication-title: Applied Bioinformatics – ident: ref33 doi: 10.2202/1544-6115.1147 – ident: ref38 doi: 10.1016/S0014-5793(02)02873-9 – ident: ref39 doi: 10.1037/h0071325 – ident: ref58 doi: 10.1093/bioinformatics/bti517 – ident: ref22 doi: 10.1038/415530a – ident: ref25 doi: 10.1038/nm0102-68 – ident: ref10 doi: 10.1073/pnas.97.1.262 – volume-title: Proc. Sixth Int’l Symp. Hungarian Researchers on Computational Intelligence (CINTI) ident: ref59 article-title: Cluster Validity Measurement Techniques – ident: ref49 doi: 10.1007/BF02345820 – volume: 62 start-page: 3005 year: 2002 ident: ref27 article-title: Molecular Profiling of Non-Small Cell Lung Cancer and Correlation with Disease-Free Survival publication-title: Cancer Research – ident: ref50 doi: 10.1093/bioinformatics/btg496 – ident: ref43 doi: 10.1126/science.290.5500.2323 – ident: ref46 doi: 10.1007/978-3-540-75759-7_34 |
| SSID | ssj0024904 |
| Score | 2.1517625 |
| Snippet | The recent explosion in procurement and availability of high-dimensional gene- and protein-expression profile datasets for cancer diagnostics has necessitated... Evaluation of the DR schemes was done by 1 assessing the discriminability of two supervised classifiers (Support Vector Machine and C4.5 Decision Trees) in the... The recent explosion in procurement and availability of high-dimensional gene and protein expression profile data sets for cancer diagnostics has necessitated... |
| SourceID | unpaywall pubmedcentral proquest pubmed crossref ieee |
| SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 368 |
| SubjectTerms | Algorithms and association rules Availability Bioinformatics Bioinformatics (genome or protein) databases Biomedical measurements Cancer classification Clustering Data and knowledge visualization Data Interpretation, Statistical Data mining Decision trees Discriminant analysis Diseases Explosions Feature extraction or construction Gene Expression Profiling - methods Machine learning Nonlinear Dynamics Pattern Recognition, Automated - methods Principal component analysis Procurement Protein engineering Reproducibility of Results Sensitivity and Specificity Software Studies |
| SummonAdditionalLinks | – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9NAEB6VVIheeBWKKY89ABe0re192cc-UlVIRBU0UjlFu_YaIoJTNY4g8OeZ9domUQjiFCs7tuT1rP2N9pvvA3iVRMxYkwuaJNJQzgtGDUsZzXRmY8MEU9Y1Cr8fyPMhf3clrragNbxb3r6PwvTw8uT42BMembwF21Ig4O7B9nBwcfSpVkLlguJ_NY1eCEURvfPaTk5x6ja5Oqp7elhlxrSXWvkI1a4qfwOY6zzJO_PyWi--68lk6SN0dg9O21Yezz35ejCvzEH2c13Z8V_3dx_uNiCUHPmseQBbtnwIt70t5WIXfi2Jb5SfCUJE0ndKEzpbkGlBBl5dQ9-QU2cN4GU9EMyTD04G1j1o8hFT4ZudkXFJatfNcd1NRZzGNdFlTi6cOgQO9n80PFw8xRMaH8HwrH95ck4bkwaaCZFUNIoNj7TMseyR-BMlxvFQiiKPZZHnYYGHkbRcJohMbGhUrqXVhWCx0aHKsER-DL1yWtonQGQei1xhzWa15SIyaaozfPtimilEGZYF8LZ9aqOsUTB3RhqTUV3JhOnITac31mQygNdd9LVX7tgQt-sSoIvhPI2V5AHstwkxapb1bJRIhvCIRyqAl90orke3yaJLO53P8KocEV6Sbo6QiAhdD3AAZENEIlPEhTHHkD2fgX_uIHF9VTwKQK3kZhfg5MJXR8rxl1o2HMFtjOVhAG-6LF6bGLdI2ol5-r-B-7DjWTSOxPwMetXN3D5HqFaZF81S_Q1ZmTnc priority: 102 providerName: Unpaywall |
| Title | Investigating the Efficacy of Nonlinear Dimensionality Reduction Schemes in Classifying Gene and Protein Expression Studies |
| URI | https://ieeexplore.ieee.org/document/4492764 https://www.ncbi.nlm.nih.gov/pubmed/18670041 https://www.proquest.com/docview/863293417 https://www.proquest.com/docview/20497689 https://www.proquest.com/docview/69370057 https://www.proquest.com/docview/869584247 https://pubmed.ncbi.nlm.nih.gov/PMC2562675 http://doi.org/10.1109/TCBB.2008.36 |
| UnpaywallVersion | submittedVersion |
| Volume | 5 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1557-9964 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0024904 issn: 1545-5963 databaseCode: RIE dateStart: 20040101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9NAEB61RYheeJWHKZQ9ABdwGnvX6_WxLakqpEYVNFI5WfsyRA1O1diCwJ9n1uu4DW0kTra0Y0u7M2t_Y3_zDcAbEVFllUlCIbgKGStoqGhGQy21jRVNaGpdofDxkB-N2Kez5GwNPnS1MNbahnxme-60-Zdvprp2n8p2GcvilLN1WE8F97VaV7p6WdMq0CGCMMGo6kju2e7pwf6-Z03SpmORcKUpLFp6EzWtVW5DmTfJkvfq8kLOf8rJ5Nqb6PABHC_m4Ako5726Uj39-x95x_-d5EO430JSsudj6BGs2fIx3PVNKudb8OeaFEf5jSBgJAOnOyH1nEwLMvRaG_KSfHSNArzIB0J78tmJwjq3ky8YGD_sjIxL0vTgHDe1VcQpXhNZGnLitCJwcPCrZeXiJZ7e-ARGh4PTg6OwbdkQ6iQRVRjFikWSG0yCOB4ioRwrpShMzAtj-gWeRtwyLhCn2L5KjeRWFgmNleynGhPmp7BRTkv7HAg3cWJSzOCstCyJVJZJjc9iDLoUMYelAbxfuC_XrZ65a6sxyZu8pp_lzu--zSblAbztrC-8jscKuy3nks6m9UYA24vIyNtNPssFpwiWWJQG8Lobxd3pfrnI0k7rGd6VId4T2WoLjvjQVQQHQFZYCJ4hSowZmjzzoXg1gzaUA0iXgrQzcOLhyyPl-HsjIo5QN8ZkMYB3XTjfWJhKK7VYmBe3L8w2bHoGjSMwv4SN6rK2rxCmVWqn2Z87cGc0PNn7-hdP-D-u |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fb9MwED6NITRe-DUGYcD8ALxAuib-EeeRjU4F1gpBJ-0tshMHKko6rY2g8M9zttNsZavEUyL5Esm-c_Jd8t13AC9kRLXRBQ-lFDpkrKShpikNc5WbWFNOE2MLhQdD0T9hH0756Qa8aWthjDGOfGY69tT9yy-meW0_le0zlsaJYDfgJmeMcV-tdaGsl7pmgRYThBzjqqW5p_ujw4MDz5ukrmeRtMUpLFp5F7nmKtfhzKt0ya26OlOLn2oyufQuOroLg-UsPAXle6ee607--x-Bx_-d5j2404BS8tZH0X3YMNUDuOXbVC624c8lMY7qK0HISHpWeULlCzItydCrbahz8s62CvAyHwjuyWcrC2sdT75gaPwwMzKuiOvCOXbVVcRqXhNVFeSTVYvAwd6vhpeLl3iC40M4OeqNDvth07QhzDmX8zCKNYuUKDANEniIpLa8lLIsYlEWRbfE00gYJiQiFdPVSaGEUSWnsVbdJMeUeQc2q2llHgMRRcyLBHM4owzjkU5TlePTGMMuQdRhaACvl-7L8kbR3DbWmGQus-mmmfW7b7RJRQAvW-szr-Sxxm7buqS1abwRwO4yMrJmm88yKSjCJRYlAey1o7g_7U8XVZlpPcO7MkR8Ml1vIRAh2prgAMgaCylSxIkxQ5NHPhQvZtCEcgDJSpC2BlY-fHWkGn9zMuIIdmNMFwN41YbzlYWZ51ovF-bJ9QuzB1v90eA4O34__LgLtz2fxtKZn8Lm_Lw2zxC0zfVzt1f_AguEQUs |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9NAEB6VVIheeBWKKY89ABe0re192cc-UlVIRBU0UjlFu_YaIoJTNY4g8OeZ9domUQjiFCs7tuT1rP2N9pvvA3iVRMxYkwuaJNJQzgtGDUsZzXRmY8MEU9Y1Cr8fyPMhf3clrragNbxb3r6PwvTw8uT42BMembwF21Ig4O7B9nBwcfSpVkLlguJ_NY1eCEURvfPaTk5x6ja5Oqp7elhlxrSXWvkI1a4qfwOY6zzJO_PyWi--68lk6SN0dg9O21Yezz35ejCvzEH2c13Z8V_3dx_uNiCUHPmseQBbtnwIt70t5WIXfi2Jb5SfCUJE0ndKEzpbkGlBBl5dQ9-QU2cN4GU9EMyTD04G1j1o8hFT4ZudkXFJatfNcd1NRZzGNdFlTi6cOgQO9n80PFw8xRMaH8HwrH95ck4bkwaaCZFUNIoNj7TMseyR-BMlxvFQiiKPZZHnYYGHkbRcJohMbGhUrqXVhWCx0aHKsER-DL1yWtonQGQei1xhzWa15SIyaaozfPtimilEGZYF8LZ9aqOsUTB3RhqTUV3JhOnITac31mQygNdd9LVX7tgQt-sSoIvhPI2V5AHstwkxapb1bJRIhvCIRyqAl90orke3yaJLO53P8KocEV6Sbo6QiAhdD3AAZENEIlPEhTHHkD2fgX_uIHF9VTwKQK3kZhfg5MJXR8rxl1o2HMFtjOVhAG-6LF6bGLdI2ol5-r-B-7DjWTSOxPwMetXN3D5HqFaZF81S_Q1ZmTnc |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Investigating+the+Efficacy+of+Nonlinear+Dimensionality+Reduction+Schemes+in+Classifying+Gene+and+Protein+Expression+Studies&rft.jtitle=IEEE%2FACM+transactions+on+computational+biology+and+bioinformatics&rft.au=Lee%2C+G&rft.au=Rodriguez%2C+C&rft.au=Madabhushi%2C+A&rft.date=2008-07-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1545-5963&rft.eissn=1557-9964&rft.volume=5&rft.issue=3&rft.spage=368&rft_id=info:doi/10.1109%2FTCBB.2008.36&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=2328962871 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1545-5963&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1545-5963&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1545-5963&client=summon |