An Introspective Comparison of Random Forest-Based Classifiers for the Analysis of Cluster-Correlated Data by Way of RF

Many mass spectrometry-based studies, as well as other biological experiments produce cluster-correlated data. Failure to account for correlation among observations may result in a classification algorithm overfitting the training data and producing overoptimistic estimated error rates and may make...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 4; no. 9; p. e7087
Main Authors Karpievitch, Yuliya V., Hill, Elizabeth G., Leclerc, Anthony P., Dabney, Alan R., Almeida, Jonas S.
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 18.09.2009
Public Library of Science (PLoS)
Subjects
Online AccessGet full text
ISSN1932-6203
1932-6203
DOI10.1371/journal.pone.0007087

Cover

Abstract Many mass spectrometry-based studies, as well as other biological experiments produce cluster-correlated data. Failure to account for correlation among observations may result in a classification algorithm overfitting the training data and producing overoptimistic estimated error rates and may make subsequent classifications unreliable. Current common practice for dealing with replicated data is to average each subject replicate sample set, reducing the dataset size and incurring loss of information. In this manuscript we compare three approaches to dealing with cluster-correlated data: unmodified Breiman's Random Forest (URF), forest grown using subject-level averages (SLA), and RF++ with subject-level bootstrapping (SLB). RF++, a novel Random Forest-based algorithm implemented in C++, handles cluster-correlated data through a modification of the original resampling algorithm and accommodates subject-level classification. Subject-level bootstrapping is an alternative sampling method that obviates the need to average or otherwise reduce each set of replicates to a single independent sample. Our experiments show nearly identical median classification and variable selection accuracy for SLB forests and URF forests when applied to both simulated and real datasets. However, the run-time estimated error rate was severely underestimated for URF forests. Predictably, SLA forests were found to be more severely affected by the reduction in sample size which led to poorer classification and variable selection accuracy. Perhaps most importantly our results suggest that it is reasonable to utilize URF for the analysis of cluster-correlated data. Two caveats should be noted: first, correct classification error rates must be obtained using a separate test dataset, and second, an additional post-processing step is required to obtain subject-level classifications. RF++ is shown to be an effective alternative for classifying both clustered and non-clustered data. Source code and stand-alone compiled versions of command-line and easy-to-use graphical user interface (GUI) versions of RF++ for Windows and Linux as well as a user manual (Supplementary File S2) are available for download at: http://sourceforge.org/projects/rfpp/ under the GNU public license.
AbstractList Many mass spectrometry-based studies, as well as other biological experiments produce cluster-correlated data. Failure to account for correlation among observations may result in a classification algorithm overfitting the training data and producing overoptimistic estimated error rates and may make subsequent classifications unreliable. Current common practice for dealing with replicated data is to average each subject replicate sample set, reducing the dataset size and incurring loss of information. In this manuscript we compare three approaches to dealing with cluster-correlated data: unmodified Breiman's Random Forest (URF), forest grown using subject-level averages (SLA), and RF++ with subject-level bootstrapping (SLB). RF++, a novel Random Forest-based algorithm implemented in C++, handles cluster-correlated data through a modification of the original resampling algorithm and accommodates subject-level classification. Subject-level bootstrapping is an alternative sampling method that obviates the need to average or otherwise reduce each set of replicates to a single independent sample. Our experiments show nearly identical median classification and variable selection accuracy for SLB forests and URF forests when applied to both simulated and real datasets. However, the run-time estimated error rate was severely underestimated for URF forests. Predictably, SLA forests were found to be more severely affected by the reduction in sample size which led to poorer classification and variable selection accuracy. Perhaps most importantly our results suggest that it is reasonable to utilize URF for the analysis of cluster-correlated data. Two caveats should be noted: first, correct classification error rates must be obtained using a separate test dataset, and second, an additional post-processing step is required to obtain subject-level classifications. RF++ is shown to be an effective alternative for classifying both clustered and non-clustered data. Source code and stand-alone compiled versions of command-line and easy-to-use graphical user interface (GUI) versions of RF++ for Windows and Linux as well as a user manual (Supplementary File S2) are available for download at: http://sourceforge.org/projects/rfpp/ under the GNU public license.
Many mass spectrometry-based studies, as well as other biological experiments produce cluster-correlated data. Failure to account for correlation among observations may result in a classification algorithm overfitting the training data and producing overoptimistic estimated error rates and may make subsequent classifications unreliable. Current common practice for dealing with replicated data is to average each subject replicate sample set, reducing the dataset size and incurring loss of information. In this manuscript we compare three approaches to dealing with cluster-correlated data: unmodified Breiman's Random Forest (URF), forest grown using subject-level averages (SLA), and RF++ with subject-level bootstrapping (SLB). RF++, a novel Random Forest-based algorithm implemented in C++, handles cluster-correlated data through a modification of the original resampling algorithm and accommodates subject-level classification. Subject-level bootstrapping is an alternative sampling method that obviates the need to average or otherwise reduce each set of replicates to a single independent sample. Our experiments show nearly identical median classification and variable selection accuracy for SLB forests and URF forests when applied to both simulated and real datasets. However, the run-time estimated error rate was severely underestimated for URF forests. Predictably, SLA forests were found to be more severely affected by the reduction in sample size which led to poorer classification and variable selection accuracy. Perhaps most importantly our results suggest that it is reasonable to utilize URF for the analysis of cluster-correlated data. Two caveats should be noted: first, correct classification error rates must be obtained using a separate test dataset, and second, an additional post-processing step is required to obtain subject-level classifications. RF++ is shown to be an effective alternative for classifying both clustered and non-clustered data. Source code and stand-alone compiled versions of command-line and easy-to-use graphical user interface (GUI) versions of RF++ for Windows and Linux as well as a user manual (Supplementary File S2) are available for download at:
Many mass spectrometry-based studies, as well as other biological experiments produce cluster-correlated data. Failure to account for correlation among observations may result in a classification algorithm overfitting the training data and producing overoptimistic estimated error rates and may make subsequent classifications unreliable. Current common practice for dealing with replicated data is to average each subject replicate sample set, reducing the dataset size and incurring loss of information. In this manuscript we compare three approaches to dealing with cluster-correlated data: unmodified Breiman's Random Forest (URF), forest grown using subject-level averages (SLA), and RF++ with subject-level bootstrapping (SLB). RF++, a novel Random Forest-based algorithm implemented in C++, handles cluster-correlated data through a modification of the original resampling algorithm and accommodates subject-level classification. Subject-level bootstrapping is an alternative sampling method that obviates the need to average or otherwise reduce each set of replicates to a single independent sample. Our experiments show nearly identical median classification and variable selection accuracy for SLB forests and URF forests when applied to both simulated and real datasets. However, the run-time estimated error rate was severely underestimated for URF forests. Predictably, SLA forests were found to be more severely affected by the reduction in sample size which led to poorer classification and variable selection accuracy. Perhaps most importantly our results suggest that it is reasonable to utilize URF for the analysis of cluster-correlated data. Two caveats should be noted: first, correct classification error rates must be obtained using a separate test dataset, and second, an additional post-processing step is required to obtain subject-level classifications. RF++ is shown to be an effective alternative for classifying both clustered and non-clustered data. Source code and stand-alone compiled versions of command-line and easy-to-use graphical user interface (GUI) versions of RF++ for Windows and Linux as well as a user manual (Supplementary File S2) are available for download at: http://sourceforge.org/projects/rfpp/ under the GNU public license.Many mass spectrometry-based studies, as well as other biological experiments produce cluster-correlated data. Failure to account for correlation among observations may result in a classification algorithm overfitting the training data and producing overoptimistic estimated error rates and may make subsequent classifications unreliable. Current common practice for dealing with replicated data is to average each subject replicate sample set, reducing the dataset size and incurring loss of information. In this manuscript we compare three approaches to dealing with cluster-correlated data: unmodified Breiman's Random Forest (URF), forest grown using subject-level averages (SLA), and RF++ with subject-level bootstrapping (SLB). RF++, a novel Random Forest-based algorithm implemented in C++, handles cluster-correlated data through a modification of the original resampling algorithm and accommodates subject-level classification. Subject-level bootstrapping is an alternative sampling method that obviates the need to average or otherwise reduce each set of replicates to a single independent sample. Our experiments show nearly identical median classification and variable selection accuracy for SLB forests and URF forests when applied to both simulated and real datasets. However, the run-time estimated error rate was severely underestimated for URF forests. Predictably, SLA forests were found to be more severely affected by the reduction in sample size which led to poorer classification and variable selection accuracy. Perhaps most importantly our results suggest that it is reasonable to utilize URF for the analysis of cluster-correlated data. Two caveats should be noted: first, correct classification error rates must be obtained using a separate test dataset, and second, an additional post-processing step is required to obtain subject-level classifications. RF++ is shown to be an effective alternative for classifying both clustered and non-clustered data. Source code and stand-alone compiled versions of command-line and easy-to-use graphical user interface (GUI) versions of RF++ for Windows and Linux as well as a user manual (Supplementary File S2) are available for download at: http://sourceforge.org/projects/rfpp/ under the GNU public license.
Audience Academic
Author Dabney, Alan R.
Karpievitch, Yuliya V.
Hill, Elizabeth G.
Almeida, Jonas S.
Leclerc, Anthony P.
AuthorAffiliation University of East Piedmont, Italy
4 Department of Bioinformatics and Computational Biology, The University of Texas, M. D. Anderson Cancer Center, Houston, Texas, United States of America
2 Division of Biostatistics and Epidemiology, Department of Medicine, Medical University of South Carolina, Charleston, South Carolina, United States of America
3 Department of Computer Science, College of Charleston, Charleston, South Carolina, United States of America
1 Department of Statistics, Texas A&M University, College Station, Texas, United States of America
AuthorAffiliation_xml – name: 1 Department of Statistics, Texas A&M University, College Station, Texas, United States of America
– name: 3 Department of Computer Science, College of Charleston, Charleston, South Carolina, United States of America
– name: 4 Department of Bioinformatics and Computational Biology, The University of Texas, M. D. Anderson Cancer Center, Houston, Texas, United States of America
– name: University of East Piedmont, Italy
– name: 2 Division of Biostatistics and Epidemiology, Department of Medicine, Medical University of South Carolina, Charleston, South Carolina, United States of America
Author_xml – sequence: 1
  givenname: Yuliya V.
  surname: Karpievitch
  fullname: Karpievitch, Yuliya V.
– sequence: 2
  givenname: Elizabeth G.
  surname: Hill
  fullname: Hill, Elizabeth G.
– sequence: 3
  givenname: Anthony P.
  surname: Leclerc
  fullname: Leclerc, Anthony P.
– sequence: 4
  givenname: Alan R.
  surname: Dabney
  fullname: Dabney, Alan R.
– sequence: 5
  givenname: Jonas S.
  surname: Almeida
  fullname: Almeida, Jonas S.
BackLink https://www.ncbi.nlm.nih.gov/pubmed/19763254$$D View this record in MEDLINE/PubMed
BookMark eNqNkl1v0zAUhiM0xD7gHyCIhATiosWx4zjZBVIpFCpNmjS-Lq1T57j15MbFdjb673HXAuuExJSLRCfP-_qc8_o4O-hch1n2tCDDgonizaXrfQd2uErlISFEkFo8yI6KhtFBRQk7uPV9mB2HcEkIZ3VVPcoOi0ZUjPLyKLsedfm0i96FFaporjAfu-UKvAmuy53OL6Br3TKfOI8hDt5BwDYfWwjBaIM-5Nr5PC4wH6VW1sGEjWZs-xDRD8bOe7QQk-Q9RMhn6_w7rG9cJ4-zhxpswCe790n2dfLhy_jT4Oz843Q8OhsoLmoxQD5rFeczrEiNs7apGgbINCpaKSUUENTARRqGUV0Rlgq64aSkDFvNBBHsJHu-9V1ZF-RuZ0EWtKGU8poWiZhuidbBpVx5swS_lg6MvCk4P5fgo1EWZd3WJbSl0KRty7rmAELroq5QlYIzAsmLb736bgXra7D2j2FB5Ca23y3ITWxyF1vSvd112c-W2CpMiYDda2b_T2cWcu6uJBWsoaJMBq92Bt796FNUcmmCQmuhQ9cHKVhJeMqfJvLFHfLfSxluqTmkuU2nXTpWpafFpVGpdW1SfVQKWqcBeJ0Er_cEiYn4M86hD0FOP1_cnz3_ts--vMUuEGxcBGf7aFwX9sFnt1f4d--7q56A0y2g0mUPHrVUJsLGJ41m7P8CKu-I75XrL_1-J_M
CitedBy_id crossref_primary_10_3390_cancers15112880
crossref_primary_10_1002_gepi_21888
crossref_primary_10_1016_j_electstud_2023_102700
crossref_primary_10_1080_10503307_2020_1785037
crossref_primary_10_1007_s10553_023_01618_1
crossref_primary_10_1002_mpr_1463
crossref_primary_10_1109_TNSRE_2019_2945634
crossref_primary_10_1148_radiol_2018180946
crossref_primary_10_1109_JBHI_2014_2337752
crossref_primary_10_1371_journal_pone_0092940
crossref_primary_10_1111_1744_9987_14204
crossref_primary_10_1899_12_009_1
crossref_primary_10_1017_bap_2019_1
crossref_primary_10_1016_j_ecolind_2018_09_002
crossref_primary_10_1093_bib_bbr012
crossref_primary_10_1038_srep35216
crossref_primary_10_3390_metabo11050286
crossref_primary_10_3390_ijerph18137105
crossref_primary_10_3354_meps12377
crossref_primary_10_1088_1752_7155_8_1_016004
crossref_primary_10_1007_s10336_016_1410_y
crossref_primary_10_3389_fped_2020_585868
crossref_primary_10_1139_cjfas_2015_0343
crossref_primary_10_1093_neuonc_noac100
crossref_primary_10_1016_j_gecco_2018_e00506
crossref_primary_10_1371_journal_pone_0095668
crossref_primary_10_1038_nature12529
crossref_primary_10_3390_f10010020
crossref_primary_10_3390_rs14132976
crossref_primary_10_1111_bij_12879
crossref_primary_10_3389_fcvm_2022_994068
crossref_primary_10_1038_s43247_021_00118_6
crossref_primary_10_1007_s00204_017_2067_x
crossref_primary_10_1016_j_csda_2010_11_017
crossref_primary_10_1186_s12859_019_2845_y
crossref_primary_10_1002_pne2_12007
crossref_primary_10_1016_j_erss_2020_101883
crossref_primary_10_1016_j_jth_2019_02_002
crossref_primary_10_1109_TBME_2016_2591827
crossref_primary_10_1007_s00180_011_0249_1
crossref_primary_10_1016_j_foodqual_2021_104371
crossref_primary_10_1016_j_eswa_2015_10_034
crossref_primary_10_1063_5_0171922
crossref_primary_10_1515_ling_2019_0049
crossref_primary_10_1080_00949655_2012_741599
crossref_primary_10_1093_bib_bbad002
crossref_primary_10_1063_5_0116650
crossref_primary_10_1038_s41467_019_13345_5
crossref_primary_10_1016_j_foreco_2018_10_021
crossref_primary_10_1016_j_ecolind_2019_106010
crossref_primary_10_1177_00131644221108180
crossref_primary_10_1016_j_biocon_2013_07_037
crossref_primary_10_1080_10888438_2015_1107073
crossref_primary_10_1016_j_rse_2015_11_021
Cites_doi 10.1196/annals.1310.015
10.1038/ng1031
10.1186/1471-2105-8-25
10.1080/01621459.1998.10474100
10.1093/bioinformatics/btg210
10.1007/s10618-005-0004-8
10.1093/bioinformatics/bti254
10.1186/1471-2105-9-307
10.1093/bioinformatics/btl583
10.1007/s10549-004-1710-4
10.1677/erc.0.0110163
10.1016/S0140-6736(06)69342-2
10.1023/A:1010933404324
10.1016/j.artmed.2004.03.006
10.1373/clinchem.2003.028035
10.1074/jbc.M210184200
10.1158/1078-0432.1110.11.3
10.1002/mas.20072
10.1093/biostatistics/4.3.449
10.1186/gb-2006-7-3-401
10.1158/1078-0432.CCR-1162-03
10.1201/9780429246593
10.1007/BF00117831
10.1016/S0140-6736(02)07746-2
10.1007/BF00058655
10.1021/ci060164k
10.1111/j.1523-1755.2004.00352.x
10.1002/ijc.20928
10.1021/ac9908997
10.1080/01621459.1992.10475220
10.1002/ana.21038
ContentType Journal Article
Copyright COPYRIGHT 2009 Public Library of Science
2009 Karpievitch et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Karpievitch et al. 2009
Copyright_xml – notice: COPYRIGHT 2009 Public Library of Science
– notice: 2009 Karpievitch et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: Karpievitch et al. 2009
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
IOV
ISR
3V.
7QG
7QL
7QO
7RV
7SN
7SS
7T5
7TG
7TM
7U9
7X2
7X7
7XB
88E
8AO
8C1
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABJCF
ABUWG
AEUYN
AFKRA
ARAPS
ATCPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
C1K
CCPQU
D1I
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
H94
HCIFZ
K9.
KB.
KB0
KL.
L6V
LK8
M0K
M0S
M1P
M7N
M7P
M7S
NAPCQ
P5Z
P62
P64
PATMY
PDBOC
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
PYCSY
RC3
7X8
5PM
ADTOC
UNPAY
DOA
DOI 10.1371/journal.pone.0007087
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Gale In Context: Opposing Viewpoints
Gale In Context: Science
ProQuest Central (Corporate)
Animal Behavior Abstracts
Bacteriology Abstracts (Microbiology B)
Biotechnology Research Abstracts
Nursing & Allied Health Database
Ecology Abstracts
Entomology Abstracts (Full archive)
Immunology Abstracts
Meteorological & Geoastrophysical Abstracts
Nucleic Acids Abstracts
Virology and AIDS Abstracts
Agricultural Science Collection
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
ProQuest Pharma Collection
Public Health Database
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Journals
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
Agricultural & Environmental Science Collection
ProQuest Central Essentials
Biological Science Collection
ProQuest Central
Technology Collection
Natural Science Collection
Environmental Sciences and Pollution Management
ProQuest One Community College
ProQuest Materials Science Collection
ProQuest Central
Engineering Research Database
Proquest Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
AIDS and Cancer Research Abstracts
SciTech Premium Collection
ProQuest Health & Medical Complete (Alumni)
Materials Science Database
Nursing & Allied Health Database (Alumni Edition)
Meteorological & Geoastrophysical Abstracts - Academic
ProQuest Engineering Collection
ProQuest Biological Science Collection
Agriculture Science Database
ProQuest Health & Medical Collection
Medical Database
Algology Mycology and Protozoology Abstracts (Microbiology C)
Biological Science Database
Engineering Database
Nursing & Allied Health Premium
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
Environmental Science Database
Materials Science Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
Environmental Science Collection
Genetics Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
Unpaywall for CDI: Periodical Content
Unpaywall
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Agricultural Science Database
Publicly Available Content Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
Nucleic Acids Abstracts
SciTech Premium Collection
ProQuest Central China
Environmental Sciences and Pollution Management
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Meteorological & Geoastrophysical Abstracts
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Engineering Collection
Advanced Technologies & Aerospace Collection
Engineering Database
Virology and AIDS Abstracts
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
Agricultural Science Collection
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
Ecology Abstracts
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
Environmental Science Collection
Entomology Abstracts
Nursing & Allied Health Premium
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Environmental Science Database
ProQuest Nursing & Allied Health Source (Alumni)
Engineering Research Database
ProQuest One Academic
Meteorological & Geoastrophysical Abstracts - Academic
ProQuest One Academic (New)
Technology Collection
Technology Research Database
ProQuest One Academic Middle East (New)
Materials Science Collection
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Pharma Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Genetics Abstracts
ProQuest Engineering Collection
Biotechnology Research Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Bacteriology Abstracts (Microbiology B)
Algology Mycology and Protozoology Abstracts (Microbiology C)
Agricultural & Environmental Science Collection
AIDS and Cancer Research Abstracts
Materials Science Database
ProQuest Materials Science Collection
ProQuest Public Health
ProQuest Nursing & Allied Health Source
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest Medical Library
Animal Behavior Abstracts
Materials Science & Engineering Collection
Immunology Abstracts
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList



MEDLINE - Academic
MEDLINE
Agricultural Science Database

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 4
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
– sequence: 5
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Sciences (General)
DocumentTitleAlternate RF++ Clustered Data Classifier
EISSN 1932-6203
ExternalDocumentID 1292225821
oai_doaj_org_article_8d84ad47f0dd4885aa7ff186ec47530a
10.1371/journal.pone.0007087
PMC2739274
2897684181
A472870858
19763254
10_1371_journal_pone_0007087
Genre Journal Article
Research Support, N.I.H., Extramural
GeographicLocations United States--US
Texas
South Carolina
GeographicLocations_xml – name: Texas
– name: South Carolina
– name: United States--US
GrantInformation_xml – fundername: NCI NIH HHS
  grantid: R25-CA-90301
– fundername: NLM NIH HHS
  grantid: T15 LM007438
– fundername: NLM NIH HHS
  grantid: 1-T15-LM07438-01
– fundername: NIDCR NIH HHS
  grantid: K25 DE016863
– fundername: NCI NIH HHS
  grantid: R25 CA090301
GroupedDBID ---
123
29O
2WC
53G
5VS
7RV
7X2
7X7
7XC
88E
8AO
8C1
8CJ
8FE
8FG
8FH
8FI
8FJ
A8Z
AAFWJ
AAUCC
AAWOE
AAYXX
ABDBF
ABIVO
ABJCF
ABUWG
ACGFO
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
ADRAZ
AEAQA
AENEX
AEUYN
AFKRA
AFPKN
AFRAH
AHMBA
ALMA_UNASSIGNED_HOLDINGS
AOIJS
APEBS
ARAPS
ATCPS
BAWUL
BBNVY
BCNDV
BENPR
BGLVJ
BHPHI
BKEYQ
BPHCQ
BVXVI
BWKFM
CCPQU
CITATION
CS3
D1I
D1J
D1K
DIK
DU5
E3Z
EAP
EAS
EBD
EMOBN
ESTFP
ESX
EX3
F5P
FPL
FYUFA
GROUPED_DOAJ
GX1
HCIFZ
HH5
HMCUK
HYE
IAO
IEA
IGS
IHR
IHW
INH
INR
IOV
IPY
ISE
ISR
ITC
K6-
KB.
KQ8
L6V
LK5
LK8
M0K
M1P
M48
M7P
M7R
M7S
M~E
NAPCQ
O5R
O5S
OK1
OVT
P2P
P62
PATMY
PDBOC
PHGZM
PHGZT
PIMPY
PJZUB
PPXIY
PQGLB
PQQKQ
PROAC
PSQYO
PTHSS
PUEGO
PYCSY
RNS
RPM
SV3
TR2
UKHRP
WOQ
WOW
~02
~KM
ALIPV
CGR
CUY
CVF
ECM
EIF
NPM
BBORY
3V.
7QG
7QL
7QO
7SN
7SS
7T5
7TG
7TM
7U9
7XB
8FD
8FK
AZQEC
C1K
DWQXO
FR3
GNUQQ
H94
K9.
KL.
M7N
P64
PKEHL
PQEST
PQUKI
PRINS
RC3
7X8
5PM
ADTOC
IPNFZ
PV9
RIG
RZL
UNPAY
-
02
AAPBV
ABPTK
ADACO
BBAFP
KM
ID FETCH-LOGICAL-c5787-e5bdc55be608ebd9693ae3fec26cc7ca0efa5776332f603ca0f950423edf37073
IEDL.DBID M48
ISSN 1932-6203
IngestDate Fri Nov 26 17:13:42 EST 2021
Tue Oct 14 19:05:56 EDT 2025
Sun Oct 26 04:16:55 EDT 2025
Tue Sep 30 16:45:30 EDT 2025
Fri Sep 05 10:32:18 EDT 2025
Tue Oct 07 06:30:07 EDT 2025
Mon Oct 20 17:25:13 EDT 2025
Thu Oct 16 14:08:48 EDT 2025
Thu Oct 16 14:26:57 EDT 2025
Thu May 22 21:23:11 EDT 2025
Mon Jul 21 06:02:08 EDT 2025
Wed Oct 01 02:31:22 EDT 2025
Thu Apr 24 23:09:56 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 9
Language English
License This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
cc-by
Creative Commons Attribution License
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c5787-e5bdc55be608ebd9693ae3fec26cc7ca0efa5776332f603ca0f950423edf37073
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Conceived and designed the experiments: YVK EGH ARD. Performed the experiments: YVK APL. Analyzed the data: YVK APL JSA. Wrote the paper: YVK EGH APL ARD JSA.
OpenAccessLink http://journals.scholarsportal.info/openUrl.xqy?doi=10.1371/journal.pone.0007087
PMID 19763254
PQID 1292225821
PQPubID 1436336
PageCount e7087
ParticipantIDs plos_journals_1292225821
doaj_primary_oai_doaj_org_article_8d84ad47f0dd4885aa7ff186ec47530a
unpaywall_primary_10_1371_journal_pone_0007087
pubmedcentral_primary_oai_pubmedcentral_nih_gov_2739274
proquest_miscellaneous_734050052
proquest_journals_1292225821
gale_infotracacademiconefile_A472870858
gale_incontextgauss_ISR_A472870858
gale_incontextgauss_IOV_A472870858
gale_healthsolutions_A472870858
pubmed_primary_19763254
crossref_citationtrail_10_1371_journal_pone_0007087
crossref_primary_10_1371_journal_pone_0007087
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2009-09-18
PublicationDateYYYYMMDD 2009-09-18
PublicationDate_xml – month: 09
  year: 2009
  text: 2009-09-18
  day: 18
PublicationDecade 2000
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: San Francisco
– name: San Francisco, USA
PublicationTitle PloS one
PublicationTitleAlternate PLoS One
PublicationYear 2009
Publisher Public Library of Science
Public Library of Science (PLoS)
Publisher_xml – name: Public Library of Science
– name: Public Library of Science (PLoS)
References C Strobl (ref34) 2008; 9
G Izmirlian (ref19) 2004; 1020
ET Fung (ref6) 2005; 115
L Li (ref8) 2004; 32
YV Karpievitch (ref29) 2007; 23
EF Petricoin (ref12) 2002; 359
RW Garden (ref13) 2000; 72
B Efron (ref18) 1994
DS Palmer (ref23) 2007; 47
M Hilario (ref27) 2006; 25
A Vlahou (ref11) 2004; 50
L Breiman (ref21) 1996; 24
AR Dabney (ref15) 2006; 7
GA Churchill (ref14) 2002; 32
LE Breiman (ref16) 2001; 45
YV Karpievitch (ref30) 2009
D Agranoff (ref3) 2006; 368
TP Conrads (ref5) 2004; 11
YD Chen (ref4) 2004; 10
Y Yasui (ref37) 2003; 4
JS Morris (ref36) 2005; 21
V Svetnik (ref24)
SK Lee (ref31) 2005; 11
MR Segal (ref32) 1992; 87
L Breiman (ref17) 1984
B Rosner (ref28) 2000
B Wu (ref25) 2003; 19
TM Pawlik (ref9) 2005; 89
C Strobl (ref20) 2007; 8
EJ Finehout (ref26) 2007; 61
PJ Adam (ref2) 2003; 278
S Schaub (ref10) 2004; 65
L Breiman (ref35) 1996; 24
BL Adam (ref1) 2002; 62
JM Koomen (ref7) 2005; 11
JR Quinlan (ref22)
H Zhang (ref33) 1998; 93
15709178 - Clin Cancer Res. 2005 Feb 1;11(3):1110-8
12477722 - J Biol Chem. 2003 Feb 21;278(8):6482-9
15704152 - Int J Cancer. 2005 Jul 10;115(5):783-9
15163296 - Endocr Relat Cancer. 2004 Jun;11(2):163-78
15692757 - Breast Cancer Res Treat. 2005 Jan;89(2):149-57
15277356 - Clin Chem. 2004 Aug;50(8):1438-41
12967959 - Bioinformatics. 2003 Sep 1;19(13):1636-43
15673564 - Bioinformatics. 2005 May 1;21(9):1764-75
15623616 - Clin Cancer Res. 2004 Dec 15;10(24):8380-5
17254353 - BMC Bioinformatics. 2007;8:25
15364092 - Artif Intell Med. 2004 Oct;32(2):71-83
14675066 - Kidney Int. 2004 Jan;65(1):323-32
12925511 - Biostatistics. 2003 Jul;4(3):449-63
17121773 - Bioinformatics. 2007 Jan 15;23(2):264-5
16980117 - Lancet. 2006 Sep 16;368(9540):1012-21
17167789 - Ann Neurol. 2007 Feb;61(2):120-9
18620558 - BMC Bioinformatics. 2008;9:307
17238260 - J Chem Inf Model. 2007 Jan-Feb;47(1):150-8
11867112 - Lancet. 2002 Feb 16;359(9306):572-7
12454643 - Nat Genet. 2002 Dec;32 Suppl:490-5
12097261 - Cancer Res. 2002 Jul 1;62(13):3609-14
15208191 - Ann N Y Acad Sci. 2004 May;1020:154-74
10655631 - Anal Chem. 2000 Jan 1;72(1):30-6
16563185 - Genome Biol. 2006;7(3):401
19602524 - Bioinformatics. 2009 Oct 1;25(19):2573-80
16463283 - Mass Spectrom Rev. 2006 May-Jun;25(3):409-49
References_xml – volume: 1020
  start-page: 154
  year: 2004
  ident: ref19
  article-title: Application of the random forest classification algorithm to a SELDI-TOF proteomics study in the setting of a cancer prevention trial.
  publication-title: Ann N Y Acad Sci
  doi: 10.1196/annals.1310.015
– volume: 32
  start-page: 490
  year: 2002
  ident: ref14
  article-title: Fundamentals of experimental design for cDNA microarrays.
  publication-title: Nat Genet
  doi: 10.1038/ng1031
– ident: ref22
  article-title: Bagging, boosting, and C4.5; 1996.
– volume: 8
  start-page: 25
  year: 2007
  ident: ref20
  article-title: Bias in random forest variable importance measures: illustrations, sources and a solution.
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-8-25
– volume: 93
  start-page: 180
  year: 1998
  ident: ref33
  article-title: Classification Trees for Multiple Binary Responses.
  publication-title: Journal of the American Statistical Association
  doi: 10.1080/01621459.1998.10474100
– volume: 19
  start-page: 1636
  year: 2003
  ident: ref25
  article-title: Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data.
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg210
– year: 2009
  ident: ref30
  article-title: Normalization of Peak Intensities in Bottom-Up MS-Based Proteomics Using Singular Value Decomposition.
  publication-title: Bioinformatics
– volume: 11
  start-page: 273
  year: 2005
  ident: ref31
  article-title: Using Generalized Estimating Equation to Learn Decision Tree with Multivariate Responses.
  publication-title: Data Mining and Knowledge Discovery
  doi: 10.1007/s10618-005-0004-8
– volume: 21
  start-page: 1764
  year: 2005
  ident: ref36
  article-title: Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum.
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bti254
– volume: 9
  start-page: 307
  year: 2008
  ident: ref34
  article-title: Conditional variable importance for random forests.
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-9-307
– volume: 23
  start-page: 264
  year: 2007
  ident: ref29
  article-title: PrepMS: TOF MS data graphical preprocessing tool.
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btl583
– volume: 89
  start-page: 149
  year: 2005
  ident: ref9
  article-title: Significant differences in nipple aspirate fluid protein expression between healthy women and those with breast cancer demonstrated by time-of-flight mass spectrometry.
  publication-title: Breast Cancer Res Treat
  doi: 10.1007/s10549-004-1710-4
– start-page: 334
  ident: ref24
  article-title: Application of Breiman's Random Forest to modeling structure-activity relationships of pharmaceutical molecules.
– year: 1984
  ident: ref17
  article-title: Classification and Regression Trees:
– volume: 11
  start-page: 163
  year: 2004
  ident: ref5
  article-title: High-resolution serum proteomic features for ovarian cancer detection.
  publication-title: Endocr Relat Cancer
  doi: 10.1677/erc.0.0110163
– volume: 368
  start-page: 1012
  year: 2006
  ident: ref3
  article-title: Identification of diagnostic markers for tuberculosis by proteomic fingerprinting of serum.
  publication-title: Lancet
  doi: 10.1016/S0140-6736(06)69342-2
– volume: 45
  start-page: 5
  year: 2001
  ident: ref16
  article-title: Random Forests.
  publication-title: Machine Learning
  doi: 10.1023/A:1010933404324
– year: 2000
  ident: ref28
  article-title: Fundamentals of Biostatistics:
– volume: 32
  start-page: 71
  year: 2004
  ident: ref8
  article-title: Data mining techniques for cancer detection using serum proteomic profiling.
  publication-title: Artif Intell Med
  doi: 10.1016/j.artmed.2004.03.006
– volume: 50
  start-page: 1438
  year: 2004
  ident: ref11
  article-title: Protein profiling in urine for the diagnosis of bladder cancer.
  publication-title: Clin Chem
  doi: 10.1373/clinchem.2003.028035
– volume: 278
  start-page: 6482
  year: 2003
  ident: ref2
  article-title: Comprehensive proteomic analysis of breast cancer cell membranes reveals unique proteins with potential roles in clinical cancer.
  publication-title: J Biol Chem
  doi: 10.1074/jbc.M210184200
– volume: 11
  start-page: 1110
  year: 2005
  ident: ref7
  article-title: Plasma protein profiling for diagnosis of pancreatic cancer reveals the presence of host response proteins.
  publication-title: Clin Cancer Res
  doi: 10.1158/1078-0432.1110.11.3
– volume: 25
  start-page: 409
  year: 2006
  ident: ref27
  article-title: Processing and classification of protein mass spectra.
  publication-title: Mass Spectrom Rev
  doi: 10.1002/mas.20072
– volume: 4
  start-page: 449
  year: 2003
  ident: ref37
  article-title: A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection.
  publication-title: Biostatistics
  doi: 10.1093/biostatistics/4.3.449
– volume: 7
  start-page: 401
  year: 2006
  ident: ref15
  article-title: A reanalysis of a published Affymetrix GeneChip control dataset.
  publication-title: Genome Biol
  doi: 10.1186/gb-2006-7-3-401
– volume: 10
  start-page: 8380
  year: 2004
  ident: ref4
  article-title: Artificial neural networks analysis of surface-enhanced laser desorption/ionization mass spectra of serum protein pattern distinguishes colorectal cancer from healthy population.
  publication-title: Clin Cancer Res
  doi: 10.1158/1078-0432.CCR-1162-03
– year: 1994
  ident: ref18
  article-title: An Introduction to the Bootstrap:
  doi: 10.1201/9780429246593
– volume: 24
  start-page: 41
  year: 1996
  ident: ref35
  article-title: Technical Note: Some Properties of Splitting Criteria.
  publication-title: Machine Learning
  doi: 10.1007/BF00117831
– volume: 359
  start-page: 572
  year: 2002
  ident: ref12
  article-title: Use of proteomic patterns in serum to identify ovarian cancer.
  publication-title: Lancet
  doi: 10.1016/S0140-6736(02)07746-2
– volume: 62
  start-page: 3609
  year: 2002
  ident: ref1
  article-title: Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men.
  publication-title: Cancer Res
– volume: 24
  start-page: 123
  year: 1996
  ident: ref21
  article-title: Bagging predictors.
  publication-title: Machine Learning
  doi: 10.1007/BF00058655
– volume: 47
  start-page: 150
  year: 2007
  ident: ref23
  article-title: Random forest models to predict aqueous solubility.
  publication-title: J Chem Inf Model
  doi: 10.1021/ci060164k
– volume: 65
  start-page: 323
  year: 2004
  ident: ref10
  article-title: Urine protein profiling with surface-enhanced laser-desorption/ionization time-of-flight mass spectrometry.
  publication-title: Kidney Int
  doi: 10.1111/j.1523-1755.2004.00352.x
– volume: 115
  start-page: 783
  year: 2005
  ident: ref6
  article-title: Classification of cancer types by measuring variants of host response proteins using SELDI serum assays.
  publication-title: Int J Cancer
  doi: 10.1002/ijc.20928
– volume: 72
  start-page: 30
  year: 2000
  ident: ref13
  article-title: Heterogeneity within MALDI samples as revealed by mass spectrometric imaging.
  publication-title: Anal Chem
  doi: 10.1021/ac9908997
– volume: 87
  start-page: 407
  year: 1992
  ident: ref32
  article-title: Tree-Structured Methods for Longitudinal Data.
  publication-title: Journal of the American Statistical Association
  doi: 10.1080/01621459.1992.10475220
– volume: 61
  start-page: 120
  year: 2007
  ident: ref26
  article-title: Cerebrospinal fluid proteomic biomarkers for Alzheimer's disease.
  publication-title: Ann Neurol
  doi: 10.1002/ana.21038
– reference: 16463283 - Mass Spectrom Rev. 2006 May-Jun;25(3):409-49
– reference: 15277356 - Clin Chem. 2004 Aug;50(8):1438-41
– reference: 12967959 - Bioinformatics. 2003 Sep 1;19(13):1636-43
– reference: 16563185 - Genome Biol. 2006;7(3):401
– reference: 15709178 - Clin Cancer Res. 2005 Feb 1;11(3):1110-8
– reference: 18620558 - BMC Bioinformatics. 2008;9:307
– reference: 15208191 - Ann N Y Acad Sci. 2004 May;1020:154-74
– reference: 17121773 - Bioinformatics. 2007 Jan 15;23(2):264-5
– reference: 12454643 - Nat Genet. 2002 Dec;32 Suppl:490-5
– reference: 12477722 - J Biol Chem. 2003 Feb 21;278(8):6482-9
– reference: 15692757 - Breast Cancer Res Treat. 2005 Jan;89(2):149-57
– reference: 15673564 - Bioinformatics. 2005 May 1;21(9):1764-75
– reference: 15704152 - Int J Cancer. 2005 Jul 10;115(5):783-9
– reference: 17254353 - BMC Bioinformatics. 2007;8:25
– reference: 12925511 - Biostatistics. 2003 Jul;4(3):449-63
– reference: 14675066 - Kidney Int. 2004 Jan;65(1):323-32
– reference: 16980117 - Lancet. 2006 Sep 16;368(9540):1012-21
– reference: 17238260 - J Chem Inf Model. 2007 Jan-Feb;47(1):150-8
– reference: 11867112 - Lancet. 2002 Feb 16;359(9306):572-7
– reference: 12097261 - Cancer Res. 2002 Jul 1;62(13):3609-14
– reference: 15623616 - Clin Cancer Res. 2004 Dec 15;10(24):8380-5
– reference: 15364092 - Artif Intell Med. 2004 Oct;32(2):71-83
– reference: 19602524 - Bioinformatics. 2009 Oct 1;25(19):2573-80
– reference: 17167789 - Ann Neurol. 2007 Feb;61(2):120-9
– reference: 10655631 - Anal Chem. 2000 Jan 1;72(1):30-6
– reference: 15163296 - Endocr Relat Cancer. 2004 Jun;11(2):163-78
SSID ssj0053866
Score 2.2229202
Snippet Many mass spectrometry-based studies, as well as other biological experiments produce cluster-correlated data. Failure to account for correlation among...
SourceID plos
doaj
unpaywall
pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage e7087
SubjectTerms Algorithms
Alzheimer's disease
Alzheimers disease
Bioinformatics
Biomarkers
Classification
Cluster Analysis
Clusters
Comparative analysis
Computer Simulation
Correlation analysis
Data mining
Data processing
Decision trees
Downloading
Error correction
Forests
Gene Expression Profiling - methods
Genetics and Genomics/Bioinformatics
Graphical user interface
Indexing
Machine learning
Mass spectrometry
Mass spectroscopy
Mathematics/Statistics
Models, Genetic
Models, Statistical
Molecular Biology/Bioinformatics
Normal distribution
Oligonucleotide Array Sequence Analysis - methods
Ovarian cancer
Pattern Recognition, Automated - methods
Post-processing
Post-production processing
Proteins
Proteomics
Resampling
Sampling methods
Scientific imaging
Software
Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization - methods
Statistical methods
Variables
Windows (computer programs)
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwELbQXuCCKK8GClgICTi4TWIndo4LYlWQAKlQ1Fvk-FFWCsmq6Wq1_74ziTfsikrtgWs8luJ5-bM8_oaQN0IBsC9swbgWKROVtUwnNmZpIiplEuF5T2D69Vt-fCq-nGVnW62-sCZsoAceFHekrBLaCulja8HZMq2l94nKnRGAtOMeGsWq2BymhhwMUZzn4aEcl8lRsMvhom0cMhbKGEvotjainq9_zMqTRd1210HOfysn7y6bhV6vdF1vbUuzB-R-wJN0Oqxjj9xxzUOyFyK2o-8CrfT7R2Q1begc69I3jyupGXsQ0tZT2LRs-4cCiIV_Ybi7WWoQW889tsvGAQpokepAY4JzTL1EogVmsMdHDbDVUiw5pdWarvQaJU5mj8np7NPPj8cstF1gBsOXuayyJssql8fKVbbIC64d986kuTHS6Nh5nUnISzz1eczhgy8yLK9x1nMJKeMJmTSg6H1Cc52pCk6IPAHjQVqt4DiaSCusTDWEp40I39igNIGTHFtj1GV_0SbhbDKosUTLlcFyEWHjrMXAyXGD_Ac07yiLjNr9B_CzMvhZeZOfReQVOkc5PE8d80I5FRLvilWmIvK6l0BWjQbLds71suvKz99_3ULox8mO0Nsg5FtQh9HhqQSsCdm6diT30Vc3y-5KgG94gldpEpGDjf9eP0zHYcgoeE2kG9cuu1JyAPF4XRCRp4O3_9UygFcONoyI3ImDHdXujjTz3z1pOcDkIpUw83CMmFsZ79n_MN5zcm-4BSxYog7I5PJi6V4AmLysXvZ54wota3VK
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: ProQuest Central
  dbid: BENPR
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3db9MwELdG9wAviPG1wAALIQEP3pLYiZ0HhNqyaUOioMJgb5ETO2NSSMrSaup_z13qZKuYYK_xOYp9H77LnX9HyCuhwLFPTMK4FiETmTFMB8ZnYSAylQei4C2A6adJfHgsPp5EJxtk0t2FwbLKzia2htrUOf4j34NzCUMTFQbvZ78Zdo3C7GrXQkO71grmXQsxdotshoiMNSCbo_3Jl2lnm0G749hdoOMy2HP82p3VlUUkQ-ljad2VA6rF8e-t9WBW1s11rujfFZW3F9VMLy90WV45rg7ukbvOz6TDlWBskQ1b3SdbTpMb-sbBTb99QC6GFT3CevXu0iUd970JaV3Qqa5M_YtiD89mzkZw6hnattI8K7CNNgWvl4IXSTt4E5wzLhcIwMDG2PujBHfW0A96rmm2pD_0sn3rwUNyfLD_bXzIXDsGlqNaMxtlJo-izMa-splJ4oRrywubh3Gey1z7ttCRBHvFwyL2OTwokgjLbqwpuART8ogMKtjobUJjHakMIkceGCPA3GYQpgbSCCNDDWprPMI7HqS5wyrHlhll2ibgJMQsq21MkXOp45xHWD9rtsLq-A_9CNnb0yLSdvugPj9NneKmyiihjZCFD5-qVKS1LIpAxTYXEOn52iMvUDjS1bXV3l6kQyExh6wi5ZGXLQWibVRYznOqF02THn3-fgOir9M1oteOqKhhO3LtrlDAmhDFa41yG2W1W3aTXqqPR3Y6-b1-mPbDYGkwfaQrWy-aVHJw7jGN4JHHK2m_3GVwajnw0CNyTQ_WtnZ9pDr72YKZg_uchBJm7vYacyPmPfn3Mp6SO6u8X8ICtUMG8_OFfQbu4zx77mzCH2QbcUk
  priority: 102
  providerName: ProQuest
– databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Nb9UwDI_G2wEuwPhaYUCEkIBD39qmbdLj24OnDYmBBoPtgKq0aWCitE-01fQ48Ldjt2mhMMQ4cKsaJ2qcxLFr-2dCHvoCFPtIRTaTvmf7iVK2dJVje66fiNT1NWsBTF_sh7uH_vOj4GiNvO9zYQwHwUbMy6r15ONDWWTbhpPbiFfUeU-nLuNu32O6BCJEI-QOSPAWcQj_jNWYgHSBrIcBqOoTsn64_2p23HmaPTv0HGbS6f400ui6alH9B9k9wS87SzH9Pb7yYlMs5epU5vlPl9fiCvnWT7uLWfk0bepkmn79BRHyv_HlKrls1F4660bZIGtZcY1sGMFS0ccG_frJdXI6K-gehs_3OaB0PpRKpKWmB7JQ5WeKJUWr2t6BS1jRtrLnicaq3hSUcApKLe3RVrDPPG8QD8KeYymSHLRrRZ_KWtJkRd_JVTvq4gY5XDx7M9-1TXUIO0UpY2dBotIgSLLQEVmiojBiMmM6S70wTXkqnUzLgIP4ZJ4OHQYvdBRgFFCmNOMg2W6SSQG82SQ0lIFIwJBlrlI-SP8ErGaXK19xT4IUURZh_SaIUwOdjhU88rj1B3IwoTo2xsjs2DDbIvbQa9lBh_yFfgf310CLwN_tC1jt2KxyLJTwpfK5duBThQik5Fq7IsxSHwxPR1rkPu7OuMuiHcRXPPM5urRFICzyoKVA8I8Co4s-yKaq4r2Xb89B9PpgRPTIEOkS2JFKk9EBc8LNOKLcxO3aT7uKQcvEHw3Ccy2y1R-gs5vp0AyCD71ZssjKpoo5A1sDvRoWudUdtx9cBh2bwRpahI8O4oi145bi5GOLrQ7afORx6Dkdjuy5Fu_2v3a4Qy51jsnIdsUWmdRfmuwu6Ld1cs9Iqe9Wx6qH
  priority: 102
  providerName: Unpaywall
Title An Introspective Comparison of Random Forest-Based Classifiers for the Analysis of Cluster-Correlated Data by Way of RF
URI https://www.ncbi.nlm.nih.gov/pubmed/19763254
https://www.proquest.com/docview/1292225821
https://www.proquest.com/docview/734050052
https://pubmed.ncbi.nlm.nih.gov/PMC2739274
https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0007087&type=printable
https://doaj.org/article/8d84ad47f0dd4885aa7ff186ec47530a
http://dx.doi.org/10.1371/journal.pone.0007087
UnpaywallVersion publishedVersion
Volume 4
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVFSB
  databaseName: Free Full-Text Journals in Chemistry
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: HH5
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: http://abc-chemistry.org/
  providerName: ABC ChemistRy
– providerCode: PRVAFT
  databaseName: Open Access Digital Library
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: KQ8
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html
  providerName: Colorado Alliance of Research Libraries
– providerCode: PRVAFT
  databaseName: Open Access Digital Library
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: KQ8
  dateStart: 20061001
  isFulltext: true
  titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html
  providerName: Colorado Alliance of Research Libraries
– providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: DOA
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVEBS
  databaseName: EBSCOhost Academic Search Ultimate
  customDbUrl: https://search.ebscohost.com/login.aspx?authtype=ip,shib&custid=s3936755&profile=ehost&defaultdb=asn
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: ABDBF
  dateStart: 20080101
  isFulltext: true
  titleUrlDefault: https://search.ebscohost.com/direct.asp?db=asn
  providerName: EBSCOhost
– providerCode: PRVEBS
  databaseName: EBSCOhost Food Science Source
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: A8Z
  dateStart: 20080101
  isFulltext: true
  titleUrlDefault: https://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=ehost&defaultdb=fsr
  providerName: EBSCOhost
– providerCode: PRVBFR
  databaseName: Free Medical Journals
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: DIK
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: http://www.freemedicaljournals.com
  providerName: Flying Publisher
– providerCode: PRVFQY
  databaseName: GFMER Free Medical Journals
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: GX1
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php
  providerName: Geneva Foundation for Medical Education and Research
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: M~E
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVAQN
  databaseName: PubMed Central
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: RPM
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/
  providerName: National Library of Medicine
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl: http://www.proquest.com/pqcentral?accountid=15518
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: BENPR
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Health & Medical Collection
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: 7X7
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Public Health Database
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: 8C1
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/publichealth
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Technology Collection
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: 8FG
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/technologycollection1
  providerName: ProQuest
– providerCode: PRVFZP
  databaseName: Scholars Portal Journals: Open Access
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 20250930
  omitProxy: true
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: M48
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: http://journals.scholarsportal.info
  providerName: Scholars Portal
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3db9MwELe27gFeEONrgVEshAQ8ZEpiJ3YeEGrLyoa0MhUK5Sly4mRMCklpWo3-99zlCyKKGC95sM9R4_Odf9ezf0fIMy4B2PvaN5nijslDrU1la8t0bB7KyOYJKwlMzybeyYy_m7vzHdLUbK0nsNga2mE9qdkyPfrxffMaDP5VWbVB2M2go0WexchHKCwpdske7FU-FnM4421eAay7zF4iajE9x2L1Zbq_vaWzWZWc_q3n7i3SvNgGS_88XXljnS3U5kql6W9b1_g2uVVjTjqoFsk-2YmzO2S_tuqCvqipp1_eJVeDjJ7i2fXmAiYdtXUKaZ7Qqcp0_o1iPc9iZQ5hB9S0LKt5mWBJbQoImAKipA3VCY4ZpWskYzBHWAckBWir6Ru1UjTc0M9qU751fI_MxscfRydmXZrBjNDEzdgNdeS6YexZMg617_lMxSyJI8eLIhEpK06UK8B3MSfxLAYNie_iEZxYJ0yAW7lPehlM9AGhnnJlCFEks7Xm4HpDCFltobkWjgIT1gZhjQ6CqOYtx_IZaVAm4wTEL9U0Bqi5oNacQcx21KLi7fiH_BDV28oi63bZkC8vgtqIA6klV5qLxIKfKqWrlEgSW3pxxCHqs5RBnuDiCKorrK3vCAZcYD5ZutIgT0sJZN7I8GjPhVoXRXD6_tM1hD5MO0LPa6Ekh-mIVH2dAr4JGb06kge4VpvPLgKAeBjlS8c2yGGzfrd307YbvA6mklQW5-siEAyAPqYUDPKgWu2_ZhkALgMdGkR07KAztd2e7PJrSWwOUNp3BIw8ai3mWsp7-J_KfkRuVklB37TlIemtluv4MWDLVdgnu2Iu4ClHNj7Hb_tkb3g8OZ_2y39r-qU7gbbZ5Hzw5Scf1X-N
linkProvider Scholars Portal
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwELdGeRgviPG1wGAWAgEP2ZLYiZ0HhLqOqmUfSGMbfQtO7IxJJSlLq6r_FH8jd_naKibYy17jcxT77n6-y53vCHnNJRj2oQ5tprhn81hrW7nasT2XxzJxecrKAqYHh8HghH8e-aMV8ru5C4NplQ0mlkCt8wT_kW_DuYSuifTcj5NfNnaNwuhq00KjEos9s5iDy1Z8GO4Cf994Xv_TcW9g110F7ASl0zZ-rBPfj03gSBPrMAiZMiw1iRckiUiUY1LlC1A75qWBw-BBGvqYPWJ0ygRoBLz3DrnLGWAJ6I8YtQ4eYEcQ1NfzmHC3a2nYmuSZwTqJwsHEvSvHX9kloD0LOpNxXlxn6P6dr7k6yyZqMVfj8ZXDsP-A3K-tWNqtxG6NrJjsIVmrcaKg7-pi1u8fkXk3o0PMhm-udNJe2_mQ5ik9UpnOf1LsEFpM7R04UzUtG3Wep9ikm4JNTcFGpU3xFJzTG8-wvIPdw84iYzCWNd1VU0XjBf2mFuVb-4_Jya2w5QnpZLDR64QGypcx-KXM1ZoDmMfgBLtCcy08BaCgLcIaHkRJXQkdG3KMozK8J8AjqrYxQs5FNecsYrezJlUlkP_Q7yB7W1qs410-yC_OohoWIqklV5qL1IFPldJXSqSpKwOTcPAjHWWRTRSOqLoU26JR1OUCI9TSlxZ5VVJgLY8Mk4XO1KwoouGX0xsQfT1aInpbE6U5bEei6gsasCasEbZEuY6y2iy7iC6V0yIbjfxeP0zbYcAxDE6pzOSzIhIMXAcMUljkaSXtl7sMJjMDHlpELOnB0tYuj2TnP8pS6WCch56AmVutxtyIec_-vYxNsjo4PtiP9oeHe8_JvSrCGNqu3CCd6cXMvABDdRq_LNGBku-3DUd_AChVqHQ
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwELdGkYAXxPhaYDALgYCHrEmcxM4DQl1LtTIYaGzQt-DE9phUkrK0qvqv8ddxl6-tYoK97DU-R7Hv7ue73PmOkOe-AMM-UpHNpO_ZfqKULV3l2J7rJyJ1fcPKAqYf98PdI__9OBivkd_NXRhMq2wwsQRqlaf4j7wL5xK6JsJzu6ZOi_g8GL6d_rKxgxRGWpt2GpWI7OnlAty34s1oALx-4XnDd4f9XbvuMGCnKKm2DhKVBkGiQ0foREVhxKRmRqdemKY8lY42MuCggswzocPggYkCzCTRyjAO2gHvvUauc8YiTCfk49bZAxwJw_qqHuNut5aM7WmeaayZyB1M4jt3FJYdA9pzoTOd5MVFRu_fuZs359lULhdyMjl3MA7vkNu1RUt7lQiukzWd3SXrNWYU9FVd2Pr1PbLoZXSEmfHN9U7ab7sg0tzQA5mp_CfFbqHFzN6B81XRsmnnicGG3RTsawr2Km0KqeCc_mSOpR7sPnYZmYDhrOhAziRNlvSbXJZvHd4nR1fClgekk8FGbxAaykAk4KMyVykfgD0Bh9jlylfckwAQyiKs4UGc1lXRsTnHJC5DfRy8o2obY-RcXHPOInY7a1pVBfkP_Q6yt6XFmt7lg_z0OK4hIhZK-FL53DjwqUIEUnJjXBHq1Aef0pEW2ULhiKsLsi0yxT2fY7RaBMIiz0oKrOuRoYYcy3lRxKNPXy9B9OVghehlTWRy2I5U1pc1YE1YL2yFcgNltVl2EZ8pqkU2G_m9eJi2w4BpGKiSmc7nRcwZuBEYsLDIw0raz3YZzGcGPLQIX9GDla1dHclOfpRl08FQjzwOM7dbjbkU8x79exlb5AYAUfxhtL_3mNyqgo2R7YpN0pmdzvUTsFlnydMSHCj5ftVo9Ac67ay3
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Nb9UwDI_G2wEuwPhaYUCEkIBD39qmbdLj24OnDYmBBoPtgKq0aWCitE-01fQ48Ldjt2mhMMQ4cKsaJ2qcxLFr-2dCHvoCFPtIRTaTvmf7iVK2dJVje66fiNT1NWsBTF_sh7uH_vOj4GiNvO9zYQwHwUbMy6r15ONDWWTbhpPbiFfUeU-nLuNu32O6BCJEI-QOSPAWcQj_jNWYgHSBrIcBqOoTsn64_2p23HmaPTv0HGbS6f400ui6alH9B9k9wS87SzH9Pb7yYlMs5epU5vlPl9fiCvnWT7uLWfk0bepkmn79BRHyv_HlKrls1F4660bZIGtZcY1sGMFS0ccG_frJdXI6K-gehs_3OaB0PpRKpKWmB7JQ5WeKJUWr2t6BS1jRtrLnicaq3hSUcApKLe3RVrDPPG8QD8KeYymSHLRrRZ_KWtJkRd_JVTvq4gY5XDx7M9-1TXUIO0UpY2dBotIgSLLQEVmiojBiMmM6S70wTXkqnUzLgIP4ZJ4OHQYvdBRgFFCmNOMg2W6SSQG82SQ0lIFIwJBlrlI-SP8ErGaXK19xT4IUURZh_SaIUwOdjhU88rj1B3IwoTo2xsjs2DDbIvbQa9lBh_yFfgf310CLwN_tC1jt2KxyLJTwpfK5duBThQik5Fq7IsxSHwxPR1rkPu7OuMuiHcRXPPM5urRFICzyoKVA8I8Co4s-yKaq4r2Xb89B9PpgRPTIEOkS2JFKk9EBc8LNOKLcxO3aT7uKQcvEHw3Ccy2y1R-gs5vp0AyCD71ZssjKpoo5A1sDvRoWudUdtx9cBh2bwRpahI8O4oi145bi5GOLrQ7afORx6Dkdjuy5Fu_2v3a4Qy51jsnIdsUWmdRfmuwu6Ld1cs9Iqe9Wx6qH
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Introspective+Comparison+of+Random+Forest-Based+Classifiers+for+the+Analysis+of+Cluster-Correlated+Data+by+Way+of+RF&rft.jtitle=PloS+one&rft.au=Karpievitch%2C+Yuliya+V.&rft.au=Hill%2C+Elizabeth+G.&rft.au=Leclerc%2C+Anthony+P.&rft.au=Dabney%2C+Alan+R.&rft.date=2009-09-18&rft.issn=1932-6203&rft.eissn=1932-6203&rft.volume=4&rft.issue=9&rft.spage=e7087&rft_id=info:doi/10.1371%2Fjournal.pone.0007087&rft.externalDBID=n%2Fa&rft.externalDocID=10_1371_journal_pone_0007087
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-6203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-6203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-6203&client=summon