selectBoost: a general algorithm to enhance the performance of variable selection methods
Abstract Motivation With the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been proposed in the literature, their performance in terms of recall (sensitivity) and precision (predictive positive value) is limited in a contex...
        Saved in:
      
    
          | Published in | Bioinformatics Vol. 37; no. 5; pp. 659 - 668 | 
|---|---|
| Main Authors | , , , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        England
          Oxford University Press
    
        05.05.2021
     Oxford University Press (OUP)  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1367-4803 1367-4811 1460-2059 1367-4811  | 
| DOI | 10.1093/bioinformatics/btaa855 | 
Cover
| Abstract | Abstract
Motivation
With the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been proposed in the literature, their performance in terms of recall (sensitivity) and precision (predictive positive value) is limited in a context where the number of variables by far exceeds the number of observations or in a highly correlated setting.
Results
In this article, we propose a general algorithm, which improves the precision of any existing variable selection method. This algorithm is based on highly intensive simulations and takes into account the correlation structure of the data. Our algorithm can either produce a confidence index for variable selection or be used in an experimental design planning perspective. We demonstrate the performance of our algorithm on both simulated and real data. We then apply it in two different ways to improve biological network reverse-engineering.
Availability and implementation
Code is available as the SelectBoost package on the CRAN, https://cran.r-project.org/package=SelectBoost. Some network reverse-engineering functionalities are available in the Patterns CRAN package, https://cran.r-project.org/package=Patterns.
Supplementary information
Supplementary data are available at Bioinformatics online. | 
    
|---|---|
| AbstractList | Motivation: With the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been proposed in the literature, their performance in terms of recall (sensitivity) and precision (predictive positive value) is limited in a context where the number of variables by far exceeds the number of observations or in a highly correlated setting. Results: In this article, we propose a general algorithm, which improves the precision of any existing variable selection method. This algorithm is based on highly intensive simulations and takes into account the correlation structure of the data. Our algorithm can either produce a confidence index for variable selection or be used in an experimental design planning perspective. We demonstrate the performance of our algorithm on both simulated and real data. We then apply it in two different ways to improve biological network reverse-engineering. With the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been proposed in the literature, their performance in terms of recall (sensitivity) and precision (predictive positive value) is limited in a context where the number of variables by far exceeds the number of observations or in a highly correlated setting.MOTIVATIONWith the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been proposed in the literature, their performance in terms of recall (sensitivity) and precision (predictive positive value) is limited in a context where the number of variables by far exceeds the number of observations or in a highly correlated setting.In this article, we propose a general algorithm, which improves the precision of any existing variable selection method. This algorithm is based on highly intensive simulations and takes into account the correlation structure of the data. Our algorithm can either produce a confidence index for variable selection or be used in an experimental design planning perspective. We demonstrate the performance of our algorithm on both simulated and real data. We then apply it in two different ways to improve biological network reverse-engineering.RESULTSIn this article, we propose a general algorithm, which improves the precision of any existing variable selection method. This algorithm is based on highly intensive simulations and takes into account the correlation structure of the data. Our algorithm can either produce a confidence index for variable selection or be used in an experimental design planning perspective. We demonstrate the performance of our algorithm on both simulated and real data. We then apply it in two different ways to improve biological network reverse-engineering.Code is available as the SelectBoost package on the CRAN, https://cran.r-project.org/package=SelectBoost. Some network reverse-engineering functionalities are available in the Patterns CRAN package, https://cran.r-project.org/package=Patterns.AVAILABILITY AND IMPLEMENTATIONCode is available as the SelectBoost package on the CRAN, https://cran.r-project.org/package=SelectBoost. Some network reverse-engineering functionalities are available in the Patterns CRAN package, https://cran.r-project.org/package=Patterns.Supplementary data are available at Bioinformatics online.SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online. With the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been proposed in the literature, their performance in terms of recall (sensitivity) and precision (predictive positive value) is limited in a context where the number of variables by far exceeds the number of observations or in a highly correlated setting. In this article, we propose a general algorithm, which improves the precision of any existing variable selection method. This algorithm is based on highly intensive simulations and takes into account the correlation structure of the data. Our algorithm can either produce a confidence index for variable selection or be used in an experimental design planning perspective. We demonstrate the performance of our algorithm on both simulated and real data. We then apply it in two different ways to improve biological network reverse-engineering. Code is available as the SelectBoost package on the CRAN, https://cran.r-project.org/package=SelectBoost. Some network reverse-engineering functionalities are available in the Patterns CRAN package, https://cran.r-project.org/package=Patterns. Supplementary data are available at Bioinformatics online. Abstract Motivation With the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been proposed in the literature, their performance in terms of recall (sensitivity) and precision (predictive positive value) is limited in a context where the number of variables by far exceeds the number of observations or in a highly correlated setting. Results In this article, we propose a general algorithm, which improves the precision of any existing variable selection method. This algorithm is based on highly intensive simulations and takes into account the correlation structure of the data. Our algorithm can either produce a confidence index for variable selection or be used in an experimental design planning perspective. We demonstrate the performance of our algorithm on both simulated and real data. We then apply it in two different ways to improve biological network reverse-engineering. Availability and implementation Code is available as the SelectBoost package on the CRAN, https://cran.r-project.org/package=SelectBoost. Some network reverse-engineering functionalities are available in the Patterns CRAN package, https://cran.r-project.org/package=Patterns. Supplementary information Supplementary data are available at Bioinformatics online.  | 
    
| Author | Aouadi, Ismaïl Vallat, Laurent Carapito, Raphael Maumy-Bertrand, Myriam Bertrand, Frédéric Jung, Nicolas Bahram, Seiamak  | 
    
| AuthorAffiliation | 2 Université de Technologie de Troyes, ICD, ROSAS, M2S , Troyes, France 4 Laboratoire International Associé (LIA) INSERM, Strasbourg (France) - Nagano (Japan) , Strasbourg, France 5 Fédération Hospitalo-Universitaire (FHU) OMICARE, Laboratoire Central d’Immunologie, Pôle de Biologie, Nouvel Hôpital Civil, Hôpitaux Universitaires de Strasbourg , Strasbourg, France 3 ImmunoRhumatologie Moléculaire, INSERM UMR_S 1109, LabEx TRANSPLANTEX, Centre de Recherche d’Immunologie et d’Hématologie, Fédération de Médecine Translationnelle de Strasbourg (FMTS), Université de Strasbourg , Strasbourg, France 1 Institut de Recherche Mathématique Avancée, CNRS UMR 7501, Labex IRMIA, Université de Strasbourg , Strasbourg, France  | 
    
| AuthorAffiliation_xml | – name: 4 Laboratoire International Associé (LIA) INSERM, Strasbourg (France) - Nagano (Japan) , Strasbourg, France – name: 2 Université de Technologie de Troyes, ICD, ROSAS, M2S , Troyes, France – name: 3 ImmunoRhumatologie Moléculaire, INSERM UMR_S 1109, LabEx TRANSPLANTEX, Centre de Recherche d’Immunologie et d’Hématologie, Fédération de Médecine Translationnelle de Strasbourg (FMTS), Université de Strasbourg , Strasbourg, France – name: 1 Institut de Recherche Mathématique Avancée, CNRS UMR 7501, Labex IRMIA, Université de Strasbourg , Strasbourg, France – name: 5 Fédération Hospitalo-Universitaire (FHU) OMICARE, Laboratoire Central d’Immunologie, Pôle de Biologie, Nouvel Hôpital Civil, Hôpitaux Universitaires de Strasbourg , Strasbourg, France  | 
    
| Author_xml | – sequence: 1 givenname: Frédéric orcidid: 0000-0002-0837-8281 surname: Bertrand fullname: Bertrand, Frédéric email: frederic.bertrand@utt.fr organization: Institut de Recherche Mathématique Avancée, CNRS UMR 7501, Labex IRMIA, Université de Strasbourg, Strasbourg, France – sequence: 2 givenname: Ismaïl surname: Aouadi fullname: Aouadi, Ismaïl organization: ImmunoRhumatologie Moléculaire, INSERM UMR_S 1109, LabEx TRANSPLANTEX, Centre de Recherche d’Immunologie et d’Hématologie, Fédération de Médecine Translationnelle de Strasbourg (FMTS), Université de Strasbourg, Strasbourg, France – sequence: 3 givenname: Nicolas surname: Jung fullname: Jung, Nicolas organization: Institut de Recherche Mathématique Avancée, CNRS UMR 7501, Labex IRMIA, Université de Strasbourg, Strasbourg, France – sequence: 4 givenname: Raphael surname: Carapito fullname: Carapito, Raphael organization: ImmunoRhumatologie Moléculaire, INSERM UMR_S 1109, LabEx TRANSPLANTEX, Centre de Recherche d’Immunologie et d’Hématologie, Fédération de Médecine Translationnelle de Strasbourg (FMTS), Université de Strasbourg, Strasbourg, France – sequence: 5 givenname: Laurent surname: Vallat fullname: Vallat, Laurent organization: ImmunoRhumatologie Moléculaire, INSERM UMR_S 1109, LabEx TRANSPLANTEX, Centre de Recherche d’Immunologie et d’Hématologie, Fédération de Médecine Translationnelle de Strasbourg (FMTS), Université de Strasbourg, Strasbourg, France – sequence: 6 givenname: Seiamak surname: Bahram fullname: Bahram, Seiamak organization: ImmunoRhumatologie Moléculaire, INSERM UMR_S 1109, LabEx TRANSPLANTEX, Centre de Recherche d’Immunologie et d’Hématologie, Fédération de Médecine Translationnelle de Strasbourg (FMTS), Université de Strasbourg, Strasbourg, France – sequence: 7 givenname: Myriam surname: Maumy-Bertrand fullname: Maumy-Bertrand, Myriam organization: Institut de Recherche Mathématique Avancée, CNRS UMR 7501, Labex IRMIA, Université de Strasbourg, Strasbourg, France  | 
    
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/33016991$$D View this record in MEDLINE/PubMed https://hal.science/hal-03206128$$DView record in HAL  | 
    
| BookMark | eNqNkU1vFSEUhompsR_6FxqWupgWBuYDY0xqo9bkJm504YowzOEOhoFxYK7pv5fbuTa2G7sCDu_zHs7LKTrywQNC55RcUCLYZWeD9SbMo0pWx8suKdVW1TN0QnlNipJU4ijvWd0UvCXsGJ3G-JOQinLOX6BjxgithaAn6EcEBzp9CCGmt1jhLXiYlcPKbcNs0zDiFDD4QXkNOA2AJ5jv2u7PweCdmq3qHODVxwaPR0hD6ONL9NwoF-HVYT1D3z99_HZ9U2y-fv5yfbUpNBckFaIntOsq0xtuSs36BgQtddeLSvNKdL0mJeSqYXVnWCd4meckpaaCGE4aXrIz1Ky-i5_U7W_lnJxmO6r5VlIi92HJh2HJQ1iZfL-S09KN0GvwKY9-Twdl5cMbbwe5DTvZEtHUbZsN3qwGwyPs5moj9zXCSlLTst3RrH19aDaHXwvEJEcbNTinPIQlypLztmY1J3WWnv_7rnvnv7-WBe9WgZ5DjDMYqW1S-_DzM637_9z1I_zJgdEVDMv0VOYPaqbdWg | 
    
| CitedBy_id | crossref_primary_10_1038_s41375_021_01221_5 crossref_primary_10_1109_TPAMI_2023_3340990  | 
    
| Cites_doi | 10.1214/11-AOAS455 10.1137/S0097539792240406 10.1111/j.1467-9868.2009.00723.x 10.1126/science.286.5439.531 10.1093/bioinformatics/bty764 10.1137/S003614450037906X 10.1214/10-AOAS377 10.1007/s00180-011-0232-x 10.1073/pnas.0914005107 10.1007/s11222-016-9651-4 10.1080/01621459.1994.10476871 10.1016/j.cell.2016.02.065 10.1103/PhysRevE.70.066111 10.1093/bioinformatics/btt350 10.1007/3-540-48885-5_8 10.1371/journal.pgen.1003264 10.2307/2529336 10.1080/00220670209598786 10.1089/106652703322756177 10.1198/016214506000000843 10.1109/TAC.1974.1100705 10.1073/pnas.1211130110 10.1093/bioinformatics/btt705 10.18637/jss.v033.i01 10.1038/4447 10.1198/016214505000000628 10.1007/978-3-540-72031-7_22 10.1073/pnas.0437847100 10.1198/004017005000000319 10.1214/09-AOS729 10.1111/j.1467-9868.2005.00503.x 10.1214/12-BA703 10.1111/j.2517-6161.1996.tb02080.x 10.1093/bioinformatics/bth447 10.1093/bioinformatics/btu660 10.1214/aos/1176344136 10.1111/j.1467-9868.2005.00532.x 10.1093/nar/gkv007 10.1080/00401706.1970.10488634 10.1111/j.1467-9868.2010.00740.x 10.1214/009053604000000067 10.1080/00949655.2010.543981 10.1198/016214506000000735  | 
    
| ContentType | Journal Article | 
    
| Copyright | The Author(s) 2020. Published by Oxford University Press. 2020 The Author(s) 2020. Published by Oxford University Press. Distributed under a Creative Commons Attribution 4.0 International License  | 
    
| Copyright_xml | – notice: The Author(s) 2020. Published by Oxford University Press. 2020 – notice: The Author(s) 2020. Published by Oxford University Press. – notice: Distributed under a Creative Commons Attribution 4.0 International License  | 
    
| DBID | TOX AAYXX CITATION CGR CUY CVF ECM EIF NPM 7X8 1XC VOOES 5PM ADTOC UNPAY  | 
    
| DOI | 10.1093/bioinformatics/btaa855 | 
    
| DatabaseName | Oxford Journals Open Access Collection CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic Hyper Article en Ligne (HAL) Hyper Article en Ligne (HAL) (Open Access) PubMed Central (Full Participant titles) Unpaywall for CDI: Periodical Content Unpaywall  | 
    
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic  | 
    
| DatabaseTitleList | MEDLINE - Academic MEDLINE  | 
    
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: TOX name: Oxford Journals Open Access Collection url: https://academic.oup.com/journals/ sourceTypes: Publisher – sequence: 4 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Biology Statistics Computer Science  | 
    
| EISSN | 1460-2059 1367-4811  | 
    
| EndPage | 668 | 
    
| ExternalDocumentID | 10.1093/bioinformatics/btaa855 PMC8097688 oai:HAL:hal-03206128v1 33016991 10_1093_bioinformatics_btaa855  | 
    
| Genre | Research Support, Non-U.S. Gov't Journal Article  | 
    
| GrantInformation_xml | – fundername: ; – fundername: ; grantid: UMR 7501 – fundername: ; grantid: UR 201923174L – fundername: ; grantid: ANR-11-LABX-0070_TRANSPLANTEX – fundername: ; grantid: ANR-11-LABX-0055_IRMIA – fundername: ; grantid: UMR_S 1109  | 
    
| GroupedDBID | -~X .2P .I3 482 48X 53G 5GY 6.Y AAIMJ AAJKP AAKPC AAMVS AAPQZ AAPXW AARHZ AAVAP ABEFU ABNKS ABPTD ABSAR ABSMQ ABWST ABXVV ABZBJ ACGFS ACMRT ACPQN ACUFI ACYTK ADEYI ADFTL ADGZP ADHKW ADOCK ADRIX ADRTK ADYVW ADZTZ ADZXQ AECKG AEGPL AEJOX AEKKA AEKPW AEKSI AELWJ AEPUE AETBJ AFFNX AFFZL AFOFC AFSHK AFXEN AGINJ AGKRT AGQXC AI. ALMA_UNASSIGNED_HOLDINGS ALTZX AQDSO ARIXL ASAOO ATDFG ATTQO AXUDD AYOIW AZFZN AZVOD BCRHZ BHONS CXTWN CZ4 DFGAJ EE~ ELUNK F5P F9B FEDTE H5~ HAR HVGLF HW0 IOX KOP KSI KSN MBTAY MVM NGC PB- Q1. Q5Y QBD RD5 RIG ROL ROX ROZ RXO TCN TLC TN5 TOX TR2 VH1 WH7 XJT ZGI ~91 --- -E4 .DC 0R~ 23N 2WC 4.4 5WA 70D AAIJN AAMDB AAOGV AAVLN AAYXX ABEJV ABEUO ABGNP ABIXL ABPQP ABQLI ACIWK ACPRK ACUXJ ADBBV ADEZT ADGKP ADHZD ADMLS ADPDF ADRDM ADVEK AEMDU AENEX AENZO AEWNT AFGWE AFIYH AFRAH AGKEF AGSYK AHMBA AHXPO AIJHB AJEEA AJEUX AKHUL AKWXX ALUQC AMNDL APIBT APWMN ASPBG AVWKF BAWUL BAYMD BQDIO BQUQU BSWAC BTQHN C45 CDBKE CITATION CS3 DAKXR DIK DILTD DU5 D~K EBD EBS EMOBN FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 HZ~ J21 JXSIZ KAQDR KQ8 M-Z MK~ ML0 N9A NLBLG NMDNZ NOMLY NU- O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PEELM PQQKQ R44 RNS RPM RUSNO RW1 SV3 TEORI TJP W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ~KM CGR CUY CVF ECM EIF M49 NPM 7X8 .-4 .GJ 1TH 1XC AAJQQ AAUQX ABNGD ACUKT AGQPQ C1A CAG COF EJD NTWIH NVLIB O0~ O~Y RNI RZF RZO VOOES 5PM ADTOC UNPAY  | 
    
| ID | FETCH-LOGICAL-c490t-9d01bb5fdf4f2c3d7e912cbd95c459bdc02e3d7f36bf3b942a8502c190f407423 | 
    
| IEDL.DBID | UNPAY | 
    
| ISSN | 1367-4803 1367-4811  | 
    
| IngestDate | Sun Oct 26 04:13:06 EDT 2025 Thu Aug 21 14:01:54 EDT 2025 Tue Oct 14 19:57:59 EDT 2025 Thu Jul 10 19:07:33 EDT 2025 Wed Feb 19 02:24:51 EST 2025 Tue Jul 01 02:33:53 EDT 2025 Thu Apr 24 23:11:08 EDT 2025 Wed Aug 28 03:17:35 EDT 2024  | 
    
| IsDoiOpenAccess | true | 
    
| IsOpenAccess | true | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Issue | 5 | 
    
| Keywords | model selection regression classification regularization prediction dimension cancer pls | 
    
| Language | English | 
    
| License | This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com http://creativecommons.org/licenses/by-nc/4.0 The Author(s) 2020. Published by Oxford University Press. Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0  | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c490t-9d01bb5fdf4f2c3d7e912cbd95c459bdc02e3d7f36bf3b942a8502c190f407423 | 
    
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Ismaïl Aouadi and Nicolas Jung authors wish it to be known that these authors contributed equally.  | 
    
| ORCID | 0000-0002-0837-8281 0000-0002-4615-1512 0000-0002-7036-442X 0000-0002-5226-7706  | 
    
| OpenAccessLink | https://proxy.k.utb.cz/login?url=https://academic.oup.com/bioinformatics/article-pdf/37/5/659/37808861/btaa855.pdf | 
    
| PMID | 33016991 | 
    
| PQID | 2448636406 | 
    
| PQPubID | 23479 | 
    
| PageCount | 10 | 
    
| ParticipantIDs | unpaywall_primary_10_1093_bioinformatics_btaa855 pubmedcentral_primary_oai_pubmedcentral_nih_gov_8097688 hal_primary_oai_HAL_hal_03206128v1 proquest_miscellaneous_2448636406 pubmed_primary_33016991 crossref_citationtrail_10_1093_bioinformatics_btaa855 crossref_primary_10_1093_bioinformatics_btaa855 oup_primary_10_1093_bioinformatics_btaa855  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2021-05-05 | 
    
| PublicationDateYYYYMMDD | 2021-05-05 | 
    
| PublicationDate_xml | – month: 05 year: 2021 text: 2021-05-05 day: 05  | 
    
| PublicationDecade | 2020 | 
    
| PublicationPlace | England | 
    
| PublicationPlace_xml | – name: England | 
    
| PublicationTitle | Bioinformatics | 
    
| PublicationTitleAlternate | Bioinformatics | 
    
| PublicationYear | 2021 | 
    
| Publisher | Oxford University Press Oxford University Press (OUP)  | 
    
| Publisher_xml | – name: Oxford University Press – name: Oxford University Press (OUP)  | 
    
| References | Peng (2023051704104166100_btaa855-B37) 2002; 96 Abramowitz (2023051704104166100_btaa855-B1) 1972 Akaike (2023051704104166100_btaa855-B2) 1974; 19 Magnanensi (2023051704104166100_btaa855-B33) 2017; 27 Schwarz (2023051704104166100_btaa855-B40) 1978; 6 Lipshutz (2023051704104166100_btaa855-B31) 1999; 21 Fan (2023051704104166100_btaa855-B19) 1997; 6 Clauset (2023051704104166100_btaa855-B13) 2004; 70 Zhang (2023051704104166100_btaa855-B48) 2010; 38 Segal (2023051704104166100_btaa855-B41) 2003; 10 Wang (2023051704104166100_btaa855-B45) 2011; 5 Ritchie (2023051704104166100_btaa855-B39) 2015; 43 Cook (2023051704104166100_btaa855-B14) 1994; 89 Golub (2023051704104166100_btaa855-B24) 1999; 286 Hoerl (2023051704104166100_btaa855-B27) 1970; 12 Natarajan (2023051704104166100_btaa855-B36) 1995; 24 Barabási (2023051704104166100_btaa855-B5) 2003 Zhao (2023051704104166100_btaa855-B49) 2006; 7 Donoho (2023051704104166100_btaa855-B16) 2003; 100 Friedman (2023051704104166100_btaa855-B22) 2010 Zou (2023051704104166100_btaa855-B51) 2006; 101 Chun (2023051704104166100_btaa855-B12) 2010; 72 Dettling (2023051704104166100_btaa855-B15) 2004; 20 Bair (2023051704104166100_btaa855-B4) 2006; 101 Chen (2023051704104166100_btaa855-B9) 2007 Bach (2023051704104166100_btaa855-B3) 2008 Boulesteix (2023051704104166100_btaa855-B7) 2014 Morgan (2023051704104166100_btaa855-B35) 2019; 35 Yuan (2023051704104166100_btaa855-B47) 2006; 68 Zou (2023051704104166100_btaa855-B52) 2005; 67 Zhou (2023051704104166100_btaa855-B50) 2013; 9 Chen (2023051704104166100_btaa855-B10) 2001; 43 Bastien (2023051704104166100_btaa855-B6) 2015; 31 Meinshausen (2023051704104166100_btaa855-B34) 2010; 72 Fan (2023051704104166100_btaa855-B21) 2010; 20 Luo (2023051704104166100_btaa855-B32) 2006; 48 Efron (2023051704104166100_btaa855-B17) 2004; 32 Carbonetto (2023051704104166100_btaa855-B11) 2012; 7 Sra (2023051704104166100_btaa855-B42) 2012; 27 Eklund (2023051704104166100_btaa855-B18) 2012; 82 Guan (2023051704104166100_btaa855-B25) 2011; 5 Fan (2023051704104166100_btaa855-B20) 2006 Bourgon (2023051704104166100_btaa855-B8) 2010; 107 Tibshirani (2023051704104166100_btaa855-B43) 1996; 58 Jung (2023051704104166100_btaa855-B29) 2014; 30 Rau (2023051704104166100_btaa855-B38) 2013; 29 Hugo (2023051704104166100_btaa855-B28) 2016; 165 Wu (2023051704104166100_btaa855-B46) 2007; 102 Friedman (2023051704104166100_btaa855-B23) 2010; 33 Hocking (2023051704104166100_btaa855-B26) 1976; 32 Koza (2023051704104166100_btaa855-B30) 1999 Vallat (2023051704104166100_btaa855-B44) 2013; 110  | 
    
| References_xml | – volume: 5 start-page: 1780 year: 2011 ident: 2023051704104166100_btaa855-B25 article-title: Bayesian variable selection regression for genome-wide association studies and other large-scale problems publication-title: Ann. Appl. Stat doi: 10.1214/11-AOAS455 – volume: 24 start-page: 227 year: 1995 ident: 2023051704104166100_btaa855-B36 article-title: Sparse approximate solutions to linear systems publication-title: SIAM J. Comput doi: 10.1137/S0097539792240406 – volume: 72 start-page: 3 year: 2010 ident: 2023051704104166100_btaa855-B12 article-title: Sparse partial least squares regression for simultaneous dimension reduction and variable selection publication-title: J. R. Stat. Soc. Series B Stat. Methodol doi: 10.1111/j.1467-9868.2009.00723.x – volume-title: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables year: 1972 ident: 2023051704104166100_btaa855-B1 – volume: 286 start-page: 531 year: 1999 ident: 2023051704104166100_btaa855-B24 article-title: Molecular classification -of cancer: class discovery and class prediction by gene expression monitoring publication-title: Science doi: 10.1126/science.286.5439.531 – volume: 7 start-page: 2541 year: 2006 ident: 2023051704104166100_btaa855-B49 article-title: On model selection consistency of lasso publication-title: J. Mach. Learn. Res – volume: 35 start-page: 1026 year: 2019 ident: 2023051704104166100_btaa855-B35 article-title: A generalized framework for controlling FDR in gene regulatory network inference publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty764 – volume: 43 start-page: 129 year: 2001 ident: 2023051704104166100_btaa855-B10 article-title: Atomic decomposition by basis pursuit publication-title: SIAM Rev doi: 10.1137/S003614450037906X – volume: 5 start-page: 468 year: 2011 ident: 2023051704104166100_btaa855-B45 article-title: Random lasso publication-title: Ann. Appl. Stat doi: 10.1214/10-AOAS377 – volume: 27 start-page: 177 year: 2012 ident: 2023051704104166100_btaa855-B42 article-title: A short note on parameter approximation for von Mises-Fisher distributions: and a fast implementation of I s (x) publication-title: Comput. Stat doi: 10.1007/s00180-011-0232-x – volume: 107 start-page: 9546 year: 2010 ident: 2023051704104166100_btaa855-B8 article-title: Independent filtering increases detection power for high-throughput experiments publication-title: Proc. Natl. Acad. Sci. USA doi: 10.1073/pnas.0914005107 – volume: 27 start-page: 757 year: 2017 ident: 2023051704104166100_btaa855-B33 article-title: A new universal resample-stable bootstrap-based stopping criterion for PLS component construction publication-title: Stat. Comput doi: 10.1007/s11222-016-9651-4 – start-page: 33 year: 2008 ident: 2023051704104166100_btaa855-B3 – volume: 89 start-page: 1314 year: 1994 ident: 2023051704104166100_btaa855-B14 article-title: Simulation-extrapolation estimation in parametric measurement error models publication-title: J. Am. Stat. Assoc doi: 10.1080/01621459.1994.10476871 – volume: 165 start-page: 35 year: 2016 ident: 2023051704104166100_btaa855-B28 article-title: Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma publication-title: Cell doi: 10.1016/j.cell.2016.02.065 – volume: 70 start-page: 066111 year: 2004 ident: 2023051704104166100_btaa855-B13 article-title: Finding community structure in very large networks publication-title: Phys. Rev. E doi: 10.1103/PhysRevE.70.066111 – volume: 29 start-page: 2146 year: 2013 ident: 2023051704104166100_btaa855-B38 article-title: Data-based filtering for replicated high-throughput transcriptome sequencing experiments publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt350 – volume-title: Genetic Programming as a Darwinian Invention Machine year: 1999 ident: 2023051704104166100_btaa855-B30 doi: 10.1007/3-540-48885-5_8 – start-page: 13 volume-title: PLS–14 Book of Abstracts, Paris, France year: 2014 ident: 2023051704104166100_btaa855-B7 – volume: 9 start-page: e1003264 year: 2013 ident: 2023051704104166100_btaa855-B50 article-title: Polygenic modeling with Bayesian sparse linear mixed models publication-title: PLoS Genet doi: 10.1371/journal.pgen.1003264 – volume: 32 start-page: 1 year: 1976 ident: 2023051704104166100_btaa855-B26 article-title: A Biometrics invited paper. The analysis and selection of variables in linear regression publication-title: Biometrics doi: 10.2307/2529336 – volume: 96 start-page: 3 year: 2002 ident: 2023051704104166100_btaa855-B37 article-title: An introduction to logistic regression analysis and reporting publication-title: J. Educ. Res doi: 10.1080/00220670209598786 – volume: 10 start-page: 961 year: 2003 ident: 2023051704104166100_btaa855-B41 article-title: Regression approaches for microarray data analysis publication-title: J. Comput. Biol doi: 10.1089/106652703322756177 – volume: 102 start-page: 235 year: 2007 ident: 2023051704104166100_btaa855-B46 article-title: Controlling variable selection by the addition of pseudovariables publication-title: J. Am. Stat. Assoc doi: 10.1198/016214506000000843 – volume: 19 start-page: 716 year: 1974 ident: 2023051704104166100_btaa855-B2 article-title: A new look at the statistical model identification publication-title: IEEE Trans. Automat. Contr doi: 10.1109/TAC.1974.1100705 – volume: 20 start-page: 101 year: 2010 ident: 2023051704104166100_btaa855-B21 article-title: A selective overview of variable selection in high dimensional feature space publication-title: Stat. Sin – start-page: 595 volume-title: Proceedings International Congress of Mathematicitans year: 2006 ident: 2023051704104166100_btaa855-B20 – volume: 110 start-page: 459 year: 2013 ident: 2023051704104166100_btaa855-B44 article-title: Reverse-engineering the genetic circuitry of a cancer cell with predicted intervention in chronic lymphocytic leukemia publication-title: Proc. Natl. Acad. Sci. USA doi: 10.1073/pnas.1211130110 – start-page: 69 volume-title: Handbook of Graphs and Networks: From the Genome to the Internet year: 2003 ident: 2023051704104166100_btaa855-B5 – volume: 6 start-page: 131 year: 1997 ident: 2023051704104166100_btaa855-B19 article-title: Comments on “Wavelets in statistics: a review” by A. Antoniadis publication-title: Stat. Meth. Appl – year: 2010 ident: 2023051704104166100_btaa855-B22 article-title: A note on the group lasso and a sparse group lasso publication-title: arXiv preprint arXiv: 1001.0736 – volume: 30 start-page: 571 year: 2014 ident: 2023051704104166100_btaa855-B29 article-title: Cascade: a R package to study, predict and simulate the diffusion of a signal through a temporal gene network publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt705 – volume: 33 start-page: 1 year: 2010 ident: 2023051704104166100_btaa855-B23 article-title: Regularization paths for generalized linear models via coordinate descent publication-title: J. Stat. Softw doi: 10.18637/jss.v033.i01 – volume: 21 start-page: 20 year: 1999 ident: 2023051704104166100_btaa855-B31 article-title: High density synthetic oligonucleotide arrays publication-title: Nat. Genet doi: 10.1038/4447 – volume: 101 start-page: 119 year: 2006 ident: 2023051704104166100_btaa855-B4 article-title: Prediction by supervised principal components publication-title: J. Am. Stat. Assoc doi: 10.1198/016214505000000628 – start-page: 237 volume-title: Bioinformatics Research and Applications, Atlanta, GA, USA year: 2007 ident: 2023051704104166100_btaa855-B9 doi: 10.1007/978-3-540-72031-7_22 – volume: 100 start-page: 2197 year: 2003 ident: 2023051704104166100_btaa855-B16 article-title: Optimally sparse representation in general (nonorthogonal) dictionaries via L1 minimization publication-title: Proc. Natl. Acad. Sci. USA doi: 10.1073/pnas.0437847100 – volume: 48 start-page: 165 year: 2006 ident: 2023051704104166100_btaa855-B32 article-title: Tuning variable selection procedures by adding noise publication-title: Technometrics doi: 10.1198/004017005000000319 – volume: 38 start-page: 894 year: 2010 ident: 2023051704104166100_btaa855-B48 article-title: Nearly unbiased variable selection under minimax concave penalty publication-title: Ann. Stat doi: 10.1214/09-AOS729 – volume: 67 start-page: 301 year: 2005 ident: 2023051704104166100_btaa855-B52 article-title: Regularization and variable selection via the elastic net publication-title: J. R. Stat. Soc. Series B Stat. Methodol doi: 10.1111/j.1467-9868.2005.00503.x – volume: 7 start-page: 73 year: 2012 ident: 2023051704104166100_btaa855-B11 article-title: Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies publication-title: Bayesian Anal doi: 10.1214/12-BA703 – volume: 58 start-page: 267 year: 1996 ident: 2023051704104166100_btaa855-B43 article-title: Regression shrinkage and selection via the lasso publication-title: J. R. Stat. Soc. Series B Methodol doi: 10.1111/j.2517-6161.1996.tb02080.x – volume: 20 start-page: 3583 year: 2004 ident: 2023051704104166100_btaa855-B15 article-title: BagBoosting for tumor classification with gene expression data publication-title: Bioinformatics doi: 10.1093/bioinformatics/bth447 – volume: 31 start-page: 397 year: 2015 ident: 2023051704104166100_btaa855-B6 article-title: Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu660 – volume: 6 start-page: 461 year: 1978 ident: 2023051704104166100_btaa855-B40 article-title: Estimating the dimension of a model publication-title: Ann. Stat doi: 10.1214/aos/1176344136 – volume: 68 start-page: 49 year: 2006 ident: 2023051704104166100_btaa855-B47 article-title: Model selection and estimation in regression with grouped variables publication-title: J. R. Stat. Soc. Series B Stat. Methodol doi: 10.1111/j.1467-9868.2005.00532.x – volume: 43 start-page: e47 year: 2015 ident: 2023051704104166100_btaa855-B39 article-title: limma powers differential expression analyses for RNA-sequencing and microarray studies publication-title: Nucleic Acids Res doi: 10.1093/nar/gkv007 – volume: 12 start-page: 55 year: 1970 ident: 2023051704104166100_btaa855-B27 article-title: Ridge regression: biased estimation for nonorthogonal problems publication-title: Technometrics doi: 10.1080/00401706.1970.10488634 – volume: 72 start-page: 417 year: 2010 ident: 2023051704104166100_btaa855-B34 article-title: Stability selection publication-title: J. R. Stat. Soc. Series B Stat. Methodol doi: 10.1111/j.1467-9868.2010.00740.x – volume: 32 start-page: 407 year: 2004 ident: 2023051704104166100_btaa855-B17 article-title: Least angle regression publication-title: Ann. Stat doi: 10.1214/009053604000000067 – volume: 82 start-page: 515 year: 2012 ident: 2023051704104166100_btaa855-B18 article-title: SimSel: a new simulation method for variable selection publication-title: J. Stat. Comput. Simul doi: 10.1080/00949655.2010.543981 – volume: 101 start-page: 1418 year: 2006 ident: 2023051704104166100_btaa855-B51 article-title: The adaptive lasso and its oracle properties publication-title: J. Am. Stat. Assoc doi: 10.1198/016214506000000735  | 
    
| SSID | ssj0051444 ssj0005056  | 
    
| Score | 2.3731432 | 
    
| Snippet | Abstract
Motivation
With the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been... With the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been proposed in the... Motivation: With the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been proposed...  | 
    
| SourceID | unpaywall pubmedcentral hal proquest pubmed crossref oup  | 
    
| SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source Publisher  | 
    
| StartPage | 659 | 
    
| SubjectTerms | Algorithms Applications Big Data Bioinformatics Computer Science Human health and pathology Life Sciences Methodology Original Papers Research Design Software Statistics  | 
    
| Title | selectBoost: a general algorithm to enhance the performance of variable selection methods | 
    
| URI | https://www.ncbi.nlm.nih.gov/pubmed/33016991 https://www.proquest.com/docview/2448636406 https://hal.science/hal-03206128 https://pubmed.ncbi.nlm.nih.gov/PMC8097688 https://academic.oup.com/bioinformatics/article-pdf/37/5/659/37808861/btaa855.pdf  | 
    
| UnpaywallVersion | publishedVersion | 
    
| Volume | 37 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1460-2059 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: KQ8 dateStart: 19960101 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1460-2059 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: ADMLS dateStart: 19980101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1460-2059 dateEnd: 20241102 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: DIK dateStart: 19960101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1460-2059 dateEnd: 20241102 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: GX1 dateStart: 19960101 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVAQN databaseName: PubMed Central customDbUrl: eissn: 1460-2059 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: RPM dateStart: 20070101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine – providerCode: PRVOVD databaseName: Journals@Ovid LWW All Open Access Journal Collection Rolling customDbUrl: eissn: 1460-2059 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: OVEED dateStart: 20010101 isFulltext: true titleUrlDefault: http://ovidsp.ovid.com/ providerName: Ovid – providerCode: PRVASL databaseName: Oxford Journals Open Access Collection customDbUrl: eissn: 1460-2059 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: TOX dateStart: 19850101 isFulltext: true titleUrlDefault: https://academic.oup.com/journals/ providerName: Oxford University Press – providerCode: PRVASL databaseName: Oxford Journals Open Access Collection customDbUrl: eissn: 1460-2059 dateEnd: 20220930 omitProxy: true ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: TOX dateStart: 19850101 isFulltext: true titleUrlDefault: https://academic.oup.com/journals/ providerName: Oxford University Press  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3db9MwED-tnRC88D0IH5NBPCGlTeLYiXkriKlCMEBape4pspN4rShJ1aZD46_nXDtlGQ8MiTfLzvnzHP8s3_0O4FXCRahTTf0kotzHv5_wFad4a9UIXqUKyrAwDs6fjvl4En-YsukefG19YaSzCh-0Lg1qXjsKUUNbPHTz6S8LPaQmcDVnAhMpbhYeDlUjZcrYAEt7sI9FPOjD_uT4y-jU-l8lfpxuoyW7dBi2XsOCXm3KVdY5sHozYy7ZcYW7hEj_NKy8uamW8uKHXCwunVpHd2DVjtcaq3wbbBo1yH9eoYL8rxNyF247jEtGVuYe7JXVfbhho15ePIDT9Tb2ztu6XjdviCRnlvmayMVZvZo3s--kqUlZzYw-EsSnZPnbuYHUmpzj9d44fBFbDyoWsXGw1w9hcvT-5N3YdxEe_DwWQeOLIgiVYrrQsY5yWiSlCKNcFYLlMROqyIOoxFxNudJUiTjCwQRRjiBGx-ZSTw-gX9VV-RgISjI8jmVkngpjXqpc4903VYHOeVLG0gPWLmOWO_pzE4VjkdlneJp1JzZzU-fBcCe3tAQgf5V4iVqy-9jwd49HHzOTZ6LVI6RMz0MPXuOCXrvGF62uZbj7zZOOrMp6s85wqCmnHFGZB4-s7u3qpNQw7QhsKuloZadn3ZJqPtsyjKcBotQ09SDY6e81u_rk30Wewq3IGAoZK1L2DPrNalM-R6TXqEPonXyeHrod_Asv81uj | 
    
| linkProvider | Unpaywall | 
    
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1bb9MwFLa2TgheuF_CTQbxhJQ2iWPH5q0gpgrBBBKVtqfIduy1oiRVmw6NX89x7ZRlPDAk3iw7x9fj-LN8zncQelUwkVpuSVxkhMXw9xOxYgRurRbAq1SJSSvn4PzpiE2m-YdjeryHvnS-MDJYhQ87lwY1bwKFqKMtHoX5jJeVHREXuJpRAQkOm4WlI9VKySkdQuk-OoAilgzQwfTo8_jE-18Vcc630ZJDOk07r2FBLjcVKusdWPszZy7Zc4W7gEj_NKy8vqmX8vyHXCwunFqHt9CqG683Vvk23LRqqH9eooL8rxNyG90MGBePvcwdtGfqu-iaj3p5fg-drLexd942zbp9gyU-9czXWC5Om9W8nX3HbYNNPXP6iAGf4uVv5wbcWHwG13vn8IV9PaBY2MfBXt9H08P3X99N4hDhIda5SNpYVEmqFLWVzW2mSVUYkWZaVYLqnApV6SQzkGsJU5YokWcwmCTTAGJs7i715AEa1E1tHiEMkhSOY5m5p8KcGaUt3H25SqxmhcllhGi3jKUO9OcuCsei9M_wpOxPbBmmLkKjndzSE4D8VeIlaMnuY8ffPRl_LF2ei1YPkJKfpRF6DQt65RpfdLpWwu53TzqyNs1mXcJQOSMMUFmEHnrd29VJiGPaEdBU0dPKXs_6JfV8tmUY5wmgVM4jlOz094pdffzvIk_QjcwZCjkrUvoUDdrVxjwDpNeq52Hv_gIPOVqH | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=selectBoost%3A+a+general+algorithm+to+enhance+the+performance+of+variable+selection+methods&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Bertrand%2C+Fr%C3%A9d%C3%A9ric&rft.au=Aouadi%2C+Isma%C3%AFl&rft.au=Jung%2C+Nicolas&rft.au=Carapito%2C+Raphael&rft.date=2021-05-05&rft.pub=Oxford+University+Press&rft.issn=1367-4803&rft.eissn=1367-4811&rft.volume=37&rft.issue=5&rft.spage=659&rft.epage=668&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtaa855&rft_id=info%3Apmid%2F33016991&rft.externalDocID=PMC8097688 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon |