Least squares after model selection in high-dimensional sparse models
In this article we study post-model selection estimators that apply ordinary least squares (OLS) to the model selected by first-step penalized estimators, typically Lasso. It is well known that Lasso can estimate the nonparametric regression function at nearly the oracle rate, and is thus hard to im...
Saved in:
| Published in | arXiv.org |
|---|---|
| Main Authors | , |
| Format | Paper Journal Article |
| Language | English |
| Published |
Ithaca
Cornell University Library, arXiv.org
20.03.2013
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2331-8422 |
| DOI | 10.48550/arxiv.1001.0188 |
Cover
| Abstract | In this article we study post-model selection estimators that apply ordinary least squares (OLS) to the model selected by first-step penalized estimators, typically Lasso. It is well known that Lasso can estimate the nonparametric regression function at nearly the oracle rate, and is thus hard to improve upon. We show that the OLS post-Lasso estimator performs at least as well as Lasso in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the Lasso-based model selection "fails" in the sense of missing some components of the "true" regression model. By the "true" model, we mean the best s-dimensional approximation to the nonparametric regression function chosen by the oracle. Furthermore, OLS post-Lasso estimator can perform strictly better than Lasso, in the sense of a strictly faster rate of convergence, if the Lasso-based model selection correctly includes all components of the "true" model as a subset and also achieves sufficient sparsity. In the extreme case, when Lasso perfectly selects the "true" model, the OLS post-Lasso estimator becomes the oracle estimator. An important ingredient in our analysis is a new sparsity bound on the dimension of the model selected by Lasso, which guarantees that this dimension is at most of the same order as the dimension of the "true" model. Our rate results are nonasymptotic and hold in both parametric and nonparametric models. Moreover, our analysis is not limited to the Lasso estimator acting as a selector in the first step, but also applies to any other estimator, for example, various forms of thresholded Lasso, with good rates and good sparsity properties. Our analysis covers both traditional thresholding and a new practical, data-driven thresholding scheme that induces additional sparsity subject to maintaining a certain goodness of fit. The latter scheme has theoretical guarantees similar to those of Lasso or OLS post-Lasso, but it dominates those procedures as well as traditional thresholding in a wide variety of experiments. |
|---|---|
| AbstractList | Bernoulli 2013, Vol. 19, No. 2, 521-547 In this article we study post-model selection estimators that apply ordinary
least squares (OLS) to the model selected by first-step penalized estimators,
typically Lasso. It is well known that Lasso can estimate the nonparametric
regression function at nearly the oracle rate, and is thus hard to improve
upon. We show that the OLS post-Lasso estimator performs at least as well as
Lasso in terms of the rate of convergence, and has the advantage of a smaller
bias. Remarkably, this performance occurs even if the Lasso-based model
selection "fails" in the sense of missing some components of the "true"
regression model. By the "true" model, we mean the best s-dimensional
approximation to the nonparametric regression function chosen by the oracle.
Furthermore, OLS post-Lasso estimator can perform strictly better than Lasso,
in the sense of a strictly faster rate of convergence, if the Lasso-based model
selection correctly includes all components of the "true" model as a subset and
also achieves sufficient sparsity. In the extreme case, when Lasso perfectly
selects the "true" model, the OLS post-Lasso estimator becomes the oracle
estimator. An important ingredient in our analysis is a new sparsity bound on
the dimension of the model selected by Lasso, which guarantees that this
dimension is at most of the same order as the dimension of the "true" model.
Our rate results are nonasymptotic and hold in both parametric and
nonparametric models. Moreover, our analysis is not limited to the Lasso
estimator acting as a selector in the first step, but also applies to any other
estimator, for example, various forms of thresholded Lasso, with good rates and
good sparsity properties. Our analysis covers both traditional thresholding and
a new practical, data-driven thresholding scheme that induces additional
sparsity subject to maintaining a certain goodness of fit. The latter scheme
has theoretical guarantees similar to those of Lasso or OLS post-Lasso, but it
dominates those procedures as well as traditional thresholding in a wide
variety of experiments. In this article we study post-model selection estimators that apply ordinary least squares (OLS) to the model selected by first-step penalized estimators, typically Lasso. It is well known that Lasso can estimate the nonparametric regression function at nearly the oracle rate, and is thus hard to improve upon. We show that the OLS post-Lasso estimator performs at least as well as Lasso in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the Lasso-based model selection "fails" in the sense of missing some components of the "true" regression model. By the "true" model, we mean the best s-dimensional approximation to the nonparametric regression function chosen by the oracle. Furthermore, OLS post-Lasso estimator can perform strictly better than Lasso, in the sense of a strictly faster rate of convergence, if the Lasso-based model selection correctly includes all components of the "true" model as a subset and also achieves sufficient sparsity. In the extreme case, when Lasso perfectly selects the "true" model, the OLS post-Lasso estimator becomes the oracle estimator. An important ingredient in our analysis is a new sparsity bound on the dimension of the model selected by Lasso, which guarantees that this dimension is at most of the same order as the dimension of the "true" model. Our rate results are nonasymptotic and hold in both parametric and nonparametric models. Moreover, our analysis is not limited to the Lasso estimator acting as a selector in the first step, but also applies to any other estimator, for example, various forms of thresholded Lasso, with good rates and good sparsity properties. Our analysis covers both traditional thresholding and a new practical, data-driven thresholding scheme that induces additional sparsity subject to maintaining a certain goodness of fit. The latter scheme has theoretical guarantees similar to those of Lasso or OLS post-Lasso, but it dominates those procedures as well as traditional thresholding in a wide variety of experiments. |
| Author | Belloni, Alexandre Chernozhukov, Victor |
| Author_xml | – sequence: 1 givenname: Alexandre surname: Belloni fullname: Belloni, Alexandre – sequence: 2 givenname: Victor surname: Chernozhukov fullname: Chernozhukov, Victor |
| BackLink | https://doi.org/10.48550/arXiv.1001.0188$$DView paper in arXiv https://doi.org/10.3150/11-BEJ410$$DView published paper (Access to full text may be restricted) |
| BookMark | eNotkE1Lw0AQhhdRsNbePUnAc-J-Z3OUUqsQ8NJ7mGxmbUq-upuI_ntT42ngnYdhnveOXHd9h4Q8MJpIoxR9Bv9dfyWMUpZQZswVWXEhWGwk57dkE8KJUsp1ypUSK7LLEcIYhfMEHkMEbkQftX2FTRSwQTvWfRfVXXSsP49xVbfYhTmBeTuAD7ig4Z7cOGgCbv7nmhxed4ftW5x_7N-3L3kMiskYjJQOrbYlpqktrWOl0ZBJjRopzomrTMYz5aB0SlQpZbYEzlJrqHboQKzJ43L2T7EYfN2C_ykuqsVFdQaeFmDw_XnCMBanfvLzu6HgdC5HSSOk-AVYqFnd |
| ContentType | Paper Journal Article |
| Copyright | 2013. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
| Copyright_xml | – notice: 2013. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
| DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS AKZ EPD GOX |
| DOI | 10.48550/arxiv.1001.0188 |
| DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials ProQuest Central ProQuest Technology Collection ProQuest One Community College ProQuest Central Korea SciTech Premium Collection ProQuest Engineering Collection Engineering Database Proquest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection arXiv Mathematics arXiv Statistics arXiv.org |
| DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) Engineering Collection |
| DatabaseTitleList | Publicly Available Content Database |
| Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Physics |
| EISSN | 2331-8422 |
| ExternalDocumentID | 1001_0188 |
| Genre | Working Paper/Pre-Print |
| GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS AKZ EPD GOX |
| ID | FETCH-LOGICAL-a514-a844fec6cbe77cbcf1b86a946e6e0e7cbfd89295fabf53d701cba217c806fefa3 |
| IEDL.DBID | BENPR |
| IngestDate | Tue Jul 22 22:01:33 EDT 2025 Mon Jun 30 09:24:15 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a514-a844fec6cbe77cbcf1b86a946e6e0e7cbfd89295fabf53d701cba217c806fefa3 |
| Notes | SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50 IMS-BEJ-BEJ410 |
| OpenAccessLink | https://www.proquest.com/docview/2085554834?pq-origsite=%requestingapplication%&accountid=15518 |
| PQID | 2085554834 |
| PQPubID | 2050157 |
| ParticipantIDs | arxiv_primary_1001_0188 proquest_journals_2085554834 |
| PublicationCentury | 2000 |
| PublicationDate | 20130320 |
| PublicationDateYYYYMMDD | 2013-03-20 |
| PublicationDate_xml | – month: 03 year: 2013 text: 20130320 day: 20 |
| PublicationDecade | 2010 |
| PublicationPlace | Ithaca |
| PublicationPlace_xml | – name: Ithaca |
| PublicationTitle | arXiv.org |
| PublicationYear | 2013 |
| Publisher | Cornell University Library, arXiv.org |
| Publisher_xml | – name: Cornell University Library, arXiv.org |
| SSID | ssj0002672553 |
| Score | 1.50682 |
| SecondaryResourceType | preprint |
| Snippet | In this article we study post-model selection estimators that apply ordinary least squares (OLS) to the model selected by first-step penalized estimators,... Bernoulli 2013, Vol. 19, No. 2, 521-547 In this article we study post-model selection estimators that apply ordinary least squares (OLS) to the model selected... |
| SourceID | arxiv proquest |
| SourceType | Open Access Repository Aggregation Database |
| SubjectTerms | Convergence Economic models Estimating techniques Estimators Goodness of fit Least squares Mathematics - Probability Mathematics - Statistics Theory Regression analysis Regression models Sparsity Statistics - Methodology Statistics - Theory |
| SummonAdditionalLinks | – databaseName: arXiv.org dbid: GOX link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1NSwMxEB3anryI4le1ag5eFzabbDYeRVqKiF4q7G3JxwR6WbXbij_fSXbrRbyGSSAzmcwLYd4DuKMYO4E5ZqGkt6okAJJpZ1PiaVdUKAqf2D5f1PJNPtVlPYLbfS-M2Xyvv3p-YBs1RyKlJtd6DGPCCbGX97XuPxsTE9dg_mtGCDON_LlYU7VYHMHhAPPYQx-XYxhhewLz5yiUw7rPXWz6YUmemyUpGtYlORryEVu3LFIIZz7S7veUGYyyftNhb9qdwmoxXz0us0HGIDOERjKjpQzolLNYVc66wK1W5l4qVOQjGgleE0Ypg7GhFL7KubOGHgpO5ypgMOIMJu17ixfAqhCCMMjjV6K0nmuFVFu8K7inBb2YwnnafvPRM1VEQmLeRMdMYbZ3SDMc0q6J8pyEJrSQl_9OvIKDIglACMqnGUy2mx1eUxne2psUjB_AR4mC priority: 102 providerName: Cornell University |
| Title | Least squares after model selection in high-dimensional sparse models |
| URI | https://www.proquest.com/docview/2085554834 https://arxiv.org/abs/1001.0188 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfZ1LS8NAEMeHtkHw5ttqLTl4jU2yyWY9iKCkLaK1SIXewmYf0EvaJq148rM7u2n0IHgJZDcEMpudmX39fwDX2MaCKF95OsaxaoQJiMdEbjseE2GiSCit2ueEjt-jp3k8b8GkOQtjtlU2PtE6arkUZo58YFiSGPoYie5Xa89Qo8zqaoPQ4Du0gryzEmNtcEKjjNUB5yGdTN9-Zl1CmmAOTer1SivmNeDl5-LDShHd-IEBsDi25I9vtgFneADOlK9UeQgtVRzBnt2nKapjSJ8Na8et1ltzbsi1hG_X0mzcyhJt0MzuonCNCrEnjXJ_rbrhouMoK1U_Wp3AbJjOHsfejoTgcUxoPM6iSCtBRa6SRORCBzmj_DaiiqKZsURLhmlOrHmuYyITPxA5x7GGYD7VSnNyCp1iWahzcBOtNeEqMKuRUS4DRhWGJynCQOILJenCmf38bFWLXRhN4yAzhulCrzFItvvPq-y3VS7-r76E_dCCJAj2yx50NuVWXWE43-R9aLPhqL9rKbwbvc7x-vKVfgP6UaRS |
| linkProvider | ProQuest |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LTwIxEJ4gG6M336KoPehxhd2W3eVATFQICBJiMOG26faRcOG14OPH-d-cll09mHjj2t006Uw7j07n-wCuUceCqqpydQ1zVYYBiBuJxB68SPihor60aJ_9oP3Knka1UQG-8l4Y86wyt4nWUMupMHfkFcMlia4vouxuNncNa5SpruYUGjyjVpANCzGWNXZ01ec7pnBpo_OI-r7x_VZz-NB2M5YBl2Ow4PKIMa1EIBIVhiIR2kuigNdZoAJcAo5oGWEIUdM80TUqw6onEo5xvIiqgVaaU5x2CxxGWR1zP-e-2R-8_Fzy-EGIITtdl0ctdliFLz7Gbxb56LbqGb4Xx478cQXWv7X2wBnwmVrsQ0FNDmDbPgsV6SE0e4bah6TzlWlTIpZQnFjyHJJaAh3UKhlPiAE9dqUhCliDfBC0U4tUrX9Nj2C4CZEcQ3EynahTIKHWmnLlmeInS6QXBQq9oRS-J3FCSUtwYpcfz9bYGgZC2YuNYEpQzgUSZ8cqjX83wdn_n69gpz187sW9Tr97Dru-5bCgaBLKUFwuVuoCI4llcpnpi0C84R3yDQRG4HI |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Least+squares+after+model+selection+in+high-dimensional+sparse+models&rft.jtitle=arXiv.org&rft.au=Belloni%2C+Alexandre&rft.au=Chernozhukov%2C+Victor&rft.date=2013-03-20&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.1001.0188 |