Natural coordinate descent algorithm for L1-penalised regression in generalised linear models

The problem of finding the maximum likelihood estimates for the regression coefficients in generalised linear models with an ℓ1 sparsity penalty is shown to be equivalent to minimising the unpenalised maximum log-likelihood function over a box with boundary defined by the ℓ1-penalty parameter. In on...

Full description

Saved in:

Bibliographic Details
Published in	Computational statistics & data analysis Vol. 97; pp. 60 - 70
Main Author	Michoel, Tom
Format	Journal Article
Language	English
Published	Elsevier B.V 01.05.2016
Subjects	Algorithms Computation Coordinate descent algorithm data collection Descent Gene expression Generalised linear model linear models Logistic regression Mathematical models neoplasms Penalised regression prototypes Regression regression analysis Source code Statistics Penalised regression Logistic regression Generalised linear model Coordinate descent algorithm
Online Access	Get full text
ISSN	0167-9473 1872-7352
DOI	10.1016/j.csda.2015.11.009

Cover

More Information
Summary:	The problem of finding the maximum likelihood estimates for the regression coefficients in generalised linear models with an ℓ1 sparsity penalty is shown to be equivalent to minimising the unpenalised maximum log-likelihood function over a box with boundary defined by the ℓ1-penalty parameter. In one-parameter models or when a single coefficient is estimated at a time, this result implies a generic soft-thresholding mechanism which leads to a novel coordinate descent algorithm for generalised linear models that is entirely described in terms of the natural formulation of the model and is guaranteed to converge to the true optimum. A prototype implementation for logistic regression tested on two large-scale cancer gene expression datasets shows that this algorithm is efficient, particularly so when a solution is computed at set values of the ℓ1-penalty parameter as opposed to along a regularisation path. Source code and test data are available from http://tmichoel.github.io/glmnat/.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0167-9473 1872-7352
DOI:	10.1016/j.csda.2015.11.009