Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets

Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified into several subsets, e.g. matched pairs or blocks. Log odds ratio estimates are usually found by maximizing the conditional likelihood. This approach eliminates all strata‐specific parameters by...

Full description

Saved in:
Bibliographic Details
Published inStatistics in medicine Vol. 29; no. 7-8; pp. 770 - 777
Main Authors Heinze, Georg, Puhr, Rainer
Format Journal Article
LanguageEnglish
Published Chichester, UK John Wiley & Sons, Ltd 30.03.2010
Wiley Subscription Services, Inc
Subjects
Online AccessGet full text
ISSN0277-6715
1097-0258
1097-0258
DOI10.1002/sim.3794

Cover

More Information
Summary:Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified into several subsets, e.g. matched pairs or blocks. Log odds ratio estimates are usually found by maximizing the conditional likelihood. This approach eliminates all strata‐specific parameters by conditioning on the number of events within each stratum. However, in the analyses of both an animal experiment and a lung cancer case–control study, conditional maximum likelihood (CML) resulted in infinite odds ratio estimates and monotone likelihood. Estimation can be improved by using Cytel Inc.'s well‐known LogXact software, which provides a median unbiased estimate and exact or mid‐p confidence intervals. Here, we suggest and outline point and interval estimation based on maximization of a penalized conditional likelihood in the spirit of Firth's (Biometrika 1993; 80:27–38) bias correction method (CFL). We present comparative analyses of both studies, demonstrating some advantages of CFL over competitors. We report on a small‐sample simulation study where CFL log odds ratio estimates were almost unbiased, whereas LogXact estimates showed some bias and CML estimates exhibited serious bias. Confidence intervals and tests based on the penalized conditional likelihood had close‐to‐nominal coverage rates and yielded highest power among all methods compared, respectively. Therefore, we propose CFL as an attractive solution to the stratified analysis of binary data, irrespective of the occurrence of monotone likelihood. A SAS program implementing CFL is available at: http://www.muw.ac.at/msi/biometrie/programs. Copyright © 2010 John Wiley & Sons, Ltd.
Bibliography:istex:5E75A907B539E194331563635F220758C35D3A15
ArticleID:SIM3794
ark:/67375/WNG-BQ2HGNCB-T
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:0277-6715
1097-0258
1097-0258
DOI:10.1002/sim.3794