Dirichlet process mixture models for regression discontinuity designs
The regression discontinuity design is a quasi-experimental design that estimates the causal effect of a treatment when its assignment is defined by a threshold for a continuous variable. The regression discontinuity design assumes that subjects with measurements within a bandwidth around the thresh...
Saved in:
| Published in | Statistical methods in medical research Vol. 32; no. 1; pp. 55 - 70 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
London, England
SAGE Publications
01.01.2023
Sage Publications Ltd |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0962-2802 1477-0334 1477-0334 |
| DOI | 10.1177/09622802221129044 |
Cover
| Summary: | The regression discontinuity design is a quasi-experimental design that estimates the causal effect of a treatment when its assignment is defined by a threshold for a continuous variable. The regression discontinuity design assumes that subjects with measurements within a bandwidth around the threshold belong to a common population, so that the threshold can be seen as a randomising device assigning treatment to those falling just above the threshold and withholding it from those who fall below. Bandwidth selection represents a compelling decision for the regression discontinuity design analysis as results may be highly sensitive to its choice. A few methods to select the optimal bandwidth, mainly from the econometric literature, have been proposed. However, their use in practice is limited. We propose a methodology that, tackling the problem from an applied point of view, considers units’ exchangeability, that is, their similarity with respect to measured covariates, as the main criteria to select subjects for the analysis, irrespectively of their distance from the threshold. We cluster the sample using a Dirichlet process mixture model to identify balanced and homogeneous clusters. Our proposal exploits the posterior similarity matrix, which contains the pairwise probabilities that two observations are allocated to the same cluster in the Markov chain Monte Carlo sample. Thus we include in the regression discontinuity design analysis only those clusters for which we have stronger evidence of exchangeability. We illustrate the validity of our methodology with both a simulated experiment and a motivating example on the effect of statins on cholesterol levels. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 0962-2802 1477-0334 1477-0334 |
| DOI: | 10.1177/09622802221129044 |