How accurate are Bayes factor-based null hypothesis tests? A simulation study
Bayes factor null hypothesis tests provide a viable alternative to frequentist measures of evidence quantification. Bayes factors for realistic data sets in areas like psychology cannot be calculated exactly and require numerical approximations to complex integrals. Crucially, the accuracy of these...
        Saved in:
      
    
          | Main Authors | , , | 
|---|---|
| Format | Journal Article | 
| Language | English | 
| Published | 
          
        12.06.2024
     | 
| Subjects | |
| Online Access | Get full text | 
| DOI | 10.48550/arxiv.2406.08022 | 
Cover
| Summary: | Bayes factor null hypothesis tests provide a viable alternative to
frequentist measures of evidence quantification. Bayes factors for realistic
data sets in areas like psychology cannot be calculated exactly and require
numerical approximations to complex integrals. Crucially, the accuracy of these
approximations, i.e., whether an approximate Bayes factor corresponds to the
exact Bayes factor, is unknown, and may depend on data, prior, and likelihood.
We have recently developed a novel statistical procedure, namely marginal
simulation-based calibration (SBC) for Bayes factors, to test whether the
computed Bayes factors for a given analysis are accurate. Here, we use marginal
SBC for Bayes factors and calibration plots to test for some common cognitive
designs, whether Bayes factors are calculated accurately. We use the
bridgesampling/brms packages in R. We run analyses for three commonly used
designs in psychology and psycholinguistics: (a) a design with random effects
for subjects only, (b) a Latin square design with crossed random effects for
subjects and items, but a single fixed-factor, and (c) a Latin square 2x2
design with crossed random effects for subjects and items. We find that Bayes
factor estimates turn out accurate in cases when the bridgesampling algorithm
does not issue a warning message, but can be biased and liberal when a warning
message is shown. These results support the use of brms/bridgesampling for null
hypothesis Bayes factor tests in commonly used factorial designs. They also
suggest that when a warning message is issued, Bayes factor results should not
be trusted. The results show that it is practical to check whether Bayes
factors are computed correctly. | 
|---|---|
| DOI: | 10.48550/arxiv.2406.08022 |