Bolstering Heuristics for Statistical Validation of Prediction Algorithms

Machine learning research in image-based computer aided diagnosis is a field characterised by rich models and relatively small datasets. In this regime, conventional statistical tests for cross validation results may no longer be optimal due to variability in training set quality. We present a princ...

Full description

Saved in:
Bibliographic Details
Published in2015 International Workshop on Pattern Recognition in NeuroImaging pp. 77 - 80
Main Authors Mendelson, Alex F., Zuluaga, Maria A., Hutton, Brian F., Ourselin, Sebastien
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2015
Subjects
Online AccessGet full text
DOI10.1109/PRNI.2015.16

Cover

More Information
Summary:Machine learning research in image-based computer aided diagnosis is a field characterised by rich models and relatively small datasets. In this regime, conventional statistical tests for cross validation results may no longer be optimal due to variability in training set quality. We present a principle by which existing statistical tests can be conservatively extended to make use of arbitrary numbers of repeated experiments. We apply this to the problems of interval estimation and pair wise comparison for the accuracy of classification algorithms, and test the resulting procedures on real and synthetic classification tasks. The interval coverages in the synthetic task are notably improved, and the comparison has both increased power and reduced type I error. Experiments in the ADNI dataset show that the low replicability of split-half based tests can be dramatically improved.
DOI:10.1109/PRNI.2015.16