Binary Classifier Evaluation Without Ground Truth

In this paper we study statistically sound ways of comparing classifiers in absence for fully reliable reference data. Based on previously published partial frameworks, we explore a more comprehensive approach to comparing and ranking classifiers that is robust to incomplete, erroneous or missing re...

Full description

Saved in:
Bibliographic Details
Published in2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR) pp. 1 - 6
Main Authors Fedorchuk, Maksym, Lamiroy, Bart
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2017
Subjects
Online AccessGet full text
DOI10.1109/ICAPR.2017.8593175

Cover

More Information
Summary:In this paper we study statistically sound ways of comparing classifiers in absence for fully reliable reference data. Based on previously published partial frameworks, we explore a more comprehensive approach to comparing and ranking classifiers that is robust to incomplete, erroneous or missing reference evaluation data. On the one hand, the use of a generalized McNemar's test is shown to give reliable confidence measures in the ranking of two classifiers under the assumption of an existing better-than-random reference classifier. We extend its use to cases where its traditional formulation is notoriously unstable. We also provide a computational context that allows it to be used for large amounts of data. Our classifier evaluation model is generic and applies to any set of binary classifiers. We have more specifically tested and validated it on synthetic and real data coming from document image binarization.
DOI:10.1109/ICAPR.2017.8593175