Binary Classifier Evaluation Without Ground Truth

In this paper we study statistically sound ways of comparing classifiers in absence for fully reliable reference data. Based on previously published partial frameworks, we explore a more comprehensive approach to comparing and ranking classifiers that is robust to incomplete, erroneous or missing re...

Full description

Saved in:

Bibliographic Details
Published in	2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR) pp. 1 - 6
Main Authors	Fedorchuk, Maksym, Lamiroy, Bart
Format	Conference Proceeding
Language	English
Published	IEEE 01.12.2017
Subjects	Atmospheric measurements Particle measurements Probabilistic logic Reliability Uncertainty
Online Access	Get full text
DOI	10.1109/ICAPR.2017.8593175

Cover

More Information
Summary:	In this paper we study statistically sound ways of comparing classifiers in absence for fully reliable reference data. Based on previously published partial frameworks, we explore a more comprehensive approach to comparing and ranking classifiers that is robust to incomplete, erroneous or missing reference evaluation data. On the one hand, the use of a generalized McNemar's test is shown to give reliable confidence measures in the ranking of two classifiers under the assumption of an existing better-than-random reference classifier. We extend its use to cases where its traditional formulation is notoriously unstable. We also provide a computational context that allows it to be used for large amounts of data. Our classifier evaluation model is generic and applies to any set of binary classifiers. We have more specifically tested and validated it on synthetic and real data coming from document image binarization.
DOI:	10.1109/ICAPR.2017.8593175