Blocked 3×2 Cross-Validated t -Test for Comparing Supervised Classification Learning Algorithms

In the research of machine learning algorithms for classification tasks, the comparison of the performances of algorithms is extremely important, and a statistical test of significance for generalization error is often used to perform it in the machine learning literature. In view of the randomness...

Full description

Saved in:

Bibliographic Details
Published in	Neural computation Vol. 26; no. 1; pp. 208 - 235
Main Authors	Yu, Wang, Ruibo, Wang, Huichen, Jia, Jihong, Li
Format	Journal Article
Language	English
Published	One Rogers Street, Cambridge, MA 02142-1209, USA MIT Press 01.01.2014 MIT Press Journals, The
Subjects	Algorithms Comparative analysis Learning Letters Simulation
Online Access	Get full text
ISSN	0899-7667 1530-888X 1530-888X
DOI	10.1162/NECO_a_00532

Cover

More Information
Summary:	In the research of machine learning algorithms for classification tasks, the comparison of the performances of algorithms is extremely important, and a statistical test of significance for generalization error is often used to perform it in the machine learning literature. In view of the randomness of partitions in cross-validation, a new blocked 3×2 cross-validation is proposed to estimate generalization error in this letter. We then conduct an analysis of variance of the blocked 3×2 cross-validated estimator. A relatively conservative variance estimator that considers the correlation between any two two-fold cross-validations, and was previously neglected in 5×2 cross-validated and -tests is put forward. A corresponding test using this variance estimator is presented to compare the performances of algorithms. Simulated results show that the performance of our test is comparable with that of 5×2 cross-validated tests but with less computation complexity.
Bibliography:	January, 2014 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0899-7667 1530-888X 1530-888X
DOI:	10.1162/NECO_a_00532