Revisiting Agnostic PAC Learning
PAC learning, dating back to Valiant'84 and Vapnik and Chervonenkis'64,'74, is a classic model for studying supervised learning. In the agnostic setting, we have access to a hypothesis set \mathbf{H} and a training set of labeled samples drawn i,i \mathbf{d} . from an unknown data dis...
Saved in:
| Published in | Proceedings / annual Symposium on Foundations of Computer Science pp. 1968 - 1982 |
|---|---|
| Main Authors | , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
27.10.2024
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2575-8454 |
| DOI | 10.1109/FOCS61266.2024.00118 |
Cover
| Summary: | PAC learning, dating back to Valiant'84 and Vapnik and Chervonenkis'64,'74, is a classic model for studying supervised learning. In the agnostic setting, we have access to a hypothesis set \mathbf{H} and a training set of labeled samples drawn i,i \mathbf{d} . from an unknown data distribution D. The goal is to produce a classifier that is competitive with the hypothesis in \mathbf{H} having the least probability of mispredicting the label of a new sample from D. Empirical Risk Minimization (ERM) is a natural learning algorithm, where one simply outputs the hypothesis from \mathbf{H} making the fewest mistakes on the training data. This simple algorithm is known to have an optimal error in terms of the VC-dimension of \mathbf{H} and the number of samples. In this work, we revisit agnostic PAC learning and first show that ERM and any other proper learning algorithm is in fact sub-optimal if we treat the performance of the best hypothesis in \mathbf{H} , as a parameter. We then complement this lower bound with the first learning algorithm achieving an optimal error. Our algorithm introduces several new ideas that we hope may find further applications in learning theory. |
|---|---|
| ISSN: | 2575-8454 |
| DOI: | 10.1109/FOCS61266.2024.00118 |