Personalized Prediction By Learning Halfspace Reference Classes Under Well-Behaved Distribution

In machine learning applications, predictive models are trained to serve future queries across the entire data distribution. Real-world data often demands excessively complex models to achieve competitive performance, however, sacrificing interpretability. Hence, the growing deployment of machine le...

Full description

Saved in:

Bibliographic Details
Main Authors	Huang, Jizhou, Juba, Brendan
Format	Journal Article
Language	English
Published	19.09.2025
Subjects	Computer Science - Learning
Online Access	Get full text
DOI	10.48550/arxiv.2509.15592

Cover

More Information
Summary:	In machine learning applications, predictive models are trained to serve future queries across the entire data distribution. Real-world data often demands excessively complex models to achieve competitive performance, however, sacrificing interpretability. Hence, the growing deployment of machine learning models in high-stakes applications, such as healthcare, motivates the search for methods for accurate and explainable predictions. This work proposes a Personalized Prediction scheme, where an easy-to-interpret predictor is learned per query. In particular, we wish to produce a "sparse linear" classifier with competitive performance specifically on some sub-population that includes the query point. The goal of this work is to study the PAC-learnability of this prediction model for sub-populations represented by "halfspaces" in a label-agnostic setting. We first give a distribution-specific PAC-learning algorithm for learning reference classes for personalized prediction. By leveraging both the reference-class learning algorithm and a list learner of sparse linear representations, we prove the first upper bound,$O(\mathrm{opt}^{1/4} )$ , for personalized prediction with sparse linear classifiers and homogeneous halfspace subsets. We also evaluate our algorithms on a variety of standard benchmark data sets.
DOI:	10.48550/arxiv.2509.15592