PAC–Bayes Guarantees for Data-Adaptive Pairwise Learning

We study the generalization properties of stochastic optimization methods under adaptive data sampling schemes, focusing on the setting of pairwise learning, which is central to tasks like ranking, metric learning, and AUC maximization. Unlike pointwise learning, pairwise methods must address statis...

Full description

Saved in:

Bibliographic Details
Published in	Entropy (Basel, Switzerland) Vol. 27; no. 8; p. 845
Main Authors	Zhou, Sijia, Lei, Yunwen, Kabán, Ata
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 08.08.2025 MDPI
Subjects	Adaptive sampling algorithmic stability Algorithms Analysis Bayesian analysis Data sampling Design Machine learning Optimization PAC–Bayes pairwise learning Random variables randomized algorithms Statistical methods algorithmic stability pairwise learning randomized algorithms PAC–Bayes
Online Access	Get full text
ISSN	1099-4300 1099-4300
DOI	10.3390/e27080845

Cover

More Information
Summary:	We study the generalization properties of stochastic optimization methods under adaptive data sampling schemes, focusing on the setting of pairwise learning, which is central to tasks like ranking, metric learning, and AUC maximization. Unlike pointwise learning, pairwise methods must address statistical dependencies between input pairs—a challenge that existing analyses do not adequately handle when sampling is adaptive. In this work, we extend a general framework that integrates two algorithm-dependent approaches—algorithmic stability and PAC–Bayes analysis for this purpose. Specifically, we examine (1) Pairwise Stochastic Gradient Descent (Pairwise SGD), widely used across machine learning applications, and (2) Pairwise Stochastic Gradient Descent Ascent (Pairwise SGDA), common in adversarial training. Our analysis avoids artificial randomization and leverages the inherent stochasticity of gradient updates instead. Our results yield generalization guarantees of order n−1/2 under non-uniform adaptive sampling strategies, covering both smooth and non-smooth convex settings. We believe these findings address a significant gap in the theory of pairwise learning with adaptive sampling.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1099-4300 1099-4300
DOI:	10.3390/e27080845