Single-loop Projection-free and Projected Gradient-based Algorithms for Nonconvex-concave Saddle Point Problems with Bilevel Structure
In this paper, we explore a broad class of constrained saddle point problems with a bilevel structure, wherein the upper-level objective function is nonconvex-concave and smooth over compact and convex constraint sets, subject to a strongly convex lower-level objective function. This class of proble...
Saved in:
| Published in | arXiv.org |
|---|---|
| Main Authors | , |
| Format | Paper |
| Language | English |
| Published |
Ithaca
Cornell University Library, arXiv.org
13.05.2024
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2331-8422 |
Cover
| Summary: | In this paper, we explore a broad class of constrained saddle point problems with a bilevel structure, wherein the upper-level objective function is nonconvex-concave and smooth over compact and convex constraint sets, subject to a strongly convex lower-level objective function. This class of problems finds wide applicability in machine learning, encompassing robust multi-task learning, adversarial learning, and robust meta-learning. Our study extends the current literature in two main directions: (i) We consider a more general setting where the upper-level function is not necessarily strongly concave or linear in the maximization variable. (ii) While existing methods for solving saddle point problems with a bilevel structure are projected-based algorithms, we propose a one-sided projection-free method employing a linear minimization oracle. Specifically, by utilizing regularization and nested approximation techniques, we introduce a novel single-loop one-sided projection-free algorithm, requiring \(\mathcal{O}(\epsilon^{-4})\) iterations to attain an \(\epsilon\)-stationary solution. Subsequently, we develop an efficient single-loop fully projected gradient-based algorithm capable of achieving an \(\epsilon\)-stationary solution within \(\mathcal{O}(\epsilon^{-5}\log(1/\epsilon))\) iterations. When the upper-level objective function is linear in the maximization component, our results improve to \(\mathcal{O}(\epsilon^{-3})\) and \(\mathcal{O}(\epsilon^{-4})\), respectively. Finally, we tested our proposed methods against the state-of-the-art algorithms for solving a robust multi-task regression problem to showcase the superiority of our algorithms. |
|---|---|
| Bibliography: | content type line 50 SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 |
| ISSN: | 2331-8422 |