Single-loop Projection-free and Projected Gradient-based Algorithms for Nonconvex-concave Saddle Point Problems with Bilevel Structure

In this paper, we explore a broad class of constrained saddle point problems with a bilevel structure, wherein the upper-level objective function is nonconvex-concave and smooth over compact and convex constraint sets, subject to a strongly convex lower-level objective function. This class of proble...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Mohammad Mahdi Ahmadi, Erfan Yazdandoost Hamedani
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 13.05.2024
Subjects
Online AccessGet full text
ISSN2331-8422

Cover

More Information
Summary:In this paper, we explore a broad class of constrained saddle point problems with a bilevel structure, wherein the upper-level objective function is nonconvex-concave and smooth over compact and convex constraint sets, subject to a strongly convex lower-level objective function. This class of problems finds wide applicability in machine learning, encompassing robust multi-task learning, adversarial learning, and robust meta-learning. Our study extends the current literature in two main directions: (i) We consider a more general setting where the upper-level function is not necessarily strongly concave or linear in the maximization variable. (ii) While existing methods for solving saddle point problems with a bilevel structure are projected-based algorithms, we propose a one-sided projection-free method employing a linear minimization oracle. Specifically, by utilizing regularization and nested approximation techniques, we introduce a novel single-loop one-sided projection-free algorithm, requiring \(\mathcal{O}(\epsilon^{-4})\) iterations to attain an \(\epsilon\)-stationary solution. Subsequently, we develop an efficient single-loop fully projected gradient-based algorithm capable of achieving an \(\epsilon\)-stationary solution within \(\mathcal{O}(\epsilon^{-5}\log(1/\epsilon))\) iterations. When the upper-level objective function is linear in the maximization component, our results improve to \(\mathcal{O}(\epsilon^{-3})\) and \(\mathcal{O}(\epsilon^{-4})\), respectively. Finally, we tested our proposed methods against the state-of-the-art algorithms for solving a robust multi-task regression problem to showcase the superiority of our algorithms.
Bibliography:content type line 50
SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
ISSN:2331-8422