A linearly convergent doubly stochastic Gauss–Seidel algorithm for solving linear equations and a certain class of over-parameterized optimization problems

Consider the classical problem of solving a general linear system of equations A x = b . It is well known that the (successively over relaxed) Gauss–Seidel scheme and many of its variants may not converge when A is neither diagonally dominant nor symmetric positive definite. Can we have a linearly c...

Full description

Saved in:

Bibliographic Details
Published in	Mathematical programming Vol. 176; no. 1-2; pp. 465 - 496
Main Authors	Razaviyayn, Meisam, Hong, Mingyi, Reyhanian, Navid, Luo, Zhi-Quan
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.07.2019 Springer Nature B.V
Subjects	Algorithms Calculus of Variations and Optimal Control; Optimization Combinatorics Convergence Error detection Feasibility Full Length Paper Iterative methods Linear equations Machine learning Mathematical and Computational Physics Mathematical Methods in Physics Mathematics Mathematics and Statistics Mathematics of Computing Numerical Analysis Optimization Parameterization Theoretical Gauss–Seidel algorithm Linear systems of equations Nonuniform block coordinate descent algorithm 49xx Calculus of variations and optimal control; optimization Over-parameterized optimization
Online Access	Get full text
ISSN	0025-5610 1436-4646
DOI	10.1007/s10107-019-01404-0

Cover

More Information
Summary:	Consider the classical problem of solving a general linear system of equations A x = b . It is well known that the (successively over relaxed) Gauss–Seidel scheme and many of its variants may not converge when A is neither diagonally dominant nor symmetric positive definite. Can we have a linearly convergent G–S type algorithm that works for any A ? In this paper we answer this question affirmatively by proposing a doubly stochastic G–S algorithm that is provably linearly convergent (in the mean square error sense) for any feasible linear system of equations. The key in the algorithm design is to introduce a nonuniform double stochastic scheme for picking the equation and the variable in each update step as well as a stepsize rule. These techniques also generalize to certain iterative alternating projection algorithms for solving the linear feasibility problem A x ≤ b with an arbitrary A , as well as high-dimensional minimization problems for training over-parameterized models in machine learning. Our results demonstrate that a carefully designed randomization scheme can make an otherwise divergent G–S algorithm converge.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0025-5610 1436-4646
DOI:	10.1007/s10107-019-01404-0