Random-reshuffled SARAH does not need full gradient computations
The StochAstic Recursive grAdient algoritHm (SARAH) algorithm is a variance reduced variant of the Stochastic Gradient Descent algorithm that needs a gradient of the objective function from time to time. In this paper, we remove the necessity of a full gradient computation. This is achieved by using...
Saved in:
| Published in | Optimization letters Vol. 18; no. 3; pp. 727 - 749 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.04.2024
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1862-4472 1862-4480 |
| DOI | 10.1007/s11590-023-02081-x |
Cover
| Summary: | The StochAstic Recursive grAdient algoritHm (SARAH) algorithm is a variance reduced variant of the Stochastic Gradient Descent algorithm that needs a gradient of the objective function from time to time. In this paper, we remove the necessity of a full gradient computation. This is achieved by using a randomized reshuffling strategy and aggregating stochastic gradients obtained in each epoch. The aggregated stochastic gradients serve as an estimate of a full gradient in the SARAH algorithm. We provide a theoretical analysis of the proposed approach and conclude the paper with numerical experiments that demonstrate the efficiency of this approach. |
|---|---|
| ISSN: | 1862-4472 1862-4480 |
| DOI: | 10.1007/s11590-023-02081-x |