Randomized Greedy Algorithms for Neural Network Optimization in Solving Partial Differential Equations

Greedy algorithms have been successfully analyzed and applied in training neural networks for solving variational problems, ensuring guaranteed convergence orders. In this paper, we extend the analysis of the orthogonal greedy algorithm (OGA) to convex optimization problems arising from the solution...

Full description

Saved in:
Bibliographic Details
Published inJournal of scientific computing Vol. 105; no. 1; p. 26
Main Authors Xu, Jinchao, Xu, Xiaofeng
Format Journal Article
LanguageEnglish
Published New York Springer US 01.10.2025
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0885-7474
1573-7691
1573-7691
DOI10.1007/s10915-025-03050-5

Cover

More Information
Summary:Greedy algorithms have been successfully analyzed and applied in training neural networks for solving variational problems, ensuring guaranteed convergence orders. In this paper, we extend the analysis of the orthogonal greedy algorithm (OGA) to convex optimization problems arising from the solution of partial differential equations, establishing its optimal convergence rate. This result broadens the applicability of OGA by generalizing its optimal convergence rate from function approximation to convex optimization problems. In addition, we also address the issue regarding practical applicability of greedy algorithms, which is due to significant computational costs from the subproblems that involve an exhaustive search over a discrete dictionary. We propose to use a more practical approach of randomly discretizing the dictionary at each iteration of the greedy algorithm. We quantify the required size of the randomized discrete dictionary and prove that, with high probability, the proposed algorithm realizes a weak greedy algorithm, achieving optimal convergence orders. Through numerous numerical experiments on function approximation, linear and nonlinear elliptic partial differential equations, we validate our analysis on the optimal convergence rate and demonstrate the advantage of using randomized discrete dictionaries over a deterministic one by showing orders of magnitude reductions in the size of the discrete dictionary, particularly in higher dimensions.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0885-7474
1573-7691
1573-7691
DOI:10.1007/s10915-025-03050-5