The difficulty of computing stable and accurate neural networks On the barriers of deep learning and Smale’s 18th problem

Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by d...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the National Academy of Sciences - PNAS Vol. 119; no. 12; pp. 1 - 10
Main Authors	Colbrook, Matthew J., Antun, Vegard, Hansen, Anders C.
Format	Journal Article
Language	English
Published	United States National Academy of Sciences 22.03.2022 The National Academy of Sciences
Subjects	Algorithms Applied Mathematics Approximation Artificial Intelligence Computation Deep Learning Digits Inverse problems Iterative methods Machine learning Mathematical analysis Neural networks Neural Networks, Computer Physical Sciences AI and deep learning stability and accuracy solvability complexity index hierarchy inverse problems Smale’s 18th problem
Online Access	Get full text
ISSN	0027-8424 1091-6490 1091-6490
DOI	10.1073/pnas.2107151119

Cover

More Information
Summary:	Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existence of NNs with great approximation qualities; however, there does not exist any algorithm, even randomized, that can train (or compute) such a NN. For any positive integers K >2 and L, there are cases where simultaneously 1) no randomized training algorithm can compute a NN correct to K digits with probability greater than 1/2; 2) there exists a deterministic training algorithm that computes a NN with K – 1 correct digits, but any such (even randomized) algorithm needs arbitrarily many training data; and 3) there exists a deterministic training algorithm that computes a NN with K – 2 correct digits using no more than L training samples. These results imply a classification theory describing conditions under which (stable) NNs with a given accuracy can be computed by an algorithm. We begin this theory by establishing sufficient conditions for the existence of algorithms that compute stable NNs in inverse problems. We introduce fast iterative restarted networks (FIRENETs), which we both prove and numerically verify are stable. Moreover, we prove that only 𝓞(\| log(ϵ)\|) layers are needed for an -accurate solution to the inverse problem.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Edited by Ronald DeVore, Texas A&M University, College Station, TX; received April 16, 2021; accepted October 26, 2021 Author contributions: M.J.C., V.A., and A.C.H. designed research, performed research, and wrote the paper. 1M.J.C. and V.A. contributed equally to this work.
ISSN:	0027-8424 1091-6490 1091-6490
DOI:	10.1073/pnas.2107151119