DiffMoment: an adaptive optimization technique for convolutional neural network

Stochastic Gradient Decent (SGD) is a very popular basic optimizer applied in the learning algorithms of deep neural networks. However, it has fixed-sized steps for every epoch without considering gradient behaviour to determine step size. The improved SGD optimizers like AdaGrad, Adam, AdaDelta, RA...

Full description

Saved in:
Bibliographic Details
Published inApplied intelligence (Dordrecht, Netherlands) Vol. 53; no. 13; pp. 16844 - 16858
Main Authors Bhakta, Shubhankar, Nandi, Utpal, Si, Tapas, Ghosal, Sudipta Kr, Changdar, Chiranjit, Pal, Rajat Kumar
Format Journal Article
LanguageEnglish
Published New York Springer US 01.07.2023
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0924-669X
1573-7497
DOI10.1007/s10489-022-04382-7

Cover

More Information
Summary:Stochastic Gradient Decent (SGD) is a very popular basic optimizer applied in the learning algorithms of deep neural networks. However, it has fixed-sized steps for every epoch without considering gradient behaviour to determine step size. The improved SGD optimizers like AdaGrad, Adam, AdaDelta, RAdam, and RMSProp make step sizes adaptive in every epoch. However, these optimizers depend on square roots of exponential moving averages (EMA) of squared previous gradients or momentums or both and cannot take the benefit of local change in gradients or momentums or both. To reduce these limitations, a novel optimizer has been presented in this paper where the adjustment of step size is done for each parameter based on changing information between the 1 s t and the 2 n d moment estimate (i.e., diffMoment). The experimental results depict that diffMoment offers better performance than AdaGrad, Adam, AdaDelta, RAdam, and RMSProp optimizers. It is also noticed that diffMoment does uniformly better for training Convolutional Neural Networks (CNN) applying different activation functions.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-022-04382-7