Guided parallelized stochastic gradient descent for delay compensation

Stochastic gradient descent (SGD) algorithm and its variations have been effectively used to optimize neural network models. However, with the rapid growth of big data and deep learning, SGD is no longer the most suitable choice due to its natural behavior of sequential optimization of the error fun...

Full description

Saved in:
Bibliographic Details
Published inApplied soft computing Vol. 102; p. 107084
Main Author Sharma, Anuraganand
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.04.2021
Subjects
Online AccessGet full text
ISSN1568-4946
1872-9681
DOI10.1016/j.asoc.2021.107084

Cover

Abstract Stochastic gradient descent (SGD) algorithm and its variations have been effectively used to optimize neural network models. However, with the rapid growth of big data and deep learning, SGD is no longer the most suitable choice due to its natural behavior of sequential optimization of the error function. This has led to the development of parallel SGD algorithms, such as asynchronous SGD (ASGD) and synchronous SGD (SSGD) to train deep neural networks. However, it introduces a high variance due to the delay in parameter (weight) update. We address this delay in our proposed algorithm and try to minimize its impact. We employed guided SGD (gSGD) that encourages consistent examples to steer the convergence by compensating the unpredictable deviation caused by the delay. Its convergence rate is also similar to A/SSGD, however, some additional (parallel) processing is required to compensate for the delay. The experimental results demonstrate that our proposed approach has been able to mitigate the impact of delay for the quality of classification accuracy. The guided approach with SSGD clearly outperforms sequential SGD and even achieves an accuracy close to sequential SGD for some benchmark datasets. •Its convergence rate of O1ρT+σ2 shows its applicability for the real-time systems.•The proposed method outperforms synchronous/asynchronous SGD.•The proposed method is compatible with other variations of SGD such as RMSprop.•The delay in parameter updates happens due to several gradients computation in parallel.
AbstractList Stochastic gradient descent (SGD) algorithm and its variations have been effectively used to optimize neural network models. However, with the rapid growth of big data and deep learning, SGD is no longer the most suitable choice due to its natural behavior of sequential optimization of the error function. This has led to the development of parallel SGD algorithms, such as asynchronous SGD (ASGD) and synchronous SGD (SSGD) to train deep neural networks. However, it introduces a high variance due to the delay in parameter (weight) update. We address this delay in our proposed algorithm and try to minimize its impact. We employed guided SGD (gSGD) that encourages consistent examples to steer the convergence by compensating the unpredictable deviation caused by the delay. Its convergence rate is also similar to A/SSGD, however, some additional (parallel) processing is required to compensate for the delay. The experimental results demonstrate that our proposed approach has been able to mitigate the impact of delay for the quality of classification accuracy. The guided approach with SSGD clearly outperforms sequential SGD and even achieves an accuracy close to sequential SGD for some benchmark datasets. •Its convergence rate of O1ρT+σ2 shows its applicability for the real-time systems.•The proposed method outperforms synchronous/asynchronous SGD.•The proposed method is compatible with other variations of SGD such as RMSprop.•The delay in parameter updates happens due to several gradients computation in parallel.
ArticleNumber 107084
Author Sharma, Anuraganand
Author_xml – sequence: 1
  givenname: Anuraganand
  surname: Sharma
  fullname: Sharma, Anuraganand
  email: sharma_au@usp.ac.fj
  organization: School of Computing, Information and Mathematical Sciences, The University of the South Pacific, Fiji
BookMark eNp9kMFKAzEQhoNUsK2-gKd9ga1JuptkwYsUW4WCFz2H7GRWU7abkkShPr1Z68lDT_PPwDfMfDMyGfyAhNwyumCUibvdwkQPC045ywNJVXVBpkxJXjZCsUnOtVBl1VTiisxi3NEMNVxNyXrz6Sza4mCC6Xvs3XduYvLwYWJyULwHYx0OqbAYYaydDzn35liA3x9wiCY5P1yTy870EW_-6py8rR9fV0_l9mXzvHrYlrCkNJV1Y1pLWyktis6ICqisFVu2CAhKVCi46UQthUHbGms5MoCKypaKFhTl9XJO-GkvBB9jwE4fgtubcNSM6tGE3unRhB5N6JOJDKl_ELj0e3YKxvXn0fsTivmpL4dBR8g6AK0LCElb787hPwV9fag
CitedBy_id crossref_primary_10_3390_pr9020339
crossref_primary_10_1016_j_bspc_2023_105450
crossref_primary_10_1109_JIOT_2024_3403178
crossref_primary_10_1007_s10489_024_05564_1
crossref_primary_10_29132_ijpas_1475183
crossref_primary_10_3390_math10173206
crossref_primary_10_3390_s21206872
crossref_primary_10_1016_j_energy_2024_131901
crossref_primary_10_38155_ksbd_1477120
crossref_primary_10_1016_j_vehcom_2022_100532
crossref_primary_10_1109_ACCESS_2022_3158977
crossref_primary_10_1109_ACCESS_2024_3465793
crossref_primary_10_1109_ACCESS_2023_3275086
crossref_primary_10_1111_tgis_12982
crossref_primary_10_1007_s40747_024_01705_8
crossref_primary_10_1016_j_ndteint_2023_102860
Cites_doi 10.1016/j.asoc.2018.09.038
10.1016/j.csbj.2014.11.005
10.1016/j.inffus.2004.04.004
10.1186/s13634-016-0355-x
10.1038/s41524-019-0221-0
10.1373/clinchem.2005.058339
ContentType Journal Article
Copyright 2021 Elsevier B.V.
Copyright_xml – notice: 2021 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.asoc.2021.107084
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-9681
ExternalDocumentID 10_1016_j_asoc_2021_107084
S1568494621000077
GroupedDBID --K
--M
.DC
.~1
0R~
1B1
1~.
1~5
23M
4.4
457
4G.
53G
5GY
5VS
6J9
7-5
71M
8P~
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABFRF
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SDF
SDG
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
UHS
UNMZH
~G-
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AFXIZ
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
BNPGV
CITATION
SSH
ID FETCH-LOGICAL-c300t-59abd0b77de6fa64c075813becec864e62af6576aedbadd2e1cc407b06bc80253
IEDL.DBID AIKHN
ISSN 1568-4946
IngestDate Thu Apr 24 23:11:26 EDT 2025
Tue Jul 01 01:50:08 EDT 2025
Fri Feb 23 02:40:57 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Deep learning
Asynchronous/synchronous stochastic gradient descent
Stochastic gradient descent
Gradient Methods
Classification
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c300t-59abd0b77de6fa64c075813becec864e62af6576aedbadd2e1cc407b06bc80253
ParticipantIDs crossref_primary_10_1016_j_asoc_2021_107084
crossref_citationtrail_10_1016_j_asoc_2021_107084
elsevier_sciencedirect_doi_10_1016_j_asoc_2021_107084
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate April 2021
2021-04-00
PublicationDateYYYYMMDD 2021-04-01
PublicationDate_xml – month: 04
  year: 2021
  text: April 2021
PublicationDecade 2020
PublicationTitle Applied soft computing
PublicationYear 2021
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Moreira, Fiesler (b19) 1995
Kingma, Ba (b30) 2014
Low, Gonzalez, Kyrola, Bickson, Guestrin, Hellerstein (b5) 2010
Bubeck (b24) 2015
Kourou, Exarchos, Exarchos, Karamouzis, Fotiadis (b1) 2015; 13
Zinkevich, Weimer, Li, Smola (b14) 2010
Sharma (b27) 2018; 73
Solberg, Lahti (b34) 2005; 51
Zhang, Xiao (b11) 2015
Qiu, Wu, Ding, Xu, Feng (b2) 2016; 2016
(b4) 2019
Brown, Wyatt, Harris, Yao (b7) 2005; 6
Dheeru, Karra Taniskidou (b31) 2019
Schmidt, Marques, Botti, Marques (b3) 2019; 5
Michalewicz, Fogel (b21) 2004
Zeiler (b28) 2012
Stich, Cordonnier, Jaggi (b16) 2018
Chu, Kim, Lin, Yu, Bradski, Olukotun, Ng (b6) 2007
Duchi, Hazan, Singer (b29) 2011; 12
Dean, Corrado, Monga, Chen, Devin, Le, Mao, Ranzato, Senior, Tucker, Yang, Ng (b8) 2012
Lacoste-Julien, Schmidt, Bach (b23) 2012
Alistarh, De Sa, Konstantinov (b25) 2018
Lei, Hu, Li, Tang (b13) 2019
S. Zheng, Q. Meng, T. Wang, W. Chen, N. Yu, Z.-M. Ma, T.-Y. Liu, Asynchronous stochastic gradient descent with delay compensation, in: International Conference on Machine Learning, 2017, pp. 4120–4129.
Bishop (b17) 1995
Agarwal, Duchi (b15) 2011
Frank, Mark A. Hall, Ian H. Witten (b32) 2016
Sridhar (b22) 2015
Wang, Roosta, Xu, Mahoney (b12) 2018
Anton, Bivens, Davis (b18) 2012
J. Laurikkala, M. Juhola, E. Kentala, Informal identification of outliers in medical data, in: Fifth International Workshop on Intelligent Data Analysis in Medicine and Pharmacology, Berlin, Germany, 2000, pp. 20–24.
Meka (b26) 2019
Crane, Roosta (b10) 2019
C.-C. Yu, B.-D. Liu, A backpropagation algorithm with adaptive learning rate and momentum coefficient, in: Proceedings of the 2002 International Joint Conference on Neural Networks, 2002. IJCNN ’02, 2002, pp. 1218–1223.
Zinkevich (10.1016/j.asoc.2021.107084_b14) 2010
Dheeru (10.1016/j.asoc.2021.107084_b31) 2019
Sridhar (10.1016/j.asoc.2021.107084_b22) 2015
Kingma (10.1016/j.asoc.2021.107084_b30) 2014
Stich (10.1016/j.asoc.2021.107084_b16) 2018
Qiu (10.1016/j.asoc.2021.107084_b2) 2016; 2016
10.1016/j.asoc.2021.107084_b20
Michalewicz (10.1016/j.asoc.2021.107084_b21) 2004
10.1016/j.asoc.2021.107084_b33
Zeiler (10.1016/j.asoc.2021.107084_b28) 2012
Wang (10.1016/j.asoc.2021.107084_b12) 2018
Meka (10.1016/j.asoc.2021.107084_b26) 2019
Low (10.1016/j.asoc.2021.107084_b5) 2010
Crane (10.1016/j.asoc.2021.107084_b10) 2019
Agarwal (10.1016/j.asoc.2021.107084_b15) 2011
Duchi (10.1016/j.asoc.2021.107084_b29) 2011; 12
Bishop (10.1016/j.asoc.2021.107084_b17) 1995
Bubeck (10.1016/j.asoc.2021.107084_b24) 2015
Lacoste-Julien (10.1016/j.asoc.2021.107084_b23) 2012
Zhang (10.1016/j.asoc.2021.107084_b11) 2015
Kourou (10.1016/j.asoc.2021.107084_b1) 2015; 13
Chu (10.1016/j.asoc.2021.107084_b6) 2007
Solberg (10.1016/j.asoc.2021.107084_b34) 2005; 51
Frank (10.1016/j.asoc.2021.107084_b32) 2016
10.1016/j.asoc.2021.107084_b9
(10.1016/j.asoc.2021.107084_b4) 2019
Moreira (10.1016/j.asoc.2021.107084_b19) 1995
Sharma (10.1016/j.asoc.2021.107084_b27) 2018; 73
Schmidt (10.1016/j.asoc.2021.107084_b3) 2019; 5
Dean (10.1016/j.asoc.2021.107084_b8) 2012
Anton (10.1016/j.asoc.2021.107084_b18) 2012
Alistarh (10.1016/j.asoc.2021.107084_b25) 2018
Lei (10.1016/j.asoc.2021.107084_b13) 2019
Brown (10.1016/j.asoc.2021.107084_b7) 2005; 6
References_xml – year: 2019
  ident: b4
  article-title: Mapreduce - an overview, sciencedirect topics [WWW document]
– start-page: 906
  year: 2012
  end-page: 999
  ident: b18
  article-title: Partial derivatives
  publication-title: Calculus, Early Transcendentals
– start-page: 1223
  year: 2012
  end-page: 1231
  ident: b8
  article-title: Large scale distributed deep networks
  publication-title: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, NIPS’12
– volume: 5
  start-page: 1
  year: 2019
  end-page: 36
  ident: b3
  article-title: Recent advances and applications of machine learning in solid-state materials science
  publication-title: npj Comput. Mater.
– year: 2015
  ident: b24
  article-title: Convex optimization: Algorithms and complexity
– year: 2014
  ident: b30
  article-title: Adam: A method for stochastic optimization
– volume: 2016
  start-page: 67
  year: 2016
  ident: b2
  article-title: A survey of machine learning for big data processing
  publication-title: EURASIP J. Adv. Signal Process.
– year: 2012
  ident: b28
  article-title: ADADELTA: An adaptive learning rate method
– year: 2016
  ident: b32
  article-title: The WEKA workbench
  publication-title: Online Appendix for Data Mining: Practical Machine Learning Tools and Techniques
– start-page: 281
  year: 2007
  end-page: 288
  ident: b6
  article-title: Map-reduce for machine learning on multicore
  publication-title: Advances in Neural Information Processing Systems 19
– start-page: 4452
  year: 2018
  end-page: 4463
  ident: b16
  article-title: Sparsified SGD with memory
  publication-title: Proceedings of the 32Nd International Conference on Neural Information Processing Systems, NIPS’18
– year: 2019
  ident: b26
  article-title: CS289ML: Notes on Convergence of Gradient Descent
– year: 1995
  ident: b19
  article-title: Neural Networks with Adaptive Learning Rate and Momentum Terms
– year: 2019
  ident: b31
  article-title: UCI Machine Learning Repository
– reference: J. Laurikkala, M. Juhola, E. Kentala, Informal identification of outliers in medical data, in: Fifth International Workshop on Intelligent Data Analysis in Medicine and Pharmacology, Berlin, Germany, 2000, pp. 20–24.
– start-page: 1
  year: 2019
  end-page: 7
  ident: b13
  article-title: Stochastic gradient descent for nonconvex learning without bounded gradient assumptions
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
– volume: 51
  start-page: 2326
  year: 2005
  end-page: 2332
  ident: b34
  article-title: Detection of outliers in reference distributions: Performance of Horn’s algorithm
  publication-title: Clin. Chem.
– volume: 73
  start-page: 1068
  year: 2018
  end-page: 1080
  ident: b27
  article-title: Guided stochastic gradient descent algorithm for inconsistent datasets
  publication-title: Appl. Soft Comput.
– start-page: 2332
  year: 2018
  end-page: 2342
  ident: b12
  article-title: GIANT: Globally improved approximate Newton method for distributed optimization
  publication-title: Advances in Neural Information Processing Systems 31
– reference: C.-C. Yu, B.-D. Liu, A backpropagation algorithm with adaptive learning rate and momentum coefficient, in: Proceedings of the 2002 International Joint Conference on Neural Networks, 2002. IJCNN ’02, 2002, pp. 1218–1223.
– volume: 12
  start-page: 2121
  year: 2011
  end-page: 2159
  ident: b29
  article-title: Adaptive subgradient methods for online learning and stochastic optimization
  publication-title: J. Mach. Learn. Res.
– reference: S. Zheng, Q. Meng, T. Wang, W. Chen, N. Yu, Z.-M. Ma, T.-Y. Liu, Asynchronous stochastic gradient descent with delay compensation, in: International Conference on Machine Learning, 2017, pp. 4120–4129.
– year: 2015
  ident: b22
  article-title: Parallel Machine Learning with Hogwild!
– volume: 13
  start-page: 8
  year: 2015
  end-page: 17
  ident: b1
  article-title: Machine learning applications in cancer prognosis and prediction
  publication-title: Comput. Struct. Biotechnol. J.
– year: 2012
  ident: b23
  article-title: A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
– year: 1995
  ident: b17
  article-title: Neural Networks for Pattern Recognition
– year: 2004
  ident: b21
  article-title: How to Solve It: Modern Heuristics
– start-page: 169
  year: 2018
  end-page: 178
  ident: b25
  article-title: The convergence of stochastic gradient descent in asynchronous shared memory
  publication-title: Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, PODC ’18
– start-page: 873
  year: 2011
  end-page: 881
  ident: b15
  article-title: Distributed delayed stochastic optimization
  publication-title: Advances in Neural Information Processing Systems 24
– start-page: 2595
  year: 2010
  end-page: 2603
  ident: b14
  article-title: Parallelized stochastic gradient descent
  publication-title: Advances in Neural Information Processing Systems 23
– volume: 6
  start-page: 5
  year: 2005
  end-page: 20
  ident: b7
  article-title: Diversity creation methods: A survey and categorisation
  publication-title: Inf. Fusion
– start-page: 9498
  year: 2019
  end-page: 9508
  ident: b10
  article-title: DINGO: Distributed Newton-type method for gradient-norm optimization
  publication-title: Advances in Neural Information Processing Systems 32
– year: 2010
  ident: b5
  article-title: GraphLab: A new framework for parallel machine learning
– start-page: 362
  year: 2015
  end-page: 370
  ident: b11
  article-title: DiSCO: Distributed optimization for self-concordant empirical loss
  publication-title: Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML’15
– start-page: 1
  year: 2019
  ident: 10.1016/j.asoc.2021.107084_b13
  article-title: Stochastic gradient descent for nonconvex learning without bounded gradient assumptions
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
– ident: 10.1016/j.asoc.2021.107084_b20
– year: 2019
  ident: 10.1016/j.asoc.2021.107084_b26
– start-page: 906
  year: 2012
  ident: 10.1016/j.asoc.2021.107084_b18
  article-title: Partial derivatives
– volume: 73
  start-page: 1068
  year: 2018
  ident: 10.1016/j.asoc.2021.107084_b27
  article-title: Guided stochastic gradient descent algorithm for inconsistent datasets
  publication-title: Appl. Soft Comput.
  doi: 10.1016/j.asoc.2018.09.038
– volume: 13
  start-page: 8
  year: 2015
  ident: 10.1016/j.asoc.2021.107084_b1
  article-title: Machine learning applications in cancer prognosis and prediction
  publication-title: Comput. Struct. Biotechnol. J.
  doi: 10.1016/j.csbj.2014.11.005
– start-page: 9498
  year: 2019
  ident: 10.1016/j.asoc.2021.107084_b10
  article-title: DINGO: Distributed Newton-type method for gradient-norm optimization
– year: 1995
  ident: 10.1016/j.asoc.2021.107084_b17
– ident: 10.1016/j.asoc.2021.107084_b9
– year: 2004
  ident: 10.1016/j.asoc.2021.107084_b21
– year: 2015
  ident: 10.1016/j.asoc.2021.107084_b24
– start-page: 4452
  year: 2018
  ident: 10.1016/j.asoc.2021.107084_b16
  article-title: Sparsified SGD with memory
– start-page: 169
  year: 2018
  ident: 10.1016/j.asoc.2021.107084_b25
  article-title: The convergence of stochastic gradient descent in asynchronous shared memory
– start-page: 1223
  year: 2012
  ident: 10.1016/j.asoc.2021.107084_b8
  article-title: Large scale distributed deep networks
– start-page: 362
  year: 2015
  ident: 10.1016/j.asoc.2021.107084_b11
  article-title: DiSCO: Distributed optimization for self-concordant empirical loss
– year: 1995
  ident: 10.1016/j.asoc.2021.107084_b19
– year: 2016
  ident: 10.1016/j.asoc.2021.107084_b32
  article-title: The WEKA workbench
– year: 2019
  ident: 10.1016/j.asoc.2021.107084_b4
– volume: 6
  start-page: 5
  year: 2005
  ident: 10.1016/j.asoc.2021.107084_b7
  article-title: Diversity creation methods: A survey and categorisation
  publication-title: Inf. Fusion
  doi: 10.1016/j.inffus.2004.04.004
– start-page: 2595
  year: 2010
  ident: 10.1016/j.asoc.2021.107084_b14
  article-title: Parallelized stochastic gradient descent
– volume: 2016
  start-page: 67
  year: 2016
  ident: 10.1016/j.asoc.2021.107084_b2
  article-title: A survey of machine learning for big data processing
  publication-title: EURASIP J. Adv. Signal Process.
  doi: 10.1186/s13634-016-0355-x
– volume: 12
  start-page: 2121
  year: 2011
  ident: 10.1016/j.asoc.2021.107084_b29
  article-title: Adaptive subgradient methods for online learning and stochastic optimization
  publication-title: J. Mach. Learn. Res.
– volume: 5
  start-page: 1
  year: 2019
  ident: 10.1016/j.asoc.2021.107084_b3
  article-title: Recent advances and applications of machine learning in solid-state materials science
  publication-title: npj Comput. Mater.
  doi: 10.1038/s41524-019-0221-0
– start-page: 2332
  year: 2018
  ident: 10.1016/j.asoc.2021.107084_b12
  article-title: GIANT: Globally improved approximate Newton method for distributed optimization
– year: 2014
  ident: 10.1016/j.asoc.2021.107084_b30
– volume: 51
  start-page: 2326
  year: 2005
  ident: 10.1016/j.asoc.2021.107084_b34
  article-title: Detection of outliers in reference distributions: Performance of Horn’s algorithm
  publication-title: Clin. Chem.
  doi: 10.1373/clinchem.2005.058339
– start-page: 281
  year: 2007
  ident: 10.1016/j.asoc.2021.107084_b6
  article-title: Map-reduce for machine learning on multicore
– year: 2015
  ident: 10.1016/j.asoc.2021.107084_b22
– ident: 10.1016/j.asoc.2021.107084_b33
– year: 2012
  ident: 10.1016/j.asoc.2021.107084_b23
– year: 2012
  ident: 10.1016/j.asoc.2021.107084_b28
– year: 2010
  ident: 10.1016/j.asoc.2021.107084_b5
– year: 2019
  ident: 10.1016/j.asoc.2021.107084_b31
– start-page: 873
  year: 2011
  ident: 10.1016/j.asoc.2021.107084_b15
  article-title: Distributed delayed stochastic optimization
SSID ssj0016928
Score 2.4085574
Snippet Stochastic gradient descent (SGD) algorithm and its variations have been effectively used to optimize neural network models. However, with the rapid growth of...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 107084
SubjectTerms Asynchronous/synchronous stochastic gradient descent
Classification
Deep learning
Gradient Methods
Stochastic gradient descent
Title Guided parallelized stochastic gradient descent for delay compensation
URI https://dx.doi.org/10.1016/j.asoc.2021.107084
Volume 102
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1JS8NAFH50uXhxF-tS5uBNYrNMJsmxFGvdiqiF3kJmiUZCLSU96MHf7ptkUhSkB48zzIPwzeQtyTffAzhzhC8T7itLcdwGKlxuhVKmFuOp8tzIDoOSm3M_ZqMJvZn60wYM6rswmlZpfH_l00tvbWZ6Bs3ePMt6T1h5hDSizC0_UQdBE9ouRvuwBe3-9e1ovPqZwKKyxapeb2kDc3emonklCAKWia6DE0EpcvpXfPoRc4bbsGmSRdKvnmcHGmq2C1t1IwZi3ss9GF4tM6kk0Treea7y7BMHmNWJ10TLMJOXRUnsKoisxJsIZqpEy0N-EE0px0q23J99mAwvnwcjyzRIsIRn24XlRwmXNg8CqViaMCow_oeOh9uiRMioYm6SMiwoEiU5-jFXOUJgAcdtxkWIyY53AK3Z-0wdAomoDCgPVYoBm1LP5r6nlONzrF8dD1O-Djg1LLEw6uG6iUUe1zSxt1hDGWso4wrKDpyvbOaVdsba1X6NdvzrBMTo3NfYHf3T7hg29Khi4ZxAq1gs1SkmGAXvQvPiy-niMRo83j10zXH6BrtJ0ao
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV05T8MwFH4q7QALN6KcHthQ1ByOk4xVRUnpsdBK3aL4CARVparSAX49z4lTgYQYGOP4SdFn5x3J5-8B3DnClyn3laU4LgMVLrdCKTOL8Ux5bmSHQcnNGU9YPKNPc3_egF59FkbTKo3vr3x66a3NSMeg2VnleecZK4-QRpS55SfqINiBFtVNrZvQ6g6G8WT7M4FFZYtVPd_SBubsTEXzShEELBNdBweCUuT0t_j0Leb0D2HfJIukWz3PETTU8hgO6kYMxLyXJ9B_3ORSSaJ1vBcLtcg_8QKzOvGaahlm8rIuiV0FkZV4E8FMlWh5yA-iKeVYyZbrcwqz_sO0F1umQYIlPNsuLD9KubR5EEjFspRRgfE_dDxcFiVCRhVz04xhQZEqydGPucoRAgs4bjMuQkx2vDNoLt-X6hxIRGVAeagyDNiUejb3PaUcn2P96niY8rXBqWFJhFEP100sFklNE3tLNJSJhjKpoGzD_dZmVWln_Dnbr9FOfuyABJ37H3YX_7S7hd14Oh4lo8FkeAl7-k7FyLmCZrHeqGtMNgp-YzbTFyC10fs
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Guided+parallelized+stochastic+gradient+descent+for+delay+compensation&rft.jtitle=Applied+soft+computing&rft.au=Sharma%2C+Anuraganand&rft.date=2021-04-01&rft.pub=Elsevier+B.V&rft.issn=1568-4946&rft.eissn=1872-9681&rft.volume=102&rft_id=info:doi/10.1016%2Fj.asoc.2021.107084&rft.externalDocID=S1568494621000077
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1568-4946&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1568-4946&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1568-4946&client=summon