A statistical model of neural network learning via the Cramer–Rao lower bound

The neural networks (NN) remain as black boxes, albeit their quite successful stories everywhere. It is mainly because they provide only the complex structure of the underlying network with a huge validation data set whenever their serendipities reveal themselves. In this paper, we propose the stati...

Full description

Saved in:

Bibliographic Details
Published in	Journal of the Korean Statistical Society Vol. 50; no. 3; pp. 756 - 772
Main Authors	Kim, Tae Yoon, Park, Inho
Format	Journal Article
Language	English
Published	Singapore Springer Singapore 01.09.2021 한국통계학회
Subjects	Applied Statistics Bayesian Inference Mathematics and Statistics Research Article Statistical Theory and Methods Statistics Statistics and Computing/Statistics Programs 통계학 Validation Cramer–Rao lower bound 68T05 ANL algorithm Statistical NN learning model Statistically successful NN learning 68Q32
Online Access	Get full text
ISSN	1226-3192 2005-2863
DOI	10.1007/s42952-021-00122-8

Cover

Abstract	The neural networks (NN) remain as black boxes, albeit their quite successful stories everywhere. It is mainly because they provide only the complex structure of the underlying network with a huge validation data set whenever their serendipities reveal themselves. In this paper, we propose the statistical NN learning model related to the concept of universal Turing computer for regression predictive model. Based on this model, we define ’statistically successful NN (SSNN) learning.’ This is mainly done by calculating the well-known Cramer–Rao lower bound for the averaged square error (ASE) of NN learning. Using such formal definition, we propose an ASE-based NN learning (ANL) algorithm. The ANL algorithm not only implements the Cramer–Rao lower bound successfully but also presents an effective way to figure out a complicated geometry of ASE over hyper-parameter space for NN. This enables the ANL to be free of huge validation data set. Simple numerical simulation and real data analysis are done to evaluate performance of the ANL and present how to implement it.
AbstractList	The neural networks (NN) remain as black boxes, albeit their quite successful stories everywhere. It is mainly because they provide only the complex structure of the underlying network with a huge validation data set whenever their serendipities reveal themselves. In this paper, we propose the statistical NN learning model related to the concept of universal Turing computer for regression predictive model. Based on this model, we define ’statistically successful NN (SSNN) learning.’ This is mainly done by calculating the well-known Cramer–Rao lower bound for the averaged square error (ASE) of NN learning. Using such formal definition, we propose an ASE-based NN learning (ANL) algorithm. The ANL algorithm not only implements the Cramer–Rao lower bound successfully but also presents an effective way to figure out a complicated geometry of ASE over hyper-parameter space for NN. This enables the ANL to be free of huge validation data set. Simple numerical simulation and real data analysis are done to evaluate performance of the ANL and present how to implement it. The neural networks (NN) remain as black boxes, albeit their quite successful stories everywhere. It is mainly because they provide only the complex structure of the underlying network with a huge validation data set whenever their serendipities reveal themselves. In this paper, we propose the statistical NN learning model related to the concept of universal Turing computer for regression predictive model. Based on this model, we define ’statistically successful NN (SSNN) learning.’ This is mainly done by calculating the well-known Cramer–Rao lower bound for the averaged square error (ASE) of NN learning. Using such formal definition, we propose an ASE-based NN learning (ANL) algorithm. The ANL algorithm not only implements the Cramer–Rao lower bound successfully but also presents an effective way to figure out a complicated geometry of ASE over hyper-parameter space for NN. This enables the ANL to be free of huge validation data set. Simple numerical simulation and real data analysis are done to evaluate performance of the ANL and present how to implement it. KCI Citation Count: 0
Author	Kim, Tae Yoon Park, Inho
Author_xml	– sequence: 1 givenname: Tae Yoon orcidid: 0000-0002-9869-8632 surname: Kim fullname: Kim, Tae Yoon email: tykim@kmu.ac.kr organization: Department of Statistics, Keimyung University – sequence: 2 givenname: Inho surname: Park fullname: Park, Inho organization: Department of Statistics, Pukyong National University
BackLink	https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002762876$$DAccess content in National Research Foundation of Korea (NRF)
BookMark	eNp9kE1OwzAQRi1UJErhAqy8ZWGYOHYSL6uKn0qVKlVlbTnOpKRNbWSnVOy4AzfkJISGNatPM3rfSPMuych5h4TcJHCXAOT3UXAlOQOeMICEc1ackTEHkIwXWToi436XsTRR_IJcxrgFyETCYUyWUxo70zWxa6xp6d5X2FJfU4eH0M8Ou6MPO9qiCa5xG_reGNq9Ip0Fs8fw_fm1Mp62_oiBlv7gqityXps24vVfTsjL48N69swWy6f5bLpgludZxypllbBSQYW8lEaWBk2eZrlBUVqTCotQSiEqQG6KXAHKOrVVjorzHERVpxNyO9x1odY722hvmlNuvN4FPV2t51oVWaEy2bN8YG3wMQas9Vto9iZ86AT0rz496NO9Pn3Sp4u-lA6l2MNug0Fv_SG4_qf_Wj9kbnXs
Cites_doi	10.1016/j.amc.2009.04.057 10.1016/0893-6080(89)90020-8 10.1007/978-1-4614-7138-7 10.1007/978-0-387-21706-2 10.1109/TPAMI.2018.2876413 10.1006/jcss.1995.1013 10.32614/RJ-2010-006
ContentType	Journal Article
Copyright	Korean Statistical Society 2021
Copyright_xml	– notice: Korean Statistical Society 2021
DBID	AAYXX CITATION ACYCR
DOI	10.1007/s42952-021-00122-8
DatabaseName	CrossRef Korean Citation Index
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Statistics
EISSN	2005-2863
EndPage	772
ExternalDocumentID	oai_kci_go_kr_ARTI_9868965 10_1007_s42952_021_00122_8
GroupedDBID	--K --M .UV .~1 0R~ 1B1 1~. 1~5 2JY 4.4 406 457 4G. 5GY 5VS 7-5 71M 8P~ 9ZL AAAKF AACDK AACTN AAEDT AAHNG AAIKJ AAJBT AAKOC AALRI AAOAW AAQFI AASML AATNV AAXUO ABAKF ABAOU ABECU ABMQK ABTEG ABTKH ABTMW ABUCO ABWVN ABXDB ACAOD ACDAQ ACDTI ACGFS ACHSB ACOKC ACPIV ACRPL ACZOJ ADBBV ADEZE ADMUD ADNMO ADTPH ADYFF AEFQL AEKER AEMSY AENEX AESKC AFTJW AGHFR AGMZJ AGQEE AGUBO AGYEJ AIGIU AIGVJ AILAN AITUG AJOXV AJZVZ ALMA_UNASSIGNED_HOLDINGS AMFUW AMXSW ARUGR BGNMA BLXMC CS3 DPUIP DU5 EBLON EBS EJD EO9 EP2 EP3 F5P FDB FIGPU FIRID FNLPD FNPLU GBLVA HAMUX HZ~ IKXTQ IWAJR J1W JZLTJ LLZTM M41 M4Y MHUIS MO0 N9A NPVJJ NQJWS NU0 O-L O9- OAUVE OZT P-8 P-9 P2P PC. PT4 Q38 RIG ROL RPZ RSV SDF SDG SES SJYHP SNE SNPRN SOHCF SOJ SRMVM SSLCW SSZ T5K UOJIU UTJUX ZMTXR ~G- AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC AEZWR AFDZB AFHIU AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION EFLBG ~HD AAFGU AAYFA ABFGW ABKAS ABYKQ ACBMV ACBRV ACBYP ACIGE ACIPQ ACTTH ACVWB ACWMK ACYCR ADMDM ADOXG AEFTE AESTI AEVTX AFNRJ AGGBP AIMYW AJBFU AJDOV AKQUC
ID	FETCH-LOGICAL-c276t-d9c94c590de2b5a5baea7367ae4bca34ce0b544d0e2a8790e5f3cd7e922704df3
ISSN	1226-3192
IngestDate	Tue Nov 21 21:04:29 EST 2023 Wed Oct 01 02:09:30 EDT 2025 Fri Feb 21 02:48:39 EST 2025
IsPeerReviewed	true
IsScholarly	true
Issue	3
Keywords	Validation Cramer–Rao lower bound 68T05 ANL algorithm Statistical NN learning model Statistically successful NN learning 68Q32
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c276t-d9c94c590de2b5a5baea7367ae4bca34ce0b544d0e2a8790e5f3cd7e922704df3
ORCID	0000-0002-9869-8632
PageCount	17
ParticipantIDs	nrf_kci_oai_kci_go_kr_ARTI_9868965 crossref_primary_10_1007_s42952_021_00122_8 springer_journals_10_1007_s42952_021_00122_8
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	20210900 2021-09-00 2021-09
PublicationDateYYYYMMDD	2021-09-01
PublicationDate_xml	– month: 9 year: 2021 text: 20210900
PublicationDecade	2020
PublicationPlace	Singapore
PublicationPlace_xml	– name: Singapore
PublicationTitle	Journal of the Korean Statistical Society
PublicationTitleAbbrev	J. Korean Stat. Soc
PublicationYear	2021
Publisher	Springer Singapore 한국통계학회
Publisher_xml	– name: Springer Singapore – name: 한국통계학회
References	Günther, Fritsch (CR3) 2010; 2 James, Witten, Hastie, Tibshirani (CR5) 2013 Hornik, Stichcombe, White (CR4) 1989; 2 Venables, Ripley (CR11) 2002 CR8 Bergmeir, Benítez (CR1) 2012; 42 Shao (CR9) 1998 Delvenne (CR2) 2009; 4 Krogh, Hertz (CR6) 1992; 4 Liao, Drummond, Reid, Carneiro (CR7) 2020; 42 Siegelmann, Sontag (CR10) 1995; 50 K Hornik (122_CR4) 1989; 2 C Bergmeir (122_CR1) 2012; 42 J-C Delvenne (122_CR2) 2009; 4 F Günther (122_CR3) 2010; 2 A Krogh (122_CR6) 1992; 4 Z Liao (122_CR7) 2020; 42 HT Siegelmann (122_CR10) 1995; 50 122_CR8 G James (122_CR5) 2013 J Shao (122_CR9) 1998 WN Venables (122_CR11) 2002
References_xml	– volume: 4 start-page: 1368 year: 2009 end-page: 1374 ident: CR2 article-title: What is a universal computing machine? publication-title: Applied Mathematics and Computation doi: 10.1016/j.amc.2009.04.057 – volume: 4 start-page: 950 year: 1992 end-page: 957 ident: CR6 article-title: A simple weight decay can improve generalization publication-title: Advances in Neural Information Processing Systems – volume: 42 start-page: 1 year: 2012 end-page: 26 ident: CR1 article-title: Neural networks in R using the Stuttgart neural network simulator: RSNNS publication-title: Journal of Statistical Software – volume: 2 start-page: 359 year: 1989 end-page: 366 ident: CR4 article-title: Multilayer feedforward networks are universal approximators publication-title: Neural Networks doi: 10.1016/0893-6080(89)90020-8 – year: 2013 ident: CR5 publication-title: An Introduction to Statistical Learning with Applications in R doi: 10.1007/978-1-4614-7138-7 – ident: CR8 – year: 2002 ident: CR11 publication-title: Modern Applied Statistics with S doi: 10.1007/978-0-387-21706-2 – volume: 42 start-page: 15 year: 2020 end-page: 26 ident: CR7 article-title: Approximate Fisher information matrix to characterise the training of deep neural networks publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence doi: 10.1109/TPAMI.2018.2876413 – year: 1998 ident: CR9 publication-title: Mathematical Statistics – volume: 50 start-page: 132 year: 1995 end-page: 150 ident: CR10 article-title: On the computatonal power of neural nets publication-title: Journal of Computer and System Science doi: 10.1006/jcss.1995.1013 – volume: 2 start-page: 30 year: 2010 end-page: 38 ident: CR3 article-title: Neuralnet: Training of neural networks R foundation for statistical computing publication-title: The R Journal doi: 10.32614/RJ-2010-006 – ident: 122_CR8 – volume: 2 start-page: 30 year: 2010 ident: 122_CR3 publication-title: The R Journal doi: 10.32614/RJ-2010-006 – volume: 2 start-page: 359 year: 1989 ident: 122_CR4 publication-title: Neural Networks doi: 10.1016/0893-6080(89)90020-8 – volume: 4 start-page: 950 year: 1992 ident: 122_CR6 publication-title: Advances in Neural Information Processing Systems – volume-title: Modern Applied Statistics with S year: 2002 ident: 122_CR11 doi: 10.1007/978-0-387-21706-2 – volume-title: An Introduction to Statistical Learning with Applications in R year: 2013 ident: 122_CR5 doi: 10.1007/978-1-4614-7138-7 – volume: 4 start-page: 1368 year: 2009 ident: 122_CR2 publication-title: Applied Mathematics and Computation doi: 10.1016/j.amc.2009.04.057 – volume-title: Mathematical Statistics year: 1998 ident: 122_CR9 – volume: 50 start-page: 132 year: 1995 ident: 122_CR10 publication-title: Journal of Computer and System Science doi: 10.1006/jcss.1995.1013 – volume: 42 start-page: 15 year: 2020 ident: 122_CR7 publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence doi: 10.1109/TPAMI.2018.2876413 – volume: 42 start-page: 1 year: 2012 ident: 122_CR1 publication-title: Journal of Statistical Software
SSID	ssj0064120 ssib023362471
Score	2.1947615
Snippet	The neural networks (NN) remain as black boxes, albeit their quite successful stories everywhere. It is mainly because they provide only the complex structure...
SourceID	nrf crossref springer
SourceType	Open Website Index Database Publisher
StartPage	756
SubjectTerms	Applied Statistics Bayesian Inference Mathematics and Statistics Research Article Statistical Theory and Methods Statistics Statistics and Computing/Statistics Programs 통계학
Title	A statistical model of neural network learning via the Cramer–Rao lower bound
URI	https://link.springer.com/article/10.1007/s42952-021-00122-8 https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002762876
Volume	50
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
ispartofPNX	Journal of the Korean Statistical Society, 2021, 50(3), , pp.756-772
journalDatabaseRights	– providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) customDbUrl: eissn: 2005-2863 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0064120 issn: 1226-3192 databaseCode: GBLVA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect (Elsevier) customDbUrl: eissn: 2005-2863 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0064120 issn: 1226-3192 databaseCode: .~1 dateStart: 20080301 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELZoufSCeIrlJQtxC1mlfsTxcVWBWhAP0a1UTpaTOG1VlKBs4NBDxX_gH_aXdPzIo1qKKJfsyuv1Zj1fZsb2NzMIvZKCclbxNOaZYDEDgx1rnbKY6IomZSXBibb7HR8-prsH7N0hPxwpQS66pMvnxdkf40r-R6rQBnK1UbI3kOwwKDTAe5AvXEHCcP0nGS_sTkDnUi3bIBBb1MYRM4zLpVF7hndfGOIo-nminZ-5YwlZbU9zoF90E32zxdKi3NZYusZdtV9837R2435_8qOB9Tke5Tt8LbWJvjbjCf_nwMjeq4-b6T4D2R6IVFf3GaN9W6ob1gZmojXBhwNl7ovazY1rcylOSRa0V1C1PsdsgBSd6E3B04kJFr6az5p294SOFZhQTmJ3i_ZgMM5GWzYwDIdszK6zgs7KdVbZBrpNwALYMh_z80G3EQqWnImBHpQyn9Jz-Gsh4MqFXa7dwBWnZqNuq7VzdeeuLO-iO0FweOFBcw_dMvV9tDVIbvUAfVrgCXqwQw9uKuzRgwN6cI8eDOjBAALs0XPx6zfgBjvcYIebh-jg7Zvlzm4cymvEBUxAF5eykKzgMikNybnmuTZa0FRow_JCU1aYJOeMlYkhOhMyMbyiRSmMJEQkrKzoI7RZN7V5jHCSag0f5AYWAExuFzmFVUIGw8kKHO6UzFDUz4_67rOoqOslNEMvYQrVaXGibPJz-3rUqNNWwRJvT4H6yGTKZ-h1P8MqPJKrv4z55Gbdn6Kt8Rl4hja79od5Dt5nl79wsLkEqiV8lw
linkProvider	Elsevier
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+statistical+model+of+neural+network+learning+via+the+Cramer%E2%80%93Rao+lower+bound&rft.jtitle=Journal+of+the+Korean+Statistical+Society&rft.au=Kim%2C+Tae+Yoon&rft.au=Park%2C+Inho&rft.date=2021-09-01&rft.pub=Springer+Singapore&rft.issn=1226-3192&rft.eissn=2005-2863&rft.volume=50&rft.issue=3&rft.spage=756&rft.epage=772&rft_id=info:doi/10.1007%2Fs42952-021-00122-8&rft.externalDocID=10_1007_s42952_021_00122_8
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1226-3192&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1226-3192&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1226-3192&client=summon