A statistical model of neural network learning via the Cramer–Rao lower bound
The neural networks (NN) remain as black boxes, albeit their quite successful stories everywhere. It is mainly because they provide only the complex structure of the underlying network with a huge validation data set whenever their serendipities reveal themselves. In this paper, we propose the stati...
Saved in:
| Published in | Journal of the Korean Statistical Society Vol. 50; no. 3; pp. 756 - 772 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
Singapore
Springer Singapore
01.09.2021
한국통계학회 |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1226-3192 2005-2863 |
| DOI | 10.1007/s42952-021-00122-8 |
Cover
| Abstract | The neural networks (NN) remain as black boxes, albeit their quite successful stories everywhere. It is mainly because they provide only the complex structure of the underlying network with a huge validation data set whenever their serendipities reveal themselves. In this paper, we propose the statistical NN learning model related to the concept of universal Turing computer for regression predictive model. Based on this model, we define ’statistically successful NN (SSNN) learning.’ This is mainly done by calculating the well-known Cramer–Rao lower bound for the averaged square error (ASE) of NN learning. Using such formal definition, we propose an ASE-based NN learning (ANL) algorithm. The ANL algorithm not only implements the Cramer–Rao lower bound successfully but also presents an effective way to figure out a complicated geometry of ASE over hyper-parameter space for NN. This enables the ANL to be free of huge validation data set. Simple numerical simulation and real data analysis are done to evaluate performance of the ANL and present how to implement it. |
|---|---|
| AbstractList | The neural networks (NN) remain as black boxes, albeit their quite successful stories everywhere. It is mainly because they provide only the complex structure of the underlying network with a huge validation data set whenever their serendipities reveal themselves. In this paper, we propose the statistical NN learning model related to the concept of universal Turing computer for regression predictive model. Based on this model, we define ’statistically successful NN (SSNN) learning.’ This is mainly done by calculating the well-known Cramer–Rao lower bound for the averaged square error (ASE) of NN learning. Using such formal definition, we propose an ASE-based NN learning (ANL) algorithm. The ANL algorithm not only implements the Cramer–Rao lower bound successfully but also presents an effective way to figure out a complicated geometry of ASE over hyper-parameter space for NN. This enables the ANL to be free of huge validation data set. Simple numerical simulation and real data analysis are done to evaluate performance of the ANL and present how to implement it. The neural networks (NN) remain as black boxes, albeit their quite successful stories everywhere. It is mainly because they provide only the complex structure of the underlying network with a huge validation data set whenever their serendipities reveal themselves. In this paper, we propose the statistical NN learning model related to the concept of universal Turing computer for regression predictive model. Based on this model, we define ’statistically successful NN (SSNN) learning.’ This is mainly done by calculating the well-known Cramer–Rao lower bound for the averaged square error (ASE) of NN learning. Using such formal definition, we propose an ASE-based NN learning (ANL) algorithm. The ANL algorithm not only implements the Cramer–Rao lower bound successfully but also presents an effective way to figure out a complicated geometry of ASE over hyper-parameter space for NN. This enables the ANL to be free of huge validation data set. Simple numerical simulation and real data analysis are done to evaluate performance of the ANL and present how to implement it. KCI Citation Count: 0 |
| Author | Kim, Tae Yoon Park, Inho |
| Author_xml | – sequence: 1 givenname: Tae Yoon orcidid: 0000-0002-9869-8632 surname: Kim fullname: Kim, Tae Yoon email: tykim@kmu.ac.kr organization: Department of Statistics, Keimyung University – sequence: 2 givenname: Inho surname: Park fullname: Park, Inho organization: Department of Statistics, Pukyong National University |
| BackLink | https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002762876$$DAccess content in National Research Foundation of Korea (NRF) |
| BookMark | eNp9kE1OwzAQRi1UJErhAqy8ZWGYOHYSL6uKn0qVKlVlbTnOpKRNbWSnVOy4AzfkJISGNatPM3rfSPMuych5h4TcJHCXAOT3UXAlOQOeMICEc1ackTEHkIwXWToi436XsTRR_IJcxrgFyETCYUyWUxo70zWxa6xp6d5X2FJfU4eH0M8Ou6MPO9qiCa5xG_reGNq9Ip0Fs8fw_fm1Mp62_oiBlv7gqityXps24vVfTsjL48N69swWy6f5bLpgludZxypllbBSQYW8lEaWBk2eZrlBUVqTCotQSiEqQG6KXAHKOrVVjorzHERVpxNyO9x1odY722hvmlNuvN4FPV2t51oVWaEy2bN8YG3wMQas9Vto9iZ86AT0rz496NO9Pn3Sp4u-lA6l2MNug0Fv_SG4_qf_Wj9kbnXs |
| Cites_doi | 10.1016/j.amc.2009.04.057 10.1016/0893-6080(89)90020-8 10.1007/978-1-4614-7138-7 10.1007/978-0-387-21706-2 10.1109/TPAMI.2018.2876413 10.1006/jcss.1995.1013 10.32614/RJ-2010-006 |
| ContentType | Journal Article |
| Copyright | Korean Statistical Society 2021 |
| Copyright_xml | – notice: Korean Statistical Society 2021 |
| DBID | AAYXX CITATION ACYCR |
| DOI | 10.1007/s42952-021-00122-8 |
| DatabaseName | CrossRef Korean Citation Index |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Statistics |
| EISSN | 2005-2863 |
| EndPage | 772 |
| ExternalDocumentID | oai_kci_go_kr_ARTI_9868965 10_1007_s42952_021_00122_8 |
| GroupedDBID | --K --M .UV .~1 0R~ 1B1 1~. 1~5 2JY 4.4 406 457 4G. 5GY 5VS 7-5 71M 8P~ 9ZL AAAKF AACDK AACTN AAEDT AAHNG AAIKJ AAJBT AAKOC AALRI AAOAW AAQFI AASML AATNV AAXUO ABAKF ABAOU ABECU ABMQK ABTEG ABTKH ABTMW ABUCO ABWVN ABXDB ACAOD ACDAQ ACDTI ACGFS ACHSB ACOKC ACPIV ACRPL ACZOJ ADBBV ADEZE ADMUD ADNMO ADTPH ADYFF AEFQL AEKER AEMSY AENEX AESKC AFTJW AGHFR AGMZJ AGQEE AGUBO AGYEJ AIGIU AIGVJ AILAN AITUG AJOXV AJZVZ ALMA_UNASSIGNED_HOLDINGS AMFUW AMXSW ARUGR BGNMA BLXMC CS3 DPUIP DU5 EBLON EBS EJD EO9 EP2 EP3 F5P FDB FIGPU FIRID FNLPD FNPLU GBLVA HAMUX HZ~ IKXTQ IWAJR J1W JZLTJ LLZTM M41 M4Y MHUIS MO0 N9A NPVJJ NQJWS NU0 O-L O9- OAUVE OZT P-8 P-9 P2P PC. PT4 Q38 RIG ROL RPZ RSV SDF SDG SES SJYHP SNE SNPRN SOHCF SOJ SRMVM SSLCW SSZ T5K UOJIU UTJUX ZMTXR ~G- AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC AEZWR AFDZB AFHIU AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION EFLBG ~HD AAFGU AAYFA ABFGW ABKAS ABYKQ ACBMV ACBRV ACBYP ACIGE ACIPQ ACTTH ACVWB ACWMK ACYCR ADMDM ADOXG AEFTE AESTI AEVTX AFNRJ AGGBP AIMYW AJBFU AJDOV AKQUC |
| ID | FETCH-LOGICAL-c276t-d9c94c590de2b5a5baea7367ae4bca34ce0b544d0e2a8790e5f3cd7e922704df3 |
| ISSN | 1226-3192 |
| IngestDate | Tue Nov 21 21:04:29 EST 2023 Wed Oct 01 02:09:30 EDT 2025 Fri Feb 21 02:48:39 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Keywords | Validation Cramer–Rao lower bound 68T05 ANL algorithm Statistical NN learning model Statistically successful NN learning 68Q32 |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c276t-d9c94c590de2b5a5baea7367ae4bca34ce0b544d0e2a8790e5f3cd7e922704df3 |
| ORCID | 0000-0002-9869-8632 |
| PageCount | 17 |
| ParticipantIDs | nrf_kci_oai_kci_go_kr_ARTI_9868965 crossref_primary_10_1007_s42952_021_00122_8 springer_journals_10_1007_s42952_021_00122_8 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 20210900 2021-09-00 2021-09 |
| PublicationDateYYYYMMDD | 2021-09-01 |
| PublicationDate_xml | – month: 9 year: 2021 text: 20210900 |
| PublicationDecade | 2020 |
| PublicationPlace | Singapore |
| PublicationPlace_xml | – name: Singapore |
| PublicationTitle | Journal of the Korean Statistical Society |
| PublicationTitleAbbrev | J. Korean Stat. Soc |
| PublicationYear | 2021 |
| Publisher | Springer Singapore 한국통계학회 |
| Publisher_xml | – name: Springer Singapore – name: 한국통계학회 |
| References | Günther, Fritsch (CR3) 2010; 2 James, Witten, Hastie, Tibshirani (CR5) 2013 Hornik, Stichcombe, White (CR4) 1989; 2 Venables, Ripley (CR11) 2002 CR8 Bergmeir, Benítez (CR1) 2012; 42 Shao (CR9) 1998 Delvenne (CR2) 2009; 4 Krogh, Hertz (CR6) 1992; 4 Liao, Drummond, Reid, Carneiro (CR7) 2020; 42 Siegelmann, Sontag (CR10) 1995; 50 K Hornik (122_CR4) 1989; 2 C Bergmeir (122_CR1) 2012; 42 J-C Delvenne (122_CR2) 2009; 4 F Günther (122_CR3) 2010; 2 A Krogh (122_CR6) 1992; 4 Z Liao (122_CR7) 2020; 42 HT Siegelmann (122_CR10) 1995; 50 122_CR8 G James (122_CR5) 2013 J Shao (122_CR9) 1998 WN Venables (122_CR11) 2002 |
| References_xml | – volume: 4 start-page: 1368 year: 2009 end-page: 1374 ident: CR2 article-title: What is a universal computing machine? publication-title: Applied Mathematics and Computation doi: 10.1016/j.amc.2009.04.057 – volume: 4 start-page: 950 year: 1992 end-page: 957 ident: CR6 article-title: A simple weight decay can improve generalization publication-title: Advances in Neural Information Processing Systems – volume: 42 start-page: 1 year: 2012 end-page: 26 ident: CR1 article-title: Neural networks in R using the Stuttgart neural network simulator: RSNNS publication-title: Journal of Statistical Software – volume: 2 start-page: 359 year: 1989 end-page: 366 ident: CR4 article-title: Multilayer feedforward networks are universal approximators publication-title: Neural Networks doi: 10.1016/0893-6080(89)90020-8 – year: 2013 ident: CR5 publication-title: An Introduction to Statistical Learning with Applications in R doi: 10.1007/978-1-4614-7138-7 – ident: CR8 – year: 2002 ident: CR11 publication-title: Modern Applied Statistics with S doi: 10.1007/978-0-387-21706-2 – volume: 42 start-page: 15 year: 2020 end-page: 26 ident: CR7 article-title: Approximate Fisher information matrix to characterise the training of deep neural networks publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence doi: 10.1109/TPAMI.2018.2876413 – year: 1998 ident: CR9 publication-title: Mathematical Statistics – volume: 50 start-page: 132 year: 1995 end-page: 150 ident: CR10 article-title: On the computatonal power of neural nets publication-title: Journal of Computer and System Science doi: 10.1006/jcss.1995.1013 – volume: 2 start-page: 30 year: 2010 end-page: 38 ident: CR3 article-title: Neuralnet: Training of neural networks R foundation for statistical computing publication-title: The R Journal doi: 10.32614/RJ-2010-006 – ident: 122_CR8 – volume: 2 start-page: 30 year: 2010 ident: 122_CR3 publication-title: The R Journal doi: 10.32614/RJ-2010-006 – volume: 2 start-page: 359 year: 1989 ident: 122_CR4 publication-title: Neural Networks doi: 10.1016/0893-6080(89)90020-8 – volume: 4 start-page: 950 year: 1992 ident: 122_CR6 publication-title: Advances in Neural Information Processing Systems – volume-title: Modern Applied Statistics with S year: 2002 ident: 122_CR11 doi: 10.1007/978-0-387-21706-2 – volume-title: An Introduction to Statistical Learning with Applications in R year: 2013 ident: 122_CR5 doi: 10.1007/978-1-4614-7138-7 – volume: 4 start-page: 1368 year: 2009 ident: 122_CR2 publication-title: Applied Mathematics and Computation doi: 10.1016/j.amc.2009.04.057 – volume-title: Mathematical Statistics year: 1998 ident: 122_CR9 – volume: 50 start-page: 132 year: 1995 ident: 122_CR10 publication-title: Journal of Computer and System Science doi: 10.1006/jcss.1995.1013 – volume: 42 start-page: 15 year: 2020 ident: 122_CR7 publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence doi: 10.1109/TPAMI.2018.2876413 – volume: 42 start-page: 1 year: 2012 ident: 122_CR1 publication-title: Journal of Statistical Software |
| SSID | ssj0064120 ssib023362471 |
| Score | 2.1947615 |
| Snippet | The neural networks (NN) remain as black boxes, albeit their quite successful stories everywhere. It is mainly because they provide only the complex structure... |
| SourceID | nrf crossref springer |
| SourceType | Open Website Index Database Publisher |
| StartPage | 756 |
| SubjectTerms | Applied Statistics Bayesian Inference Mathematics and Statistics Research Article Statistical Theory and Methods Statistics Statistics and Computing/Statistics Programs 통계학 |
| Title | A statistical model of neural network learning via the Cramer–Rao lower bound |
| URI | https://link.springer.com/article/10.1007/s42952-021-00122-8 https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002762876 |
| Volume | 50 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| ispartofPNX | Journal of the Korean Statistical Society, 2021, 50(3), , pp.756-772 |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) customDbUrl: eissn: 2005-2863 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0064120 issn: 1226-3192 databaseCode: GBLVA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect (Elsevier) customDbUrl: eissn: 2005-2863 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0064120 issn: 1226-3192 databaseCode: .~1 dateStart: 20080301 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELZoufSCeIrlJQtxC1mlfsTxcVWBWhAP0a1UTpaTOG1VlKBs4NBDxX_gH_aXdPzIo1qKKJfsyuv1Zj1fZsb2NzMIvZKCclbxNOaZYDEDgx1rnbKY6IomZSXBibb7HR8-prsH7N0hPxwpQS66pMvnxdkf40r-R6rQBnK1UbI3kOwwKDTAe5AvXEHCcP0nGS_sTkDnUi3bIBBb1MYRM4zLpVF7hndfGOIo-nminZ-5YwlZbU9zoF90E32zxdKi3NZYusZdtV9837R2435_8qOB9Tke5Tt8LbWJvjbjCf_nwMjeq4-b6T4D2R6IVFf3GaN9W6ob1gZmojXBhwNl7ovazY1rcylOSRa0V1C1PsdsgBSd6E3B04kJFr6az5p294SOFZhQTmJ3i_ZgMM5GWzYwDIdszK6zgs7KdVbZBrpNwALYMh_z80G3EQqWnImBHpQyn9Jz-Gsh4MqFXa7dwBWnZqNuq7VzdeeuLO-iO0FweOFBcw_dMvV9tDVIbvUAfVrgCXqwQw9uKuzRgwN6cI8eDOjBAALs0XPx6zfgBjvcYIebh-jg7Zvlzm4cymvEBUxAF5eykKzgMikNybnmuTZa0FRow_JCU1aYJOeMlYkhOhMyMbyiRSmMJEQkrKzoI7RZN7V5jHCSag0f5AYWAExuFzmFVUIGw8kKHO6UzFDUz4_67rOoqOslNEMvYQrVaXGibPJz-3rUqNNWwRJvT4H6yGTKZ-h1P8MqPJKrv4z55Gbdn6Kt8Rl4hja79od5Dt5nl79wsLkEqiV8lw |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+statistical+model+of+neural+network+learning+via+the+Cramer%E2%80%93Rao+lower+bound&rft.jtitle=Journal+of+the+Korean+Statistical+Society&rft.au=Kim%2C+Tae+Yoon&rft.au=Park%2C+Inho&rft.date=2021-09-01&rft.pub=Springer+Singapore&rft.issn=1226-3192&rft.eissn=2005-2863&rft.volume=50&rft.issue=3&rft.spage=756&rft.epage=772&rft_id=info:doi/10.1007%2Fs42952-021-00122-8&rft.externalDocID=10_1007_s42952_021_00122_8 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1226-3192&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1226-3192&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1226-3192&client=summon |