Information geometry of the EM and em algorithms for neural networks

To realize an input-output relation given by noise-contaminated examples, it is effective to use a stochastic model of neural networks. When the model network includes hidden units whose activation values are not specified nor observed, it is useful to estimate the hidden variables from the observed...

Full description

Saved in:

Bibliographic Details
Published in	Neural networks Vol. 8; no. 9; pp. 1379 - 1408
Main Author	Amari, Shun-ichi
Format	Journal Article
Language	English
Published	Oxford Elsevier Ltd 1995 Elsevier Science
Subjects	Applied sciences Artificial intelligence Computer science; control theory; systems Connectionism. Neural networks e-Projection EM algorithm Exact sciences and technology Hidden variable Identification of neural network Information geometry Learning m-Projection Stochastic model of neural networks Learning Information geometry m-Projection Stochastic model of neural networks e-Projection Identification of neural network EM algorithm Hidden variable Conditional expectation Stochastic model Iterative method Neural network Hidden unit Learning algorithm
Online Access	Get full text
ISSN	0893-6080 1879-2782
DOI	10.1016/0893-6080(95)00003-8

Cover

More Information
Summary:	To realize an input-output relation given by noise-contaminated examples, it is effective to use a stochastic model of neural networks. When the model network includes hidden units whose activation values are not specified nor observed, it is useful to estimate the hidden variables from the observed or specified input-output data based on the stochastic model. Two algorithms, the EM and em algorithms, have so far been proposed for this purpose. The EM algorithm is an iterative statistical technique of using the conditional expectation, and the em algorithm is a geometrical one given by information geometry. The em algorithm minimizes iteratively the Kullback-Leibler divergence in the manifold of neural networks. These two algorithms are equivalent in most cases. The present paper gives a unified information geometrical framework for studying stochastic models of neural networks, by focusing on the EM and em algorithms, and proves a condition that guarantees their equivalence. Examples include: (1) stochastic multilayer perceptron, (2) mixtures of experts, and (3) normal mixture model.
ISSN:	0893-6080 1879-2782
DOI:	10.1016/0893-6080(95)00003-8