Modelling high-dimensional data by mixtures of factor analyzers

We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be fitted to a sample of n data points of dimension p, where p is large...

Full description

Saved in:

Bibliographic Details
Published in	Computational statistics & data analysis Vol. 41; no. 3; pp. 379 - 388
Main Authors	McLachlan, G.J., Peel, D., Bean, R.W.
Format	Journal Article Conference Proceeding
Language	English
Published	Amsterdam Elsevier B.V 28.01.2003 Elsevier Science Elsevier
Series	Computational Statistics & Data Analysis
Subjects	EM algorithm Exact sciences and technology Factor analyzers Mathematics Mixture modelling Multivariate analysis Nonparametric inference Probability and statistics Sciences and techniques of general use Statistics Mixture modelling EM algorithm Factor analyzers Mixed distribution Density estimation Discriminant analysis Factor analysis Cluster analysis (statistics) Eigenvector Gaussian distribution Non parametric estimation Multivariate distribution Statistical estimation Covariance matrix Multivariate analysis Modeling Mixture Model matching Covariance
Online Access	Get full text
ISSN	0167-9473 1872-7352
DOI	10.1016/S0167-9473(02)00183-4

Cover

More Information
Summary:	We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be fitted to a sample of n data points of dimension p, where p is large relative to n. The number of free parameters is controlled through the dimension of the latent factor space. By working in this reduced space, it allows a model for each component-covariance matrix with complexity lying between that of the isotropic and full covariance structure models. We shall illustrate the use of mixtures of factor analyzers in a practical example that considers the clustering of cell lines on the basis of gene expressions from microarray experiments.
ISSN:	0167-9473 1872-7352
DOI:	10.1016/S0167-9473(02)00183-4