Modelling high-dimensional data by mixtures of factor analyzers

We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be fitted to a sample of n data points of dimension p, where p is large...

Full description

Saved in:
Bibliographic Details
Published inComputational statistics & data analysis Vol. 41; no. 3; pp. 379 - 388
Main Authors McLachlan, G.J., Peel, D., Bean, R.W.
Format Journal Article Conference Proceeding
LanguageEnglish
Published Amsterdam Elsevier B.V 28.01.2003
Elsevier Science
Elsevier
SeriesComputational Statistics & Data Analysis
Subjects
Online AccessGet full text
ISSN0167-9473
1872-7352
DOI10.1016/S0167-9473(02)00183-4

Cover

More Information
Summary:We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be fitted to a sample of n data points of dimension p, where p is large relative to n. The number of free parameters is controlled through the dimension of the latent factor space. By working in this reduced space, it allows a model for each component-covariance matrix with complexity lying between that of the isotropic and full covariance structure models. We shall illustrate the use of mixtures of factor analyzers in a practical example that considers the clustering of cell lines on the basis of gene expressions from microarray experiments.
ISSN:0167-9473
1872-7352
DOI:10.1016/S0167-9473(02)00183-4