A tractable Multi-Partitions Clustering
In the framework of model-based clustering, a model allowing several latent class variables is proposed. This model assumes that the distribution of the observed data can be factorized into several independent blocks of variables. Each block is assumed to follow a latent class model ({ıt i.e.,} mixt...
Saved in:
| Main Authors | , |
|---|---|
| Format | Journal Article |
| Language | English |
| Published |
22.01.2018
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.48550/arxiv.1801.07063 |
Cover
| Summary: | In the framework of model-based clustering, a model allowing several latent
class variables is proposed. This model assumes that the distribution of the
observed data can be factorized into several independent blocks of variables.
Each block is assumed to follow a latent class model ({ıt i.e.,} mixture with
conditional independence assumption). The proposed model includes variable
selection, as a special case, and is able to cope with the mixed-data setting.
The simplicity of the model allows to estimate the repartition of the variables
into blocks and the mixture parameters simultaneously, thus avoiding to run EM
algorithms for each possible repartition of variables into blocks. For the
proposed method, a model is defined by the number of blocks, the number of
clusters inside each block and the repartition of variables into block. Model
selection can be done with two information criteria, the BIC and the MICL, for
which an efficient optimization is proposed. The performances of the model are
investigated on simulated and real data. It is shown that the proposed method
gives a rich interpretation of the dataset at hand ({ıt i.e.,} analysis of the
repartition of the variables into blocks and analysis of the clusters produced
by each block of variables). |
|---|---|
| DOI: | 10.48550/arxiv.1801.07063 |