Bayesian versus data driven model selection for microarray data

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is a particular instance of the model selection problem, i.e., the...

Full description

Saved in:
Bibliographic Details
Published inNatural computing Vol. 14; no. 3; pp. 393 - 402
Main Authors Giancarlo, Raffaele, Lo Bosco, Giosué, Utro, Filippo
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Netherlands 01.09.2015
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1567-7818
1572-9796
1572-9796
DOI10.1007/s11047-014-9446-5

Cover

More Information
Summary:Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is a particular instance of the model selection problem, i.e., the identification of the correct number of clusters in a dataset. In what follows, for ease of reference, we refer to that instance still as model selection. It is an important part of any statistical analysis. The techniques used for solving it are mainly either Bayesian or data-driven, and are both based on internal knowledge . That is, they use information obtained by processing the input data. Although both techniques have been evaluated in the realm of microarray data analysis, their merits (relative to each other) has not been assessed. Here we will fill this gap in the literature by comparing three Bayesians versus several state of the art data-driven model selection methods. Our results show that, although in some cases Bayesian methods guarantee good results, they are not able to compete in terms of ability to predict the correct number of clusters in a dataset with the data-driven methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1567-7818
1572-9796
1572-9796
DOI:10.1007/s11047-014-9446-5