Biomarker Selection and Classification of “-Omics” Data Using a Two-Step Bayes Classification Framework

Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly. Here, a novel two-step machine-learning framework is presented to address this need. First,...

Full description

Saved in:
Bibliographic Details
Published inBioMed research international Vol. 2013; no. 2013; pp. 1 - 9
Main Authors Tongsima, Sissades, Varavithya, Vara, Shaw, Philip James, Kulawonganunchai, Supasak, Prueksaaroon, Supakit, Assawamakin, Anunchai, Ruangrajitpakorn, Taneth
Format Journal Article
LanguageEnglish
Published Cairo, Egypt Hindawi Publishing Corporation 01.01.2013
John Wiley & Sons, Inc
Subjects
Online AccessGet full text
ISSN2314-6133
2314-6141
2314-6141
DOI10.1155/2013/148014

Cover

More Information
Summary:Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly. Here, a novel two-step machine-learning framework is presented to address this need. First, a Naïve Bayes estimator is used to rank features from which the top-ranked will most likely contain the most informative features for prediction of the underlying biological classes. The top-ranked features are then used in a Hidden Naïve Bayes classifier to construct a classification prediction model from these filtered attributes. In order to obtain the minimum set of the most informative biomarkers, the bottom-ranked features are successively removed from the Naïve Bayes-filtered feature list one at a time, and the classification accuracy of the Hidden Naïve Bayes classifier is checked for each pruned feature set. The performance of the proposed two-step Bayes classification framework was tested on different types of -omics datasets including gene expression microarray, single nucleotide polymorphism microarray (SNParray), and surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) proteomic data. The proposed two-step Bayes classification framework was equal to and, in some cases, outperformed other classification methods in terms of prediction accuracy, minimum number of classification markers, and computational time.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
Academic Editor: Florencio Pazos
ISSN:2314-6133
2314-6141
2314-6141
DOI:10.1155/2013/148014