Multiclass cancer classification and biomarker discovery using GA-based algorithms

Motivation: The development of microarray-based high-throughput gene profiling has led to the hope that this technology could provide an efficient and accurate means of diagnosing and classifying tumors, as well as predicting prognoses and effective treatments. However, the large amount of data gene...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformatics Vol. 21; no. 11; pp. 2691 - 2697
Main Authors	Liu, Jane Jijun, Cutler, Gene, Li, Wuxiong, Pan, Zheng, Peng, Sihua, Hoey, Tim, Chen, Liangbiao, Ling, Xuefeng Bruce
Format	Journal Article
Language	English
Published	Oxford Oxford University Press 01.06.2005 Oxford Publishing Limited (England)
Subjects	Algorithms Artificial Intelligence Bioinformatics Biological and medical sciences Biomarkers, Tumor - classification Biomarkers, Tumor - genetics Biomarkers, Tumor - metabolism Data mining Diagnosis, Computer-Assisted - methods Fundamental and applied biological sciences. Psychology Gene Expression Profiling - methods General aspects Humans Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Neoplasm Proteins - classification Neoplasm Proteins - genetics Neoplasm Proteins - metabolism Neoplasms - classification Neoplasms - diagnosis Neoplasms - genetics Neoplasms - metabolism Oligonucleotide Array Sequence Analysis - methods Pattern Recognition, Automated - methods Reproducibility of Results Sensitivity and Specificity Tumors High throughput screening Treatment Gene Genetic algorithm Classification DNA chip Development Biological marker Malignant tumor Microarray
Online Access	Get full text
ISSN	1367-4803 0266-7061 1460-2059 1460-2059 1367-4811
DOI	10.1093/bioinformatics/bti419

Cover

More Information
Summary:	Motivation: The development of microarray-based high-throughput gene profiling has led to the hope that this technology could provide an efficient and accurate means of diagnosing and classifying tumors, as well as predicting prognoses and effective treatments. However, the large amount of data generated by microarrays requires effective reduction of discriminant gene features into reliable sets of tumor biomarkers for such multiclass tumor discrimination. The availability of reliable sets of biomarkers, especially serum biomarkers, should have a major impact on our understanding and treatment of cancer. Results: We have combined genetic algorithm (GA) and all paired (AP) support vector machine (SVM) methods for multiclass cancer categorization. Predictive features can be automatically determined through iterative GA/SVM, leading to very compact sets of non-redundant cancer-relevant genes with the best classification performance reported to date. Interestingly, these different classifier sets harbor only modest overlapping gene features but have similar levels of accuracy in leave-one-out cross-validations (LOOCV). Further characterization of these optimal tumor discriminant features, including the use of nearest shrunken centroids (NSC), analysis of annotations and literature text mining, reveals previously unappreciated tumor subclasses and a series of genes that could be used as cancer biomarkers. With this approach, we believe that microarray-based multiclass molecular analysis can be an effective tool for cancer biomarker discovery and subsequent molecular cancer diagnosis. Contact: xuefeng_ling@yahoo.com Supplementary information: http://www.fishgenome.org/publication/Liu/bioinformatics/
Bibliography:	istex:F4476C481851974C0EBBA8415D07BD0DCBB43B3A local:bti419 ark:/67375/HXZ-JT86S21T-M To whom correspondence should be addressed at Amgen San Francisco, South San Francisco, CA 94080, USA. ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 ObjectType-Undefined-3
ISSN:	1367-4803 0266-7061 1460-2059 1460-2059 1367-4811
DOI:	10.1093/bioinformatics/bti419