An efficient feature selection and classification system for microarray cancer data using genetic algorithm and deep belief networks

Cancer is one of the most devastating health conditions in the world. In the diagnosis and treatment of the various forms of cancer illness, studies have shown that early detection of the cancer by clinical methods usually takes a considerable lengthy time. This informs searching for an alternative...

Full description

Saved in:

Bibliographic Details
Published in	Multimedia tools and applications Vol. 84; no. 8; pp. 4393 - 4434
Main Authors	Lawrence, Morolake Oladayo, Jimoh, Rasheed Gbenga, Yahya, Waheed Babatunde
Format	Journal Article
Language	English
Published	New York Springer US 01.03.2025 Springer Nature B.V
Subjects	Belief networks Biological properties Biomarkers Cancer Classification Computer Communication Networks Computer Science Data Structures and Information Theory Datasets Diagnosis Feature selection Genetic algorithms Leukemia Multimedia Information Systems Performance indices Prostate Special Purpose and Application-Based Systems Track 2: Medical Applications of Multimedia Deep learning Genetic algorithm Monte-carlo experiment Deep Belief networks Prostate cancer Cancer Leukaemia
Online Access	Get full text
ISSN	1573-7721 1380-7501 1573-7721
DOI	10.1007/s11042-024-18802-y

Cover

More Information
Summary:	Cancer is one of the most devastating health conditions in the world. In the diagnosis and treatment of the various forms of cancer illness, studies have shown that early detection of the cancer by clinical methods usually takes a considerable lengthy time. This informs searching for an alternative non-clinical diagnosis of cancer cases using microarray technology. Therefore, this study develops an efficient feature selection and classification method for high-dimensional microarray cancer data by combining Genetic Algorithms (GA) and Deep Belief Networks (DBN). The study employed a GA for selecting the most informative gene biomarker and DBN for the classification of biological samples. In a Monte Carlo experiment, the simulated and real-life microarray datasets were partitioned into 95% training and 5% test samples at 100 to 1000 epochs. The classifier was constructed using the training datasets while its efficiency was assessed on the test sample using the Misclassification Error Rate, Sensitivity, Specificity, and Receiver Operating Characteristic Analysis. The GA and DBN were implemented using Caret and Deepnet (R statistical packages). The proposed GADBN method was implemented using simulated and real-life datasets, which yielded out-of-bag average classification accuracy of 98.8% and 93.1% respectively. The proposed GADBN outperformed the other existing classifiers; and the GADBN method was more efficient with a smaller Misclassification Error Rate (MER) for Leukaemia 1 (0.18), Prostate 1 (0.35) and Prostate 3 (0.09) datasets than some of the existing methods under various performance indices considered. The proposed model is a powerful and effective instrument for identifying useful features in microarray cancer data and classifying the cancer types accordingly.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1573-7721 1380-7501 1573-7721
DOI:	10.1007/s11042-024-18802-y