An efficient feature selection and classification system for microarray cancer data using genetic algorithm and deep belief networks
Cancer is one of the most devastating health conditions in the world. In the diagnosis and treatment of the various forms of cancer illness, studies have shown that early detection of the cancer by clinical methods usually takes a considerable lengthy time. This informs searching for an alternative...
Saved in:
| Published in | Multimedia tools and applications Vol. 84; no. 8; pp. 4393 - 4434 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
New York
Springer US
01.03.2025
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1573-7721 1380-7501 1573-7721 |
| DOI | 10.1007/s11042-024-18802-y |
Cover
| Summary: | Cancer is one of the most devastating health conditions in the world. In the diagnosis and treatment of the various forms of cancer illness, studies have shown that early detection of the cancer by clinical methods usually takes a considerable lengthy time. This informs searching for an alternative non-clinical diagnosis of cancer cases using microarray technology. Therefore, this study develops an efficient feature selection and classification method for high-dimensional microarray cancer data by combining Genetic Algorithms (GA) and Deep Belief Networks (DBN). The study employed a GA for selecting the most informative gene biomarker and DBN for the classification of biological samples. In a Monte Carlo experiment, the simulated and real-life microarray datasets were partitioned into 95% training and 5% test samples at 100 to 1000 epochs. The classifier was constructed using the training datasets while its efficiency was assessed on the test sample using the Misclassification Error Rate, Sensitivity, Specificity, and Receiver Operating Characteristic Analysis. The GA and DBN were implemented using Caret and Deepnet (R statistical packages). The proposed GADBN method was implemented using simulated and real-life datasets, which yielded out-of-bag average classification accuracy of 98.8% and 93.1% respectively. The proposed GADBN outperformed the other existing classifiers; and the GADBN method was more efficient with a smaller Misclassification Error Rate (MER) for Leukaemia 1 (0.18), Prostate 1 (0.35) and Prostate 3 (0.09) datasets than some of the existing methods under various performance indices considered. The proposed model is a powerful and effective instrument for identifying useful features in microarray cancer data and classifying the cancer types accordingly. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1573-7721 1380-7501 1573-7721 |
| DOI: | 10.1007/s11042-024-18802-y |