Performance Comparison Between Deep Neural Network and Machine Learning Based Classifiers for Huntington Disease Prediction From Human DNA Sequence

Huntington Disease (HD) is a type of neurodegenerative disorder which causes problems like psychiatric disturbances, movement problem, weight loss and problem in sleep. It needs to be addressed in earlier stage of human life. Nowadays Deep Learning (DL) based system could help physicians provide sec...

Full description

Saved in:
Bibliographic Details
Published inIEEE Transactions on Computational Biology and Bioinformatics Vol. 22; no. 1; pp. 52 - 63
Main Authors Vishnuppriya, C., Tamilpavai, G.
Format Journal Article
LanguageEnglish
Published United States IEEE 01.01.2025
Subjects
Online AccessGet full text
ISSN2998-4165
1557-9964
2998-4165
1557-9964
DOI10.1109/TCBB.2024.3493203

Cover

More Information
Summary:Huntington Disease (HD) is a type of neurodegenerative disorder which causes problems like psychiatric disturbances, movement problem, weight loss and problem in sleep. It needs to be addressed in earlier stage of human life. Nowadays Deep Learning (DL) based system could help physicians provide second opinion in treating patient's disease. In this work, human Deoxyribo Nucleic Acid (DNA) sequence is analyzed using Deep Neural Network (DNN) algorithm to predict the HD disease. The main objective of this work is to identify whether the human DNA is affected by HD or not. Human DNA sequences are collected from National Center for Biotechnology Information (NCBI) and synthetic human DNA data are also constructed for process. Then numerical conversion of human DNA sequence data is done by Chaos Game Representation (CGR) method. After that, numerical values of DNA data are used for feature extraction. Mean, median, standard deviation, entropy, contrast, correlation, energy and homogeneity are extracted. Additionally, the following features such as counts of adenine, thymine, guanine and cytosine are extracted from the DNA sequence data itself. The extracted features are used as input to the DNN classifier and other machine learning based classifiers such as NN (Neural Network), Support Vector Machine (SVM), Random Forest (RF) and Classification Tree with Forward Pruning (CT WFP ). Six performance measures are used such as Accuracy, Sensitivity, Specificity, Precision, F1 score and Mathew Correlation Co-efficient (MCC). The study concludes DNN, NN, SVM, RF achieve 100% accuracy and CT WFP achieves accuracy of 87%.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2998-4165
1557-9964
2998-4165
1557-9964
DOI:10.1109/TCBB.2024.3493203