Enhanced the prediction approach of diabetes using an autoencoder with regularization and deep neural network

Diabetes mellitus is considered one of the foremost common and extreme diseases worldwide. A precise and early diagnosis of diabetes is essential to avoid complications and is of crucial importance to the medical care that patients get. To achieve that, we need to develop a model to predict diabetes...

Full description

Saved in:
Bibliographic Details
Published inPeriodicals of Engineering and Natural Sciences (PEN) Vol. 10; no. 6; pp. 156 - 167
Main Authors Ismael, Hussein A., Al-A’araji, Nabeel H., Shukur, Baheja Khudair
Format Journal Article
LanguageEnglish
Published 31.12.2022
Online AccessGet full text
ISSN2303-4521
2303-4521
DOI10.21533/pen.v10.i6.918

Cover

More Information
Summary:Diabetes mellitus is considered one of the foremost common and extreme diseases worldwide. A precise and early diagnosis of diabetes is essential to avoid complications and is of crucial importance to the medical care that patients get. To achieve that, we need to develop a model to predict diabetes. There are many prediction models, but they suffer from some problems such as the accuracy of prediction being poor and the time complexity. The prediction process is highly dependent on important features. So, in this paper, we proposed a new model called (CAER-DNN) that depends on an unsupervised technique for generating newly important features and a deep neural network for the prediction process. The unsupervised technique is called complete autoencoder with regularization techniques (CAER) that uses to reconstruct the original features (newly learned features). It is focused too much on training the most important learned features and misses out on less important features. Thus, improving the performance of the prediction process. These important features are used as input to the deep neural network for the prediction of diabetes.  Our model is applied to two sets of data including Pima Indian and Mendeley diabetic datasets. Based on the 10-fold cross-validation technique Pima Indian dataset achieves high performance in evaluation measures (f1-score 97.38%, accuracy, recall 97.25%, specificity 97.59%, precision 97.53%,). While the Mendeley diabetes dataset achieved high performance in evaluation measures (f1-score 94.51%, accuracy 98.48, recall 91.74%, accuracy-balance 98.21%, precision 98.21%) based on the holdout technique. compared with other existing machine learning and deep learning techniques our model outperformed existing techniques.
ISSN:2303-4521
2303-4521
DOI:10.21533/pen.v10.i6.918