Enhanced the prediction approach of diabetes using an autoencoder with regularization and deep neural network

Diabetes mellitus is considered one of the foremost common and extreme diseases worldwide. A precise and early diagnosis of diabetes is essential to avoid complications and is of crucial importance to the medical care that patients get. To achieve that, we need to develop a model to predict diabetes...

Full description

Saved in:

Bibliographic Details
Published in	Periodicals of Engineering and Natural Sciences (PEN) Vol. 10; no. 6; pp. 156 - 167
Main Authors	Ismael, Hussein A., Al-A’araji, Nabeel H., Shukur, Baheja Khudair
Format	Journal Article
Language	English
Published	31.12.2022
Online Access	Get full text
ISSN	2303-4521 2303-4521
DOI	10.21533/pen.v10.i6.918

Cover

More Information
Summary:	Diabetes mellitus is considered one of the foremost common and extreme diseases worldwide. A precise and early diagnosis of diabetes is essential to avoid complications and is of crucial importance to the medical care that patients get. To achieve that, we need to develop a model to predict diabetes. There are many prediction models, but they suffer from some problems such as the accuracy of prediction being poor and the time complexity. The prediction process is highly dependent on important features. So, in this paper, we proposed a new model called (CAER-DNN) that depends on an unsupervised technique for generating newly important features and a deep neural network for the prediction process. The unsupervised technique is called complete autoencoder with regularization techniques (CAER) that uses to reconstruct the original features (newly learned features). It is focused too much on training the most important learned features and misses out on less important features. Thus, improving the performance of the prediction process. These important features are used as input to the deep neural network for the prediction of diabetes. Our model is applied to two sets of data including Pima Indian and Mendeley diabetic datasets. Based on the 10-fold cross-validation technique Pima Indian dataset achieves high performance in evaluation measures (f1-score 97.38%, accuracy, recall 97.25%, specificity 97.59%, precision 97.53%,). While the Mendeley diabetes dataset achieved high performance in evaluation measures (f1-score 94.51%, accuracy 98.48, recall 91.74%, accuracy-balance 98.21%, precision 98.21%) based on the holdout technique. compared with other existing machine learning and deep learning techniques our model outperformed existing techniques.
ISSN:	2303-4521 2303-4521
DOI:	10.21533/pen.v10.i6.918