A cohort study on the predictive capability of body composition for diabetes mellitus using machine learning

Purpose We applied machine learning to study associations between regional body fat distribution and diabetes mellitus in a population of community adults in order to investigate the predictive capability. We retrospectively analyzed a subset of data from the published Fasa cohort study using indivi...

Full description

Saved in:
Bibliographic Details
Published inJournal of diabetes and metabolic disorders Vol. 23; no. 1; pp. 773 - 781
Main Authors Nematollahi, Mohammad Ali, Askarinejad, Amir, Asadollahi, Arefeh, Bazrafshan, Mehdi, Sarejloo, Shirin, Moghadami, Mana, Sasannia, Sarvin, Farjam, Mojtaba, Homayounfar, Reza, Pezeshki, Babak, Amini, Mitra, Roshanzamir, Mohamad, Alizadehsani, Roohallah, Bazrafshan, Hanieh, Bazrafshan drissi, Hamed, Tan, Ru-San, Acharya, U. Rajendra, Islam, Mohammed Shariful Sheikh
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.06.2024
BioMed Central Ltd
Nature Publishing Group
Subjects
Online AccessGet full text
ISSN2251-6581
2251-6581
DOI10.1007/s40200-023-01350-x

Cover

More Information
Summary:Purpose We applied machine learning to study associations between regional body fat distribution and diabetes mellitus in a population of community adults in order to investigate the predictive capability. We retrospectively analyzed a subset of data from the published Fasa cohort study using individual standard classifiers as well as ensemble learning algorithms. Methods We measured segmental body composition using the Tanita Analyzer BC-418 MA (Tanita Corp, Japan). The following features were input to our machine learning model: fat-free mass, fat percentage, basal metabolic rate, total body water, right arm fat-free mass, right leg fat-free mass, trunk fat-free mass, trunk fat percentage, sex, age, right leg fat percentage, and right arm fat percentage. We performed classification into diabetes vs. no diabetes classes using linear support vector machine, decision tree, stochastic gradient descent, logistic regression, Gaussian naïve Bayes, k-nearest neighbors (k = 3 and k = 4), and multi-layer perceptron, as well as ensemble learning using random forest, gradient boosting, adaptive boosting, XGBoost, and ensemble voting classifiers with Top3 and Top4 algorithms. 4661 subjects (mean age 47.64 ± 9.37 years, range 35 to 70 years; 2155 male, 2506 female) were analyzed and stratified into 571 and 4090 subjects with and without a self-declared history of diabetes, respectively. Results Age, fat mass, and fat percentages in the legs, arms, and trunk were positively associated with diabetes; fat-free mass in the legs, arms, and trunk, were negatively associated. Using XGBoost, our model attained the best excellent accuracy, precision, recall, and F1-score of 89.96%, 90.20%, 89.65%, and 89.91%, respectively. Conclusions Our machine learning model showed that regional body fat compositions were predictive of diabetes status.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2251-6581
2251-6581
DOI:10.1007/s40200-023-01350-x