The SMART Framework: Selection of Machine Learning Algorithms With ReplicaTions-A Case Study on the Microvascular Complications of Diabetes

Over 34 million people in the US have diabetes, a major cause of blindness, renal failure, and amputations. Machine learning (ML) models can predict high-risk patients to help prevent adverse outcomes. Selecting the 'best' prediction model for a given disease, population, and clinical appl...

Full description

Saved in:
Bibliographic Details
Published inIEEE journal of biomedical and health informatics Vol. 26; no. 2; pp. 809 - 817
Main Authors Swan, Breanna P., Mayorga, Maria E., Ivy, Julie S.
Format Journal Article
LanguageEnglish
Published United States IEEE 01.02.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2168-2194
2168-2208
2168-2208
DOI10.1109/JBHI.2021.3094777

Cover

More Information
Summary:Over 34 million people in the US have diabetes, a major cause of blindness, renal failure, and amputations. Machine learning (ML) models can predict high-risk patients to help prevent adverse outcomes. Selecting the 'best' prediction model for a given disease, population, and clinical application is challenging due to the hundreds of health-related ML models in the literature and the increasing availability of ML methodologies. To support this decision process, we developed the Selection of Machine-learning Algorithms with ReplicaTions (SMART) Framework that integrates building and selecting ML models with decision theory. We build ML models and estimate performance for multiple plausible future populations with a replicated nested cross-validation technique. We rank ML models by simulating decision-maker priorities, using a range of accuracy measures (e.g., AUC) and robustness metrics from decision theory (e.g., minimax Regret). We present the SMART Framework through a case study on the microvascular complications of diabetes using data from the ACCORD clinical trial. We compare selections made by risk-averse, -neutral, and -seeking decision-makers, finding agreement in 80% of the risk-averse and risk-neutral selections, with the risk-averse selections showing consistency for a given complication. We also found that the models that best predicted outcomes in the validation set were those with low performance variance on the testing set, indicating a risk-averse approach in model selection is ideal when there is a potential for high population feature variability. The SMART Framework is a powerful, interactive tool that incorporates various ML algorithms and stakeholder preferences, generalizable to new data and technological advancements.
Bibliography:ObjectType-Case Study-2
SourceType-Scholarly Journals-1
content type line 14
ObjectType-Feature-4
ObjectType-Report-1
ObjectType-Article-3
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:2168-2194
2168-2208
2168-2208
DOI:10.1109/JBHI.2021.3094777