The SMART Framework: Selection of Machine Learning Algorithms With ReplicaTions-A Case Study on the Microvascular Complications of Diabetes

Over 34 million people in the US have diabetes, a major cause of blindness, renal failure, and amputations. Machine learning (ML) models can predict high-risk patients to help prevent adverse outcomes. Selecting the 'best' prediction model for a given disease, population, and clinical appl...

Full description

Saved in:

Bibliographic Details
Published in	IEEE journal of biomedical and health informatics Vol. 26; no. 2; pp. 809 - 817
Main Authors	Swan, Breanna P., Mayorga, Maria E., Ivy, Julie S.
Format	Journal Article
Language	English
Published	United States IEEE 01.02.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Blindness Case reports Case studies Complications Data models Data-driven modeling Decision making Decision theory Diabetes Diabetes Complications Diabetes mellitus Diabetes Mellitus - diagnosis Humans Learning algorithms Machine Learning Measurement Microvasculature Minimax technique Prediction models Predictive models Renal failure Risk Risk aversion Risk groups Robustness Sociology Statistics
Online Access	Get full text
ISSN	2168-2194 2168-2208 2168-2208
DOI	10.1109/JBHI.2021.3094777

Cover

More Information
Summary:	Over 34 million people in the US have diabetes, a major cause of blindness, renal failure, and amputations. Machine learning (ML) models can predict high-risk patients to help prevent adverse outcomes. Selecting the 'best' prediction model for a given disease, population, and clinical application is challenging due to the hundreds of health-related ML models in the literature and the increasing availability of ML methodologies. To support this decision process, we developed the Selection of Machine-learning Algorithms with ReplicaTions (SMART) Framework that integrates building and selecting ML models with decision theory. We build ML models and estimate performance for multiple plausible future populations with a replicated nested cross-validation technique. We rank ML models by simulating decision-maker priorities, using a range of accuracy measures (e.g., AUC) and robustness metrics from decision theory (e.g., minimax Regret). We present the SMART Framework through a case study on the microvascular complications of diabetes using data from the ACCORD clinical trial. We compare selections made by risk-averse, -neutral, and -seeking decision-makers, finding agreement in 80% of the risk-averse and risk-neutral selections, with the risk-averse selections showing consistency for a given complication. We also found that the models that best predicted outcomes in the validation set were those with low performance variance on the testing set, indicating a risk-averse approach in model selection is ideal when there is a potential for high population feature variability. The SMART Framework is a powerful, interactive tool that incorporates various ML algorithms and stakeholder preferences, generalizable to new data and technological advancements.
Bibliography:	ObjectType-Case Study-2 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Feature-4 ObjectType-Report-1 ObjectType-Article-3 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
ISSN:	2168-2194 2168-2208 2168-2208
DOI:	10.1109/JBHI.2021.3094777