The SMART Framework: Selection of Machine Learning Algorithms With ReplicaTions-A Case Study on the Microvascular Complications of Diabetes
Over 34 million people in the US have diabetes, a major cause of blindness, renal failure, and amputations. Machine learning (ML) models can predict high-risk patients to help prevent adverse outcomes. Selecting the 'best' prediction model for a given disease, population, and clinical appl...
Saved in:
| Published in | IEEE journal of biomedical and health informatics Vol. 26; no. 2; pp. 809 - 817 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
IEEE
01.02.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2168-2194 2168-2208 2168-2208 |
| DOI | 10.1109/JBHI.2021.3094777 |
Cover
| Summary: | Over 34 million people in the US have diabetes, a major cause of blindness, renal failure, and amputations. Machine learning (ML) models can predict high-risk patients to help prevent adverse outcomes. Selecting the 'best' prediction model for a given disease, population, and clinical application is challenging due to the hundreds of health-related ML models in the literature and the increasing availability of ML methodologies. To support this decision process, we developed the Selection of Machine-learning Algorithms with ReplicaTions (SMART) Framework that integrates building and selecting ML models with decision theory. We build ML models and estimate performance for multiple plausible future populations with a replicated nested cross-validation technique. We rank ML models by simulating decision-maker priorities, using a range of accuracy measures (e.g., AUC) and robustness metrics from decision theory (e.g., minimax Regret). We present the SMART Framework through a case study on the microvascular complications of diabetes using data from the ACCORD clinical trial. We compare selections made by risk-averse, -neutral, and -seeking decision-makers, finding agreement in 80% of the risk-averse and risk-neutral selections, with the risk-averse selections showing consistency for a given complication. We also found that the models that best predicted outcomes in the validation set were those with low performance variance on the testing set, indicating a risk-averse approach in model selection is ideal when there is a potential for high population feature variability. The SMART Framework is a powerful, interactive tool that incorporates various ML algorithms and stakeholder preferences, generalizable to new data and technological advancements. |
|---|---|
| Bibliography: | ObjectType-Case Study-2 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Feature-4 ObjectType-Report-1 ObjectType-Article-3 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 2168-2194 2168-2208 2168-2208 |
| DOI: | 10.1109/JBHI.2021.3094777 |