Interpretable machine learning for early neurological deterioration prediction in atrial fibrillation-related stroke

We aimed to develop a novel prediction model for early neurological deterioration (END) based on an interpretable machine learning (ML) algorithm for atrial fibrillation (AF)-related stroke and to evaluate the prediction accuracy and feature importance of ML models. Data from multicenter prospective...

Full description

Saved in:
Bibliographic Details
Published inScientific reports Vol. 11; no. 1; pp. 20610 - 9
Main Authors Kim, Seong-Hwan, Jeon, Eun-Tae, Yu, Sungwook, Oh, Kyungmi, Kim, Chi Kyung, Song, Tae-Jin, Kim, Yong-Jae, Heo, Sung Hyuk, Park, Kwang-Yeol, Kim, Jeong-Min, Park, Jong-Ho, Choi, Jay Chol, Park, Man-Seok, Kim, Joon-Tae, Choi, Kang-Ho, Hwang, Yang Ha, Kim, Bum Joon, Chung, Jong-Won, Bang, Oh Young, Kim, Gyeongmoon, Seo, Woo-Keun, Jung, Jin-Man
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 18.10.2021
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text
ISSN2045-2322
2045-2322
DOI10.1038/s41598-021-99920-7

Cover

More Information
Summary:We aimed to develop a novel prediction model for early neurological deterioration (END) based on an interpretable machine learning (ML) algorithm for atrial fibrillation (AF)-related stroke and to evaluate the prediction accuracy and feature importance of ML models. Data from multicenter prospective stroke registries in South Korea were collected. After stepwise data preprocessing, we utilized logistic regression, support vector machine, extreme gradient boosting, light gradient boosting machine (LightGBM), and multilayer perceptron models. We used the Shapley additive explanation (SHAP) method to evaluate feature importance. Of the 3,213 stroke patients, the 2,363 who had arrived at the hospital within 24 h of symptom onset and had available information regarding END were included. Of these, 318 (13.5%) had END. The LightGBM model showed the highest area under the receiver operating characteristic curve (0.772; 95% confidence interval, 0.715–0.829). The feature importance analysis revealed that fasting glucose level and the National Institute of Health Stroke Scale score were the most influential factors. Among ML algorithms, the LightGBM model was particularly useful for predicting END, as it revealed new and diverse predictors. Additionally, the effects of the features on the predictive power of the model were individualized using the SHAP method.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2045-2322
2045-2322
DOI:10.1038/s41598-021-99920-7