Long-term survival prediction in patients with acute brain lesions using ensemble machine learning algorithms: a cohort study with combined national health insurance service and its self-run hospital database

Long-term prognosis of patients with acute brain lesions can be elucidated by machine learning algorithms recently. We propose machine learning (ML) models that predict long-term survival after stroke and traumatic brain injury (TBI) based on a combined national health insurance (NHIS) and its self-...

Full description

Saved in:
Bibliographic Details
Published inJournal of big data Vol. 12; no. 1; pp. 79 - 14
Main Authors Park, Dougho, Hong, Daeyoung, Jin, Suntak, Kim, Jong Hun, Kim, Hyoung Seop
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.04.2025
Springer Nature B.V
SpringerOpen
Subjects
Online AccessGet full text
ISSN2196-1115
2196-1115
DOI10.1186/s40537-025-01133-6

Cover

More Information
Summary:Long-term prognosis of patients with acute brain lesions can be elucidated by machine learning algorithms recently. We propose machine learning (ML) models that predict long-term survival after stroke and traumatic brain injury (TBI) based on a combined national health insurance (NHIS) and its self-run hospital database. This retrospective cohort study included adults aged ≥ 20 years who were diagnosed with stroke or TBI between 2009 and 2018. We categorized the participants into long-term care insurance (LTCI) and non-LTCI beneficiaries. The dependent variable was survival until December 31, 2018. Random forest (RF), extreme gradient boosting (XGB), and algorithm stacking methods were used for prediction. Of the total patients, 1839 patients were in the LTCI group, while 2950 patients were in the non-LTCI group. The LTCI group was older (76.1 ± 9.4 years old vs. 62.4 ± 13.6 years old; P  < 0.001) and had a lower survival rate (52.3% vs. 23.6%; P  < 0.001). In the LTCI group, the stacking algorithms showed the highest area under the Receiver Operating Characteristic curve (AUC) value of 0.864 and F1-score of 0.872 (sensitivity, 0.914; specificity, 0.814; positive predictive value, 0.833; negative predictive value, 0.903). Both RF and XGB had an AUC of 0.834. In the non-LTCI group, XGB showed the highest AUC value at 0.810 and an F1-score of 0.745 (sensitivity, 0.961; specificity, 0.659; positive predictive value, 0.889; negative predictive value, 0.856). The RF and stacking algorithms showed AUCs of 0.780 and 0.801, respectively. ML algorithms applied in this study showed valid and high accuracy and AUC values for predicting long-term survival in patients with acute brain lesions by using various variables from the combined database of the NHIS and its self-run hospital.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2196-1115
2196-1115
DOI:10.1186/s40537-025-01133-6