Long-term survival prediction in patients with acute brain lesions using ensemble machine learning algorithms: a cohort study with combined national health insurance service and its self-run hospital database
Long-term prognosis of patients with acute brain lesions can be elucidated by machine learning algorithms recently. We propose machine learning (ML) models that predict long-term survival after stroke and traumatic brain injury (TBI) based on a combined national health insurance (NHIS) and its self-...
Saved in:
| Published in | Journal of big data Vol. 12; no. 1; pp. 79 - 14 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Cham
Springer International Publishing
01.04.2025
Springer Nature B.V SpringerOpen |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2196-1115 2196-1115 |
| DOI | 10.1186/s40537-025-01133-6 |
Cover
| Summary: | Long-term prognosis of patients with acute brain lesions can be elucidated by machine learning algorithms recently. We propose machine learning (ML) models that predict long-term survival after stroke and traumatic brain injury (TBI) based on a combined national health insurance (NHIS) and its self-run hospital database. This retrospective cohort study included adults aged ≥ 20 years who were diagnosed with stroke or TBI between 2009 and 2018. We categorized the participants into long-term care insurance (LTCI) and non-LTCI beneficiaries. The dependent variable was survival until December 31, 2018. Random forest (RF), extreme gradient boosting (XGB), and algorithm stacking methods were used for prediction. Of the total patients, 1839 patients were in the LTCI group, while 2950 patients were in the non-LTCI group. The LTCI group was older (76.1 ± 9.4 years old vs. 62.4 ± 13.6 years old;
P
< 0.001) and had a lower survival rate (52.3% vs. 23.6%;
P
< 0.001). In the LTCI group, the stacking algorithms showed the highest area under the Receiver Operating Characteristic curve (AUC) value of 0.864 and F1-score of 0.872 (sensitivity, 0.914; specificity, 0.814; positive predictive value, 0.833; negative predictive value, 0.903). Both RF and XGB had an AUC of 0.834. In the non-LTCI group, XGB showed the highest AUC value at 0.810 and an F1-score of 0.745 (sensitivity, 0.961; specificity, 0.659; positive predictive value, 0.889; negative predictive value, 0.856). The RF and stacking algorithms showed AUCs of 0.780 and 0.801, respectively. ML algorithms applied in this study showed valid and high accuracy and AUC values for predicting long-term survival in patients with acute brain lesions by using various variables from the combined database of the NHIS and its self-run hospital. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2196-1115 2196-1115 |
| DOI: | 10.1186/s40537-025-01133-6 |