Application of machine learning in predicting survival outcomes involving real-world data: a scoping review

Background Despite the interest in machine learning (ML) algorithms for analyzing real-world data (RWD) in healthcare, the use of ML in predicting time-to-event data, a common scenario in clinical practice, is less explored. ML models are capable of algorithmically learning from large, complex datas...

Full description

Saved in:

Bibliographic Details
Published in	BMC medical research methodology Vol. 23; no. 1; pp. 268 - 11
Main Authors	Huang, Yinan, Li, Jieni, Li, Mai, Aparasu, Rajender R.
Format	Journal Article
Language	English
Published	London BioMed Central 13.11.2023 BioMed Central Ltd Springer Nature B.V BMC
Subjects	Algorithms Analysis Artificial intelligence Cancer Care and treatment Chronic diseases Datasets Diagnosis Disease Electronic health records Event history analysis Health aspects Health Sciences Health services Humans Machine Learning Medical prognosis Medical research Medicine Medicine & Public Health Methods Neural network Neural networks Neural Networks, Computer Oncology Patient outcomes Patients Prediction Methods for Rare Diseases or Outcomes Prognosis Random survival forest Real-world datasets Statistical Theory and Methods Statistics for Life Sciences Survival analysis Theory of Medicine/Bioethics Treatment Outcome United States United States > US Real-world datasets Random survival forest Neural network Machine learning
Online Access	Get full text
ISSN	1471-2288 1471-2288
DOI	10.1186/s12874-023-02078-1

Cover

More Information
Summary:	Background Despite the interest in machine learning (ML) algorithms for analyzing real-world data (RWD) in healthcare, the use of ML in predicting time-to-event data, a common scenario in clinical practice, is less explored. ML models are capable of algorithmically learning from large, complex datasets and can offer advantages in predicting time-to-event data. We reviewed the recent applications of ML for survival analysis using RWD in healthcare. Methods PUBMED and EMBASE were searched from database inception through March 2023 to identify peer-reviewed English-language studies of ML models for predicting time-to-event outcomes using the RWD. Two reviewers extracted information on the data source, patient population, survival outcome, ML algorithms, and the Area Under the Curve (AUC). Results Of 257 citations, 28 publications were included. Random survival forests ( N = 16, 57%) and neural networks ( N = 11, 39%) were the most popular ML algorithms. There was variability across AUC for these ML models (median 0.789, range 0.6–0.950). ML algorithms were predominately considered for predicting overall survival in oncology ( N = 12, 43%). ML survival models were often used to predict disease prognosis or clinical events ( N = 27, 96%) in the oncology, while less were used for treatment outcomes ( N = 1, 4%). Conclusions The ML algorithms, random survival forests and neural networks, are mainly used for RWD to predict survival outcomes such as disease prognosis or clinical events in the oncology. This review shows that more opportunities remain to apply these ML algorithms to inform treatment decision-making in clinical practice. More methodological work is also needed to ensure the utility and applicability of ML models in survival outcomes.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Literature Review-2 ObjectType-Feature-3 ObjectType-Article-2 content type line 23 ObjectType-Review-1
ISSN:	1471-2288 1471-2288
DOI:	10.1186/s12874-023-02078-1