Personalized Risk Prediction in Clinical Oncology Research: Applications and Practical Issues Using Survival Trees and Random Forests

A crucial component of making individualized treatment decisions is to accurately predict each patient's disease risk. In clinical oncology, disease risks are often measured through time-to-event data, such as overall survival and progression/recurrence-free survival, and are often subject to c...

Full description

Saved in:

Bibliographic Details
Published in	Journal of biopharmaceutical statistics Vol. 28; no. 2; pp. 333 - 349
Main Authors	Hu, Chen, Steingrimsson, Jon Arni
Format	Journal Article
Language	English
Published	England Taylor & Francis 04.03.2018 Taylor & Francis Ltd
Subjects	Accuracy Algorithms Breast cancer Breast Neoplasms - drug therapy Breast Neoplasms - mortality Cancer CART Computer Simulation - statistics & numerical data Female Humans Medical Oncology - methods Medical Oncology - statistics & numerical data Monte Carlo Method Oncology Precision Medicine - methods Precision Medicine - statistics & numerical data Progression-Free Survival Proportional Hazards Models Research Design - statistics & numerical data Risk Factors risk prediction Survival Analysis survival forests survival trees survival trees survival analysis risk prediction survival forests CART Cancer
Online Access	Get full text
ISSN	1054-3406 1520-5711 1520-5711
DOI	10.1080/10543406.2017.1377730

Cover

More Information
Summary:	A crucial component of making individualized treatment decisions is to accurately predict each patient's disease risk. In clinical oncology, disease risks are often measured through time-to-event data, such as overall survival and progression/recurrence-free survival, and are often subject to censoring. Risk prediction models based on recursive partitioning methods are becoming increasingly popular largely due to their ability to handle nonlinear relationships, higher-order interactions, and/or high-dimensional covariates. The most popular recursive partitioning methods are versions of the Classification and Regression Tree (CART) algorithm, which builds a simple interpretable tree structured model. With the aim of increasing prediction accuracy, the random forest algorithm averages multiple CART trees, creating a flexible risk prediction model. Risk prediction models used in clinical oncology commonly use both traditional demographic and tumor pathological factors as well as high-dimensional genetic markers and treatment parameters from multimodality treatments. In this article, we describe the most commonly used extensions of the CART and random forest algorithms to right-censored outcomes. We focus on how they differ from the methods for noncensored outcomes, and how the different splitting rules and methods for cost-complexity pruning impact these algorithms. We demonstrate these algorithms by analyzing a randomized Phase III clinical trial of breast cancer. We also conduct Monte Carlo simulations to compare the prediction accuracy of survival forests with more commonly used regression models under various scenarios. These simulation studies aim to evaluate how sensitive the prediction accuracy is to the underlying model specifications, the choice of tuning parameters, and the degrees of missing covariates.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1054-3406 1520-5711 1520-5711
DOI:	10.1080/10543406.2017.1377730