Censoring Sensitivity Analysis for Benchmarking Survival Machine Learning Methods

(1) Background: Survival analysis models in clinical research must effectively handle censored data, where complete survival times are unknown for some subjects. While established methodologies exist for validating standard machine learning models, current benchmarking approaches rarely assess model...

Full description

Saved in:
Bibliographic Details
Published inSci Vol. 7; no. 1; p. 18
Main Authors Báskay, János, Mezei, Tamás, Banczerowski, Péter, Horváth, Anna, Joó, Tamás, Pollner, Péter
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.03.2025
Subjects
Online AccessGet full text
ISSN2413-4155
2413-4155
DOI10.3390/sci7010018

Cover

More Information
Summary:(1) Background: Survival analysis models in clinical research must effectively handle censored data, where complete survival times are unknown for some subjects. While established methodologies exist for validating standard machine learning models, current benchmarking approaches rarely assess model robustness under varying censoring conditions. This limitation creates uncertainty about model reliability in real-world applications where censoring patterns may differ from training data. We address this gap by introducing a systematic benchmarking methodology focused on censoring sensitivity. (2) Methods: We developed a benchmarking framework that assesses survival models through controlled modification of censoring conditions. Five models were evaluated: Cox proportional hazards, survival tree, random survival forest, gradient-boosted survival analysis, and mixture density networks. The framework systematically reduced observation periods and increased censoring rates while measuring performance through multiple metrics following Bayesian hyperparameter optimization. (3) Results: Model performance showed greater sensitivity to increased censoring rates than to reduced observation periods. Non-linear models, especially mixture density networks, exhibited higher vulnerability to data quality degradation. Statistical comparisons became increasingly challenging with higher censoring rates due to widened confidence intervals. (4) Conclusions: Our methodology provides a new standard for evaluating survival analysis models, revealing the critical impact of censoring on model performance. These findings offer practical guidance for model selection and development in clinical applications, emphasizing the importance of robust censoring handling strategies.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2413-4155
2413-4155
DOI:10.3390/sci7010018