Censoring Sensitivity Analysis for Benchmarking Survival Machine Learning Methods
(1) Background: Survival analysis models in clinical research must effectively handle censored data, where complete survival times are unknown for some subjects. While established methodologies exist for validating standard machine learning models, current benchmarking approaches rarely assess model...
Saved in:
| Published in | Sci Vol. 7; no. 1; p. 18 |
|---|---|
| Main Authors | , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Basel
MDPI AG
01.03.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2413-4155 2413-4155 |
| DOI | 10.3390/sci7010018 |
Cover
| Summary: | (1) Background: Survival analysis models in clinical research must effectively handle censored data, where complete survival times are unknown for some subjects. While established methodologies exist for validating standard machine learning models, current benchmarking approaches rarely assess model robustness under varying censoring conditions. This limitation creates uncertainty about model reliability in real-world applications where censoring patterns may differ from training data. We address this gap by introducing a systematic benchmarking methodology focused on censoring sensitivity. (2) Methods: We developed a benchmarking framework that assesses survival models through controlled modification of censoring conditions. Five models were evaluated: Cox proportional hazards, survival tree, random survival forest, gradient-boosted survival analysis, and mixture density networks. The framework systematically reduced observation periods and increased censoring rates while measuring performance through multiple metrics following Bayesian hyperparameter optimization. (3) Results: Model performance showed greater sensitivity to increased censoring rates than to reduced observation periods. Non-linear models, especially mixture density networks, exhibited higher vulnerability to data quality degradation. Statistical comparisons became increasingly challenging with higher censoring rates due to widened confidence intervals. (4) Conclusions: Our methodology provides a new standard for evaluating survival analysis models, revealing the critical impact of censoring on model performance. These findings offer practical guidance for model selection and development in clinical applications, emphasizing the importance of robust censoring handling strategies. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2413-4155 2413-4155 |
| DOI: | 10.3390/sci7010018 |