Development of a High‐Performance Ultrasound Prediction Model for the Diagnosis of Endometrial Cancer An Interpretable XGBoost Algorithm Utilizing SHAP Analysis
To develop and validate an ultrasonography-based machine learning (ML) model for predicting malignant endometrial and cavitary lesions. This retrospective study was conducted on patients with pathologically confirmed results following transvaginal or transrectal ultrasound from 2021 to 2023. Endomet...
Saved in:
| Published in | Journal of ultrasound in medicine |
|---|---|
| Main Authors | , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
England
29.09.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0278-4297 1550-9613 1550-9613 |
| DOI | 10.1002/jum.70082 |
Cover
| Summary: | To develop and validate an ultrasonography-based machine learning (ML) model for predicting malignant endometrial and cavitary lesions.
This retrospective study was conducted on patients with pathologically confirmed results following transvaginal or transrectal ultrasound from 2021 to 2023. Endometrial ultrasound features were characterized using the International Endometrial Tumor Analysis (IETA) terminology. The dataset was ranomly divided (7:3) into training and validation sets. LASSO (least absolute shrinkage and selection operator) regression was applied for feature selection, and an extreme gradient boosting (XGBoost) model was developed. Performance was assessed via receiver operating characteristic (ROC) analysis, calibration, decision curve analysis, sensitivity, specificity, and accuracy.
Among 1080 patients, 6 had a non-measurable endometrium. Of the remaining 1074 cases, 641 were premenopausal and 433 postmenopausal. Performance of the XGBoost model on the test set: The area under the curve (AUC) for the premenopausal group was 0.845 (0.781-0.909), with a relatively low sensitivity (0.588, 0.442-0.722) and a relatively high specificity (0.923, 0.863-0.959); the AUC for the postmenopausal group was 0.968 (0.944-0.992), with both sensitivity (0.895, 0.778-0.956) and specificity (0.931, 0.839-0.974) being relatively high. SHapley Additive exPlanations (SHAP) analysis identified key predictors: endometrial-myometrial junction, endometrial thickness, endometrial echogenicity, color Doppler flow score, and vascular pattern in premenopausal women; endometrial thickness, endometrial-myometrial junction, endometrial echogenicity, and color Doppler flow score in postmenopausal women.
The XGBoost-based model exhibited excellent predictive performance, particularly in postmenopausal patients. SHAP analysis further enhances interpretability by identifying key ultrasonographic predictors of malignancy. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 0278-4297 1550-9613 1550-9613 |
| DOI: | 10.1002/jum.70082 |