An online clustering algorithm predicting model for prostate cancer based on PHI-related variables and PI-RADS in different PSA populations

Background and aim Prostate cancer is the most common male malignancy. Current diagnostic methods using single TPSA and PHI lack specificity. Some researches have created nomograms for predicting risk, but these are not easily visualized. Our study aims to find the best negative predictive value (NP...

Full description

Saved in:
Bibliographic Details
Published inCancer cell international Vol. 25; no. 1; pp. 44 - 11
Main Authors Hu, Jiyuan, Miao, Qi, Ren, Jiayi, Su, Hongbo, Zhang, Xianlu, Bi, Jianbin, Zhang, Gejun
Format Journal Article
LanguageEnglish
Published London BioMed Central 13.02.2025
Springer Nature B.V
BMC
Subjects
Online AccessGet full text
ISSN1475-2867
1475-2867
DOI10.1186/s12935-025-03677-2

Cover

More Information
Summary:Background and aim Prostate cancer is the most common male malignancy. Current diagnostic methods using single TPSA and PHI lack specificity. Some researches have created nomograms for predicting risk, but these are not easily visualized. Our study aims to find the best negative predictive value (NPV) for PHI, then build a clustering model to display prostate cancer risk categories, particularly useful for patients with PSA > 20 and be actually applied in clinical work. Method We collected 708 patients in the training cohort and 143 in the validation cohort, divided into three groups based on their PSA levels. Next, we determined optimal and customized PHI cut-off values, calculated NPV and PPV, and selected logistic regression as the best method among several machine-learning algorithms. Subsequently, the significant variables were identified, and then a clustering algorithm was constructed. Finally, the model was validated and made available online for further clinical application. Results The Optimal PHI cut-off lower limits for PSA > 4, PSA4-20, PSA > 20 subgroups were 23.85, 24.35, and 40.75, with upper limits of 142.9, 143, and 135.6, respectively. The clustering model of the optimal cohort for PSA > 4 and PSA 4–20 sub-groups showed a superior Silhouette coefficients of 0.433 and 0.526 than that of the customized PHI cohort (0.432, 0.452). The PSA > 20 subgroup owned the highest Silhouette coefficient of 0.572. The validation cohort showed AUC values of 0.761, 0.823, 0.833 for these 3 sub-groups, with accuracy rates of 88.81%, 90.38%, and 82.05%. Conclusion In conclusion, our clustering model effectively categorizes patients into distinct risk groups with clear visualization and has demonstrated stability and reliability in the validation cohort, potentially aiding in early diagnosis of prostate cancer in clinical practice.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1475-2867
1475-2867
DOI:10.1186/s12935-025-03677-2