An online clustering algorithm predicting model for prostate cancer based on PHI-related variables and PI-RADS in different PSA populations

Background and aim Prostate cancer is the most common male malignancy. Current diagnostic methods using single TPSA and PHI lack specificity. Some researches have created nomograms for predicting risk, but these are not easily visualized. Our study aims to find the best negative predictive value (NP...

Full description

Saved in:

Bibliographic Details
Published in	Cancer cell international Vol. 25; no. 1; pp. 44 - 11
Main Authors	Hu, Jiyuan, Miao, Qi, Ren, Jiayi, Su, Hongbo, Zhang, Xianlu, Bi, Jianbin, Zhang, Gejun
Format	Journal Article
Language	English
Published	London BioMed Central 13.02.2025 Springer Nature B.V BMC
Subjects	Algorithms Antigens Biomarkers Biomedical and Life Sciences Biomedicine Biopsy Cancer Research Cell Biology Clinical application Clustering Clustering model Data collection Ethics Hospitals Males Malignancy Patients PHI Prostate cancer Risk groups Software packages China PHI Clustering model Clinical application Biopsy Prostate cancer
Online Access	Get full text
ISSN	1475-2867 1475-2867
DOI	10.1186/s12935-025-03677-2

Cover

More Information
Summary:	Background and aim Prostate cancer is the most common male malignancy. Current diagnostic methods using single TPSA and PHI lack specificity. Some researches have created nomograms for predicting risk, but these are not easily visualized. Our study aims to find the best negative predictive value (NPV) for PHI, then build a clustering model to display prostate cancer risk categories, particularly useful for patients with PSA > 20 and be actually applied in clinical work. Method We collected 708 patients in the training cohort and 143 in the validation cohort, divided into three groups based on their PSA levels. Next, we determined optimal and customized PHI cut-off values, calculated NPV and PPV, and selected logistic regression as the best method among several machine-learning algorithms. Subsequently, the significant variables were identified, and then a clustering algorithm was constructed. Finally, the model was validated and made available online for further clinical application. Results The Optimal PHI cut-off lower limits for PSA > 4, PSA4-20, PSA > 20 subgroups were 23.85, 24.35, and 40.75, with upper limits of 142.9, 143, and 135.6, respectively. The clustering model of the optimal cohort for PSA > 4 and PSA 4–20 sub-groups showed a superior Silhouette coefficients of 0.433 and 0.526 than that of the customized PHI cohort (0.432, 0.452). The PSA > 20 subgroup owned the highest Silhouette coefficient of 0.572. The validation cohort showed AUC values of 0.761, 0.823, 0.833 for these 3 sub-groups, with accuracy rates of 88.81%, 90.38%, and 82.05%. Conclusion In conclusion, our clustering model effectively categorizes patients into distinct risk groups with clear visualization and has demonstrated stability and reliability in the validation cohort, potentially aiding in early diagnosis of prostate cancer in clinical practice.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1475-2867 1475-2867
DOI:	10.1186/s12935-025-03677-2