Construction of a machine learning-based screening model for IgD myeloma

•Using five machine learning algorithms for the integration of medical data to enhance prediction accuracy.•The random forest model demonstrates high accuracy in screening IgD myeloma patients.•Developing an online tool to anticipate IgD myeloma. Immunoglobulin D (IgD) myeloma is a rare subtype of m...

Full description

Saved in:

Bibliographic Details
Published in	Clinica chimica acta Vol. 577; p. 120488
Main Authors	Zhou, Manli, Feng, Sisi
Format	Journal Article
Language	English
Published	Netherlands Elsevier B.V 01.09.2025
Subjects	Aged Algorithm Algorithms Female Humans IgD myeloma Immunoglobulin D - blood Machine Learning Male Middle Aged Multiple Myeloma - blood Multiple Myeloma - diagnosis Model IgD myeloma Algorithm Machine learning
Online Access	Get full text
ISSN	0009-8981 1873-3492 1873-3492
DOI	10.1016/j.cca.2025.120488

Cover

More Information
Summary:	•Using five machine learning algorithms for the integration of medical data to enhance prediction accuracy.•The random forest model demonstrates high accuracy in screening IgD myeloma patients.•Developing an online tool to anticipate IgD myeloma. Immunoglobulin D (IgD) myeloma is a rare subtype of multiple myeloma (MM), comprising approximately 1 %–2 % of all MM cases. Owing to the diminished levels of IgD in serum, IgD MM manifests as subtle M protein spikes in routine serum electrophoresis, rendering it susceptible to misdiagnosis and underdiagnosis. The objective of this study was to develop a machine learning (ML) model utilizing readily available complete blood count and biochemical test data for the purpose of screening IgD MM. This study encompassed clinical data from 83 newly diagnosed IgD MM patients and 166 non-IgD MM patients, decision tree, random forest, support vector machine (SVM), stochastic gradient descent (SGD), and adaptive boosting (AdaBoost) algorithms were employed for model construction. The predictive performance of the ML models was evaluated using the area under the receiver operating characteristic curve (AUC), calibration curve analysis, and decision curve analysis. The random forest-based screening model demonstrated superior performance, incorporating seven key features: LDH, albumin, creatinine, Ca, β2 microglobulin, age and Hb. It achieved an AUC of 0.954 (95 % CI 0.930–0.977), with a sensitivity of 0.958, specificity of 0.747, positive predictive value of 88.3 % and negative predictive value of 89.9 %. Furthermore, this model has been evaluated in the validation cohort. The model constructed based on the random forest algorithm demonstrates potential in screening IgD MM patients, particularly when routine IgD immunotyping testing is not conducted in clinical practice. This can assist clinicians in early diagnosis and personalized treatment strategies, thereby optimizing the utilization of medical resources.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0009-8981 1873-3492 1873-3492
DOI:	10.1016/j.cca.2025.120488