Machine learning driven biomarker selection for medical diagnosis

Recent advances in experimental methods have enabled researchers to collect data on thousands of analytes simultaneously. This has led to correlational studies that associated molecular measurements with diseases such as Alzheimer’s, Liver, and Gastric Cancer. However, the use of thousands of biomar...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 20; no. 6; p. e0322620
Main Authors Bavikadi, Divyagna, Agarwal, Ayushi, Ganta, Shashank, Chung, Yunro, Song, Lusheng, Qiu, Ji, Shakarian, Paulo
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 11.06.2025
Public Library of Science (PLoS)
Subjects
Online AccessGet full text
ISSN1932-6203
1932-6203
DOI10.1371/journal.pone.0322620

Cover

More Information
Summary:Recent advances in experimental methods have enabled researchers to collect data on thousands of analytes simultaneously. This has led to correlational studies that associated molecular measurements with diseases such as Alzheimer’s, Liver, and Gastric Cancer. However, the use of thousands of biomarkers selected from the analytes is not practical for real-world medical diagnosis and is likely undesirable due to potentially formed spurious correlations. In this study, we evaluate 4 different methods for biomarker selection and 5 different machine learning (ML) classifiers for identifying correlations—evaluating 20 approaches in all. We found that contemporary methods outperform previously reported logistic regression in cases where 3 and 10 biomarkers are permitted. When specificity is fixed at 0.9, ML approaches produced a sensitivity of 0.240 (3 biomarkers) and 0.520 (10 biomarkers), while standard logistic regression provided a sensitivity of 0.000 (3 biomarkers) and 0.040 (10 biomarkers). We also noted that causal-based methods for biomarker selection proved to be the most performant when fewer biomarkers were permitted, while univariate feature selection was the most performant when a greater number of biomarkers were permitted.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Competing Interests: No authors have competing interests.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0322620