Serum Raman spectroscopy combined with multiple classification models for rapid diagnosis of breast cancer

•The differences in biochemical components between breast cancer and health control serums were analyzed.•Four classification models were used to diagnose breast cancer patients and healthy subjects, with the highest accuracy rate of 100%.•Compared with previous similar spectral studies, this experi...

Full description

Saved in:
Bibliographic Details
Published inPhotodiagnosis and photodynamic therapy Vol. 40; p. 103115
Main Authors Li, Hongtao, Wang, Shanshan, Zeng, Qinggang, Chen, Chen, Lv, Xiaoyi, Ma, Mingrui, Su, Haihua, Ma, Binlin, Chen, Cheng, Fang, Jingjing
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier B.V 01.12.2022
Subjects
Online AccessGet full text
ISSN1572-1000
1873-1597
1873-1597
DOI10.1016/j.pdpdt.2022.103115

Cover

More Information
Summary:•The differences in biochemical components between breast cancer and health control serums were analyzed.•Four classification models were used to diagnose breast cancer patients and healthy subjects, with the highest accuracy rate of 100%.•Compared with previous similar spectral studies, this experiment has a higher classification accuracy. Breast cancer is a malignant tumor with the highest incidence rate in women. Current diagnostic methods are time-consuming, costly, and dependent on physician experience. In this study, we used serum Raman spectroscopy combined with multiple classification algorithms to implement an auxiliary diagnosis method for breast cancer, which will help in the early diagnosis of breast cancer patients. We analyzed the serum Raman spectra of 171 invasive ductal carcinoma (IDC) and 100 healthy volunteers; The analysis showed differences in nucleic acids, carotenoids, amino acids, and lipid concentrations in their blood. These differences provide a theoretical basis for this experiment. First, we used adaptive iteratively reweighted penalized least squares (airPLS) and Savitzky-Golay (SG) for baseline correction and smoothing denoising to remove the effect of noise on the experiment. Then, the Principal component analysis (PCA) algorithm was used to extract features. Finally, we built four classification models: support vector machine (SVM), decision tree (DT), linear discriminant analysis (LDA), and Neural Network Language Model (NNLM). The LDA, SVM, and NNLM achieve 100% accuracy. As supplementary, we added the classification experiment of the raw data. By comparing the experimental results of the two groups, We concluded that the NNLM was the best model. The results show the reliability of the combination of serum Raman spectroscopy and classification models under large sample conditions.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1572-1000
1873-1597
1873-1597
DOI:10.1016/j.pdpdt.2022.103115