Fusing features of speech for depression classification based on higher-order spectral analysis

•Propose a feature fusion method based on the higher-order spectral analysis (HOSA) of bi-spectral features (BSFs) and non-linear bi-coherent features (BCFs).•The HOSA of speech-related features not only has a high accuracy in speech emotion recognition but is also accurate for depression recognitio...

Full description

Saved in:

Bibliographic Details
Published in	Speech communication Vol. 143; pp. 46 - 56
Main Authors	Miao, Xiaolin, Li, Yao, Wen, Min, Liu, Yongyan, Julian, Ibegbu Nnamdi, Guo, Hao
Format	Journal Article
Language	English
Published	Amsterdam Elsevier B.V 01.09.2022 Elsevier Science Ltd
Subjects	Accuracy Algorithms Artificial neural networks Classification Classifier Collaboration Deep learning Depression Diagnosis Feature extraction Feature recognition Higher-order spectrum analysis Machine learning Mental depression Mental health Model accuracy Neural networks Patients Repositories Spectrum analysis Speech recognition Speech-related features Support vector machines Voice recognition Speech-related features Depression Higher-order spectrum analysis Classifier Machine learning
Online Access	Get full text
ISSN	0167-6393 1872-7182 1872-7182
DOI	10.1016/j.specom.2022.07.006

Cover

More Information
Summary:	•Propose a feature fusion method based on the higher-order spectral analysis (HOSA) of bi-spectral features (BSFs) and non-linear bi-coherent features (BCFs).•The HOSA of speech-related features not only has a high accuracy in speech emotion recognition but is also accurate for depression recognition.•Fused features were better than the speech-related features obtained from HOSA in terms of accuracy, and the latter were better than the COVAREP-extracted speech-related features. Approximately 300 million people worldwide suffer from depression, and more than 60% of psychiatric patients do not have access to mental health services due to the shortage of psychiatrists and the high costs associated with clinical diagnosis and treatment. Correct and efficient diagnosis of depression can help overcome these straits. Automatic detection of depressive symptoms can help improve the accuracy and availability of diagnosis. In this paper, a fusion feature for Bispectral Features and Bicoherent Features by using higher-order spectral analysis. Experiments were performed on the Depression Sub-Challenge Dataset of the Audio/Visual Emotion Challenge 2017. The fusion feature fuses higher-order spectral features and traditional speech features with classification weights greater than 100 extracted by using A Collaborative Voice Analysis Repository. The support vector machine and k-nearest neighbor classification algorithms were used as the traditional machine learning models, and the convolutional neural network was used as the deep learning model to verify the proposed features. The experimental results show that under the support vector machine algorithm, the accuracies of extraction of speech-related features by using a collaborative voice analysis repository, The higher-order spectral analysis, and their fusion features were 63.15%, 68.42%, and 73.68%, respectively. Under the k-nearest neighbor classification algorithms model algorithm, the corresponding accuracies were 68.18%, 72.73%, and 77.27%, respectively. For the convolutional neural network model, the corresponding accuracies were 70%, 77%, and 85%, respectively. The results demonstrate that the fusion feature recognition accuracy is high and can be employed to improve the accuracy of depression identification by using traditional machine learning and deep learning models.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0167-6393 1872-7182 1872-7182
DOI:	10.1016/j.specom.2022.07.006