Automatic Depression Level Assessment Based on Mel Spectrum and Deep Learning Technology

Traditional diagnostic methods rely on professional inquiries and patient reports, which are time-consuming and susceptible to subjective bias. Therefore, the assisted diagnosis and early prediction based on artificial intelligence technology have gradually attracted attention in the detection of de...

Full description

Saved in:
Bibliographic Details
Published in2025 6th International Conference on Electronic Communication and Artificial Intelligence (ICECAI) pp. 475 - 478
Main Authors Zhang, Tao, Chen, Yanbo, Yue, Jingsong, Li, Mi
Format Conference Proceeding
LanguageEnglish
Published IEEE 20.06.2025
Subjects
Online AccessGet full text
DOI10.1109/ICECAI66283.2025.11170639

Cover

More Information
Summary:Traditional diagnostic methods rely on professional inquiries and patient reports, which are time-consuming and susceptible to subjective bias. Therefore, the assisted diagnosis and early prediction based on artificial intelligence technology have gradually attracted attention in the detection of depression. In this study, an automatic depression recognition algorithm based on audio analysis and deep learning was proposed. By extracting the Mel spectrogram of the audio signal, a depression assessment model combining Resnet module, Transformer and multi-scale CNN was designed. The model combines ResNet to extract local features, Transformer captures global dependencies, and enhances expression ability through multi-scale feature extraction. The experiment was carried out on the DAIC-WOZ database, which contained the audio data of 189 subjects, and the results showed that the model showed high-precision prediction ability with MAE of 4.28 and RMSE of 5.31 in the depression detection task, which provided a new perspective and method for the identification of depression based on audio analysis.
DOI:10.1109/ICECAI66283.2025.11170639