Different Performances of Machine Learning Models to Classify Dysphonic and Non-Dysphonic Voices

To analyze the performance of 10 different machine learning (ML) classifiers for discrimination between dysphonic and non-dysphonic voices, using a variance threshold as a method for the selection and reduction of acoustic measurements used in the classifier. We analyzed 435 samples of individuals (...

Full description

Saved in:
Bibliographic Details
Published inJournal of voice Vol. 39; no. 3; pp. 577 - 590
Main Authors Leite, Danilo Rangel Arruda, de Moraes, Ronei Marcos, Lopes, Leonardo Wanderley
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 01.05.2025
Subjects
Online AccessGet full text
ISSN0892-1997
1873-4588
1873-4588
DOI10.1016/j.jvoice.2022.11.001

Cover

More Information
Summary:To analyze the performance of 10 different machine learning (ML) classifiers for discrimination between dysphonic and non-dysphonic voices, using a variance threshold as a method for the selection and reduction of acoustic measurements used in the classifier. We analyzed 435 samples of individuals (337 female and 98 male), with a mean age of 41.07 ± 13.73 years, of which 384 were dysphonic and 51 were non-dysphonic. From the sustained /ε/ vowel sample, 34 acoustic measurements were extracted, including traditional perturbation and noise measurements, cepstral/spectral measurements, and measurements based on nonlinear models. The variance method was used to select the best set of acoustic measurements. We tested the performance of the best-selected set with 10 ML classifiers using precision, sensitivity, specificity, accuracy, and F1-Score measurements. The kappa coefficient was used to verify the reproducibility between the two datasets (training and testing). The naive Bayes (NB) and stochastic gradient descent classifier (SGDC) models performed best in terms of accuracy, AUC, sensitivity, and specificity for a reduced dataset of 15 acoustic measures compared to the full dataset of 34 acoustic measures. SGDC and NB obtained the best performance results, with an accuracy of 0.91 and 0.76, respectively. These two classifiers presented moderate agreement, with a Kappa of 0.57 (SGDC) and 0.45 (NB). Among the tested models, the NB and SGDC models performed better in discriminating between dysphonic and non-dysphonic voices from a set of 15 acoustic measures.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0892-1997
1873-4588
1873-4588
DOI:10.1016/j.jvoice.2022.11.001