Different Performances of Machine Learning Models to Classify Dysphonic and Non-Dysphonic Voices

To analyze the performance of 10 different machine learning (ML) classifiers for discrimination between dysphonic and non-dysphonic voices, using a variance threshold as a method for the selection and reduction of acoustic measurements used in the classifier. We analyzed 435 samples of individuals (...

Full description

Saved in:

Bibliographic Details
Published in	Journal of voice Vol. 39; no. 3; pp. 577 - 590
Main Authors	Leite, Danilo Rangel Arruda, de Moraes, Ronei Marcos, Lopes, Leonardo Wanderley
Format	Journal Article
Language	English
Published	United States Elsevier Inc 01.05.2025
Subjects	Acoustic Acoustics Adult Artificial intelligence Bayes Theorem Case-Control Studies Dysphonia - classification Dysphonia - diagnosis Dysphonia - physiopathology Female Humans Machine Learning Male Middle Aged Otolaryngology Predictive Value of Tests Reproducibility of Results Signal Processing, Computer-Assisted Sound Spectrography Speech Acoustics Speech Production Measurement Voice Voice disorders Voice Quality Young Adult Voice disorders Acoustic Voice Artificial intelligence Machine learning
Online Access	Get full text
ISSN	0892-1997 1873-4588 1873-4588
DOI	10.1016/j.jvoice.2022.11.001

Cover

More Information
Summary:	To analyze the performance of 10 different machine learning (ML) classifiers for discrimination between dysphonic and non-dysphonic voices, using a variance threshold as a method for the selection and reduction of acoustic measurements used in the classifier. We analyzed 435 samples of individuals (337 female and 98 male), with a mean age of 41.07 ± 13.73 years, of which 384 were dysphonic and 51 were non-dysphonic. From the sustained /ε/ vowel sample, 34 acoustic measurements were extracted, including traditional perturbation and noise measurements, cepstral/spectral measurements, and measurements based on nonlinear models. The variance method was used to select the best set of acoustic measurements. We tested the performance of the best-selected set with 10 ML classifiers using precision, sensitivity, specificity, accuracy, and F1-Score measurements. The kappa coefficient was used to verify the reproducibility between the two datasets (training and testing). The naive Bayes (NB) and stochastic gradient descent classifier (SGDC) models performed best in terms of accuracy, AUC, sensitivity, and specificity for a reduced dataset of 15 acoustic measures compared to the full dataset of 34 acoustic measures. SGDC and NB obtained the best performance results, with an accuracy of 0.91 and 0.76, respectively. These two classifiers presented moderate agreement, with a Kappa of 0.57 (SGDC) and 0.45 (NB). Among the tested models, the NB and SGDC models performed better in discriminating between dysphonic and non-dysphonic voices from a set of 15 acoustic measures.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0892-1997 1873-4588 1873-4588
DOI:	10.1016/j.jvoice.2022.11.001