Noise-Robust DSP-Assisted Neural Pitch Estimation With Very Low Complexity

Pitch estimation is an essential step of many speech processing algorithms, including speech coding, synthesis, and enhancement. Recently, pitch estimators based on deep neural networks (DNNs) have been outperforming well-established DSP-based techniques. Unfortunately, these new estimators can be i...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) pp. 11851 - 11855
Main Authors	Subramani, Krishna, Valin, Jean-Marc, Buthe, Jan, Smaragdis, Paris, Goodwin, Mike
Format	Conference Proceeding
Language	English
Published	IEEE 14.04.2024
Subjects	Artificial neural networks Estimation Frequency estimation instantaneous frequency Pitch estimation Real-time systems Signal processing Signal processing algorithms Speech coding
Online Access	Get full text
ISSN	2379-190X
DOI	10.1109/ICASSP48485.2024.10447962

Cover

More Information
Summary:	Pitch estimation is an essential step of many speech processing algorithms, including speech coding, synthesis, and enhancement. Recently, pitch estimators based on deep neural networks (DNNs) have been outperforming well-established DSP-based techniques. Unfortunately, these new estimators can be impractical to deploy in real-time systems, both because of their relatively high complexity, and the fact that some require significant lookahead. We show that a hybrid estimator using a small deep neural network (DNN) with traditional DSP-based features can match or exceed the performance of pure DNN-based models, with a complexity and algorithmic delay comparable to traditional DSP-based algorithms. We further demonstrate that this hybrid approach can provide benefits for a neural vocoding task.
ISSN:	2379-190X
DOI:	10.1109/ICASSP48485.2024.10447962