Language Model Score Regularization for Speech Recognition

Inspired by the fact that back-off and interpolated smoothing algorithms have significant effect on statistical language modeling, this paper proposes a sentence-level Language model (LM) score regularization algorithm to improve the fault-tolerance of LMs for recognition errors. The proposed algori...

Full description

Saved in:

Bibliographic Details
Published in	Chinese Journal of Electronics Vol. 28; no. 3; pp. 604 - 609
Main Authors	Zhang, Yike, Zhang, Pengyuan, Yan, Yonghong
Format	Journal Article
Language	English
Published	Published by the IET on behalf of the CIE 01.05.2019
Subjects	Bidirectional neural network bidirectional neural networks composite model consistent word error rate reduction count‐based LMs different order models fault‐tolerance feedforward neural nets Feedforward neural network LMs fixed order Markov assumption interpolated smoothing algorithms Interpolation Language model score regularization Markov processes neural nets N‐best lists re‐scoring show probability recognition errors recurrent neural nets Recurrent neural network LMs regularization algorithm sentence‐level Language model skip‐gram features Speech recognition statistical language modeling recognition errors fault-tolerance sentence-level Language model count-based LMs skip-gram features Bidirectional neural network Recurrent neural network LMs Language model score regularization speech recognition consistent word error rate reduction fixed order Markov assumption N-best lists re-scoring show different order models probability recurrent neural nets regularization algorithm Feedforward neural network LMs Interpolation interpolated smoothing algorithms Markov processes bidirectional neural networks composite model neural nets statistical language modeling feedforward neural nets
Online Access	Get full text
ISSN	1022-4653 2075-5597
DOI	10.1049/cje.2019.03.015

Cover

More Information
Summary:	Inspired by the fact that back-off and interpolated smoothing algorithms have significant effect on statistical language modeling, this paper proposes a sentence-level Language model (LM) score regularization algorithm to improve the fault-tolerance of LMs for recognition errors. The proposed algorithm is applicable to both count-based LMs and neural network LMs. Instead of predicting the occurrence of a sequence of words under a fixed order Markov assumption, we use a composite model consisting of different order models with either n-gram or skip-gram features to estimate the probability of the sequence of words. In order to simplify implementations, we derive a connection between bidirectional neural networks and the proposed algorithm. Experiments were carried out on the Switchboard corpus. Results on N-best lists re-scoring show that the proposed algorithm achieves consistent word error rate reduction when it is applied to count-based LMs, Feedforward neural network (FNN) LMs, and Recurrent neural network (RNN) LMs.
ISSN:	1022-4653 2075-5597
DOI:	10.1049/cje.2019.03.015