Learning Domain Specific Language Models for Automatic Speech Recognition through Machine Translation
Automatic Speech Recognition (ASR) systems have been gaining popularity in the recent years for their widespread usage in smart phones and speakers. Building ASR systems for task-specific scenarios is subject to the availability of utterances that adhere to the style of the task as well as the langu...
Saved in:
| Main Author | |
|---|---|
| Format | Journal Article |
| Language | English |
| Published |
21.09.2021
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.48550/arxiv.2110.10261 |
Cover
| Summary: | Automatic Speech Recognition (ASR) systems have been gaining popularity in
the recent years for their widespread usage in smart phones and speakers.
Building ASR systems for task-specific scenarios is subject to the availability
of utterances that adhere to the style of the task as well as the language in
question. In our work, we target such a scenario wherein task-specific text
data is available in a language that is different from the target language in
which an ASR Language Model (LM) is expected. We use Neural Machine Translation
(NMT) as an intermediate step to first obtain translations of the task-specific
text data. We then train LMs on the 1-best and N-best translations and study
ways to improve on such a baseline LM. We develop a procedure to derive word
confusion networks from NMT beam search graphs and evaluate LMs trained on
these confusion networks. With experiments on the WMT20 chat translation task
dataset, we demonstrate that NMT confusion networks can help to reduce the
perplexity of both n-gram and recurrent neural network LMs compared to those
trained only on N-best translations. |
|---|---|
| DOI: | 10.48550/arxiv.2110.10261 |