Convolutional Sequence to Sequence Model with Non-Sequential Greedy Decoding for Grapheme to Phoneme Conversion
The greedy decoding method used in the conventional sequence-to-sequence models is prone to producing a model with a compounding of errors, mainly because it makes inferences in a fixed order, regardless of whether or not the model's previous guesses are correct. We propose a non-sequential gre...
Saved in:
Published in | 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 2486 - 2490 |
---|---|
Main Authors | , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.04.2018
|
Subjects | |
Online Access | Get full text |
ISSN | 2379-190X |
DOI | 10.1109/ICASSP.2018.8462678 |
Cover
Summary: | The greedy decoding method used in the conventional sequence-to-sequence models is prone to producing a model with a compounding of errors, mainly because it makes inferences in a fixed order, regardless of whether or not the model's previous guesses are correct. We propose a non-sequential greedy decoding method that generalizes the greedy decoding schemes proposed in the past. The proposed method determines not only which token to consider, but also which position in the output sequence to infer at each inference step. Specifically, it allows the model to consider easy parts first, helping the model infer hard parts more easily later by providing more information. We study a grapheme-to-phoneme conversion task with a fully convolutional encoder-decoder model that embeds the proposed decoding method. Experiment results show that our model shows better performance than that of the state-of-the-art model in terms of both phoneme error rate and word error rate. |
---|---|
ISSN: | 2379-190X |
DOI: | 10.1109/ICASSP.2018.8462678 |