Convolutional Sequence to Sequence Model with Non-Sequential Greedy Decoding for Grapheme to Phoneme Conversion

The greedy decoding method used in the conventional sequence-to-sequence models is prone to producing a model with a compounding of errors, mainly because it makes inferences in a fixed order, regardless of whether or not the model's previous guesses are correct. We propose a non-sequential gre...

Full description

Saved in:
Bibliographic Details
Published in2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 2486 - 2490
Main Authors Moon-Jung Chae, Kyubyong Park, Jinhyun Bang, Soobin Suh, Jonghyuk Park, Kim, Namju, Longhun Park
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.04.2018
Subjects
Online AccessGet full text
ISSN2379-190X
DOI10.1109/ICASSP.2018.8462678

Cover

More Information
Summary:The greedy decoding method used in the conventional sequence-to-sequence models is prone to producing a model with a compounding of errors, mainly because it makes inferences in a fixed order, regardless of whether or not the model's previous guesses are correct. We propose a non-sequential greedy decoding method that generalizes the greedy decoding schemes proposed in the past. The proposed method determines not only which token to consider, but also which position in the output sequence to infer at each inference step. Specifically, it allows the model to consider easy parts first, helping the model infer hard parts more easily later by providing more information. We study a grapheme-to-phoneme conversion task with a fully convolutional encoder-decoder model that embeds the proposed decoding method. Experiment results show that our model shows better performance than that of the state-of-the-art model in terms of both phoneme error rate and word error rate.
ISSN:2379-190X
DOI:10.1109/ICASSP.2018.8462678