Convolutional Sequence to Sequence Model with Non-Sequential Greedy Decoding for Grapheme to Phoneme Conversion

The greedy decoding method used in the conventional sequence-to-sequence models is prone to producing a model with a compounding of errors, mainly because it makes inferences in a fixed order, regardless of whether or not the model's previous guesses are correct. We propose a non-sequential gre...

Full description

Saved in:

Bibliographic Details
Published in	2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 2486 - 2490
Main Authors	Moon-Jung Chae, Kyubyong Park, Jinhyun Bang, Soobin Suh, Jonghyuk Park, Kim, Namju, Longhun Park
Format	Conference Proceeding
Language	English
Published	IEEE 01.04.2018
Subjects	Brain modeling Convolutional codes Convolutional neural network Decoding Decoding algorithm Encoder-decoder model Error analysis Grapheme to phoneme conversion Predictive models Sequence to sequence model Task analysis Training
Online Access	Get full text
ISSN	2379-190X
DOI	10.1109/ICASSP.2018.8462678

Cover

More Information
Summary:	The greedy decoding method used in the conventional sequence-to-sequence models is prone to producing a model with a compounding of errors, mainly because it makes inferences in a fixed order, regardless of whether or not the model's previous guesses are correct. We propose a non-sequential greedy decoding method that generalizes the greedy decoding schemes proposed in the past. The proposed method determines not only which token to consider, but also which position in the output sequence to infer at each inference step. Specifically, it allows the model to consider easy parts first, helping the model infer hard parts more easily later by providing more information. We study a grapheme-to-phoneme conversion task with a fully convolutional encoder-decoder model that embeds the proposed decoding method. Experiment results show that our model shows better performance than that of the state-of-the-art model in terms of both phoneme error rate and word error rate.
ISSN:	2379-190X
DOI:	10.1109/ICASSP.2018.8462678