A Hybrid Text-to-Speech System That Combines Concatenative and Statistical Synthesis Units

Concatenative synthesis and statistical synthesis are the two main approaches to text-to-speech (TTS) synthesis. Concatenative TTS (CTTS) stores natural speech features segments, selected from a recorded speech database. Consequently, CTTS systems enable speech synthesis with natural quality. Howeve...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on audio, speech, and language processing Vol. 19; no. 5; pp. 1278 - 1288
Main Authors Tiomkin, S, Malah, D, Shechtman, S, Kons, Z
Format Journal Article
LanguageEnglish
Published Piscataway, NJ IEEE 01.07.2011
Institute of Electrical and Electronics Engineers
Subjects
Online AccessGet full text
ISSN1558-7916
1558-7924
DOI10.1109/TASL.2010.2089679

Cover

More Information
Summary:Concatenative synthesis and statistical synthesis are the two main approaches to text-to-speech (TTS) synthesis. Concatenative TTS (CTTS) stores natural speech features segments, selected from a recorded speech database. Consequently, CTTS systems enable speech synthesis with natural quality. However, as the footprint of the stored data is reduced, desired segments are not always available in the stored data, and audible discontinuities may result. On the other hand, statistical TTS (STTS) systems, in spite of having a smaller footprint than CTTS, synthesize speech that is free of such discontinuities. Yet, in general, STTS produces lower quality speech than CTTS, in terms of naturalness, as it is often sounding muffled. The muffling effect is due to over-smoothing of model-generated speech features. In order to gain from the advantages of each of the two approaches, we propose in this work to combine CTTS and STTS into a hybrid TTS (HTTS) system. Each utterance representation in HTTS is constructed from natural segments and model generated segments in an interweaved fashion via a hybrid dynamic path algorithm. Reported listening tests demonstrate the validity of the proposed approach.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1558-7916
1558-7924
DOI:10.1109/TASL.2010.2089679