Dialogue act based expressive speech synthesis in limited domain for the Czech language

This paper deals with expressive speech synthesis in a dialogue. Dialogue acts - discrete expressive categories - are used for expressivity description. The aim of the work is to create a procedure for development of expressive speech synthesis for a dialogue system in a limited domain. The domain i...

Full description

Saved in:

Bibliographic Details
Published in	Informatica (Ljubljana) Vol. 44; no. 2; pp. 147 - 165
Main Authors	Grůber, Martin, Matoušek, Jindřich, Hanzlíček, Zdeněk, Tihelka, Daniel
Format	Journal Article
Language	English
Published	Ljubljana Slovenian Society Informatika / Slovensko drustvo Informatika 01.06.2020
Subjects	Acoustics Algorithms Cameras Domains Human subjects Methods Quality Speech Speech recognition
Online Access	Get full text
ISSN	0350-5596 1854-3871 1854-3871
DOI	10.31449/inf.v44i2.2559

Cover

More Information
Summary:	This paper deals with expressive speech synthesis in a dialogue. Dialogue acts - discrete expressive categories - are used for expressivity description. The aim of the work is to create a procedure for development of expressive speech synthesis for a dialogue system in a limited domain. The domain is here limited to dialogues between a human and a computer on a given topic of reminiscing about personal photographs. To incorporate expressivity into synthetic speech, modihcations of current algorithms used for neutral speech synthesis are made. An expressive speech corpus is recorded, annotated using a predehned set of dialogue acts, and its acoustic analysis is performed. Unit selection and HMM-based methods are used to synthesize expressive speech, and an evaluation using listening tests is presented. The listeners asses two basic aspects of synthetic expressive speech for isolated utterances: speech quality and expressivity perception. The evaluation is also performed for utterances in a dialogue to asses appropriateness of synthetic expressive speech. It can be concluded that synthetic expressive speech is rated positively even though it is of worse quality when comparing with the neutral speech synthesis. However, synthetic expressive speech is able to transmit expressivity to listeners and to improve the naturalness of the synthetic speech.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0350-5596 1854-3871 1854-3871
DOI:	10.31449/inf.v44i2.2559