Optimizing sub-cost functions for segment selection based on perceptual evaluations in concatenative speech synthesis

In concatenative speech synthesis, various factors affect the naturalness of synthetic speech. A cost for segment selection is calculated by integrating some sub-costs capturing the degradation of naturalness caused by such factors. In this paper, we optimize each sub-cost function for converting a...

Full description

Saved in:
Bibliographic Details
Published in2004 IEEE International Conference on Acoustics, Speech and Signal Processing Vol. 1; pp. I - 657
Main Authors Toda, T., Kawai, H., Tsuzaki, M.
Format Conference Proceeding
LanguageEnglish
Japanese
Published Piscataway, N.J IEEE 28.09.2004
Subjects
Online AccessGet full text
ISBN9780780384842
0780384849
ISSN1520-6149
DOI10.1109/ICASSP.2004.1326071

Cover

More Information
Summary:In concatenative speech synthesis, various factors affect the naturalness of synthetic speech. A cost for segment selection is calculated by integrating some sub-costs capturing the degradation of naturalness caused by such factors. In this paper, we optimize each sub-cost function for converting a linguistic feature or an acoustic parameter into a sub-cost based on perceptual evaluations. Two types of perceptual experiments are performed with test sets constructed by controlling the variations of sub-costs to evaluate the independent effect of each sub-cost and the interactions between them. We clarify the effectiveness of perceptually optimizing subcost functions from a result of a preference test comparing synthetic speech before and after the optimization.
ISBN:9780780384842
0780384849
ISSN:1520-6149
DOI:10.1109/ICASSP.2004.1326071