Prosody modification in Filipino speech synthesis using dynamic time warping

Prosody is composed of two components: microprosody and macroprosody. Microprosody is solely influenced by individual speech sounds while macroprosody is subject to the speaker's choice of intonation. The paper deals with the latter and describes the result of using dynamic time warping (DTW) f...

Full description

Saved in:
Bibliographic Details
Published inIEEE TENCON 2003 : Conference on Convergent Technologies for the Asia-Pacific Region : October 15-17, 2003, Bangalore, India Vol. 1; pp. 397 - 401 Vol.1
Main Authors Co, M.O., Guevara, R.C.L.
Format Conference Proceeding
LanguageEnglish
Published IEEE 2003
Subjects
Online AccessGet full text
ISBN0780381629
9780780381629
DOI10.1109/TENCON.2003.1273353

Cover

More Information
Summary:Prosody is composed of two components: microprosody and macroprosody. Microprosody is solely influenced by individual speech sounds while macroprosody is subject to the speaker's choice of intonation. The paper deals with the latter and describes the result of using dynamic time warping (DTW) for changing the macroprosody of speech segments in a concatenative synthesizer. Prerecorded Filipino words uttered in isolation are stored in a corpus. When a text is typed, the utterances of the words are searched in the corpus. The acoustical features of macroprosody are extracted from prerecorded utterances of selected Filipino sentences and modified through DTW. The acoustic features are embedded in the speech segments by using the TD-PSOLA. This synthesis process achieves an average MOS of 2.64 in the acceptability test. In a separate test procedure, results showed 98% of the synthesized Filipino sentences were accurately distinguished as either declarative or interrogative sentences.
ISBN:0780381629
9780780381629
DOI:10.1109/TENCON.2003.1273353