Quality of synthetic speech : perceptual dimensions, influencing factors, and instrumental assessment

This book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and int...

Full description

Saved in:
Bibliographic Details
Main Author: Hinterleitner, Florian.
Format: eBook
Language: English
Published: Singapore : Springer, [2017]
Series: T-labs series in telecommunication services.
Subjects:
ISBN: 9789811037344
9789811037337
Physical Description: 1 online resource

Cover

Table of contents

LEADER 05939cam a2200505Ii 4500
001 100001
003 CZ-ZlUTB
005 20240914112435.0
006 m o d
007 cr cnu|||unuuu
008 170411s2017 si ob 000 0 eng d
040 |a N$T  |b eng  |e rda  |e pn  |c N$T  |d N$T  |d GW5XE  |d EBLCP  |d OCLCF  |d YDX  |d UAB  |d ESU  |d AZU  |d UPM  |d OCLCA  |d VT2  |d OTZ  |d OCLCQ  |d IOG  |d U3W  |d CAUOI  |d OCLCQ  |d KSU  |d EZ9  |d WYU  |d OCLCQ  |d UKMGB  |d UKAHL  |d OCLCQ  |d ERF  |d UKBTH  |d LEATE  |d OCLCQ 
020 |a 9789811037344  |q (electronic bk.) 
020 |z 9789811037337 
024 7 |a 10.1007/978-981-10-3734-4  |2 doi 
035 |a (OCoLC)982121294  |z (OCoLC)982156932  |z (OCoLC)982244262  |z (OCoLC)982327851  |z (OCoLC)982394785  |z (OCoLC)982529816  |z (OCoLC)982738414  |z (OCoLC)988383479  |z (OCoLC)999522251  |z (OCoLC)1005793895  |z (OCoLC)1011849632  |z (OCoLC)1048145298  |z (OCoLC)1066633937  |z (OCoLC)1086475366  |z (OCoLC)1112547622  |z (OCoLC)1112841744  |z (OCoLC)1112953962  |z (OCoLC)1116922704  |z (OCoLC)1122818746  |z (OCoLC)1127156212 
100 1 |a Hinterleitner, Florian. 
245 1 0 |a Quality of synthetic speech :  |b perceptual dimensions, influencing factors, and instrumental assessment /  |c Florian Hinterleitner. 
264 1 |a Singapore :  |b Springer,  |c [2017] 
300 |a 1 online resource 
336 |a text  |b txt  |2 rdacontent 
337 |a počítač  |b c  |2 rdamedia 
338 |a online zdroj  |b cr  |2 rdacarrier 
490 1 |a T-labs series in telecommunication services 
504 |a Includes bibliographical references. 
506 |a Plný text je dostupný pouze z IP adres počítačů Univerzity Tomáše Bati ve Zlíně nebo vzdáleným přístupem pro zaměstnance a studenty 
520 |a This book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and intelligibility, (iv) absence of disturbances, and (v) calmness. Moreover, a test protocol for the efficient indentification of those dimensions in a listening test is introduced. Furthermore, several factors influencing these dimensions are examined. In addition, different techniques for the instrumental quality assessment of TTS signals are introduced, reviewed and tested. Finally, the requirements for the integration of an instrumental quality measure into a concatenative TTS system are examined. 
505 0 |a Acknowledgements; Contents; Acronyms; Abstract; 1 Introduction; 1.1 Motivation; 1.2 Outline; References; 2 Speech Synthesis; 2.1 Setup of a Speech Synthesizer; 2.1.1 Natural Language Processing (NLP); 2.1.2 Prosody Generation; 2.1.3 Concatenation and Generation of Speech-Signal Parameters; 2.1.4 Speech Signal Generation; 2.2 The Mary Text-to-Speech System (MaryTTS); References; 3 Auditory and Instrumental Quality Evaluation Metrics; 3.1 What Is Perceptual Quality?; 3.2 Taxonomy for the Quality Assessment of Synthetic Speech; 3.2.1 Glass Box Versus Black Box. 
505 8 |a 3.2.2 Laboratory Versus Field Studies3.2.3 Linguistic Versus Acoustic; 3.2.4 Auditory Versus Instrumental; 3.3 Auditory Quality Evaluation Metrics; 3.3.1 Functional TestsThe content of this section has previously been published in a slightly different version in [6].; 3.3.2 Judgment TestsParts of the content of this section have previously been published in a slightly different version in [13] and [6].; 3.4 Instrumental Quality Evaluation Metrics; 3.4.1 Reference-Based MeasuresParts of the content of this section have previously been published in a slightly different version in [21]. 
505 8 |a 3.4.2 Reference-Free MeasuresReferences; 4 Perceptual Quality Dimensions; 4.1 State-of-the-Art Perceptual Quality DimensionsParts of the content of this section have previously been published in a slightly different version in [1].; 4.1.1 Study: Kraft and Portele (Kraft1995); 4.1.2 Study: Mayo et al. I (Mayo2005); 4.1.3 Study: Viswanathan and Viswanathan (Vis2005); 4.1.4 Study: Seget (Seget2007); 4.1.5 Study: Hinterleitner (Hint2010); 4.1.6 Study: Mayo et al. II (Mayo2011); 4.1.7 Restrictions of Discussed Studies. 
505 8 |a 4.2 Semantic Differential and Factor AnalysisParts of the content of this section have previously been published in a slightly different version in [13].4.2.1 Experimental Setup; 4.2.2 Statistical Analysis; 4.3 Sorting Task and Multidimensional ScalingParts of the content of this section have previously been published in a slightly different version in [16].; 4.3.1 Experimental Setup; 4.3.2 Statistical Analysis; 4.4 Summary of the SD/FA and ST/MDS StudiesParts of the content of this section have previously been published in a slightly different version in [16]. 
505 8 |a 4.5 4.5 Universal Perceptual Quality Dimensions4.5.1 Naturalness of Voice; 4.5.2 Prosodic Quality; 4.5.3 Fluency and Intelligibility; 4.5.4 Absence of Disturbances; 4.5.5 Calmness; 4.5.6 Instructions for TTS Quality Assessment; 4.6 Summary; References; 5 Influencing Factors on Perceptual Quality; 5.1 Influence of the ApplicationParts of the content of this section have previously been published in a slightly different version in [1].; 5.1.1 Pretest; 5.1.2 Main TestThe content of this section has previously been published in a slightly different version in [10].; 5.1.3 Conclusions. 
590 |a SpringerLink  |b Springer Complete eBooks 
650 0 |a Speech synthesis. 
650 0 |a Speech processing systems. 
650 0 |a Text-to-speech software. 
650 0 |a Telecommunication. 
655 7 |a elektronické knihy  |7 fd186907  |2 czenas 
655 9 |a electronic books  |2 eczenas 
776 0 8 |i Printed edition:  |z 9789811037337 
830 0 |a T-labs series in telecommunication services. 
856 4 0 |u https://proxy.k.utb.cz/login?url=https://link.springer.com/10.1007/978-981-10-3734-4  |y Plný text 
992 |c NTK-SpringerENG 
999 |c 100001  |d 100001 
993 |x NEPOSILAT  |y EIZ