Quality of synthetic speech : perceptual dimensions, influencing factors, and instrumental assessment
This book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and int...
Saved in:
Main Author: | |
---|---|
Format: | eBook |
Language: | English |
Published: |
Singapore :
Springer,
[2017]
|
Series: | T-labs series in telecommunication services.
|
Subjects: | |
ISBN: | 9789811037344 9789811037337 |
Physical Description: | 1 online resource |
LEADER | 05939cam a2200505Ii 4500 | ||
---|---|---|---|
001 | 100001 | ||
003 | CZ-ZlUTB | ||
005 | 20240914112435.0 | ||
006 | m o d | ||
007 | cr cnu|||unuuu | ||
008 | 170411s2017 si ob 000 0 eng d | ||
040 | |a N$T |b eng |e rda |e pn |c N$T |d N$T |d GW5XE |d EBLCP |d OCLCF |d YDX |d UAB |d ESU |d AZU |d UPM |d OCLCA |d VT2 |d OTZ |d OCLCQ |d IOG |d U3W |d CAUOI |d OCLCQ |d KSU |d EZ9 |d WYU |d OCLCQ |d UKMGB |d UKAHL |d OCLCQ |d ERF |d UKBTH |d LEATE |d OCLCQ | ||
020 | |a 9789811037344 |q (electronic bk.) | ||
020 | |z 9789811037337 | ||
024 | 7 | |a 10.1007/978-981-10-3734-4 |2 doi | |
035 | |a (OCoLC)982121294 |z (OCoLC)982156932 |z (OCoLC)982244262 |z (OCoLC)982327851 |z (OCoLC)982394785 |z (OCoLC)982529816 |z (OCoLC)982738414 |z (OCoLC)988383479 |z (OCoLC)999522251 |z (OCoLC)1005793895 |z (OCoLC)1011849632 |z (OCoLC)1048145298 |z (OCoLC)1066633937 |z (OCoLC)1086475366 |z (OCoLC)1112547622 |z (OCoLC)1112841744 |z (OCoLC)1112953962 |z (OCoLC)1116922704 |z (OCoLC)1122818746 |z (OCoLC)1127156212 | ||
100 | 1 | |a Hinterleitner, Florian. | |
245 | 1 | 0 | |a Quality of synthetic speech : |b perceptual dimensions, influencing factors, and instrumental assessment / |c Florian Hinterleitner. |
264 | 1 | |a Singapore : |b Springer, |c [2017] | |
300 | |a 1 online resource | ||
336 | |a text |b txt |2 rdacontent | ||
337 | |a počítač |b c |2 rdamedia | ||
338 | |a online zdroj |b cr |2 rdacarrier | ||
490 | 1 | |a T-labs series in telecommunication services | |
504 | |a Includes bibliographical references. | ||
506 | |a Plný text je dostupný pouze z IP adres počítačů Univerzity Tomáše Bati ve Zlíně nebo vzdáleným přístupem pro zaměstnance a studenty | ||
520 | |a This book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and intelligibility, (iv) absence of disturbances, and (v) calmness. Moreover, a test protocol for the efficient indentification of those dimensions in a listening test is introduced. Furthermore, several factors influencing these dimensions are examined. In addition, different techniques for the instrumental quality assessment of TTS signals are introduced, reviewed and tested. Finally, the requirements for the integration of an instrumental quality measure into a concatenative TTS system are examined. | ||
505 | 0 | |a Acknowledgements; Contents; Acronyms; Abstract; 1 Introduction; 1.1 Motivation; 1.2 Outline; References; 2 Speech Synthesis; 2.1 Setup of a Speech Synthesizer; 2.1.1 Natural Language Processing (NLP); 2.1.2 Prosody Generation; 2.1.3 Concatenation and Generation of Speech-Signal Parameters; 2.1.4 Speech Signal Generation; 2.2 The Mary Text-to-Speech System (MaryTTS); References; 3 Auditory and Instrumental Quality Evaluation Metrics; 3.1 What Is Perceptual Quality?; 3.2 Taxonomy for the Quality Assessment of Synthetic Speech; 3.2.1 Glass Box Versus Black Box. | |
505 | 8 | |a 3.2.2 Laboratory Versus Field Studies3.2.3 Linguistic Versus Acoustic; 3.2.4 Auditory Versus Instrumental; 3.3 Auditory Quality Evaluation Metrics; 3.3.1 Functional TestsThe content of this section has previously been published in a slightly different version in [6].; 3.3.2 Judgment TestsParts of the content of this section have previously been published in a slightly different version in [13] and [6].; 3.4 Instrumental Quality Evaluation Metrics; 3.4.1 Reference-Based MeasuresParts of the content of this section have previously been published in a slightly different version in [21]. | |
505 | 8 | |a 3.4.2 Reference-Free MeasuresReferences; 4 Perceptual Quality Dimensions; 4.1 State-of-the-Art Perceptual Quality DimensionsParts of the content of this section have previously been published in a slightly different version in [1].; 4.1.1 Study: Kraft and Portele (Kraft1995); 4.1.2 Study: Mayo et al. I (Mayo2005); 4.1.3 Study: Viswanathan and Viswanathan (Vis2005); 4.1.4 Study: Seget (Seget2007); 4.1.5 Study: Hinterleitner (Hint2010); 4.1.6 Study: Mayo et al. II (Mayo2011); 4.1.7 Restrictions of Discussed Studies. | |
505 | 8 | |a 4.2 Semantic Differential and Factor AnalysisParts of the content of this section have previously been published in a slightly different version in [13].4.2.1 Experimental Setup; 4.2.2 Statistical Analysis; 4.3 Sorting Task and Multidimensional ScalingParts of the content of this section have previously been published in a slightly different version in [16].; 4.3.1 Experimental Setup; 4.3.2 Statistical Analysis; 4.4 Summary of the SD/FA and ST/MDS StudiesParts of the content of this section have previously been published in a slightly different version in [16]. | |
505 | 8 | |a 4.5 4.5 Universal Perceptual Quality Dimensions4.5.1 Naturalness of Voice; 4.5.2 Prosodic Quality; 4.5.3 Fluency and Intelligibility; 4.5.4 Absence of Disturbances; 4.5.5 Calmness; 4.5.6 Instructions for TTS Quality Assessment; 4.6 Summary; References; 5 Influencing Factors on Perceptual Quality; 5.1 Influence of the ApplicationParts of the content of this section have previously been published in a slightly different version in [1].; 5.1.1 Pretest; 5.1.2 Main TestThe content of this section has previously been published in a slightly different version in [10].; 5.1.3 Conclusions. | |
590 | |a SpringerLink |b Springer Complete eBooks | ||
650 | 0 | |a Speech synthesis. | |
650 | 0 | |a Speech processing systems. | |
650 | 0 | |a Text-to-speech software. | |
650 | 0 | |a Telecommunication. | |
655 | 7 | |a elektronické knihy |7 fd186907 |2 czenas | |
655 | 9 | |a electronic books |2 eczenas | |
776 | 0 | 8 | |i Printed edition: |z 9789811037337 |
830 | 0 | |a T-labs series in telecommunication services. | |
856 | 4 | 0 | |u https://proxy.k.utb.cz/login?url=https://link.springer.com/10.1007/978-981-10-3734-4 |y Plný text |
992 | |c NTK-SpringerENG | ||
999 | |c 100001 |d 100001 | ||
993 | |x NEPOSILAT |y EIZ |