Are we using enough listeners? No! An empirically-supported critique of Interspeech 2014 TTS evaluations
Mirjam Wester, Cassia Valentini-Botinhao and Gustav Eje Henter
- CSTR, University of Edinburgh
Are we using enough listeners? No! An empirically-supported critique - - PowerPoint PPT Presentation
Are we using enough listeners? No! An empirically-supported critique of Interspeech 2014 TTS evaluations Mirjam Wester, Cassia Valentini-Botinhao and Gustav Eje Henter CSTR, University of Edinburgh Introduction
20 40 60 80 100 120 140 160 180 5 10 15 20 25 30 35 40 45 50 55 Number of listeners Number of comparisons found to be significantly different ER ES ALL EE Naturalness ER ES ALL EE Similarity
ER ES ALL EE Similarity 20 40 60 80 100 120 140 160 180 Number of listeners 1.00 0.95 0.90 0.85 0.75 0.80 Rank correlation ER ES ALL EE Naturalness 20 40 60 80 100 120 140 160 180 Number of listeners 1.00 0.95 0.90 0.85 0.75 0.80 Rank correlation
20 40 60 80 100 120 140 160 180 5 10 15 20 25 30 35 40 45 50 55 Number of listeners Number of comparisons found to be significantly different ER ES ALL EE Naturalness ER ES ALL EE Similarity ER ES ALL EE Similarity 20 40 60 80 100 120 140 160 180 Number of listeners 1.00 0.95 0.90 0.85 0.75 0.80 Rank correlation ER ES ALL EE Naturalness 20 40 60 80 100 120 140 160 180 Number of listeners 1.00 0.95 0.90 0.85 0.75 0.80 Rank correlation
20 40 60 80 100 120 140 160 180 5 10 15 20 25 30 35 40 45 50 55 Number of datapoints Number of comparisons found to be significantly different ER ES ALL EE Naturalness ER ES ALL EE Similarity