CAN STANDARD ANALYSIS TOOLS BE USED ON DECOMPRESSED SPEECH?
R.J.J.H. van Son Institute of Phonetic Sciences/ACLC University of Amsterdam Herengracht 338, 1016CG Amsterdam Rob.van.Son@hum.uva.nl
CAN STANDARD ANALYSIS TOOLS BE USED ON DECOMPRESSED SPEECH? - - PowerPoint PPT Presentation
CAN STANDARD ANALYSIS TOOLS BE USED ON DECOMPRESSED SPEECH? R.J.J.H. van Son Institute of Phonetic Sciences/ACLC University of Amsterdam Herengracht 338, 1016CG Amsterdam Rob.van.Son@hum.uva.nl Introduction
R.J.J.H. van Son Institute of Phonetic Sciences/ACLC University of Amsterdam Herengracht 338, 1016CG Amsterdam Rob.van.Son@hum.uva.nl
Microphone change: From HF condenser (Sennheiser
MKH 105) to head-mounted dynamic (Shure SM10A)
Sony Minidisc: ATRAC3 on Walkman MZ-R909 Ogg Vorbis (40 kbs): 1.0rc3, 45 kbs effective (factor 15.5) Ogg Vorbis (80 kbs): 1.0rc3, 85 kbs effective (factor 8.3) MP3 (192 kbs): LAME 3.92, 204 kbs effective (factor 3.5)
Vowels Vowel- like Nasals Total
N
785 786 3549
2.5
Vowels Vowel- like Nasals Fricatives Total
3.2 5.4 7.6 5.3 N = 2415 853 795 863 4926
Sony MD > Ogg Vorbis (80kbs) > MP3 (192kbs) F0 F1 F2 F3
CoG Vowel- like Nasals Fricatives Vowels F0 CoG F0 CoG CoG
N
N
N
N
✁863
Weakest Link Determines RMS Error (Sony Minidisc)
Total Error = Sum of Component RMS Errors Sony MD Compression cascade
Vowels < 0.7 semitone
✁Nasals < 0.3 semitone
✁Holds for Low bit-rates (40 kbs) for Pitch and Formants
Pitch & Formants: Weakest Link
✁CoG: Sum of Component RMS Errors Solution: (Partial) Translation of Formats, i.e., No Decompression
Low bit-rates (40 kbs)
✁Repeated Compression
✁Microphone Choice