speech processing 15 492 18 492
play

Speech Processing 15-492/18-492 Speech Synthesis Building Voices - PowerPoint PPT Presentation

Speech Processing 15-492/18-492 Speech Synthesis Building Voices Building a Voice Designing the Prompts Designing the Prompts Recording the Prompts Recording the Prompts Labeling the Utterances Labeling the Utterances


  1. Speech Processing 15-492/18-492 Speech Synthesis Building Voices

  2. Building a Voice Designing the Prompts � Designing the Prompts � Recording the Prompts � Recording the Prompts � Labeling the Utterances � Labeling the Utterances � Finding parameters (F0, MCEP) � Finding parameters (F0, MCEP) � Building the synthesis voice � Building the synthesis voice � Tuning and Testing � Tuning and Testing �

  3. Software Requirements Festival Speech Synthesizer � Festival Speech Synthesizer � � Free software language independent Free software language independent � synthesizer synthesizer � Multiplatform: Windows, Linux, OSX Multiplatform: Windows, Linux, OSX � � Used for research and commercial synthesis Used for research and commercial synthesis � Festvox � Festvox � � Voice building tools Voice building tools � � Scripts, instructions, example databases Scripts, instructions, example databases � � Used for over 40 different languages Used for over 40 different languages �

  4. Festival Speech Synthesis After Installation � After Installation � festival – –tts tts stuff.txt stuff.txt � festival � festival � festival � festival> (SayText SayText “hello world”) “hello world”) � festival> ( �

  5. Building Synthetic Voices http://festvox.org/bsv � http://festvox.org/bsv � � Look at section on “Telling the Time” Look at section on “Telling the Time” �

  6. Automatic Labeling

  7. Automatic Labeling (bad)

  8. Parameterization Extract pitch marks from data � Extract pitch marks from data � � Find voices/unvoiced regions Find voices/unvoiced regions � � Add “fake” pitch marks during unvoiced regions Add “fake” pitch marks during unvoiced regions � Extract MFCC pitch synchronously � Extract MFCC pitch synchronously � � Instead of a fixed frame advance (e.g. 5ms) Instead of a fixed frame advance (e.g. 5ms) � � Extract it at each pitch mark Extract it at each pitch mark � � Try to capture the spectrum at the pitch period Try to capture the spectrum at the pitch period �

  9. Pitchmarks

  10. Building a LDOM synthesizer Build cluster tree on each unit type � Build cluster tree on each unit type � � Not just on phones Not just on phones � � Tag phones with word they come from Tag phones with word they come from � � d_limited d_limited and and d_domain d_domain are treated as different are treated as different �

  11. Tuning and Testing � Test it on some real data Test it on some real data � � Ensure number/symbol expansions are correct Ensure number/symbol expansions are correct � � Prompts should probably be word expanded Prompts should probably be word expanded � � Flight US187 Flight US187 - -> flight u s one eight seven > flight u s one eight seven � � Remove bad prompts Remove bad prompts � � Or fix labels Or fix labels � � Remember to keep access to the speaker Remember to keep access to the speaker � � If you have to update the system, you need the same If you have to update the system, you need the same � speaker available speaker available

  12. Summary Building a voice � Building a voice � � Databases design, recording, labeling Databases design, recording, labeling � � Parameter extraction and model building Parameter extraction and model building � Limited domain synthesis � Limited domain synthesis �

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend