19.05.2009 1
Emotional Speech Synthesis
State of the art 2009 Felix Burkhardt
Emotional Speech Synthesis State of the art 2009 Felix Burkhardt - - PDF document
1 19.05.2009 Emotional Speech Synthesis State of the art 2009 Felix Burkhardt outline how to model and why simulate emotions? emotions in speech introduction to speech synthesis approaches examples, examples, examples
19.05.2009 1
State of the art 2009 Felix Burkhardt
19.05.2009 2
Emotional Soeech Synthesis - Felix Burkhardt,
how to model and why simulate emotions? emotions in speech introduction to speech synthesis approaches examples, examples, examples conclusion and outlook
19.05.2009 3
Emotional Soeech Synthesis - Felix Burkhardt,
how to model and why simulate emotions? emotions in speech
examples, examples, examples conclusion, outlook
19.05.2009 4
Emotional Soeech Synthesis - Felix Burkhardt,
…everyone except a psychologist knows what an emotion is (Young 1973)
dominance, valence
pleasantness, relevance, coping potential,
emotion cube arousal valence d
i n a n c e
anger joy sadness content neutral despair boredom source: Burkhardt 2001
19.05.2009 5
Emotional Soeech Synthesis - Felix Burkhardt,
aspects of emotion modeling in human-machine interaction:
source: Batliner et al 2006
19.05.2009 6
Emotional Soeech Synthesis - Felix Burkhardt,
fun, e.g. emotional greetings prosthesis emotional chat avatars gaming, believable characters adapted dialog design adapted persona design target-group specific advertising … believable agents … artificial humans
time
19.05.2009 7
Emotional Soeech Synthesis - Felix Burkhardt,
19.05.2009 8
Emotional Soeech Synthesis - Felix Burkhardt,
why simulate emotions? emotions in speech
examples, examples, examples conclusion, outlook
19.05.2009 9
Emotional Soeech Synthesis - Felix Burkhardt,
source: Reynolds et al 2003
descriptive layers of speech
19.05.2009 10
Emotional Soeech Synthesis - Felix Burkhardt,
source: TUB emotional database
frightened sad happy bored neutral angry spectrograms from emotional acted speech
19.05.2009 11
Emotional Soeech Synthesis - Felix Burkhardt,
actors vs. reality Berlin EmoDB: 10 actors x 7
emotions x 10 sentences
alternatives
induced data, e.g. Aibo television, radio data
EmoDB: Burkhardt et al 2005
19.05.2009 12
Emotional Soeech Synthesis - Felix Burkhardt,
EmotionML, incubator group at W3C Example, embedded in SSML:
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"> <voice gender="female"> <prosody contour="(0%,+20Hz)(10%,+30%)(40%,+10Hz)"> Hi, am sad know but start getting angry... </prosody> </voice> <emotion> <category name="sadness„ set="basic" intensity="0.6"/> <timing start="10%" end="50%"/> </emotion> <emotion> <category name="anger" set="basic" intensity="0.4"/> <timing start="50%" end="100%"/> </emotion> </speak>
http://www.w3.org/2005/Incubator/emotion/
19.05.2009 13
Emotional Soeech Synthesis - Felix Burkhardt,
source: Loquendo
19.05.2009 14
Emotional Soeech Synthesis - Felix Burkhardt,
why simulate emotions? emotions in speech introduction to speech synthesis approaches examples, examples, examples conclusion, outlook
19.05.2009 15
Emotional Soeech Synthesis - Felix Burkhardt,
speech synthesis systems voice response systems arbitary speech synthesizers re (copy)-synthesis, voice transformation text-to-speech (unknown input) concept-to-speech (input from text-generation system) voice conversion
19.05.2009 16
Emotional Soeech Synthesis - Felix Burkhardt,
NLP natural language processing DSP digital speech processing
phonetic transcription prosody track
preprocessing morpho-syntactic analysis transpcription prosody modeling unit concatenation / search prosody fitting edge smoothing
19.05.2009 17
Emotional Soeech Synthesis - Felix Burkhardt,
system modeling signal modeling expert systems formant synthesis articulatory synthesis vocal tract shape synthesis concatenative synthesis coding of units type of units syllables, diphones, allophones, subsegments parametric coded LPC linear predictive coding MFCC mel frequency cepstral MBR multi band resynthesis formants waveform coded PCM LDM (linear delta mod.) hybrid approaches MBRPSOLA, RELP statistical model generated HMM hidden markov models ANN neural nets rule based data based non-uniform unit selection pseudo articulatory
19.05.2009 18
Emotional Soeech Synthesis - Felix Burkhardt,
articulatory van Kempelen formant synthesis e.g. Dec Talk PSOLA based synthesis e.g. Elan non-uniform unit selection e.g. RealSpeak flexible historic not flexible modern natural sounding domain dependent artificial sounding domain independent
2000 1990 1980 1780 ….
19.05.2009 19
Emotional Soeech Synthesis - Felix Burkhardt,
19.05.2009 20
Emotional Soeech Synthesis - Felix Burkhardt,
source: Klatt80 formant synthesizer (Klatt 1980)
19.05.2009 21
Emotional Soeech Synthesis - Felix Burkhardt,
why simulate emotions? emotions in speech
examples, examples, examples conclusion, outlook
19.05.2009 22
Emotional Soeech Synthesis - Felix Burkhardt,
based on MBROLA synthesis engine.
NOT a complete text-to-speech
system
prosody filter between natural
language and digital speech signal processing modules
as multilingual as MBROLA
which currently supports 35 languages.
19.05.2009 23
Emotional Soeech Synthesis - Felix Burkhardt,
emoSpeak is integrated
Marc Schröder
the system can be freely
source: Schröder 2004
19.05.2009 24
Emotional Soeech Synthesis - Felix Burkhardt,
neutral sad Phase vocoder Greg Beller, IRCAM neutral angry PSOLA - LPC conversion Murtaza Bulut et al, USC
19.05.2009 25
Emotional Soeech Synthesis - Felix Burkhardt,
Laughter synthesis by LPC synthesis and mass-spring model Shiva Sundaram USC 2007 woman as boy as man man breathy whispery tense Mixed LF + harmonic model Olivier Rosec FranceTelecom 2009
19.05.2009 26
Emotional Soeech Synthesis - Felix Burkhardt,
neutral sad angry crying content prosody rules + phonation model EmoSyn Burkhardt, 2000 sad angry DEC Talk prosody rules AffectEditor
19.05.2009 27
Emotional Soeech Synthesis - Felix Burkhardt,
neutral joy prosody rules EmoFilt Burkhardt, 1999 joy angry prosody rules for dimensions three inventories for soft, normal and tense speech MARY
19.05.2009 28
Emotional Soeech Synthesis - Felix Burkhardt,
neutral joy HMM models spectral and prosodic features Tokyo Institute, Kobayashi Lab
19.05.2009 29
Emotional Soeech Synthesis - Felix Burkhardt,
Katrin extralinguistic units product research CTTS with expressive units Damian Shouty fun personality voices
19.05.2009 30
Emotional Soeech Synthesis - Felix Burkhardt,
anger fear formant synthesis MIT Kismet robot happy sad concatenative Oudeyer: Sony pet robots
19.05.2009 31
Emotional Soeech Synthesis - Felix Burkhardt,
bicycle 1961 articulatory, first song ever Bell Labs Gerstman & Mathews, aria 1993 Articulatory pavarobotti Ingo Titze donna nobis 2007 articulatory vocal tract lab Peter Birkholz
19.05.2009 32
Emotional Soeech Synthesis - Felix Burkhardt,
http://emosamples.syntheticspeech.de
19.05.2009 33
Emotional Soeech Synthesis - Felix Burkhardt,
why simulate emotions? emotions in speech
examples, examples, examples conclusion, outlook
19.05.2009 34
Emotional Soeech Synthesis - Felix Burkhardt,
emotions are part of natural speech simulation possible by either
modeling the process including emotional data
still text to speech fights with intelligible, neutral speech first steps: speaking styles, extralinguistics first apps: fun, gaming
19.05.2009 35
Emotional Soeech Synthesis - Felix Burkhardt,
discrepancy between natural but unflexible vs. artificial sounding but flexible solutions short - middle term: very large databases hybrid parametric – non-uniform unit selection voice transformation techniques high quality source filter model based synthesis solutions on the long run physical modeling
19.05.2009 36
Emotional Soeech Synthesis - Felix Burkhardt,