Emotional Speech Synthesis State of the art 2009 Felix Burkhardt - - PDF document

emotional speech synthesis
SMART_READER_LITE
LIVE PREVIEW

Emotional Speech Synthesis State of the art 2009 Felix Burkhardt - - PDF document

1 19.05.2009 Emotional Speech Synthesis State of the art 2009 Felix Burkhardt outline how to model and why simulate emotions? emotions in speech introduction to speech synthesis approaches examples, examples, examples


slide-1
SLIDE 1

19.05.2009 1

Emotional Speech Synthesis

State of the art 2009 Felix Burkhardt

slide-2
SLIDE 2

19.05.2009 2

Emotional Soeech Synthesis - Felix Burkhardt,

  • utline

how to model and why simulate emotions? emotions in speech introduction to speech synthesis approaches examples, examples, examples conclusion and outlook

slide-3
SLIDE 3

19.05.2009 3

Emotional Soeech Synthesis - Felix Burkhardt,

contents

how to model and why simulate emotions? emotions in speech

  • verview on speech synthesis

examples, examples, examples conclusion, outlook

slide-4
SLIDE 4

19.05.2009 4

Emotional Soeech Synthesis - Felix Burkhardt,

emotion models

…everyone except a psychologist knows what an emotion is (Young 1973)

  • categories, e.g. anger, joy, …
  • dimensions, e.g. activation,

dominance, valence

  • appraisals, e.g. novelty, intrinsic

pleasantness, relevance, coping potential,

emotion cube arousal valence d

  • m

i n a n c e

anger joy sadness content neutral despair boredom source: Burkhardt 2001

slide-5
SLIDE 5

19.05.2009 5

Emotional Soeech Synthesis - Felix Burkhardt,

w hy model emotional behaviour?

aspects of emotion modeling in human-machine interaction:

source: Batliner et al 2006

slide-6
SLIDE 6

19.05.2009 6

Emotional Soeech Synthesis - Felix Burkhardt,

applications of emotional tts

fun, e.g. emotional greetings prosthesis emotional chat avatars gaming, believable characters adapted dialog design adapted persona design target-group specific advertising … believable agents … artificial humans

time

slide-7
SLIDE 7

19.05.2009 7

Emotional Soeech Synthesis - Felix Burkhardt,

aspects of emotional tts

slide-8
SLIDE 8

19.05.2009 8

Emotional Soeech Synthesis - Felix Burkhardt,

contents

why simulate emotions? emotions in speech

  • verview on speech synthesis

examples, examples, examples conclusion, outlook

slide-9
SLIDE 9

19.05.2009 9

Emotional Soeech Synthesis - Felix Burkhardt,

speech features

source: Reynolds et al 2003

descriptive layers of speech

slide-10
SLIDE 10

19.05.2009 10

Emotional Soeech Synthesis - Felix Burkhardt,

emotion in speech

source: TUB emotional database

frightened sad happy bored neutral angry spectrograms from emotional acted speech

slide-11
SLIDE 11

19.05.2009 11

Emotional Soeech Synthesis - Felix Burkhardt,

emotional data?

actors vs. reality Berlin EmoDB: 10 actors x 7

emotions x 10 sentences

alternatives

induced data, e.g. Aibo television, radio data

EmoDB: Burkhardt et al 2005

slide-12
SLIDE 12

19.05.2009 12

Emotional Soeech Synthesis - Felix Burkhardt,

how to describe emotion?

EmotionML, incubator group at W3C Example, embedded in SSML:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"> <voice gender="female"> <prosody contour="(0%,+20Hz)(10%,+30%)(40%,+10Hz)"> Hi, am sad know but start getting angry... </prosody> </voice> <emotion> <category name="sadness„ set="basic" intensity="0.6"/> <timing start="10%" end="50%"/> </emotion> <emotion> <category name="anger" set="basic" intensity="0.4"/> <timing start="50%" end="100%"/> </emotion> </speak>

http://www.w3.org/2005/Incubator/emotion/

slide-13
SLIDE 13

19.05.2009 13

Emotional Soeech Synthesis - Felix Burkhardt,

loquendo tts director

source: Loquendo

slide-14
SLIDE 14

19.05.2009 14

Emotional Soeech Synthesis - Felix Burkhardt,

contents

why simulate emotions? emotions in speech introduction to speech synthesis approaches examples, examples, examples conclusion, outlook

slide-15
SLIDE 15

19.05.2009 15

Emotional Soeech Synthesis - Felix Burkhardt,

speech synthesis taxonomy

speech synthesis systems voice response systems arbitary speech synthesizers re (copy)-synthesis, voice transformation text-to-speech (unknown input) concept-to-speech (input from text-generation system) voice conversion

slide-16
SLIDE 16

19.05.2009 16

Emotional Soeech Synthesis - Felix Burkhardt,

tts process chain

NLP natural language processing DSP digital speech processing

phonetic transcription prosody track

preprocessing morpho-syntactic analysis transpcription prosody modeling unit concatenation / search prosody fitting edge smoothing

slide-17
SLIDE 17

19.05.2009 17

Emotional Soeech Synthesis - Felix Burkhardt,

synthesis approaches

system modeling signal modeling expert systems formant synthesis articulatory synthesis vocal tract shape synthesis concatenative synthesis coding of units type of units syllables, diphones, allophones, subsegments parametric coded LPC linear predictive coding MFCC mel frequency cepstral MBR multi band resynthesis formants waveform coded PCM LDM (linear delta mod.) hybrid approaches MBRPSOLA, RELP statistical model generated HMM hidden markov models ANN neural nets rule based data based non-uniform unit selection pseudo articulatory

slide-18
SLIDE 18

19.05.2009 18

Emotional Soeech Synthesis - Felix Burkhardt,

historic development

articulatory van Kempelen formant synthesis e.g. Dec Talk PSOLA based synthesis e.g. Elan non-uniform unit selection e.g. RealSpeak flexible historic not flexible modern natural sounding domain dependent artificial sounding domain independent

2000 1990 1980 1780 ….

slide-19
SLIDE 19

19.05.2009 19

Emotional Soeech Synthesis - Felix Burkhardt,

system modeling

slide-20
SLIDE 20

19.05.2009 20

Emotional Soeech Synthesis - Felix Burkhardt,

source filter model

source: Klatt80 formant synthesizer (Klatt 1980)

slide-21
SLIDE 21

19.05.2009 21

Emotional Soeech Synthesis - Felix Burkhardt,

contents

why simulate emotions? emotions in speech

  • verview on speech synthesis

examples, examples, examples conclusion, outlook

slide-22
SLIDE 22

19.05.2009 22

Emotional Soeech Synthesis - Felix Burkhardt,

  • pen source Java program

based on MBROLA synthesis engine.

NOT a complete text-to-speech

system

prosody filter between natural

language and digital speech signal processing modules

as multilingual as MBROLA

which currently supports 35 languages.

examples: emofilt

slide-23
SLIDE 23

19.05.2009 23

Emotional Soeech Synthesis - Felix Burkhardt,

emoSpeak is integrated

into the MARY text-to- speech framework by DFKI.

Marc Schröder

investigated in his ph.d. thesis, how to assign rule-based modification of speech to emotional dimensions.

the system can be freely

dowloaded

examples: emoSpeak

source: Schröder 2004

slide-24
SLIDE 24

19.05.2009 24

Emotional Soeech Synthesis - Felix Burkhardt,

examples voice conversion

neutral sad Phase vocoder Greg Beller, IRCAM neutral angry PSOLA - LPC conversion Murtaza Bulut et al, USC

slide-25
SLIDE 25

19.05.2009 25

Emotional Soeech Synthesis - Felix Burkhardt,

examples voice transformation

Laughter synthesis by LPC synthesis and mass-spring model Shiva Sundaram USC 2007 woman as boy as man man breathy whispery tense Mixed LF + harmonic model Olivier Rosec FranceTelecom 2009

slide-26
SLIDE 26

19.05.2009 26

Emotional Soeech Synthesis - Felix Burkhardt,

examples formant synthesis

neutral sad angry crying content prosody rules + phonation model EmoSyn Burkhardt, 2000 sad angry DEC Talk prosody rules AffectEditor

  • J. Cahn, MIT 1998
slide-27
SLIDE 27

19.05.2009 27

Emotional Soeech Synthesis - Felix Burkhardt,

examples diphone synthesis

neutral joy prosody rules EmoFilt Burkhardt, 1999 joy angry prosody rules for dimensions three inventories for soft, normal and tense speech MARY

  • M. Schröder, DFKI
slide-28
SLIDE 28

19.05.2009 28

Emotional Soeech Synthesis - Felix Burkhardt,

examples statistical based

neutral joy HMM models spectral and prosodic features Tokyo Institute, Kobayashi Lab

slide-29
SLIDE 29

19.05.2009 29

Emotional Soeech Synthesis - Felix Burkhardt,

examples unit selection

Katrin extralinguistic units product research CTTS with expressive units Damian Shouty fun personality voices

slide-30
SLIDE 30

19.05.2009 30

Emotional Soeech Synthesis - Felix Burkhardt,

examples non human

anger fear formant synthesis MIT Kismet robot happy sad concatenative Oudeyer: Sony pet robots

slide-31
SLIDE 31

19.05.2009 31

Emotional Soeech Synthesis - Felix Burkhardt,

examples singing

bicycle 1961 articulatory, first song ever Bell Labs Gerstman & Mathews, aria 1993 Articulatory pavarobotti Ingo Titze donna nobis 2007 articulatory vocal tract lab Peter Birkholz

slide-32
SLIDE 32

19.05.2009 32

Emotional Soeech Synthesis - Felix Burkhardt,

more examples …

http://emosamples.syntheticspeech.de

slide-33
SLIDE 33

19.05.2009 33

Emotional Soeech Synthesis - Felix Burkhardt,

contents

why simulate emotions? emotions in speech

  • verview on speech synthesis

examples, examples, examples conclusion, outlook

slide-34
SLIDE 34

19.05.2009 34

Emotional Soeech Synthesis - Felix Burkhardt,

conclusion

emotions are part of natural speech simulation possible by either

modeling the process including emotional data

still text to speech fights with intelligible, neutral speech first steps: speaking styles, extralinguistics first apps: fun, gaming

slide-35
SLIDE 35

19.05.2009 35

Emotional Soeech Synthesis - Felix Burkhardt,

  • utlook

discrepancy between natural but unflexible vs. artificial sounding but flexible solutions short - middle term: very large databases hybrid parametric – non-uniform unit selection voice transformation techniques high quality source filter model based synthesis solutions on the long run physical modeling

slide-36
SLIDE 36

19.05.2009 36

Emotional Soeech Synthesis - Felix Burkhardt,

references