Sign Language Avatars Animation and Comprehensibility Michael Kipp* - - PowerPoint PPT Presentation

sign language avatars
SMART_READER_LITE
LIVE PREVIEW

Sign Language Avatars Animation and Comprehensibility Michael Kipp* - - PowerPoint PPT Presentation

Sign Language Avatars Animation and Comprehensibility Michael Kipp* Alexis Heloir Quan Nguyen DFKI Embodied Agents Research Group Exzellenzcluster Multimodal Computing and Interaction Universitt des Saarlandes * University of


slide-1
SLIDE 1

Sign Language Avatars

Animation and Comprehensibility

Michael Kipp* Alexis Heloir Quan Nguyen

research funded by:

DFKI Embodied Agents Research Group Exzellenzcluster Multimodal Computing and Interaction Universität des Saarlandes

* University of Applied Sciences Augsburg

slide-2
SLIDE 2

Sign language avatars... for the internet

slide-3
SLIDE 3

Deaf 500,000

Sign language is a real language [Stokoe 1960] Specific SL for every country (ASL, DGS, LSF, BSL ...) Sign language is primary means of communication

slide-4
SLIDE 4

Deaf 500,000

Sign language as a first language Sign language is a real language [Stokoe 1960] Specific SL for every country (ASL, DGS, LSF, BSL ...) Sign language is primary means of communication Spoken language is a foreign language 80% of deaf pupils leave school with significant reading/ writing problems

what is your name ?

=

slide-5
SLIDE 5

expensive not editable not interactive comprehensible 95% inexpensive editable possibly interactive comprehension limited commercial

slide-6
SLIDE 6

Prior Work

  • No standard writing system for sign language

Glosses: based on meaning, tool for learning

Notation: based on form, tool for science (Stokoe notation, HamNoSys)

  • Milestones

ViSiCAST (2000-2003): face-to-face translation // mocap

eSIGN (2002-2004): internet // procedural animation SiGML

60% comprehensibility

  • Recent projects

More flexible notations: Zebedee (LIMSI), PDTS-SiGML (U East Anglia)

Avatars for American SL (Huenerfauth et al. // DePaul Univ.), Italian SL (ATLAS project), Czech SL (U West Bohemia) ...

YOUR NAME WHAT

slide-7
SLIDE 7

GUIDO, eSIGN, Televirtual (2003) Greta, U Paris 8 (2006)

What's your Agent's Native Language?

LIMSI, SNCF, web sourds (2006) SmartBody ICT (2004) EMBR, DFKI (2009) Max Elckerlyc Marc ... Paula GeSSyCa ...

slide-8
SLIDE 8

GUIDO, eSIGN, Televirtual (2003) Greta, U Paris 8 (2006)

What's your Agent's Native Language?

LIMSI, SNCF, web sourds (2006) SmartBody ICT (2004) EMBR, DFKI (2009) Max Elckerlyc Marc ... Paula GeSSyCa ...

  • control language
  • validating quality
  • speech-gesture sync.
  • lip syncing
  • locomotion
  • rich form vocabulary
  • validation by

"understanding"

Universal Communi- cators

slide-9
SLIDE 9

Point of Departure

  • Goal: Make every ECA "sign language ready"
  • EMBR: EMBodied agent Realizer

➡ open source ➡ own animation language EMBRScript

  • Comprehensiblity?
slide-10
SLIDE 10

Toward "sign language ready"

  • Hand shapes: 10 => 60+ (finger alphabet...)
slide-11
SLIDE 11

Toward "sign language ready"

  • Hand shapes: 10 => 60+ (finger alphabet...)
  • Torso: lean/orientation, shoulder raises
slide-12
SLIDE 12

Toward "sign language ready"

  • Hand shapes: 10 => 60+ (finger alphabet...)
  • Torso: lean/orientation, shoulder raises
  • Facial expression: higher amplitude
  • Mouth: sophisticated viseme set, should allow

lipreading, use text-to-speech for visemes

slide-13
SLIDE 13

Toward "sign language ready"

  • Hand shapes: 10 => 60+ (finger alphabet...)
  • Torso: lean/orientation, shoulder raises
  • Facial expression: higher amplitude
  • Mouth: sophisticated viseme set, should allow

lipreading, use text-to-speech for visemes

  • Gaze: separate eye-ball from head movement
slide-14
SLIDE 14

Toward "sign language ready"

  • Hand shapes: 10 => 60+ (finger alphabet...)
  • Torso: lean/orientation, shoulder raises
  • Facial expression: higher amplitude
  • Mouth: sophisticated viseme set, should allow

lipreading, use text-to-speech for visemes

  • Gaze: separate eye-ball from head movement
slide-15
SLIDE 15

Animating Sign Language: Attempt I

  • Video: human signer's utterance
  • Imitate utterance using EMBRScript
  • Show EMBR animation
slide-16
SLIDE 16

Failed!

slide-17
SLIDE 17

Reasons

  • Ambiguity in sign language

fewer grammatical constructs

  • Single sign level

formational manual features

situational nonmanual features (almost impossible)

mouthing especially important in German SL

  • Utterance level

facial expression for sentence mode

eyebrows + posture for information structure

face as a visual focus point

  • Casual signing style makes sign harder to read

human signers compensate with all of the above

slide-18
SLIDE 18

Consequences

  • Working hypothesis: Avatars with current

animation methods are unable to produce understandable "spontaneous" sign language

  • Therefore:

➡ Overarticulate ➡ Involve Deaf experts ➡ Focus on nonmanual features ➡ Consider random facial movement

slide-19
SLIDE 19

Original Overarticulated Remake Avatar

slide-20
SLIDE 20

Attempt II

  • Overarticulated remake

➡ transcribe glosses ➡ recording

  • Gloss-based animation (lexicalized)

➡ compatible with EMBRScript ➡ tool support ➡ implications for HamNoSys

slide-21
SLIDE 21

HamNoSys Video Animation

slide-22
SLIDE 22

HamNoSys Video Animation BML

slide-23
SLIDE 23

HamNoSys Video Animation BML Video Animation EMBRScript BML

Heloir, Kipp 2010 Kipp et al. 2010

slide-24
SLIDE 24

HamNoSys Video Animation BML Video Animation EMBRScript BML

Heloir, Kipp 2010 Kipp et al. 2010

HamNoSys

slide-25
SLIDE 25

HamNoSys Video Animation BML Video Animation EMBRScript

Heloir, Kipp 2010 Kipp et al. 2010

slide-26
SLIDE 26

pose pose sequence many sequences single pose

Sample utterance: YOUR NAME WHAT

sequence utterance

pose seq. pose seq. pose seq.

YOUR NAME WHAT

gloss = pose pose pose

slide-27
SLIDE 27

Evaluation

  • Corpus

11 utterances (154 glosses) from German Deaf e-learning portal

quite complex sentences

  • Animation

higher duration for remake (factor 1.8) and for animation (factor 2.3)

gloss reuse factor = 1.6 (95 gloss lexemes)

  • Experiment

13 Deaf test subjects (6m / 7f), aged 33-55

Each session 1.5 - 2 hrs (videotaped)

Pure sign language environment: Deaf assistant, use of pictograms

Warm-up: 3 easy avatar sentence

slide-28
SLIDE 28

Delta Testing

slide-29
SLIDE 29

Analysis

  • Analysis of videos by Deaf experts

➡ Subjects' own rating usually misleading

[Huenerfauth et al. 2008]

  • Objective measure: count correctly recalled glosses

➡ only partial understanding?

  • Subjective measure: expert rates understanding for

each utterance

  • Combine measures [Sheard et al. 2004]
slide-30
SLIDE 30

Results

41.4 %

avatar / absolute:

58.4 %

avatar / relative:

slide-31
SLIDE 31

Discussion

  • Comprehensibility

➡ original video = 71 % "shockingly low" ➡ overarticulated remake = 82 % ➡ avatar = 58.4 % => close to state of the art

  • Novel aspects:

➡ complex content ➡ direct comparison with human signers

  • Delta testing factors out difficulties inherent to the

material (dialect, speed, bad grammar)

➡ focus on the real "delta" between avatar and human

slide-32
SLIDE 32

Conclusions

  • How to make an ECA sign!

➡ EMBRScript as an interface language ➡ SL synthesis research: nonmanuals and prosody ➡ Delta testing for comprehensibility

  • Signing avatars can profit from ECAs, and vice versa
  • 2nd workshop on Sign Language

Translation and Avatar Technology @ACM ASSETS 2011 Dundee !

Thanks!

First workshop, Berlin, January 2011 Thanks to: Peter Schaar Iris König Silke Matthes Thomas Hanke