Uncertainty in Spoken Uncertainty in Spoken Multimodal - speakers - - PDF document

uncertainty in spoken uncertainty in spoken
SMART_READER_LITE
LIVE PREVIEW

Uncertainty in Spoken Uncertainty in Spoken Multimodal - speakers - - PDF document

Baltic HLT Oct 8 2010 Natural Interaction Natural Interaction Goal oriented Uncertainty in Spoken Uncertainty in Spoken Multimodal - speakers have intentions - speech, gesture, - task Dialogue Management face, posture Dialogue


slide-1
SLIDE 1

Baltic HLT Oct 8 2010

  • K. Jokinen

1

Uncertainty in Spoken Uncertainty in Spoken Dialogue Management Dialogue Management

Kristiina Jokinen Kristiina Jokinen University of Tartu University of Tartu and and University of Helsinki University of Helsinki

Natural Interaction Natural Interaction

Goal oriented

  • speakers have intentions
  • task

Symbolic

  • learnt conventions
  • communicative signs, words

Affective

  • affinity, social bonds
  • emotions, attitudes

Rational agents

  • coordination of action
  • cooperation
  • grounding = build

shared knowledge Multimodal

  • speech, gesture,

face, posture

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Constructive Constructive Dialogue Dialogue Modelling Modelling

The agents monitor each other and each other

The agents monitor each other and each other’ ’s actions in the s actions in the communicative situation and react to the situation according the communicative situation and react to the situation according their ir beliefs, intentions and interpretation of the situation, to buil beliefs, intentions and interpretation of the situation, to build shared d shared knowledge and achieve a goal (Jokinen, 2009) knowledge and achieve a goal (Jokinen, 2009)

Contact Contact Hearing Hearing/seeing seeing/touching touching distance distance Perception Perception Recognition Recognition of

  • f meaningful

meaningful symbols symbols Understanding Understanding Meaning Meaning creation creation for the for the symbols symbols in the in the context context Reaction Reaction Production Production of

  • f one
  • ne’s own
  • wn behaviour

behaviour

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Hesitation/Uncertainty Situations

Jokinen & Allwood (2010) lack of own ability to continue

knowledge, skills

lack of permission to continue

situational issues

lack of willingness to continue

attitude

slide-2
SLIDE 2

Baltic HLT Oct 8 2010

  • K. Jokinen

2

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

it’s just a… Make a distance…. probably …. I don’t know

Examples

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Hesitation-related phenomena

Fillers: filled pauses, discourse markers,

editing terms, parentheticals

Disfluencies Self-corrections (repairs) Retractions: reformulation or restart of

  • ne’s utterance

Non-verbal aspects: hand gestures, body

posture

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Perception of hesitation signals

Hesitation markers (uhh, umm) Pauses, silence Slower speaking rate Higher pitch Fundamental frequency F0 rises before pauses that

  • ccur in major syntactic boundaries

Gesturing Compensatory pattern Carlson & Gustafson (2006)

F0 countours, pausing, retardation, creaky voice, syntax the total duration increase counts rather than the contribution by

syntactic or prosodic factors as such

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Multi Multi-

  • level

level Hybrid Hybrid Method Method

  • Top

Top-

  • down

down analysis of the analysis of the collected collected data data

at different meaning levels

at different meaning levels

  • words

words, , syntactic syntactic phrases phrases, , dialogue dialogue acts acts, , gestures gestures, , face face, , posture posture,…

different

different tagging tagging schemes schemes: AMI, MUMIN, : AMI, MUMIN, standardisation standardisation

manual annotation of what the human observe

manual annotation of what the human observe

  • Bottom

Bottom-

  • up analysis of the collected data

up analysis of the collected data

at different signal levels:

at different signal levels:

  • speech, eye

speech, eye-gaze, face, gesture recognition gaze, face, gesture recognition

different

different technical technical constraints constraints: : accuracy accuracy

Automatic

Automatic annotation annotation of

  • f what

what ” ”happens happens” ”

  • Correlations

Correlations and and classifications classifications between between

  • Crossing

Crossing points points

slide-3
SLIDE 3

Baltic HLT Oct 8 2010

  • K. Jokinen

3

Annotation Annotation – – human observations human observations

  • Speech, facial expressions, gestures, body posture
  • Using the Anvil annotation tool

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Praat Praat Analysis Analysis

F0 variation pitch intensity

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

F0 for F0 for it it’ ’s s just a just a… …

Fundamental frequency raises before pauses at major syntactic constructs, but lowers if a pause occurs in the middle

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

F0 of F0 of probably probably & & I I don don’ ’t t know know

slide-4
SLIDE 4

Baltic HLT Oct 8 2010

  • K. Jokinen

4

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Face activity (blue peaks) correlates with Face activity (blue peaks) correlates with speech activity (green circles) and speech activity (green circles) and manual facial gesture manual facial gesture

500 1000 1500 2000 2500 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Face Activity Facial Labels Speech

Thanks to Stefan Scherer

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Conclusions Conclusions

Verbal expressions of hesitation and uncertainty

are accompanied by facial expressions and gesturing that help the partners to understand the underlying reasons for hesitation

Situational and attitudinal hesitation seems to be

accompanied by large, specific gesturing which carries social conventions of symbolic gestures

Hesitation and uncertainty also used in the

coordination of interaction and the activity that the speakers are involved in

Semantic theme of hesitation and uncertainty: non-

continuation of the current conversational topic

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Holistic Holistic View View of

  • f Interaction

Interaction

Communicative signals are used as means

to manage social situation in which the agents find themselves, as a reaction to conversational understanding and they form communicative patterns rather than function as individual signs of communication

What is the context (levels, activities,

culture…)

  • Cf. Compensatory pattern

Deviations of the expected temporal patterns Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Future Future Work Work

Relation between speech and gesture

parameters

Temporal correlations

Segmentation: what is the smallest unit How the speakers learn to observe hesitation

signals

Intercultural comparison: culturally accepted

hesitation markers vs. interpretation of these signals (e.g. shoulder shrug)

slide-5
SLIDE 5

Baltic HLT Oct 8 2010

  • K. Jokinen

5

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Future Future Wiews Wiews in the in the context context of

  • f Baltic

Baltic HLT HLT

  • Natural

Natural language language interaction interaction extended extended to the to the whole whole communication communication situation situation

Integrates

Integrates language language and and speech speech technology technology

  • Corpus

Corpus collection collection

Speech

Speech data: data: individual individual words words to to read read speech speech to to conversational conversational speech speech

What

What type type of data,

  • f data, activity

activity, , what what kind kind of

  • f equipments

equipments

  • Analysis

Analysis

Annotation

Annotation levels levels ( (tags tags for for dialogue dialogue acts acts, , gestures gestures, etc. on , etc. on pragmatic pragmatic level level) )

Comparison

Comparison of

  • f interaction

interaction strategies strategies

Possible

Possible use use of

  • f speech

speech recognisers recognisers, , morphological morphological and and syntactic syntactic parsers parsers; ; also also face face and and gesture gesture recognition recognition

Oct 8 2010 Oct 8 2010 Baltic HLT / K.Jokinen Baltic HLT / K.Jokinen

Thank Thank you you! !