NICT Use Cases and Requirements for New Models of Human Language to - - PowerPoint PPT Presentation

nict use cases and requirements for new models of human
SMART_READER_LITE
LIVE PREVIEW

NICT Use Cases and Requirements for New Models of Human Language to - - PowerPoint PPT Presentation

W3C: Workshop on Conversational Applications, June 2010, NICT Use Cases and Requirements for New Models of Human Language to Support Mobile Conversational Systems Chiori Hori and Teruhisa Misu Spoken Language Communication Group NICT, Japan


slide-1
SLIDE 1

NICT Use Cases and Requirements for New Models of Human Language to Support Mobile Conversational Systems

Chiori Hori and Teruhisa Misu Spoken Language Communication Group NICT, Japan

W3C: Workshop on Conversational Applications, June 2010,

slide-2
SLIDE 2

1986 2006

NICT Spoken Language Communication Group

ATR NICT

Advanced Telecommunications Research Institute International

Speech-to-Speech Translation + Spoken Dialog System

slide-3
SLIDE 3

Multi-Party Communication Between Asian Language Speakers

slide-4
SLIDE 4

Kyoto Tour Guide System Kyoto Tour Guide System

Human-to-Human Dialog Human-to-Human Dialog

slide-5
SLIDE 5

Tour Guide Dialog Discourse

eps: Grt(start) Stt(prf (spot/general)): eps eps: Extrct_kywd eps: Mk_ rcmdlist(kwd) eps: Set_tgt eps: Expln(tgt) eps: OQ(DST) Rqst(rcmd) : eps eps: Set_rcmdlist eps: Rcmd(tgt) eps : Chck_ forloop(rcmdlist) Accept : Set_imprs (Decided) eps: Cnfrm(dcst) eps: Prcs4imprs(tgt) Neutral: Keep(tgt, rcmdlist) Decided: Mv(tgt, rcmdlist, dcsnlist) Negative: Remove(tgt, rcmdlist) Eexperienced:Remove(tgt, rcmdlist) eps: Trnst(if_forloop_done) eps : Stt(next_act) eps: Trnst(if_forloop_not_end) eps : Aagree Stt(ro_requirement) : Grt(end) eps : Stt(prcs(rcmd)) Stt(exprnc)/ : Set_imprs(Experienced) Stt(imprs(Next/Bad) : Set_imprs(Neutral/Bad) eps: Rspns2imprs eps: Grt(end) eps : Chck_rcmdlist eps : Trnst (if_data_in_rcmdlist) eps: Rqst(dcsn4rcmdlist) Stt (no_prefered_tgt) : eps eps : Trnst(if_nodata_in rcmdlist) Stt(prf(tgt)) : Set A as tgt Set_imprs(Positive) eps: Set_imprs(Decided) Good : Set_imprs (Positive)

Make Make a recommendation list a recommendation list Recommend each spot in the list Recommend each spot in the list Check user’s Check user’s preference preference

Confirm users’ final decision Confirm users’ final decision

slide-6
SLIDE 6
  • Accept users’ spontaneous dialog

Accept users’ spontaneous dialog behavior behavior

  • Mimic guides’ dialog behavior as in the

Mimic guides’ dialog behavior as in the data data

Goal of Spoken Dialog System

Issues: 1. Spontaneous speech recognition 2. Robust user concept understanding 3. Flexible dialog management (DM) 4. Expandable DM platform

Corpus-based Corpus-based DM DM

slide-7
SLIDE 7
  • Human-to-Human dialog corpus:

Annotation of tags representing “user concept” + “system actions”

Corpus-based Dialog Management

Statistical models of humans’ dialog behaviors

slide-8
SLIDE 8

Different fashions of scenarios: IF ・ THEN rules, Finite State Automaton and Statistical Dialog Management

General Description for Dialog Scenario

Convert

WFST description

Advantage of WFST-based DM

slide-9
SLIDE 9

W Weighted eighted F Finite inite State tate T Transducer ransducer

  • 1. State and Arcs
  • 2. A pair of input

input and output

  • utput

symbols with weights

  • 3. Transition is determined by

the weights.

Slot-Filling for Origin and Destination

User input System response Input Concept tag Action Tag Response From where? From Osaka. To where? To Tokyo

ε

Ask_ORG From_<city> Fill_ORG Ask_DST Fill_DST To_<city>

ε ε ε ε

exit

ε :

Ask_DST/1 To_<city> : Fill_DST/0 From_<city> : Fill_ORG/0

ε :

Ask_ORG/0

ε:ε /0 ε : exit/2 ε ε

Scenario WFST

* Slot handling

slide-10
SLIDE 10

Spoken Language Understanding WFST

<word-class label="station"> Tokyo Kyoto </word-class> <keyword-class label="origin"> (station) </keyword-class> <keyword-class label="time"> six seven eight nine ten eleven twelve </keyword-class> <keyword-class label="destination"> (station) </keyword-class> <plan repeat="true"> from,(origin) to,(destination) </plan> <depart> at,(time) </depart

slide-11
SLIDE 11

Kyoto Tour Guide System using WFST-based Dialog Management

slide-12
SLIDE 12

Problems in implementing using SRGS/SISR

  • 1. Context sensitive ASR

Statistical Language models for ASR are required to be tuned depending on the current dialogue context determined by previous system prompt, dialogue situations.

  • 2. Separation of ASR and Natural Language Understanding

We need to implement speech recognition systems which are more robust to natural language expressions. N-gram language models can be a solution. Consequently, we will need a framework to label semantic annotations on ASR results, afterward.

  • 3. Spoken Language Understanding using WFSTs

To realize context sensitive semantic annotation for SLU, we need a description for WFST.