speech processing 15 492 18 492
play

Speech Processing 15-492/18-492 Spoken Dialog Systems Advanced - PowerPoint PPT Presentation

Speech Processing 15-492/18-492 Spoken Dialog Systems Advanced Concepts in Dialog Spoken Dialog Systems Basic steps for machine conversation: Basic steps for machine conversation: Take speech to text (ASR) Take speech to text (ASR)


  1. Speech Processing 15-492/18-492 Spoken Dialog Systems Advanced Concepts in Dialog

  2. Spoken Dialog Systems Basic steps for machine conversation: � Basic steps for machine conversation: � � Take speech to text (ASR) Take speech to text (ASR) � � Extract meaning from text (Parsing) Extract meaning from text (Parsing) � � Interpret meaning, decide what to do next (DM) Interpret meaning, decide what to do next (DM) � � Generate text to convey decision (NLG) Generate text to convey decision (NLG) � � Speak text (TTS) Speak text (TTS) � When these work well, actual conversations � When these work well, actual conversations � are possible are possible � Still can be too machine Still can be too machine- -like to be natural like to be natural �

  3. Example S: Welcome to MyBus MyBus. Which itinerary do you want information for? . Which itinerary do you want information for? S: Welcome to U: I need to go to the airport. U: I need to go to the airport. S: Where are you leaving from? S: Where are you leaving from? U: Downtown. U: Downtown. S: Just a minute. Let me check that for you. S: Just a minute. Let me check that for you. S: There is a 28X leaving downtown at 7:03 PM. It will arrive at the S: There is a 28X leaving downtown at 7:03 PM. It will arrive at the airport at 7:37 PM. airport at 7:37 PM. S: You can say when is the next bus, when is the previous bus, start a S: You can say when is the next bus, when is the previous bus, s tart a new query, or goodbye. new query, or goodbye. U: When is the previous bus? U: When is the previous bus? S: Just a second. There is a 28X leaving downtown at 4:20 PM. It will S: Just a second. There is a 28X leaving downtown at 4:20 PM. It will arrive at the airport at 4:56 PM. arrive at the airport at 4:56 PM. S: You can say when is the next bus, when is the previous bus, start a tart a S: You can say when is the next bus, when is the previous bus, s new query, or goodbye. new query, or goodbye. U: Goodbye. U: Goodbye. S: Thank you for using MyBus S: Thank you for using MyBus. Goodbye. . Goodbye.

  4. Spoken Dialog Systems Of course, all those steps can be error- -prone prone � Of course, all those steps can be error � � In fact, rarely do all work 100% correctly In fact, rarely do all work 100% correctly � What happens when errors occur? � What happens when errors occur? � � Degraded machine conversation Degraded machine conversation � Errors can chain, compounding problems � Errors can chain, compounding problems � � Misrecognition Misrecognition Bad parse Wrong decision Bad parse Wrong decision �

  5. Example S: Welcome to MyBus MyBus. . S: Welcome to S: Which itinerary do you want information for? S: Which itinerary do you want information for? U: uh… I wanna U: uh… I wanna go from Pittsburgh Airport to the Cathedral of go from Pittsburgh Airport to the Cathedral of Learning. Learning. S: Which itinerary do you want information for? S: Which itinerary do you want information for? U: I said I wanna wanna go from Pittsburgh Airport to the Cathedral of go from Pittsburgh Airport to the Cathedral of U: I said I Learning. Learning. S: Which itinerary do you want information for? S: Which itinerary do you want information for? U: What? U: What? S: Which itinerary do you want information for? S: Which itinerary do you want information for? U: Well… I wanna U: Well… I wanna go from Pittsburgh Airport to the… go from Pittsburgh Airport to the… S: Which itinerary do you want information for? S: Which itinerary do you want information for? S: Just a minute. Let me check that for you. S: Just a minute. Let me check that for you. U: What? U: What? S: Which itinerary do you want… S: Which itinerary do you want… S: Which itinerary do you want information for? S: Which itinerary do you want information for?

  6. Overview What to do with imperfect ASR results? � What to do with imperfect ASR results? � � Can anything useful be extracted or is there no Can anything useful be extracted or is there no � meaning at all in what was recognized? meaning at all in what was recognized? How can a conversation that’s going badly � How can a conversation that’s going badly � recover and end up successful? recover and end up successful? What approaches can be used for dialog � What approaches can be used for dialog � decision making? decision making?

  7. Handling Imperfect Recognition � Even with good recognition, you cannot Even with good recognition, you cannot � blindly trust ASR output… blindly trust ASR output… � Confidence annotation: Helios Confidence annotation: Helios � � Given the current state of the dialog, how Given the current state of the dialog, how � confident is the system that the input matches the confident is the system that the input matches the user’s intention? user’s intention? � Logistic regression based on speech, parsing, Logistic regression based on speech, parsing, � dialog features dialog features � Training from corpus of transcribed data Training from corpus of transcribed data �

  8. Grounding Concept Values � Grounding: process where conversation Grounding: process where conversation � participants establish common understanding participants establish common understanding � For each understood concept, choose among 3 For each understood concept, choose among 3 � possible actions possible actions � Explicit confirmation: ask user a direct question, wait Explicit confirmation: ask user a direct question, wait � for a positive response before accepting for a positive response before accepting “To the airport. Is this correct?” “To the airport. Is this correct?” � Implicit confirmation: repeat what was understood, Implicit confirmation: repeat what was understood, � accept unless user indicates it was wrong accept unless user indicates it was wrong “To the airport. Where are you leaving from?” “To the airport. Where are you leaving from?” � No action: silently accept without informing user No action: silently accept without informing user � � Best choice can be situationally dependent Best choice can be situationally dependent �

  9. Enabling Confirmation in Olympus Attach Policies Write Prompts Create Policies to Concepts Special Rosetta template prompts Special Rosetta template prompts $Rosetta::Templates::act{“implicit_confirm"} = { “origin_place" => “Leaving from <origin_place>.", … }

  10. Enabling Confirmation in Olympus Attach Policies Write Prompts Create Policies to Concepts Confirmation policies config config file: file: Confirmation policies ( Configurations/DesktopSAPI/expl_impl.pol ) EXPLORATION_MODE=epsilon-greedy EXPLORATION_PARAMETER=0.1 ACCEPT EXPL_CONF IMPL_CONF INACTIVE 1 - - CONFIDENT - 5 10 UNCONFIDENT - 10 5 GROUNDED 1 - -

  11. Enabling Confirmation in Olympus Attach Policies Write Prompts Create Policies to Concepts When defining concepts in the dialog manager, indicate When defining concepts in the dialog manager, indicate Concept name Policy name which policy to apply: which policy to apply: DEFINE_AGENCY( CPerformTask, DEFINE_CONCEPTS( INT_USER_CONCEPT(query_type, “impl") STRING_USER_CONCEPT(origin_place, “expl_impl") … ) … )

  12. Handling Non-Understandings � No meaning can be extracted from user input No meaning can be extracted from user input � S: Where do you want to go? S: Where do you want to go? U: (no parse) U: (no parse) S: ??? S: ??? � Many possible system responses: Many possible system responses: � S: Where do you want to go? S: Where do you want to go? S: Could you repeat that? S: Could you repeat that? S: For example, you can say, “ “Downtown Downtown” ”. . S: For example, you can say, S: Which route are you looking for? S: Which route are you looking for? … …

  13. Non-Understanding Policies Repeat question � Repeat question � � May work if temporary channel issue caused May work if temporary channel issue caused � ASR problems ASR problems � Frustrating to user if continued Frustrating to user if continued � Provide example of what to say � Provide example of what to say � � Can assist unfamiliar users Can assist unfamiliar users � � Annoys users who already said the example Annoys users who already said the example � and weren’t understood and weren’t understood Change topic � Change topic � � Gets user to talk about something else Gets user to talk about something else � � Still have to get original question answered Still have to get original question answered �

  14. Non-Understanding Policies Two types of policies: � Two types of policies: � � Handcrafted/deterministic Handcrafted/deterministic �  Design a (small) space of dialog states Design a (small) space of dialog states   Set a utility to each action in each state Set a utility to each action in each state  � Data Data- -driven driven �  Learn optimal weight based on collected dialogs Learn optimal weight based on collected dialogs   Exploration/exploitation trade Exploration/exploitation trade- -off at runtime off at runtime 

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend