Whats so Hard about Natural Language Understanding? Alan Ritter - PowerPoint PPT Presentation

What’s so Hard about Natural Language Understanding? Alan Ritter Computer Science and Engineering The Ohio State University Collaborators: Jiwei Li, Dan Jurafsky (Stanford) Bill Dolan, Michel Galley, Jianfeng Gao (MSR), Colin Cherry (Google) Jeniya Tabassum (Ohio State), Alexander Konovalov (Ohio State), Wei Xu (Ohio State) Brendan O’Connor (Umass)

What’s so Hard about Natural Language Understanding? Alan Ritter Computer Science and Engineering The Ohio State University Collaborators: Jiwei Li , Dan Jurafsky (Stanford) Bill Dolan, Michel Galley, Jianfeng Gao (MSR), Colin Cherry (Google) Jeniya Tabassum (Ohio State), Alexander Konovalov (Ohio State), Wei Xu (Ohio State) Brendan O’Connor (Umass)

Q: Why are we so good at Speech, MT (but bad at NLU)? People naturally translate and transcribe.

Q: Why are we so good at Speech, MT (but bad at NLU)? People naturally translate and transcribe. Q: Large, End-to-End Datasets for NLU? • Web-scale Conversations? • Web-scale Structured Data?

Data-Driven Conversation • Twitter: ~ 500 Million Public SMS-Style Conversations per Month • Goal: Learn conversational agents directly from massive volumes of data. 6

[Ritter, Cherry, Dolan EMNLP 2011] Noisy Channel Model Input: Who wants to come over for dinner tomorrow? 7

[Ritter, Cherry, Dolan EMNLP 2011] Noisy Channel Model Input: Who wants to come over for dinner tomorrow? { Output: Yum ! I 7

[Ritter, Cherry, Dolan EMNLP 2011] Noisy Channel Model Input: Who wants to come over for dinner tomorrow? { { Output: Yum ! I want to 7

[Ritter, Cherry, Dolan EMNLP 2011] Noisy Channel Model Input: Who wants to come over for dinner tomorrow? { { { Output: Yum ! I want to be there 7

[Ritter, Cherry, Dolan EMNLP 2011] Noisy Channel Model Input: Who wants to come over for dinner tomorrow? { { { { Output: Yum ! I want to be there tomorrow ! 7

Neural Conversation [Sordoni et. al. 2015] [Xu et. al. 2016] [Wen et. al. 2016] [Li et. al. 2016] [Kannan et. al. 2016] [Serban et. al. 2016] 8

How old are you? Slide Credit: Jiwei Li 9

How old are you? i 'm 16 . Slide Credit: Jiwei Li 10

How old are you? i 'm 16 . 16 ? Slide Credit: Jiwei Li 11

How old are you? i 'm 16 . 16 ? i don 't know what you 're talking about Slide Credit: Jiwei Li 12

How old are you? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying Slide Credit: Jiwei Li 13

How old are you? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about … Slide Credit: Jiwei Li

Bad Action How old are you? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about … Slide Credit: Jiwei Li

Bad Action How old are you? i 'm 16 . Outcome 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about … Slide Credit: Jiwei Li

Deep Reinforcement Learning [Li, Monroe, Ritter, Galley, Gao, Jurafsky EMNLP 2016] How old are you? State Encoding how old are you

Deep Reinforcement Learning [Li, Monroe, Ritter, Galley, Gao, Jurafsky EMNLP 2016] Action How old are you? i 'm 16 . . 16 EOS I’m Encoding Decoding how EOS old are you I’m 16 .

Learning: Policy Gradient REINFORCE Algorithm (Williams,1992) What we want to learn Action How old are you? i 'm 16 . . 16 EOS I’m Encoding Decoding how EOS old are you I’m 16 .

Q: Rewards?

Q: Rewards? A: Turing Test

Q: Rewards? A: Turing Test Adversarial Learning (Goodfellow et al., 2014)

Adversarial Learning for Neural Dialogue [Li, Monroe, Shi, Jean, Ritter, Jurafsky EMNLP 2016] sample Real-world human response conversations Discriminator Real or Fake? generate response Response Generator

Adversarial Learning for Neural Dialogue [Li, Monroe, Shi, Jean, Ritter, Jurafsky EMNLP 2016] (Alternate Between Training Generator and Discriminator) sample Real-world human response conversations Discriminator Real or Fake? generate response Response Generator

Adversarial Learning for Neural Dialogue [Li, Monroe, Shi, Jean, Ritter, Jurafsky EMNLP 2016] (Alternate Between Training Generator and Discriminator) sample Real-world human response conversations Discriminator Real or Fake? generate response Response Generator REINFORCE Algorithm (Williams,1992)

Adversarial Learning Improves Response Generation vs vanilla generation model Adversarial Adversarial Tie Win Lose Human Evaluator: 62% 18% 20% Adversarial Success (How often can you fool a machine) Adversarial Learning 8.0% Machine Evaluator: Standard Seq2Seq model 4.9% [Bowman et. al. 2016] Slide Credit: Jiwei Li

Q: Why are we so good at Speech, MT (but bad at NLU)? People naturally translate and transcribe. Q: Large, End-to-End Datasets for NLU? • Web-scale Conversations? • Web-scale Structured Data?

Q: Why are we so good at Speech, MT (but bad at NLU)? People naturally translate and transcribe. Q: Large, End-to-End Datasets for NLU? Generates fluent open domain • Web-scale Conversations? replies • Web-scale Structured Data?

Q: Why are we so good at Speech, MT (but bad at NLU)? People naturally translate and transcribe. Q: Large, End-to-End Datasets for NLU? Generates fluent open domain • Web-scale Conversations? replies • Web-scale Structured Data? Really Natural Language Understanding?

Learning from Distant Supervision [Mintz et. al. 2009] 1) Named Entity Recognition Challenge: highly ambiguous labels [Ritter, et. al. EMNLP 2011] 2) Relation Extraction Challenge: missing data [Ritter, et. al. TACL 2013] 3) Time Normalization Challenge: diversity in noisy text [Tabassum, Ritter, Xu, EMNLP 2016] N X − λ U D (˜ p unlabeled O ( θ ) = log p θ ( y i | x i ) p || ˆ ) 4) Event Extraction − θ | {z } i Label regularization Challenge: lack of negative examples | {z } Log Likelihood [Ritter, et. al. WWW 2015] [Konovalov, et. al. WWW 2017]

Time Normalization [Tabassum, Ritter, Xu EMNLP 2016] State-of- the-art time resolvers { } TempEX HeidelTime SUTime UWTime 1 Jan 2016

Time Normalization [Tabassum, Ritter, Xu EMNLP 2016] Distant Supervision   (no human labels or rules!) State-of- the-art time resolvers { } TempEX HeidelTime SUTime UWTime 1 Jan 2016

Distant Supervision Assumption Mercury Transit May 9,2016

Distant Supervision Assumption Mercury Transit May 9,2016 8 May 9 May 10 May

Multiple Instance Learning Tagger [ Mercury, 5/9/2016 ] … w 1 w 2 w n w 3 Words t 1 t 2 t 3 t 4 Sentence Level Tags 1 Mon 1 Past … … … Present 31 12 Sun Future [Event Database]

Multiple Instance Learning Tagger [ Mercury, 5/9/2016 ] … w 1 w 2 w n w 3 Words Local Classifier exp ( θ · f ( w i , z i )) … z 1 z 2 z 3 z n Word Level Tags t 1 t 2 t 3 t 4 Sentence Level Tags 1 Mon 1 Past … … … Present 31 12 Sun Future [Event Database]

Whats so Hard about Natural Language Understanding? Alan Ritter - PowerPoint PPT Presentation

Whats so Hard about Natural Language Understanding? Alan Ritter Computer Science and Engineering The Ohio State University Collaborators: Jiwei Li, Dan Jurafsky (Stanford) Bill Dolan, Michel Galley, Jianfeng Gao (MSR), Colin Cherry

Natural Language Processing Stages in understanding natural language Why its hard

Natural Language Understanding We want to communicate with computers using natural language

A Software Suite for the Understanding of Natural Language Marco Ponza Paolo Ferragina Natural

Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2019 Natural Language

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova

HydroCare HC-44 HydroCare HC-44 Hard Water Problems Hard Water Problems Hard Water Costs You

6/18/2018 When Family Life Gets Hard 1 6/18/2018 When Family Life Gets Hard God

Neural Language Models The New Frontier of Natural Language Understanding Gabriele Sarti

Outline of todays lecture Overview of Natural Language Generation Components of Natural

1 Particle Systems - History Particle Systems 1982 Star Trek II: The Wrath of Khan Certain

Cadoli-Schaerf Approximation Anytime Algorithms for logical entailment State of the Art:

Pre-application webinar for PAR-16- 131: Emerging Questions in Cancer Audio for webinar: Systems

Last class Dependency parsing and logistic regression Dependency parsing: a fully lexicalized

Dow ntow n Transportation Plan 1 D O W N T O W N T O L E D O T R A N S P O R T A T I O N P L A

Source side Dependency Tree Reordering Models with Subtree Movements and Constraints Nguyen

CMP784 DEEP LEARNING Lecture #12 Self-Supervised Learning Aykut Erdem // Hacettepe

2019-2020 About Mrs. Cherry Graduated from Indiana University Masters in Curriculum and

Sambuz

Useful Links

Newsletter

Mail Us