Modern NLP for Pre-Modern Practitioners Joel Grus #QConAI - - PowerPoint PPT Presentation

modern nlp
SMART_READER_LITE
LIVE PREVIEW

Modern NLP for Pre-Modern Practitioners Joel Grus #QConAI - - PowerPoint PPT Presentation

Modern NLP for Pre-Modern Practitioners Joel Grus #QConAI @joelgrus #2019 "True self-control is waiting until the movie starts to eat your popcorn." the movie True until is waiting self-control starts to eat your popcorn.


slide-1
SLIDE 1

Modern NLP

for

Pre-Modern Practitioners

Joel Grus @joelgrus

#QConAI #2019

slide-2
SLIDE 2

"True self-control is waiting until the movie starts to eat your popcorn."

slide-3
SLIDE 3
slide-4
SLIDE 4

starts to eat your popcorn. True self-control is waiting the movie until

slide-5
SLIDE 5
slide-6
SLIDE 6

Natural Language Understanding is Hard

slide-7
SLIDE 7

B u t W e ' r e G e t t i n g B e t t e r A t I t *

* as measured by performance

  • n tasks we're getting better at
slide-8
SLIDE 8

As Measured by Performance on Tasks We're Getting Better at*

* tasks that would be easy if we were good at natural language understanding and that we therefore use to measure our progress toward natural language understanding

slide-9
SLIDE 9

About Me

slide-10
SLIDE 10

Obligatory Plug for AllenNLP

slide-11
SLIDE 11

A Handful of Tasks That Would Be Easy if We Were Good at Natural Language Understanding

slide-12
SLIDE 12

Parsing

slide-13
SLIDE 13

Named-Entity Recognition

slide-14
SLIDE 14

Coreference Resolution

slide-15
SLIDE 15

Machine Translation

slide-16
SLIDE 16
slide-17
SLIDE 17

Summarization

Attend QCon.ai.

slide-18
SLIDE 18

Text Classification

slide-19
SLIDE 19

Machine Comprehension

slide-20
SLIDE 20

Machine Comprehension?

slide-21
SLIDE 21

Textual Entailment

slide-22
SLIDE 22

Winograd Schemas

The conference organizer disinvited the speaker because he feared a boring talk. The conference organizer disinvited the speaker because he proposed a boring talk. conference organizer speaker

slide-23
SLIDE 23

Language Modeling

slide-24
SLIDE 24

And many others!

slide-25
SLIDE 25

If you were good at natural language understanding, you'd also be pretty good at these tasks

slide-26
SLIDE 26

So if computers get good at each of these tasks, then...

slide-27
SLIDE 27

(I Am Being Unfair)

Each of these tasks is valuable on its own merits Likely they are getting us closer to actual natural language understanding

slide-28
SLIDE 28

Pre-Modern NLP

slide-29
SLIDE 29

Lots of Linguistics

slide-30
SLIDE 30

Grammars

S S -> NP VP NP VP VP -> VBZ ADJP NP VBZ ADJP NP -> JJ NN JJ NN VBZ ADJP ADJP -> JJ JJ NN VBZ JJ JJ -> "Artificial" NN -> "intelligence" VBZ -> "is" JJ -> "dangerous" Artificial intelligence is dangerous

slide-31
SLIDE 31

Hand-Crafted Features

slide-32
SLIDE 32

Rule-Based Systems

slide-33
SLIDE 33

Modern NLP

slide-34
SLIDE 34

Theme 1: Neural Nets and Low-Dimensional Representations

slide-35
SLIDE 35

Theme 2: Putting Things in

slide-36
SLIDE 36

Theme 3:

slide-37
SLIDE 37

Theme 4:

slide-38
SLIDE 38

Theme 5: Transfer Learning

slide-39
SLIDE 39

Word Vectors

slide-40
SLIDE 40

1 ... .01 .9 .05 ... .3 .6 .1 .2 2.3

Joel is attending an artificial intelligence conference.

artificial intelligence embedding prediction

slide-41
SLIDE 41

Using Word Vectors

? ?

slide-42
SLIDE 42

Using Word Vectors

V N

slide-43
SLIDE 43

Using Word Vectors

N J

The

  • fficial

department heads all quit .

slide-44
SLIDE 44

bites man dog

slide-45
SLIDE 45

Using Context for Sequence Labeling

V N

slide-46
SLIDE 46

Using Context for Sequence Classification

slide-47
SLIDE 47

Recurrent Neural Networks

slide-48
SLIDE 48

LSTMs and GRUs

slide-49
SLIDE 49

Bidirectionality

slide-50
SLIDE 50

Generative Character-Level Modeling

slide-51
SLIDE 51

Convolutional Networks

slide-52
SLIDE 52

Sequence-to-Sequence Models

slide-53
SLIDE 53

Attention

slide-54
SLIDE 54

Large "Unsupervised" Language Models

slide-55
SLIDE 55

Contextual Embeddings

slide-56
SLIDE 56

Contextual Embeddings The Seahawks football today

slide-57
SLIDE 57
slide-58
SLIDE 58
slide-59
SLIDE 59

word2vec

slide-60
SLIDE 60

ELMo

slide-61
SLIDE 61

ELMo

slide-62
SLIDE 62

"NLP's ImageNet moment"

slide-63
SLIDE 63

Self-Attention

slide-64
SLIDE 64

RNN vs CNN vs Self-Attention

slide-65
SLIDE 65

The Transformer ("Attention Is All You Need")

slide-66
SLIDE 66

OpenAI GPT, or Transformer Decoder Language Model

slide-67
SLIDE 67

One Model to Rule Them All?

slide-68
SLIDE 68

The GLUE Benchmark

slide-69
SLIDE 69

BERT

slide-70
SLIDE 70

Task 1: Masked Language Modeling

Joel is giving a [MASK] talk at a [MASK] in San Francisco interesting exciting derivative pedestrian snooze-worthy ... conference meetup rave coffeehouse WeWork ...

slide-71
SLIDE 71

Task 2: Next Sentence Prediction

[CLS] Joel is giving a talk. [SEP] The audience is enthralled. [SEP] [CLS] Joel is giving a talk. [SEP] The audience is falling asleep. [SEP] 99% is_next_sentence 1% is_not_next_sentence 1% is_next_sentence 99% is_not_next_sentence

slide-72
SLIDE 72

BERT for downstream tasks

slide-73
SLIDE 73

GPT-2

slide-74
SLIDE 74

1.5 billion parameters

slide-75
SLIDE 75
slide-76
SLIDE 76

Is GPT-2 Dangerous?

PRETRAINED LANGUAGE MODEL

slide-77
SLIDE 77
slide-78
SLIDE 78
slide-79
SLIDE 79
slide-80
SLIDE 80

How Can You Use These In Your Work?

slide-81
SLIDE 81

Use Pretrained Word Vectors

slide-82
SLIDE 82

Better Still, Use Pretrained Contextual Embeddings

slide-83
SLIDE 83

Use Pretrained BERT to Build Great Classifiers

slide-84
SLIDE 84

Use GPT-2 (small) (if you dare)

PRETRAINED LANGUAGE MODEL

slide-85
SLIDE 85

In Conclusion

  • NLP is cool
  • Modern NLP is solving really hard

problems

  • (And is changing really really quickly)
  • Lots of really smart people with lots of

data and lots of compute power have trained models that you can just download and use

  • So take advantage of their work!

I'm fine-tuning a transformer model!

slide-86
SLIDE 86

Thanks!

  • I'll tweet out the slides: @joelgrus

○ read the speaker notes ○ they have lots of links

  • I sometimes blog: joelgrus.com
  • AI2: allenai.org
  • AllenNLP: allennlp.org
  • GPT-2 Explorer: gpt2.apps.allenai.org
  • podcast: adversariallearning.com
slide-87
SLIDE 87

Appendix

slide-88
SLIDE 88

References

http://ruder.io/a-review-of-the-recent-history-of-nlp/ https://ankit-ai.blogspot.com/2019/03/future-of-natural-language-processing.html

https://lilianweng.github.io/lil-log/2019/01/31/generalized-language-models.html#openai-gpt