Enabling Language Models to Fill in the Blanks Chris Donahue Percy - - PowerPoint PPT Presentation

enabling language models to fill in the blanks
SMART_READER_LITE
LIVE PREVIEW

Enabling Language Models to Fill in the Blanks Chris Donahue Percy - - PowerPoint PPT Presentation

Enabling Language Models to Fill in the Blanks Chris Donahue Percy Liang Mina Lee Paper https://arxiv.org/abs/2005.05339 Code https://github.com/chrisdonahue/ilm Demo https://chrisdonahue.com/ilm Why filling in the blanks? Hi Chris,


slide-1
SLIDE 1

Enabling Language Models to Fill in the Blanks

Chris Donahue Paper https://arxiv.org/abs/2005.05339 Code https://github.com/chrisdonahue/ilm Demo https://chrisdonahue.com/ilm Mina Lee Percy Liang

slide-2
SLIDE 2

Why filling in the blanks?

Hi Chris, Thanks for updating the draft. Can you revert the wording of the task definition?

Editing and revising

slide-3
SLIDE 3

Why filling in the blanks?

Hi Chris, Thanks for updating the draft. The modifications look Can you revert the wording of the task definition?

Editing and revising

slide-4
SLIDE 4

Why filling in the blanks?

Hi Chris, Thanks for updating the draft. The modifications look great to me. Can you revert the wording of the task definition?

Editing and revising

slide-5
SLIDE 5

Why filling in the blanks?

Hi Chris, Thanks for updating the draft. The modifications look good with one exception. Can you revert the wording of the task definition?

Editing and revising

slide-6
SLIDE 6

Why filling in the blanks?

We were lost in the dark forest. Suddenly,

Connecting ideas

slide-7
SLIDE 7

Why filling in the blanks?

We were lost in the dark forest. Suddenly, a bear emerged from the trees!

Connecting ideas

slide-8
SLIDE 8

Why filling in the blanks?

We were lost in the dark forest. Suddenly, A wave of relief washed over us and we ran over to greet the other traveler.

Connecting ideas

slide-9
SLIDE 9

Why filling in the blanks?

We were lost in the dark forest. Suddenly, we saw a flashlight in the distance. A wave of relief washed over us and we ran over to greet the other traveler.

Connecting ideas

slide-10
SLIDE 10

Input Output

Givenincompletetextwith[blank]s,predictcompletetext

Text infilling

She ate [blank] for [blank]. She ate leftover pasta for lunch.

Arbitrarynumberofblanks Variablelengthspans(e.g.word,sentence,paragraph)

slide-11
SLIDE 11

Input Output

Previous work on text infilling

She ate [blank] for [blank]. She ate leftover pasta for lunch.

GPT-3(Brownetal.,2020):Cannotconsiderfuturecontext General-purpose models

slide-12
SLIDE 12

Output

Previous work on text infilling

She ate leftover pasta for lunch. She ate [mask] [mask] for [mask].

Input

General-purpose models BERT(Devlinetal.,2019):Mustknowexactnumberoftokens GPT-3(Brownetal.,2020):Cannotconsiderfuturecontext

slide-13
SLIDE 13

Input Output

Previous work on text infilling

She ate [blank] for [blank]. She ate leftover pasta for lunch.

General-purpose models SA(Zhuetal.,2019):Cannotleveragepre-trainedlanguagemodels Task-specific models GPT-3(Brownetal.,2020):Cannotconsiderfuturecontext BERT(Devlinetal.,2019):Mustknowexactnumberoftokens

slide-14
SLIDE 14

Our Idea: Infilling by Language Modeling (ILM)

  • 1. Download your favorite language model (LM)

Language Model

slide-15
SLIDE 15

Our Idea: Infilling by Language Modeling (ILM)

  • 1. Download your favorite language model (LM)
  • 2. Fine-tune the model on infilling examples

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. [sep] Language Model

slide-16
SLIDE 16

Our Idea: Infilling by Language Modeling (ILM)

  • 1. Manufacture infilling examples

Trainingtime

slide-17
SLIDE 17

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. She ate leftover pasta for lunch.

Our Idea: Infilling by Language Modeling (ILM)

  • 1. Manufacture infilling examples

Trainingtime

Data

slide-18
SLIDE 18

leftover pasta [answer] lunch [answer] She ate leftover pasta for lunch. She ate [blank] for [blank].

Our Idea: Infilling by Language Modeling (ILM)

  • 1. Manufacture infilling examples

Input

Trainingtime

Data

slide-19
SLIDE 19

leftover pasta [answer] lunch [answer] She ate [blank] for [blank].

Our Idea: Infilling by Language Modeling (ILM)

  • 1. Manufacture infilling examples

Input Target

Trainingtime

She ate leftover pasta for lunch.

Data

slide-20
SLIDE 20

Our Idea: Infilling by Language Modeling (ILM)

  • 1. Manufacture infilling examples

Data

Trainingtime

New data

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. [sep] She ate leftover pasta for lunch.

slide-21
SLIDE 21

Our Idea: Infilling by Language Modeling (ILM)

  • 2. Download pre-trained left-to-right LM

Language Model

Trainingtime

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. [sep]

slide-22
SLIDE 22

Our Idea: Infilling by Language Modeling (ILM)

  • 3. Fine-tune LM on infilling examples

Language Model

Trainingtime

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. [sep]

slide-23
SLIDE 23

Our Idea: Infilling by Language Modeling (ILM)

  • 3. Fine-tune LM on infilling examples

Trainingtime

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. [sep] Language Model

slide-24
SLIDE 24

Our Idea: Infilling by Language Modeling (ILM)

Use fine-tuned LM to infill

Language Model He drinks [blank] after [blank].

Input

Testtime

[sep]

slide-25
SLIDE 25

Our Idea: Infilling by Language Modeling (ILM)

Language Model

Testtime

Use fine-tuned LM to infill

He drinks [blank] after [blank].

Input

water [answer] running [answer] [sep]

Target

slide-26
SLIDE 26

Our Idea: Infilling by Language Modeling (ILM)

Testtime

Output

He drinks water after running.

Use fine-tuned LM to infill

He drinks [blank] after [blank]. water [answer] running [answer] [sep]

Input Target

slide-27
SLIDE 27

Experimental setup

Data Metric Stories(Mostafazadehetal.,2016),Abstracts,Lyrics Score,Perplexity

  • 1. Human evaluation
  • 2. Quantitative evaluation

Model BERT,SA(Zhuetal.,2019),LM,ILM(ours)

slide-28
SLIDE 28
  • 1. Human evaluation: Turing test

Identify one of the five sentences generated by machine. Patty was excited about having her friends over. She had been working hard preparing the food. She also had the place looking spotless. All of her friends arrived and were seated at the table. Patty had a great time with her friends.

slide-29
SLIDE 29
  • 1. Human evaluation: Turing test

Identify one of the five sentences generated by machine. Patty was excited about having her friends over. She had been working hard preparing the food. She also had the place looking spotless. All of her friends arrived and were seated at the table. Patty had a great time with her friends.

slide-30
SLIDE 30
  • 1. Human evaluation: Turing test

Identify one of the five sentences generated by machine. Patty was excited about having her friends over. She had been working hard preparing the food. She also had the place looking spotless. All of her friends arrived and were seated at the table. Patty had a great time with her friends. [blank]

slide-31
SLIDE 31
  • 1. Human evaluation: Turing test

Identify one of the five sentences generated by machine. Patty was excited about having her friends over. She had been working hard preparing the food. She also had the place looking spotless. All of her friends arrived and were seated at the table. Patty had a great time with her friends. [blank]

ILM Patty knew her friends wanted pizza.

slide-32
SLIDE 32

Patty was excited about having her friends over. She had been working hard preparing the food. She also had the place looking spotless. All of her friends arrived and were seated at the table. Patty had a great time with her friends. [blank]

  • 1. Human evaluation: Turing test

Identify one of the five sentences generated by machine.

SA She wasn't sure she had to go to the store. LM She went to check the tv. ILM Patty knew her friends wanted pizza. favoritea ", Mary brightly said. BERT 29% 41% 45% 20%

slide-33
SLIDE 33
  • 2. Quantitative evaluation

Stories Abstracts Lyrics LM 18.3 27.9 27.7 ILM 15.6 22.4 22.6 Perplexity on the sentence infilling task Takeadvantageofbidirectionalcontextdespiteusingunidirectionalmodels Pleaserefertothepaperformoreexperimentsanddetailedanalysis

slide-34
SLIDE 34

Takeaways

Conceptual simplicity Model-agnostic framework MinimalchangetostandardLMtraining Leveragemassivelypre-trainedLMs

Input Output

Thank [blank] for [blank]! Thank you for watching!

Paper https://arxiv.org/abs/2005.05339 Code https://github.com/chrisdonahue/ilm Demo https://chrisdonahue.com/ilm