Non-Monotonic Sequential Text Generation Sean Welleck, Kiant - - PowerPoint PPT Presentation

non monotonic sequential text generation
SMART_READER_LITE
LIVE PREVIEW

Non-Monotonic Sequential Text Generation Sean Welleck, Kiant - - PowerPoint PPT Presentation

Non-Monotonic Sequential Text Generation Sean Welleck, Kiant Brantley, Hal Daum III, Kyunghyun Cho Sequential Text Generation Y = ( y 1 , y 2 , , y N ) ( hi , how , are , you , ? ) Sequential Text Generation Unconditional Y ( hi ,


slide-1
SLIDE 1

Non-Monotonic Sequential Text Generation

Sean Welleck, Kianté Brantley, Hal Daumé III, Kyunghyun Cho

slide-2
SLIDE 2

Sequential Text Generation

Y = (y1, y2, …… , yN)

(hi, how, are, you, ?)

slide-3
SLIDE 3

Sequential Text Generation Unconditional

(hi, how, are, you, ?) (good, to, see, you, !) (what, time, is, it, ?)

Y

Policy

π

slide-4
SLIDE 4

Sequential Text Generation Conditional

→ (how, are, you, ?)

Y

元気ですか?

X

→ Policy

π

Transformer, LSTM, …

slide-5
SLIDE 5

Sequential Text Generation Monotonic

how

π(a1|s1)

are

π(a2|s2) π(a3|s3)

you ?

π(a4|s4)

token (how, are, X)

slide-6
SLIDE 6

Sequential Text Generation Non-Monotonic

how

π(a1|s1)

are you ?

π(a2|s2) π(a3|s3) π(a4|s4)

are how ? you how are you ?

slide-7
SLIDE 7

…, how, are, you , ?, the, …

Binary Tree Generating Policy

[ ] are

…., you , ?, … …., how , …

[ ]

slide-8
SLIDE 8

∅ ∅ ∅ ∅

…., you , …

Binary Tree Generating Policy

how you ? ∅

…, how, are, you , ?, the, …

are

…., how , … …., you , ?, …

slide-9
SLIDE 9

Binary Tree Generating Policy how are you ? are how ? you

∅ ∅ ∅ ∅ ∅

in-order traversal

∅ ∅ ∅ ∅ how you ∅ ? are

slide-10
SLIDE 10

Binary Tree Generating Policy

how are you ? ∅ ∅ ∅ ∅ ∅ how are you ? ∅ ∅ ∅ ∅ ∅ how are you ? ∅ ∅ ∅ ∅ ∅

… …

slide-11
SLIDE 11

Imitation Learning

Define an oracle Sample sequences Minimize cost

π*(at|st, X, Y) (a1, …, aT) ∼ π* KL [π*( ⋅ |st), πθ( ⋅ |st)]

slide-12
SLIDE 12

Oracles

Oracle: only puts mass on valid actions

A B C D ∅ ∅ ∅ ∅ ∅

A B C D E A B C D E A B C D E A B C D E

π*

uniform

slide-13
SLIDE 13

Oracles

Oracle: only puts mass on valid actions

A B C D ∅ ∅ ∅ ∅ ∅

A B C D E A B C D E A B C D E A B C D E

π*

uniform

ℒ1 = KL( , )

A B C D E

π* πθ

uniform

A B C D E

slide-14
SLIDE 14

Oracles

left-right: only put mass on ‘left-most’ valid action

A B C D ∅ ∅ ∅

A B C D E A B C D E A B C D E A B C D E

∅ ∅

π*

left-right

slide-15
SLIDE 15

Coaching

Weight correct actions by the learned policy

A C ∅

A B C D E A B C D E

A B C D E

∝ … …

π*

uniform

πθ π*

coaching

slide-16
SLIDE 16

Coaching

Weight valid actions by the learned policy Loss reinforces preferred orders

A C ∅

A B C D E A B C D E

A B C D E

∝ … …

π*

uniform

πθ π*

coaching

KL( , )

A B C D E A B C D E

π*

coaching

πθ

slide-17
SLIDE 17

Results | Unconditional

slide-18
SLIDE 18

Results | Unconditional

slide-19
SLIDE 19

Results | Conditional

Word Reordering

slide-20
SLIDE 20

Results | Conditional

Machine Translation

slide-21
SLIDE 21

π( ⋅ | ) ∼

Left-Right

π( ⋅ | ) ∼

Non-Monotonic

Results | Variable-Sized Text Infilling

slide-22
SLIDE 22

Results | Variable-Sized Text Infilling

slide-23
SLIDE 23
  • Code & Pre-trained Models:



 https://github.com/wellecks/nonmonotonic_text

  • Poster #45 (Pacific Ballroom)
slide-24
SLIDE 24
  • Code & Pre-trained Models:



 https://github.com/wellecks/nonmonotonic_text

  • Poster #45 (Pacific Ballroom)

thank ! you