Linguistic sca fg olds for policy learning Jacob Andreas Berkeley - - PowerPoint PPT Presentation

linguistic sca fg olds for policy learning
SMART_READER_LITE
LIVE PREVIEW

Linguistic sca fg olds for policy learning Jacob Andreas Berkeley - - PowerPoint PPT Presentation

Linguistic sca fg olds for policy learning Jacob Andreas Berkeley Microsoft Semantic Machines MIT Linguistic sca fg olds for policy learning Work on language! Jacob Andreas Berkeley Microsoft Semantic Machines MIT What RL can do


slide-1
SLIDE 1

Jacob Andreas Berkeley → Microsoft Semantic Machines → MIT

Linguistic scafgolds for policy learning

slide-2
SLIDE 2

Linguistic scafgolds for policy learning

Work on language!

Jacob Andreas Berkeley → Microsoft Semantic Machines → MIT

slide-3
SLIDE 3

What RL can do for language

What language can do for RL

Crafting environment make plank get wood use toolshed make stick get wood use workbench make cloth get grass use factory make rope get grass use toolshed make bridge get iron get wood make bed∗ get wood use toolshed make axe∗ get wood use workbench make shears get wood use workbench get gold get iron get wood get gem get wood use workbench replace the last letter of the word drop head change the final letter to t i add a z if the last character is a every vowel becomes y change only the first consonant to first & last 3 letters delete every vowel replace all n s with c

slide-4
SLIDE 4

What RL can do for language

Daniel
 Fried Ronghang
 Hu Volkan
 Cirik w/ Anja Rohrbach, L.P. Morency, Taylor Berg-Kirkpatrick, Trevor Darrell and Dan Klein

slide-5
SLIDE 5

Generation & understanding

[Anderson et al. 18]

Turn right and walk through the kitchen. Go right into the living room and stop by the rug.

slide-6
SLIDE 6

A reference game

[Frank & Goodman 12]

slide-7
SLIDE 7

“glasses"

[Frank & Goodman 12]

slide-8
SLIDE 8

“glasses"

[Frank & Goodman 12]

slide-9
SLIDE 9

“glasses"

[Frank & Goodman 12]

slide-10
SLIDE 10

“glasses"

[Frank & Goodman 12]

slide-11
SLIDE 11

The rational speech acts model

[Frank & Goodman 12]

L0( . | glasses) L0( . | hat)

1/2 1/2 1

slide-12
SLIDE 12

The rational speech acts model

L0( . | glasses) L0( . | hat)

1/2 1/2 1

S1( glasses | . ) ∝ L0( . | glasses)

1 1/3

S1( hat | . )

2/3

[Frank & Goodman 12]

slide-13
SLIDE 13

The rational speech acts model 3/4 1/4 1

S1( glasses | . ) ∝ L0( . | glasses)

1 1/3

S1( hat | . )

2/3

[Frank & Goodman 12]

L1( . | glasses ) ∝ S1( glasses | . ) L1( . | hat )

slide-14
SLIDE 14

Pragmatics Q: Do you know what time it is?

slide-15
SLIDE 15

Q: Do you know what time it is? A: Yes Pragmatics

slide-16
SLIDE 16

Pragmatics Q: Do you know what time it is? A: Yes I find his cooking very interesting.

[Grice 70]

slide-17
SLIDE 17

RSA game tree

hat glasses speaker

slide-18
SLIDE 18

RSA game tree: as speaker

hat glasses hat

glasses

  • 1

+1

  • 1

+1 speaker (listener)

slide-19
SLIDE 19

RSA game tree: as speaker

hat glasses hat

glasses

  • 1

+1

  • 1

+1 speaker (listener)

slide-20
SLIDE 20

RSA game tree: as speaker

hat glasses hat

glasses

  • 1

+1

  • 1

+1 speaker (listener)

slide-21
SLIDE 21

RSA game tree: as listener

glasses

glasses

? ? (speaker) listener

?

slide-22
SLIDE 22

glasses

glasses

? ? (speaker) listener

?

RSA game tree: as listener Language use is gameplay!

slide-23
SLIDE 23

A recipe for pragmatic text generation

smiley plain glasses
 man glasses hat &
 glasses glasses

  • 1. Train a base listener model
slide-24
SLIDE 24

A recipe for pragmatic text generation

  • 2. Train a reasoning speaker to win when 


playing with the listener

  • 1. Train a base listener model

hat glasses hat glasses

  • 1

+1

  • 1

+1

slide-25
SLIDE 25

Application: image captioning

  • 1. Train an image retrieval / gen model

a snake is slithering away from Jenny

slide-26
SLIDE 26

Application: image captioning

  • 2. Describe images using the listener model


for search at inference time

a snake is slithering away

+1

the sun is in the sky

  • 1
  • 1
  • 1

[A & Klein 16, Vedantam et al. 17]

slide-27
SLIDE 27

Application: image captioning

  • 2. Describe images using the listener model


as a training-time reward (“self-play”)

the sun is in the sky

  • 1

captioner model retrieval loss

[Yu et al. 16, Mao et al. 16]

slide-28
SLIDE 28

Descriptive captions [Vedantam et al. 17]

seq2seq captioner: this bird has a yellow breast with a short pointy bill pragmatic captioner: a small yellow bird with black stripes on its body and
 black stripe on the wings.

slide-29
SLIDE 29

Contrastive captions without contrastive data!

(a) (b) (c)

Mike is holding a baseball bat. The snake is slithering away from Mike & Jenny.

[A & Klein 16]

slide-30
SLIDE 30
  • 2. Train an instruction generation model to


get the follower to goal states

  • 1. Train a base instruction following model

Application: instruction generation

slide-31
SLIDE 31

Application: instruction generation

[Fried, Hu, Cirik et al. 18]

speaker-listener: Walk past the dining room table and chairs and take a right into the living room. Stop once you are

  • n the rug.

seq2seq: Walk past the dining room table and chairs and wait there. human: Turn right and walk through the kitchen. Go right into the living room and stop by the rug.

slide-32
SLIDE 32

Listener mode

[Fried, Hu, Cirik et al. 18]

human: Go through the door on the right and continue straight. Stop in the next room in front of the bed.

instruction: Go through the door on the right and continue

  • straight. Stop in the next

room in front of the bed. (a) orange: trajectory without pragmatic inference (b) green: trajectory with pragmatic inference top-down

  • verview of

trajectories

seq-to-seq speaker-listener

slide-33
SLIDE 33

The rules of the game

glasses

glasses

+1

slide-34
SLIDE 34

The rules of the game

hat hat +1

slide-35
SLIDE 35

Killer robots [Lewis et al. 17]

Bob: i can i i everything else . . . . . . . . . . . . . . Alice: balls have zero to me to me to me to me to me to me to me to me to Bob: you i everything else . . . . . . . . . . . . . . Alice: balls have a ball to me to me to me to me to me to me to me

slide-36
SLIDE 36

Killer robots [Lewis et al. 17]

Bob: i can i i everything else . . . . . . . . . . . . . . Alice: balls have zero to me to me to me to me to me to me to me to me to Bob: you i everything else . . . . . . . . . . . . . . Alice: balls have a ball to me to me to me to me to me to me to me

slide-37
SLIDE 37

Problems to work on How do we use tools like self-play and tree
 search while remaining within the rules of
 natural language? How do we do effjcient search in string-
 valued action spaces?

slide-38
SLIDE 38

Problems to work on How do we use tools like self-play and tree
 search while remaining within the rules of natural language? How do we do effjcient search in string-
 valued action spaces?

slide-39
SLIDE 39

What language can do for RL

w/ Dan Klein and Sergey Levine

slide-40
SLIDE 40

A crafting game

make planks make sticks

slide-41
SLIDE 41

Learning with sketches

use saw get wood use axe get wood

slide-42
SLIDE 42

The options framework

[Su$on et al. 99]

slide-43
SLIDE 43

Unsupervised option learning

[Bacon & Precup 16]

+r

slide-44
SLIDE 44

Learning with intermediate rewards

[Kearns & Singh 02, Kulkarni et al. 16]

+r +r

slide-45
SLIDE 45

Segmenting demonstrations

Ï

[Stolle & Precup 02, Fox & Krishnan et al. 16]

slide-46
SLIDE 46

Learning from sketches

Ï get wood use saw

[A, Klein & Levine 17]

slide-47
SLIDE 47

Modular policies

get wood use saw get wood use axe

π1 π2 π1 π3

slide-48
SLIDE 48

Modular policies

get wood use saw get wood use axe

π1 π2 π1 π3

slide-49
SLIDE 49

Modular policies

π1

get wood

TURN LEFT

slide-50
SLIDE 50

Results: crafting game

slide-51
SLIDE 51

Results: crafting game

x 106 episodes 1 2 3 Reward Unsupervised Sketches: modular Sketches: joint

slide-52
SLIDE 52

Results: locomotion

slide-53
SLIDE 53

Results: locomotion

x 108 Tmesteps 1 2 3 Reward Unsupervised Sketches: modular Sketches: joint

slide-54
SLIDE 54

Generalization

What if I don’t get a sketch at test Tme?

???

slide-55
SLIDE 55

Generalization

25 50 75 100 Training AdaptaTon

47 89 76 42

Sketches Unsupervised

What if I don’t get a sketch at test Tme?

slide-56
SLIDE 56

Moral

A little bit of (structured) language goes a long way!

slide-57
SLIDE 57

Beyond structured sketches

itch itctch

Language learning Learning from demonstrations

emboldens dogtrot loneliness emboldecs dogtrot locelicess

first & last 3 letters

vein ???

[A, Klein & Levine 17]

slide-58
SLIDE 58

Beyond structured sketches

itch itctch

Language learning Learning from demonstrations

emboldens dogtrot loneliness emboldecs dogtrot locelicess

first & last 3 letters

vein ???

slide-59
SLIDE 59

f( · ; η, )

Pretraining via language learning

wonderful wonful

first & last 3 letters [Branavan et al., 09]

slide-60
SLIDE 60

L(f( · ; η, ), · )

Concept learning

emboldens vein loneliness emboldecs veic locelicess

slide-61
SLIDE 61

Concept learning

every vowel becomes i

L(f( · ; η, ), · )

emboldens vein loneliness emboldecs veic locelicess

slide-62
SLIDE 62

Concept learning

128.6

L(f( · ; η, ), · )

emboldens vein loneliness emboldecs veic locelicess

every vowel becomes i

slide-63
SLIDE 63

Concept learning

change consonants to c

52.3

every vowel becomes i

128.6

L(f( · ; η, ), · )

emboldens vein loneliness emboldecs veic locelicess

slide-64
SLIDE 64

Concept learning

change consonants to c

52.3

every vowel becomes i

128.6

replace n with c

8.3

L(f( · ; η, ), · )

emboldens vein loneliness emboldecs veic locelicess

slide-65
SLIDE 65

Prediction

replace n with c

L(f( · ; η, ), · )

slide-66
SLIDE 66

Evaluation

replace n with c

loonies

L(f( · ; η, ), · )

slide-67
SLIDE 67

f( · ; η, )

Evaluation

loonies loocies

replace n with c

slide-68
SLIDE 68

As multitask learning

wonderful glabrous itch wonful glaous itctch

Pretraining data Training data

emboldens vein loneliness emboldecs veic locelicess

vein veic itch

itctch

first & last 3 letters replace n with c

arg min

η

L(f( | ; η, ))

<latexit sha1_base64="6DTSKcLHA7PT7ua1kd9L70AxyAc=">ACQXicbZBSxwxFMcz1ra62nbHr0EF2EtZmRQgu9SNtDxUVXBV2luVN9s0aTDJj8qawDOvH8N4K+136EfwJh71YmacQ6s+SPjx/+clef8kV9JRGP4N5p7MP32fGxtbT84uWr9us3+y4rMC+yFRmDxNwqKTBPklSeJhbBJ0oPEiOv1b+wU+0TmZmj6Y5DjVMjEylAPLSqB2YrATLc0oRgL+o5t245OTAsY81tJvNX/mlfmen56ur4/anbAX1sUfQtRAhzW1M2pfx+NMFBoNCQXODaIwp2EJlqRQOGvFhcMcxDFMcODRgEY3LOvJZnzNK2OeZtYvQ7xW/+0oQTs31Yk/qYGO3H2vEh/zBgWln4alNHlBaMTdQ2mhOGW8iomPpUVBauoBhJX+r1wcgQVBPsxW/A39LBa3/L3bOVqgzL4rmyRnNcQV+bCi+9E8hP2NXhT2ot0Pnc0vTWwLbIWtsi6L2Ee2yb6zHdZngp2xc/ab/Ql+BRfBZXB1d3QuaHresv8quLkFy6v0w=</latexit><latexit sha1_base64="6DTSKcLHA7PT7ua1kd9L70AxyAc=">ACQXicbZBSxwxFMcz1ra62nbHr0EF2EtZmRQgu9SNtDxUVXBV2luVN9s0aTDJj8qawDOvH8N4K+136EfwJh71YmacQ6s+SPjx/+clef8kV9JRGP4N5p7MP32fGxtbT84uWr9us3+y4rMC+yFRmDxNwqKTBPklSeJhbBJ0oPEiOv1b+wU+0TmZmj6Y5DjVMjEylAPLSqB2YrATLc0oRgL+o5t245OTAsY81tJvNX/mlfmen56ur4/anbAX1sUfQtRAhzW1M2pfx+NMFBoNCQXODaIwp2EJlqRQOGvFhcMcxDFMcODRgEY3LOvJZnzNK2OeZtYvQ7xW/+0oQTs31Yk/qYGO3H2vEh/zBgWln4alNHlBaMTdQ2mhOGW8iomPpUVBauoBhJX+r1wcgQVBPsxW/A39LBa3/L3bOVqgzL4rmyRnNcQV+bCi+9E8hP2NXhT2ot0Pnc0vTWwLbIWtsi6L2Ee2yb6zHdZngp2xc/ab/Ql+BRfBZXB1d3QuaHresv8quLkFy6v0w=</latexit><latexit sha1_base64="6DTSKcLHA7PT7ua1kd9L70AxyAc=">ACQXicbZBSxwxFMcz1ra62nbHr0EF2EtZmRQgu9SNtDxUVXBV2luVN9s0aTDJj8qawDOvH8N4K+136EfwJh71YmacQ6s+SPjx/+clef8kV9JRGP4N5p7MP32fGxtbT84uWr9us3+y4rMC+yFRmDxNwqKTBPklSeJhbBJ0oPEiOv1b+wU+0TmZmj6Y5DjVMjEylAPLSqB2YrATLc0oRgL+o5t245OTAsY81tJvNX/mlfmen56ur4/anbAX1sUfQtRAhzW1M2pfx+NMFBoNCQXODaIwp2EJlqRQOGvFhcMcxDFMcODRgEY3LOvJZnzNK2OeZtYvQ7xW/+0oQTs31Yk/qYGO3H2vEh/zBgWln4alNHlBaMTdQ2mhOGW8iomPpUVBauoBhJX+r1wcgQVBPsxW/A39LBa3/L3bOVqgzL4rmyRnNcQV+bCi+9E8hP2NXhT2ot0Pnc0vTWwLbIWtsi6L2Ee2yb6zHdZngp2xc/ab/Ql+BRfBZXB1d3QuaHresv8quLkFy6v0w=</latexit><latexit sha1_base64="6DTSKcLHA7PT7ua1kd9L70AxyAc=">ACQXicbZBSxwxFMcz1ra62nbHr0EF2EtZmRQgu9SNtDxUVXBV2luVN9s0aTDJj8qawDOvH8N4K+136EfwJh71YmacQ6s+SPjx/+clef8kV9JRGP4N5p7MP32fGxtbT84uWr9us3+y4rMC+yFRmDxNwqKTBPklSeJhbBJ0oPEiOv1b+wU+0TmZmj6Y5DjVMjEylAPLSqB2YrATLc0oRgL+o5t245OTAsY81tJvNX/mlfmen56ur4/anbAX1sUfQtRAhzW1M2pfx+NMFBoNCQXODaIwp2EJlqRQOGvFhcMcxDFMcODRgEY3LOvJZnzNK2OeZtYvQ7xW/+0oQTs31Yk/qYGO3H2vEh/zBgWln4alNHlBaMTdQ2mhOGW8iomPpUVBauoBhJX+r1wcgQVBPsxW/A39LBa3/L3bOVqgzL4rmyRnNcQV+bCi+9E8hP2NXhT2ot0Pnc0vTWwLbIWtsi6L2Ee2yb6zHdZngp2xc/ab/Ql+BRfBZXB1d3QuaHresv8quLkFy6v0w=</latexit>

arg min

η

L(f( | ; η, ))

<latexit sha1_base64="6DTSKcLHA7PT7ua1kd9L70AxyAc=">ACQXicbZBSxwxFMcz1ra62nbHr0EF2EtZmRQgu9SNtDxUVXBV2luVN9s0aTDJj8qawDOvH8N4K+136EfwJh71YmacQ6s+SPjx/+clef8kV9JRGP4N5p7MP32fGxtbT84uWr9us3+y4rMC+yFRmDxNwqKTBPklSeJhbBJ0oPEiOv1b+wU+0TmZmj6Y5DjVMjEylAPLSqB2YrATLc0oRgL+o5t245OTAsY81tJvNX/mlfmen56ur4/anbAX1sUfQtRAhzW1M2pfx+NMFBoNCQXODaIwp2EJlqRQOGvFhcMcxDFMcODRgEY3LOvJZnzNK2OeZtYvQ7xW/+0oQTs31Yk/qYGO3H2vEh/zBgWln4alNHlBaMTdQ2mhOGW8iomPpUVBauoBhJX+r1wcgQVBPsxW/A39LBa3/L3bOVqgzL4rmyRnNcQV+bCi+9E8hP2NXhT2ot0Pnc0vTWwLbIWtsi6L2Ee2yb6zHdZngp2xc/ab/Ql+BRfBZXB1d3QuaHresv8quLkFy6v0w=</latexit><latexit sha1_base64="6DTSKcLHA7PT7ua1kd9L70AxyAc=">ACQXicbZBSxwxFMcz1ra62nbHr0EF2EtZmRQgu9SNtDxUVXBV2luVN9s0aTDJj8qawDOvH8N4K+136EfwJh71YmacQ6s+SPjx/+clef8kV9JRGP4N5p7MP32fGxtbT84uWr9us3+y4rMC+yFRmDxNwqKTBPklSeJhbBJ0oPEiOv1b+wU+0TmZmj6Y5DjVMjEylAPLSqB2YrATLc0oRgL+o5t245OTAsY81tJvNX/mlfmen56ur4/anbAX1sUfQtRAhzW1M2pfx+NMFBoNCQXODaIwp2EJlqRQOGvFhcMcxDFMcODRgEY3LOvJZnzNK2OeZtYvQ7xW/+0oQTs31Yk/qYGO3H2vEh/zBgWln4alNHlBaMTdQ2mhOGW8iomPpUVBauoBhJX+r1wcgQVBPsxW/A39LBa3/L3bOVqgzL4rmyRnNcQV+bCi+9E8hP2NXhT2ot0Pnc0vTWwLbIWtsi6L2Ee2yb6zHdZngp2xc/ab/Ql+BRfBZXB1d3QuaHresv8quLkFy6v0w=</latexit><latexit sha1_base64="6DTSKcLHA7PT7ua1kd9L70AxyAc=">ACQXicbZBSxwxFMcz1ra62nbHr0EF2EtZmRQgu9SNtDxUVXBV2luVN9s0aTDJj8qawDOvH8N4K+136EfwJh71YmacQ6s+SPjx/+clef8kV9JRGP4N5p7MP32fGxtbT84uWr9us3+y4rMC+yFRmDxNwqKTBPklSeJhbBJ0oPEiOv1b+wU+0TmZmj6Y5DjVMjEylAPLSqB2YrATLc0oRgL+o5t245OTAsY81tJvNX/mlfmen56ur4/anbAX1sUfQtRAhzW1M2pfx+NMFBoNCQXODaIwp2EJlqRQOGvFhcMcxDFMcODRgEY3LOvJZnzNK2OeZtYvQ7xW/+0oQTs31Yk/qYGO3H2vEh/zBgWln4alNHlBaMTdQ2mhOGW8iomPpUVBauoBhJX+r1wcgQVBPsxW/A39LBa3/L3bOVqgzL4rmyRnNcQV+bCi+9E8hP2NXhT2ot0Pnc0vTWwLbIWtsi6L2Ee2yb6zHdZngp2xc/ab/Ql+BRfBZXB1d3QuaHresv8quLkFy6v0w=</latexit><latexit sha1_base64="6DTSKcLHA7PT7ua1kd9L70AxyAc=">ACQXicbZBSxwxFMcz1ra62nbHr0EF2EtZmRQgu9SNtDxUVXBV2luVN9s0aTDJj8qawDOvH8N4K+136EfwJh71YmacQ6s+SPjx/+clef8kV9JRGP4N5p7MP32fGxtbT84uWr9us3+y4rMC+yFRmDxNwqKTBPklSeJhbBJ0oPEiOv1b+wU+0TmZmj6Y5DjVMjEylAPLSqB2YrATLc0oRgL+o5t245OTAsY81tJvNX/mlfmen56ur4/anbAX1sUfQtRAhzW1M2pfx+NMFBoNCQXODaIwp2EJlqRQOGvFhcMcxDFMcODRgEY3LOvJZnzNK2OeZtYvQ7xW/+0oQTs31Yk/qYGO3H2vEh/zBgWln4alNHlBaMTdQ2mhOGW8iomPpUVBauoBhJX+r1wcgQVBPsxW/A39LBa3/L3bOVqgzL4rmyRnNcQV+bCi+9E8hP2NXhT2ot0Pnc0vTWwLbIWtsi6L2Ee2yb6zHdZngp2xc/ab/Ql+BRfBZXB1d3QuaHresv8quLkFy6v0w=</latexit>

???

[Caruana 97]

slide-69
SLIDE 69

As inverse reinforcement learning

vein

replace n with c

???

{

<latexit sha1_base64="H0oI12VFcZFlVoYZOQ3lWhXltU=">ACu3icbVFda9RAFJ3ErxqtbvXRl6FLYatlSUpBQYTiB/igWMFtCzsh3ExutmMnk3RmIiwxf6P/rf/FByfZKHbrhRnOnHPnbl30koKY8PwyvNv3b5z97G/eDBw81Hj0dbT45NWuOM17KUp+mYFAKhTMrMTSiMUqcST9Pxdp5/8QG1Eqb7ZYVxAQslcsHBOioZXe4w0ItCqIShBfpk/YxUNGWFcFuPX9NO3HMnd9jdDf54KN/3ZVwDrBnadp8aJOGWagpM6KgTmnfd2O+jmUpGs14D1zTQas5Y1yWgcTsM+6E0QDWBMhjhKtrxNlpW8LlBZLsGYeRWNm5AW8EltgGrDVbAz2GBcwcVFGjipr+zpTuOyWheareUpT37r6OBwphlkbrMrkezrnXk/7R5bfNXcSNUVtUfHVRXktqS9r9Bs2ERm7l0gHgWri3Un4Grh1fxaw9+h60fjZ1f1SoQZb6ufNMPK2B6xD156zaskNMFof101wvD+Nwmn09WB8+HY5QZ5RrbJhETkJTkH8kRmRFOfnb3gtvz3/jc/+7L1epvjd4npJr4de/Ad1k1zY=</latexit><latexit sha1_base64="H0oI12VFcZFlVoYZOQ3lWhXltU=">ACu3icbVFda9RAFJ3ErxqtbvXRl6FLYatlSUpBQYTiB/igWMFtCzsh3ExutmMnk3RmIiwxf6P/rf/FByfZKHbrhRnOnHPnbl30koKY8PwyvNv3b5z97G/eDBw81Hj0dbT45NWuOM17KUp+mYFAKhTMrMTSiMUqcST9Pxdp5/8QG1Eqb7ZYVxAQslcsHBOioZXe4w0ItCqIShBfpk/YxUNGWFcFuPX9NO3HMnd9jdDf54KN/3ZVwDrBnadp8aJOGWagpM6KgTmnfd2O+jmUpGs14D1zTQas5Y1yWgcTsM+6E0QDWBMhjhKtrxNlpW8LlBZLsGYeRWNm5AW8EltgGrDVbAz2GBcwcVFGjipr+zpTuOyWheareUpT37r6OBwphlkbrMrkezrnXk/7R5bfNXcSNUVtUfHVRXktqS9r9Bs2ERm7l0gHgWri3Un4Grh1fxaw9+h60fjZ1f1SoQZb6ufNMPK2B6xD156zaskNMFof101wvD+Nwmn09WB8+HY5QZ5RrbJhETkJTkH8kRmRFOfnb3gtvz3/jc/+7L1epvjd4npJr4de/Ad1k1zY=</latexit><latexit sha1_base64="H0oI12VFcZFlVoYZOQ3lWhXltU=">ACu3icbVFda9RAFJ3ErxqtbvXRl6FLYatlSUpBQYTiB/igWMFtCzsh3ExutmMnk3RmIiwxf6P/rf/FByfZKHbrhRnOnHPnbl30koKY8PwyvNv3b5z97G/eDBw81Hj0dbT45NWuOM17KUp+mYFAKhTMrMTSiMUqcST9Pxdp5/8QG1Eqb7ZYVxAQslcsHBOioZXe4w0ItCqIShBfpk/YxUNGWFcFuPX9NO3HMnd9jdDf54KN/3ZVwDrBnadp8aJOGWagpM6KgTmnfd2O+jmUpGs14D1zTQas5Y1yWgcTsM+6E0QDWBMhjhKtrxNlpW8LlBZLsGYeRWNm5AW8EltgGrDVbAz2GBcwcVFGjipr+zpTuOyWheareUpT37r6OBwphlkbrMrkezrnXk/7R5bfNXcSNUVtUfHVRXktqS9r9Bs2ERm7l0gHgWri3Un4Grh1fxaw9+h60fjZ1f1SoQZb6ufNMPK2B6xD156zaskNMFof101wvD+Nwmn09WB8+HY5QZ5RrbJhETkJTkH8kRmRFOfnb3gtvz3/jc/+7L1epvjd4npJr4de/Ad1k1zY=</latexit><latexit sha1_base64="38u3Msd8cLInPlViofgp/9PDR90=">ACFXicbZDNSgMxFIUz9a+OrbZrN8EiIsy40aXgi7ciBXsD3SGksnctqGZEgyQhn6Am59BZ/GnbjybcxMu7CtFwIf5yY395wo5Uwbz/txKju7e/sH1UP3qObWj08atZ6WmaLQpZJLNYiIBs4EdA0zHAapApJEHPrR7K7o919BaSbFi5mnECZkItiYUWKs1Bk1Wl7bKwtvg7+CFlrVqOnUg1jSLAFhKCdaD30vNWFOlGUw8INMg0poTMygaFQRLQYV7ucDnVonxWCp7hMGl+vdFThKt50lkbybETPVmrxD/6w0zM74JcybSzICgy4/GcdG4sI0jpkCavjcAqGK2V0xnRJFqLHRuME9WC8KHu3cpxQUMVJd5gFRk4SJRQlBQWvrLC3Z/PzNtLahd9X2vb/7KEqOkVn6AL56BrdogfUQV1EUYze0Lvz4Xw6X8ucK84q8CZaK+f7F+W1ogc=</latexit><latexit sha1_base64="907aLh1HkgsKeqMR7wtaRLVIO5o=">ACsHicbVFNb9QwEHXCVwktbLlysVpV2kK1SriAhJCQAIkDiCKxbaV1FE2cydatnaS2g7Qy+Rv8N/4LB5xsQHTLSLae3/PMeJ7zRgpj4/hnEN6fefuva370YPtnYePJrvbJ6ZuNc5r2Wtz3IwKEWFcyusxLNGI6hc4ml+bXT7+hNqKuvtpVg6mCZSVKwcF6Kpv8OGCgl0pUGUML9O0nLKrqxYKypTw24Bf0V48id/ODyM/uRQRv9mN8JngD3Pc/e+yxyz0FJmhKJe6RZD3Z76PpakGzXTiA3DOI1Fx1w2Y9n8RD0JkhGsE/GOM52gx1W1LxVWFkuwZhFEjc2daCt4BK7iLUG+CXsMSFhxUoNKkbenb0wDMFLWvtV2XpwP6b4UAZs1K5v9nPaDa1nvyftmht+TJ1ompaixVfNypbSW1N+9+ghdDIrVx5AFwL/1bKz0EDt/7PIvYO/SwaP/m6nxvUYGv91I2WdwNgPbr2nPVI3sBk06b4OT5LIlnyZeYbJEnZI9MSUJekDfkAzkmc8LJr2AveBYcha9DHl6srQ6D0fPH5FqE8jeKt9YP</latexit><latexit sha1_base64="907aLh1HkgsKeqMR7wtaRLVIO5o=">ACsHicbVFNb9QwEHXCVwktbLlysVpV2kK1SriAhJCQAIkDiCKxbaV1FE2cydatnaS2g7Qy+Rv8N/4LB5xsQHTLSLae3/PMeJ7zRgpj4/hnEN6fefuva370YPtnYePJrvbJ6ZuNc5r2Wtz3IwKEWFcyusxLNGI6hc4ml+bXT7+hNqKuvtpVg6mCZSVKwcF6Kpv8OGCgl0pUGUML9O0nLKrqxYKypTw24Bf0V48id/ODyM/uRQRv9mN8JngD3Pc/e+yxyz0FJmhKJe6RZD3Z76PpakGzXTiA3DOI1Fx1w2Y9n8RD0JkhGsE/GOM52gx1W1LxVWFkuwZhFEjc2daCt4BK7iLUG+CXsMSFhxUoNKkbenb0wDMFLWvtV2XpwP6b4UAZs1K5v9nPaDa1nvyftmht+TJ1ompaixVfNypbSW1N+9+ghdDIrVx5AFwL/1bKz0EDt/7PIvYO/SwaP/m6nxvUYGv91I2WdwNgPbr2nPVI3sBk06b4OT5LIlnyZeYbJEnZI9MSUJekDfkAzkmc8LJr2AveBYcha9DHl6srQ6D0fPH5FqE8jeKt9YP</latexit><latexit sha1_base64="K8qKatu3reyGM+zdZ8LxlK8+MbY=">ACu3icbVFNb9QwEHXCVwmUbuHIxeq0haqVcIFJIRUlSJxAFEktq20jqKJM9ma2klqO0irNH+j/43/wgEnGxDdMpKt5/fmjT3jtJLC2D86fl37t67/2DjYfDo8eaTrdH20xNT1prjJey1GcpGJSiwJkVuJZpRFUKvE0vXjf6ac/UBtRFt/sJYwaIQueBgHZWMrncZ6IUSRcLQAv0ySfs8rKGjDIl3Nbjt7QT93JHfb2gj8eyuhfdyWcA+x5mjYf2qRhFmrKjFDUKe28r9tRV0NJulYzDljfTKMxa1mTjMbhNOyD3gbRAMZkiONk29tkWclrhYXlEoyZR2Fl4wa0FVxiG7DaYAX8AhY4d7AhSZu+jtbuYjOaldquwtGf/dTSgjFmq1GV2PZp1rSP/p81rm7+JG1FUtcWCry7Ka0ltSbvfoJnQyK1cOgBcC/dWys9BA7fuzwJ2hK4XjZ9d3S8VarClftEMI297wDp04zmrltwAo/Vx3QYnr6ZROI2+huODw2GUG+Q52SETEpHX5IB8JMdkRj5e14L719/53P/e+XKX63uB5Rm6EX/8G3CTXMg=</latexit><latexit sha1_base64="H0oI12VFcZFlVoYZOQ3lWhXltU=">ACu3icbVFda9RAFJ3ErxqtbvXRl6FLYatlSUpBQYTiB/igWMFtCzsh3ExutmMnk3RmIiwxf6P/rf/FByfZKHbrhRnOnHPnbl30koKY8PwyvNv3b5z97G/eDBw81Hj0dbT45NWuOM17KUp+mYFAKhTMrMTSiMUqcST9Pxdp5/8QG1Eqb7ZYVxAQslcsHBOioZXe4w0ItCqIShBfpk/YxUNGWFcFuPX9NO3HMnd9jdDf54KN/3ZVwDrBnadp8aJOGWagpM6KgTmnfd2O+jmUpGs14D1zTQas5Y1yWgcTsM+6E0QDWBMhjhKtrxNlpW8LlBZLsGYeRWNm5AW8EltgGrDVbAz2GBcwcVFGjipr+zpTuOyWheareUpT37r6OBwphlkbrMrkezrnXk/7R5bfNXcSNUVtUfHVRXktqS9r9Bs2ERm7l0gHgWri3Un4Grh1fxaw9+h60fjZ1f1SoQZb6ufNMPK2B6xD156zaskNMFof101wvD+Nwmn09WB8+HY5QZ5RrbJhETkJTkH8kRmRFOfnb3gtvz3/jc/+7L1epvjd4npJr4de/Ad1k1zY=</latexit><latexit sha1_base64="H0oI12VFcZFlVoYZOQ3lWhXltU=">ACu3icbVFda9RAFJ3ErxqtbvXRl6FLYatlSUpBQYTiB/igWMFtCzsh3ExutmMnk3RmIiwxf6P/rf/FByfZKHbrhRnOnHPnbl30koKY8PwyvNv3b5z97G/eDBw81Hj0dbT45NWuOM17KUp+mYFAKhTMrMTSiMUqcST9Pxdp5/8QG1Eqb7ZYVxAQslcsHBOioZXe4w0ItCqIShBfpk/YxUNGWFcFuPX9NO3HMnd9jdDf54KN/3ZVwDrBnadp8aJOGWagpM6KgTmnfd2O+jmUpGs14D1zTQas5Y1yWgcTsM+6E0QDWBMhjhKtrxNlpW8LlBZLsGYeRWNm5AW8EltgGrDVbAz2GBcwcVFGjipr+zpTuOyWheareUpT37r6OBwphlkbrMrkezrnXk/7R5bfNXcSNUVtUfHVRXktqS9r9Bs2ERm7l0gHgWri3Un4Grh1fxaw9+h60fjZ1f1SoQZb6ufNMPK2B6xD156zaskNMFof101wvD+Nwmn09WB8+HY5QZ5RrbJhETkJTkH8kRmRFOfnb3gtvz3/jc/+7L1epvjd4npJr4de/Ad1k1zY=</latexit><latexit sha1_base64="H0oI12VFcZFlVoYZOQ3lWhXltU=">ACu3icbVFda9RAFJ3ErxqtbvXRl6FLYatlSUpBQYTiB/igWMFtCzsh3ExutmMnk3RmIiwxf6P/rf/FByfZKHbrhRnOnHPnbl30koKY8PwyvNv3b5z97G/eDBw81Hj0dbT45NWuOM17KUp+mYFAKhTMrMTSiMUqcST9Pxdp5/8QG1Eqb7ZYVxAQslcsHBOioZXe4w0ItCqIShBfpk/YxUNGWFcFuPX9NO3HMnd9jdDf54KN/3ZVwDrBnadp8aJOGWagpM6KgTmnfd2O+jmUpGs14D1zTQas5Y1yWgcTsM+6E0QDWBMhjhKtrxNlpW8LlBZLsGYeRWNm5AW8EltgGrDVbAz2GBcwcVFGjipr+zpTuOyWheareUpT37r6OBwphlkbrMrkezrnXk/7R5bfNXcSNUVtUfHVRXktqS9r9Bs2ERm7l0gHgWri3Un4Grh1fxaw9+h60fjZ1f1SoQZb6ufNMPK2B6xD156zaskNMFof101wvD+Nwmn09WB8+HY5QZ5RrbJhETkJTkH8kRmRFOfnb3gtvz3/jc/+7L1epvjd4npJr4de/Ad1k1zY=</latexit><latexit sha1_base64="H0oI12VFcZFlVoYZOQ3lWhXltU=">ACu3icbVFda9RAFJ3ErxqtbvXRl6FLYatlSUpBQYTiB/igWMFtCzsh3ExutmMnk3RmIiwxf6P/rf/FByfZKHbrhRnOnHPnbl30koKY8PwyvNv3b5z97G/eDBw81Hj0dbT45NWuOM17KUp+mYFAKhTMrMTSiMUqcST9Pxdp5/8QG1Eqb7ZYVxAQslcsHBOioZXe4w0ItCqIShBfpk/YxUNGWFcFuPX9NO3HMnd9jdDf54KN/3ZVwDrBnadp8aJOGWagpM6KgTmnfd2O+jmUpGs14D1zTQas5Y1yWgcTsM+6E0QDWBMhjhKtrxNlpW8LlBZLsGYeRWNm5AW8EltgGrDVbAz2GBcwcVFGjipr+zpTuOyWheareUpT37r6OBwphlkbrMrkezrnXk/7R5bfNXcSNUVtUfHVRXktqS9r9Bs2ERm7l0gHgWri3Un4Grh1fxaw9+h60fjZ1f1SoQZb6ufNMPK2B6xD156zaskNMFof101wvD+Nwmn09WB8+HY5QZ5RrbJhETkJTkH8kRmRFOfnb3gtvz3/jc/+7L1epvjd4npJr4de/Ad1k1zY=</latexit><latexit sha1_base64="H0oI12VFcZFlVoYZOQ3lWhXltU=">ACu3icbVFda9RAFJ3ErxqtbvXRl6FLYatlSUpBQYTiB/igWMFtCzsh3ExutmMnk3RmIiwxf6P/rf/FByfZKHbrhRnOnHPnbl30koKY8PwyvNv3b5z97G/eDBw81Hj0dbT45NWuOM17KUp+mYFAKhTMrMTSiMUqcST9Pxdp5/8QG1Eqb7ZYVxAQslcsHBOioZXe4w0ItCqIShBfpk/YxUNGWFcFuPX9NO3HMnd9jdDf54KN/3ZVwDrBnadp8aJOGWagpM6KgTmnfd2O+jmUpGs14D1zTQas5Y1yWgcTsM+6E0QDWBMhjhKtrxNlpW8LlBZLsGYeRWNm5AW8EltgGrDVbAz2GBcwcVFGjipr+zpTuOyWheareUpT37r6OBwphlkbrMrkezrnXk/7R5bfNXcSNUVtUfHVRXktqS9r9Bs2ERm7l0gHgWri3Un4Grh1fxaw9+h60fjZ1f1SoQZb6ufNMPK2B6xD156zaskNMFof101wvD+Nwmn09WB8+HY5QZ5RrbJhETkJTkH8kRmRFOfnb3gtvz3/jc/+7L1epvjd4npJr4de/Ad1k1zY=</latexit><latexit sha1_base64="H0oI12VFcZFlVoYZOQ3lWhXltU=">ACu3icbVFda9RAFJ3ErxqtbvXRl6FLYatlSUpBQYTiB/igWMFtCzsh3ExutmMnk3RmIiwxf6P/rf/FByfZKHbrhRnOnHPnbl30koKY8PwyvNv3b5z97G/eDBw81Hj0dbT45NWuOM17KUp+mYFAKhTMrMTSiMUqcST9Pxdp5/8QG1Eqb7ZYVxAQslcsHBOioZXe4w0ItCqIShBfpk/YxUNGWFcFuPX9NO3HMnd9jdDf54KN/3ZVwDrBnadp8aJOGWagpM6KgTmnfd2O+jmUpGs14D1zTQas5Y1yWgcTsM+6E0QDWBMhjhKtrxNlpW8LlBZLsGYeRWNm5AW8EltgGrDVbAz2GBcwcVFGjipr+zpTuOyWheareUpT37r6OBwphlkbrMrkezrnXk/7R5bfNXcSNUVtUfHVRXktqS9r9Bs2ERm7l0gHgWri3Un4Grh1fxaw9+h60fjZ1f1SoQZb6ufNMPK2B6xD156zaskNMFof101wvD+Nwmn09WB8+HY5QZ5RrbJhETkJTkH8kRmRFOfnb3gtvz3/jc/+7L1epvjd4npJr4de/Ad1k1zY=</latexit>

cost function

arg min ˆ Eτ∼π[L(f(τ| ; η, ))]

<latexit sha1_base64="eHQpI+OSq3iuV/rUS12oFbSDhE=">ADHicnVJNi9RAEO3ErzW6OqtHL43DwKzIkIig4GXxAzysuIKzuzAdQqVTmWk2X9vdEY2J6/6F/w13sSr4C/xaicTR3fWkwVpXr/qeql63XGVCaV9/4fjXrh46fKVravetevbN24Odm4dqrKWHKe8zEp5HIPCTBQ41UJneFxJhDzO8Cg+edbmj96hVKIs3uplhWEO80KkgoO2VDT4OWIg57koIoYa6P4HbPT0xoSynJhlw4/oW3yvt3Zze6u97uGMrquroStAL2IY/OiQzTUFOmRE5tpl1ui31vpekG5qht9ZcgDZ/pP5La8Q6Z4zEpGHG6WUcSE53V+3Hg2G/sTvgp4HQ+GpI+DaMfZknJ6xwLzTNQahb4lQ4NSC14ho3HaoUV8BOY48zCAnJUoen6aOjIMglNS2m/QtO/bvCQK7UMo/tyXZytZlryX/lZrVOH4dGFWtseCrH6V1RnVJ2+umiZDIdba0ALgUtlfKFyCBa/soPYc7SwSX1nd1xVK0KW8Z3qLmg6wFp1pZzWSNTDYtOs8OHwCfxJ8ObhcO9pb+UWuUPukjEJyCOyR16SAzIl3AmdD85H5P72f3ifnW/rY6Tl9zm5wJ9/sv7Qn7Sg=</latexit><latexit sha1_base64="eHQpI+OSq3iuV/rUS12oFbSDhE=">ADHicnVJNi9RAEO3ErzW6OqtHL43DwKzIkIig4GXxAzysuIKzuzAdQqVTmWk2X9vdEY2J6/6F/w13sSr4C/xaicTR3fWkwVpXr/qeql63XGVCaV9/4fjXrh46fKVravetevbN24Odm4dqrKWHKe8zEp5HIPCTBQ41UJneFxJhDzO8Cg+edbmj96hVKIs3uplhWEO80KkgoO2VDT4OWIg57koIoYa6P4HbPT0xoSynJhlw4/oW3yvt3Zze6u97uGMrquroStAL2IY/OiQzTUFOmRE5tpl1ui31vpekG5qht9ZcgDZ/pP5La8Q6Z4zEpGHG6WUcSE53V+3Hg2G/sTvgp4HQ+GpI+DaMfZknJ6xwLzTNQahb4lQ4NSC14ho3HaoUV8BOY48zCAnJUoen6aOjIMglNS2m/QtO/bvCQK7UMo/tyXZytZlryX/lZrVOH4dGFWtseCrH6V1RnVJ2+umiZDIdba0ALgUtlfKFyCBa/soPYc7SwSX1nd1xVK0KW8Z3qLmg6wFp1pZzWSNTDYtOs8OHwCfxJ8ObhcO9pb+UWuUPukjEJyCOyR16SAzIl3AmdD85H5P72f3ifnW/rY6Tl9zm5wJ9/sv7Qn7Sg=</latexit><latexit sha1_base64="eHQpI+OSq3iuV/rUS12oFbSDhE=">ADHicnVJNi9RAEO3ErzW6OqtHL43DwKzIkIig4GXxAzysuIKzuzAdQqVTmWk2X9vdEY2J6/6F/w13sSr4C/xaicTR3fWkwVpXr/qeql63XGVCaV9/4fjXrh46fKVravetevbN24Odm4dqrKWHKe8zEp5HIPCTBQ41UJneFxJhDzO8Cg+edbmj96hVKIs3uplhWEO80KkgoO2VDT4OWIg57koIoYa6P4HbPT0xoSynJhlw4/oW3yvt3Zze6u97uGMrquroStAL2IY/OiQzTUFOmRE5tpl1ui31vpekG5qht9ZcgDZ/pP5La8Q6Z4zEpGHG6WUcSE53V+3Hg2G/sTvgp4HQ+GpI+DaMfZknJ6xwLzTNQahb4lQ4NSC14ho3HaoUV8BOY48zCAnJUoen6aOjIMglNS2m/QtO/bvCQK7UMo/tyXZytZlryX/lZrVOH4dGFWtseCrH6V1RnVJ2+umiZDIdba0ALgUtlfKFyCBa/soPYc7SwSX1nd1xVK0KW8Z3qLmg6wFp1pZzWSNTDYtOs8OHwCfxJ8ObhcO9pb+UWuUPukjEJyCOyR16SAzIl3AmdD85H5P72f3ifnW/rY6Tl9zm5wJ9/sv7Qn7Sg=</latexit><latexit sha1_base64="eHQpI+OSq3iuV/rUS12oFbSDhE=">ADHicnVJNi9RAEO3ErzW6OqtHL43DwKzIkIig4GXxAzysuIKzuzAdQqVTmWk2X9vdEY2J6/6F/w13sSr4C/xaicTR3fWkwVpXr/qeql63XGVCaV9/4fjXrh46fKVravetevbN24Odm4dqrKWHKe8zEp5HIPCTBQ41UJneFxJhDzO8Cg+edbmj96hVKIs3uplhWEO80KkgoO2VDT4OWIg57koIoYa6P4HbPT0xoSynJhlw4/oW3yvt3Zze6u97uGMrquroStAL2IY/OiQzTUFOmRE5tpl1ui31vpekG5qht9ZcgDZ/pP5La8Q6Z4zEpGHG6WUcSE53V+3Hg2G/sTvgp4HQ+GpI+DaMfZknJ6xwLzTNQahb4lQ4NSC14ho3HaoUV8BOY48zCAnJUoen6aOjIMglNS2m/QtO/bvCQK7UMo/tyXZytZlryX/lZrVOH4dGFWtseCrH6V1RnVJ2+umiZDIdba0ALgUtlfKFyCBa/soPYc7SwSX1nd1xVK0KW8Z3qLmg6wFp1pZzWSNTDYtOs8OHwCfxJ8ObhcO9pb+UWuUPukjEJyCOyR16SAzIl3AmdD85H5P72f3ifnW/rY6Tl9zm5wJ9/sv7Qn7Sg=</latexit>
slide-70
SLIDE 70

As a language game…

replace n with c

speaker model listener loss

f L

<latexit sha1_base64="LWz4dIFHTSmXqy3ZapCM+I1HvOk=">ACxnicbVFda9RAFJ3Ej9ZodauPvgwuha3okojYQl+KH9CHihXctrATws3kZjs0maTzoSwx4F/xZ/lvnGSj2K0XEs6cM+fM3DtpXQhtwvCX59+6fefuxua94P6DrYePRtuPT3VlFcZr4pKnaegsRASZ0aYAs9rhVCmBZ6l+86/ewrKi0q+cUsa4xLWEiRCw7GUcno5w4DtSiFTBgaoMeTfMKurixklJXC/Xp8QDvxhVu5xe5u8MdDGf3roVzgLlI0+ZDmzTMgKVMi5I6pZ3uR31fYika5mxC+27aRmLWuCnDIuFKfHyWgcTsO+6E0QDWBMhjpJtr0tlXcligNL0DreRTWJm5AGcELbANmNdbAL2GBcwclKjpj+9pTuOyWheKfdJQ3v2X0cDpdbLMnU7u3b1utaR/9Pm1uT7cSNkbQ1KvjotwU1Fe0ehmZCITfF0gHgSri7Un4BCrhxzxew9+h6UfjR5X6qUYGp1PNmH7bA9aha9dZteQGK2P6yY4fTWNwmn0+fX48O0wyk3ylDwjExKRPXJIjsgJmRHubXgvTfen/kS9/631ZbfW/wPCHXyv/xGxd+2cM=</latexit><latexit sha1_base64="LWz4dIFHTSmXqy3ZapCM+I1HvOk=">ACxnicbVFda9RAFJ3Ej9ZodauPvgwuha3okojYQl+KH9CHihXctrATws3kZjs0maTzoSwx4F/xZ/lvnGSj2K0XEs6cM+fM3DtpXQhtwvCX59+6fefuxua94P6DrYePRtuPT3VlFcZr4pKnaegsRASZ0aYAs9rhVCmBZ6l+86/ewrKi0q+cUsa4xLWEiRCw7GUcno5w4DtSiFTBgaoMeTfMKurixklJXC/Xp8QDvxhVu5xe5u8MdDGf3roVzgLlI0+ZDmzTMgKVMi5I6pZ3uR31fYika5mxC+27aRmLWuCnDIuFKfHyWgcTsO+6E0QDWBMhjpJtr0tlXcligNL0DreRTWJm5AGcELbANmNdbAL2GBcwclKjpj+9pTuOyWheKfdJQ3v2X0cDpdbLMnU7u3b1utaR/9Pm1uT7cSNkbQ1KvjotwU1Fe0ehmZCITfF0gHgSri7Un4BCrhxzxew9+h6UfjR5X6qUYGp1PNmH7bA9aha9dZteQGK2P6yY4fTWNwmn0+fX48O0wyk3ylDwjExKRPXJIjsgJmRHubXgvTfen/kS9/631ZbfW/wPCHXyv/xGxd+2cM=</latexit><latexit sha1_base64="LWz4dIFHTSmXqy3ZapCM+I1HvOk=">ACxnicbVFda9RAFJ3Ej9ZodauPvgwuha3okojYQl+KH9CHihXctrATws3kZjs0maTzoSwx4F/xZ/lvnGSj2K0XEs6cM+fM3DtpXQhtwvCX59+6fefuxua94P6DrYePRtuPT3VlFcZr4pKnaegsRASZ0aYAs9rhVCmBZ6l+86/ewrKi0q+cUsa4xLWEiRCw7GUcno5w4DtSiFTBgaoMeTfMKurixklJXC/Xp8QDvxhVu5xe5u8MdDGf3roVzgLlI0+ZDmzTMgKVMi5I6pZ3uR31fYika5mxC+27aRmLWuCnDIuFKfHyWgcTsO+6E0QDWBMhjpJtr0tlXcligNL0DreRTWJm5AGcELbANmNdbAL2GBcwclKjpj+9pTuOyWheKfdJQ3v2X0cDpdbLMnU7u3b1utaR/9Pm1uT7cSNkbQ1KvjotwU1Fe0ehmZCITfF0gHgSri7Un4BCrhxzxew9+h6UfjR5X6qUYGp1PNmH7bA9aha9dZteQGK2P6yY4fTWNwmn0+fX48O0wyk3ylDwjExKRPXJIjsgJmRHubXgvTfen/kS9/631ZbfW/wPCHXyv/xGxd+2cM=</latexit><latexit sha1_base64="38u3Msd8cLInPlViofgp/9PDR90=">ACFXicbZDNSgMxFIUz9a+OrbZrN8EiIsy40aXgi7ciBXsD3SGksnctqGZEgyQhn6Am59BZ/GnbjybcxMu7CtFwIf5yY395wo5Uwbz/txKju7e/sH1UP3qObWj08atZ6WmaLQpZJLNYiIBs4EdA0zHAapApJEHPrR7K7o919BaSbFi5mnECZkItiYUWKs1Bk1Wl7bKwtvg7+CFlrVqOnUg1jSLAFhKCdaD30vNWFOlGUw8INMg0poTMygaFQRLQYV7ucDnVonxWCp7hMGl+vdFThKt50lkbybETPVmrxD/6w0zM74JcybSzICgy4/GcdG4sI0jpkCavjcAqGK2V0xnRJFqLHRuME9WC8KHu3cpxQUMVJd5gFRk4SJRQlBQWvrLC3Z/PzNtLahd9X2vb/7KEqOkVn6AL56BrdogfUQV1EUYze0Lvz4Xw6X8ucK84q8CZaK+f7F+W1ogc=</latexit><latexit sha1_base64="YgEGFlhxUQC9mG7/D4825E2/RMY=">ACu3icbVFda9RAFJ3Ej9bY6tZXwZLYSu6JD5YwRdBhT5UrOC2hZ0QbiY326GTSTofyhLzY/xZ/hsn2Sh264WEM+fMPTP3TN5IYWwc/wrCO3fv3d/afhA93Nl9Hiyt3Nmaqc5znkta32Rg0EpFM6tsBIvGo1Q5RLP86v3vX7+DbURtfpqVw2mFSyVKAUH6ls8vOAgV5WQmUMLdCTaTl19cOCsoq4X8Dfkt78YVf+cXhYfSnhzL6t7sRvgPsZ63H7usZRYcZUZU1CvdYvDtqR+jJd3wTL3pME2rsehYG5WUcaE5Pckm+/EsHoreBskI9slYp9lesMuKmrsKleUSjFkcWPTFrQVXGIXMWewAX4FS1x4qKBCk7bD6R098ExBy1r7T1k6sP92tFAZs6pyv7Mf12xqPfk/beFs+SZthWqcRcXB5VOUlvT/mFoITRyK1ceANfC35XyS9DArX+iH1AP4vGT973c4MabK2ft2P63QBYj25cZz2SDzDZjOs2OHs1S+JZ8iUm2+QpeUamJCFH5B05JqdkTniwFbwMXgdH4XGoQreOgzGzJ+QGxV+/w2htiR</latexit><latexit sha1_base64="YgEGFlhxUQC9mG7/D4825E2/RMY=">ACu3icbVFda9RAFJ3Ej9bY6tZXwZLYSu6JD5YwRdBhT5UrOC2hZ0QbiY326GTSTofyhLzY/xZ/hsn2Sh264WEM+fMPTP3TN5IYWwc/wrCO3fv3d/afhA93Nl9Hiyt3Nmaqc5znkta32Rg0EpFM6tsBIvGo1Q5RLP86v3vX7+DbURtfpqVw2mFSyVKAUH6ls8vOAgV5WQmUMLdCTaTl19cOCsoq4X8Dfkt78YVf+cXhYfSnhzL6t7sRvgPsZ63H7usZRYcZUZU1CvdYvDtqR+jJd3wTL3pME2rsehYG5WUcaE5Pckm+/EsHoreBskI9slYp9lesMuKmrsKleUSjFkcWPTFrQVXGIXMWewAX4FS1x4qKBCk7bD6R098ExBy1r7T1k6sP92tFAZs6pyv7Mf12xqPfk/beFs+SZthWqcRcXB5VOUlvT/mFoITRyK1ceANfC35XyS9DArX+iH1AP4vGT973c4MabK2ft2P63QBYj25cZz2SDzDZjOs2OHs1S+JZ8iUm2+QpeUamJCFH5B05JqdkTniwFbwMXgdH4XGoQreOgzGzJ+QGxV+/w2htiR</latexit><latexit sha1_base64="LMvrh5/kVgAFyuB7+QrRexaOGU=">ACxnicbVFda9RAFJ3Ej9Zo7bY+jK4FLaiS+KDFXwpfkAfKlZw28JOCDeTm+3QySdmVSWGPCv+LP8N06yUezWCwlnzplzZu6dtJLC2D85fl37t67v7H5IHj4aOvx9mhn9SUteY46Us9XkKBqVQOLPCSjyvNEKRSjxL93+tk1aiNK9dUuK4wLWCiRCw7WUcno5x4DvSiEShaoMeTfMKurmrIKCuE+/X4Le3EF27lFv7wR8PZfSvuxLOAfYiTZuPbdIwCzVlRhTUKe28z+2o70MkXcuMXWjfTaMxa1kT5JRxoTk9TkbjcBr2RW+DaABjMtRJsuNtsazkdYHKcgnGzKOwsnED2gousQ1YbACfgkLnDuoEATN/3pLd1zTEbzUrtPWdqz/zoaKIxZFqnb2bVr1rWO/J82r23+Jm6EqmqLiq8OymtJbUm7h6GZ0MitXDoAXAt3V8ovQAO37vkC9gFdLxo/udzPFWqwpX7eDNve8A6dOM6q5bcAKP1cd0Gp6+mUTiNvoTjw3fDKDfJU/KMTEhEDsghOSInZEa4t+G9F57B/6Rr/za/7ba6nuD5wm5Uf6P3xY+2b8=</latexit><latexit sha1_base64="LWz4dIFHTSmXqy3ZapCM+I1HvOk=">ACxnicbVFda9RAFJ3Ej9ZodauPvgwuha3okojYQl+KH9CHihXctrATws3kZjs0maTzoSwx4F/xZ/lvnGSj2K0XEs6cM+fM3DtpXQhtwvCX59+6fefuxua94P6DrYePRtuPT3VlFcZr4pKnaegsRASZ0aYAs9rhVCmBZ6l+86/ewrKi0q+cUsa4xLWEiRCw7GUcno5w4DtSiFTBgaoMeTfMKurixklJXC/Xp8QDvxhVu5xe5u8MdDGf3roVzgLlI0+ZDmzTMgKVMi5I6pZ3uR31fYika5mxC+27aRmLWuCnDIuFKfHyWgcTsO+6E0QDWBMhjpJtr0tlXcligNL0DreRTWJm5AGcELbANmNdbAL2GBcwclKjpj+9pTuOyWheKfdJQ3v2X0cDpdbLMnU7u3b1utaR/9Pm1uT7cSNkbQ1KvjotwU1Fe0ehmZCITfF0gHgSri7Un4BCrhxzxew9+h6UfjR5X6qUYGp1PNmH7bA9aha9dZteQGK2P6yY4fTWNwmn0+fX48O0wyk3ylDwjExKRPXJIjsgJmRHubXgvTfen/kS9/631ZbfW/wPCHXyv/xGxd+2cM=</latexit><latexit sha1_base64="LWz4dIFHTSmXqy3ZapCM+I1HvOk=">ACxnicbVFda9RAFJ3Ej9ZodauPvgwuha3okojYQl+KH9CHihXctrATws3kZjs0maTzoSwx4F/xZ/lvnGSj2K0XEs6cM+fM3DtpXQhtwvCX59+6fefuxua94P6DrYePRtuPT3VlFcZr4pKnaegsRASZ0aYAs9rhVCmBZ6l+86/ewrKi0q+cUsa4xLWEiRCw7GUcno5w4DtSiFTBgaoMeTfMKurixklJXC/Xp8QDvxhVu5xe5u8MdDGf3roVzgLlI0+ZDmzTMgKVMi5I6pZ3uR31fYika5mxC+27aRmLWuCnDIuFKfHyWgcTsO+6E0QDWBMhjpJtr0tlXcligNL0DreRTWJm5AGcELbANmNdbAL2GBcwclKjpj+9pTuOyWheKfdJQ3v2X0cDpdbLMnU7u3b1utaR/9Pm1uT7cSNkbQ1KvjotwU1Fe0ehmZCITfF0gHgSri7Un4BCrhxzxew9+h6UfjR5X6qUYGp1PNmH7bA9aha9dZteQGK2P6yY4fTWNwmn0+fX48O0wyk3ylDwjExKRPXJIjsgJmRHubXgvTfen/kS9/631ZbfW/wPCHXyv/xGxd+2cM=</latexit><latexit sha1_base64="LWz4dIFHTSmXqy3ZapCM+I1HvOk=">ACxnicbVFda9RAFJ3Ej9ZodauPvgwuha3okojYQl+KH9CHihXctrATws3kZjs0maTzoSwx4F/xZ/lvnGSj2K0XEs6cM+fM3DtpXQhtwvCX59+6fefuxua94P6DrYePRtuPT3VlFcZr4pKnaegsRASZ0aYAs9rhVCmBZ6l+86/ewrKi0q+cUsa4xLWEiRCw7GUcno5w4DtSiFTBgaoMeTfMKurixklJXC/Xp8QDvxhVu5xe5u8MdDGf3roVzgLlI0+ZDmzTMgKVMi5I6pZ3uR31fYika5mxC+27aRmLWuCnDIuFKfHyWgcTsO+6E0QDWBMhjpJtr0tlXcligNL0DreRTWJm5AGcELbANmNdbAL2GBcwclKjpj+9pTuOyWheKfdJQ3v2X0cDpdbLMnU7u3b1utaR/9Pm1uT7cSNkbQ1KvjotwU1Fe0ehmZCITfF0gHgSri7Un4BCrhxzxew9+h6UfjR5X6qUYGp1PNmH7bA9aha9dZteQGK2P6yY4fTWNwmn0+fX48O0wyk3ylDwjExKRPXJIjsgJmRHubXgvTfen/kS9/631ZbfW/wPCHXyv/xGxd+2cM=</latexit><latexit sha1_base64="LWz4dIFHTSmXqy3ZapCM+I1HvOk=">ACxnicbVFda9RAFJ3Ej9ZodauPvgwuha3okojYQl+KH9CHihXctrATws3kZjs0maTzoSwx4F/xZ/lvnGSj2K0XEs6cM+fM3DtpXQhtwvCX59+6fefuxua94P6DrYePRtuPT3VlFcZr4pKnaegsRASZ0aYAs9rhVCmBZ6l+86/ewrKi0q+cUsa4xLWEiRCw7GUcno5w4DtSiFTBgaoMeTfMKurixklJXC/Xp8QDvxhVu5xe5u8MdDGf3roVzgLlI0+ZDmzTMgKVMi5I6pZ3uR31fYika5mxC+27aRmLWuCnDIuFKfHyWgcTsO+6E0QDWBMhjpJtr0tlXcligNL0DreRTWJm5AGcELbANmNdbAL2GBcwclKjpj+9pTuOyWheKfdJQ3v2X0cDpdbLMnU7u3b1utaR/9Pm1uT7cSNkbQ1KvjotwU1Fe0ehmZCITfF0gHgSri7Un4BCrhxzxew9+h6UfjR5X6qUYGp1PNmH7bA9aha9dZteQGK2P6yY4fTWNwmn0+fX48O0wyk3ylDwjExKRPXJIjsgJmRHubXgvTfen/kS9/631ZbfW/wPCHXyv/xGxd+2cM=</latexit><latexit sha1_base64="LWz4dIFHTSmXqy3ZapCM+I1HvOk=">ACxnicbVFda9RAFJ3Ej9ZodauPvgwuha3okojYQl+KH9CHihXctrATws3kZjs0maTzoSwx4F/xZ/lvnGSj2K0XEs6cM+fM3DtpXQhtwvCX59+6fefuxua94P6DrYePRtuPT3VlFcZr4pKnaegsRASZ0aYAs9rhVCmBZ6l+86/ewrKi0q+cUsa4xLWEiRCw7GUcno5w4DtSiFTBgaoMeTfMKurixklJXC/Xp8QDvxhVu5xe5u8MdDGf3roVzgLlI0+ZDmzTMgKVMi5I6pZ3uR31fYika5mxC+27aRmLWuCnDIuFKfHyWgcTsO+6E0QDWBMhjpJtr0tlXcligNL0DreRTWJm5AGcELbANmNdbAL2GBcwclKjpj+9pTuOyWheKfdJQ3v2X0cDpdbLMnU7u3b1utaR/9Pm1uT7cSNkbQ1KvjotwU1Fe0ehmZCITfF0gHgSri7Un4BCrhxzxew9+h6UfjR5X6qUYGp1PNmH7bA9aha9dZteQGK2P6yY4fTWNwmn0+fX48O0wyk3ylDwjExKRPXJIjsgJmRHubXgvTfen/kS9/631ZbfW/wPCHXyv/xGxd+2cM=</latexit><latexit sha1_base64="LWz4dIFHTSmXqy3ZapCM+I1HvOk=">ACxnicbVFda9RAFJ3Ej9ZodauPvgwuha3okojYQl+KH9CHihXctrATws3kZjs0maTzoSwx4F/xZ/lvnGSj2K0XEs6cM+fM3DtpXQhtwvCX59+6fefuxua94P6DrYePRtuPT3VlFcZr4pKnaegsRASZ0aYAs9rhVCmBZ6l+86/ewrKi0q+cUsa4xLWEiRCw7GUcno5w4DtSiFTBgaoMeTfMKurixklJXC/Xp8QDvxhVu5xe5u8MdDGf3roVzgLlI0+ZDmzTMgKVMi5I6pZ3uR31fYika5mxC+27aRmLWuCnDIuFKfHyWgcTsO+6E0QDWBMhjpJtr0tlXcligNL0DreRTWJm5AGcELbANmNdbAL2GBcwclKjpj+9pTuOyWheKfdJQ3v2X0cDpdbLMnU7u3b1utaR/9Pm1uT7cSNkbQ1KvjotwU1Fe0ehmZCITfF0gHgSri7Un4BCrhxzxew9+h6UfjR5X6qUYGp1PNmH7bA9aha9dZteQGK2P6yY4fTWNwmn0+fX48O0wyk3ylDwjExKRPXJIjsgJmRHubXgvTfen/kS9/631ZbfW/wPCHXyv/xGxd+2cM=</latexit>

arg min

<latexit sha1_base64="/8RfHiPqR2J1MpofaCATkWdehoU=">ACz3icbVFdaxQxFM2MX3W0utVHX4LwlZkmSmig8ufoAPFVtw28JmWO5k7mxDMx9NMsoSR3z1r/iP/DdmZqdit15IODkn5+bem6SQpsw/O35167fuHlr63Zw5+72vfuDnQdHuqwVxkvZalOEtAoRYEzI4zEk0oh5InE4+Tsbasf0GlRVl8NqsK4xyWhcgEB+OoxeDXiIFa5qJYMDRA98fZmJ2f15BSlgu3dfgVbcWn7uQOu7vBhYcy+tdCecAc5ok9n2zsMxATZkWOXVKM+/ytS3PiXdyBm7pF03VmHaMBuMsq4UJzuBxdvDIbhJOyCXgVRD4akj4PFjrfN0pLXORaGS9B6HoWViS0oI7jEJmC1xgr4GSx7mABOerYdmU0dOSYlGalcqswtGP/dVjItV7libvZ9q03tZb8nzavTfYytqKoaoMFXz+U1ZKakrY/RFOhkBu5cgC4Eq5Wyk9BATfuHwP2Dl0vCj+6vJ8qVGBK9cT2I2o6wFp0qZx1S26A0ea4roKjvUkUTqLDZ8Ppm36UW+QReUzGJCIvyJR8IAdkRrg38J57r72pf+h/9b/7P9ZXfa/3PCSXwv/5B8AY3SY=</latexit><latexit sha1_base64="/8RfHiPqR2J1MpofaCATkWdehoU=">ACz3icbVFdaxQxFM2MX3W0utVHX4LwlZkmSmig8ufoAPFVtw28JmWO5k7mxDMx9NMsoSR3z1r/iP/DdmZqdit15IODkn5+bem6SQpsw/O35167fuHlr63Zw5+72vfuDnQdHuqwVxkvZalOEtAoRYEzI4zEk0oh5InE4+Tsbasf0GlRVl8NqsK4xyWhcgEB+OoxeDXiIFa5qJYMDRA98fZmJ2f15BSlgu3dfgVbcWn7uQOu7vBhYcy+tdCecAc5ok9n2zsMxATZkWOXVKM+/ytS3PiXdyBm7pF03VmHaMBuMsq4UJzuBxdvDIbhJOyCXgVRD4akj4PFjrfN0pLXORaGS9B6HoWViS0oI7jEJmC1xgr4GSx7mABOerYdmU0dOSYlGalcqswtGP/dVjItV7libvZ9q03tZb8nzavTfYytqKoaoMFXz+U1ZKakrY/RFOhkBu5cgC4Eq5Wyk9BATfuHwP2Dl0vCj+6vJ8qVGBK9cT2I2o6wFp0qZx1S26A0ea4roKjvUkUTqLDZ8Ppm36UW+QReUzGJCIvyJR8IAdkRrg38J57r72pf+h/9b/7P9ZXfa/3PCSXwv/5B8AY3SY=</latexit><latexit sha1_base64="/8RfHiPqR2J1MpofaCATkWdehoU=">ACz3icbVFdaxQxFM2MX3W0utVHX4LwlZkmSmig8ufoAPFVtw28JmWO5k7mxDMx9NMsoSR3z1r/iP/DdmZqdit15IODkn5+bem6SQpsw/O35167fuHlr63Zw5+72vfuDnQdHuqwVxkvZalOEtAoRYEzI4zEk0oh5InE4+Tsbasf0GlRVl8NqsK4xyWhcgEB+OoxeDXiIFa5qJYMDRA98fZmJ2f15BSlgu3dfgVbcWn7uQOu7vBhYcy+tdCecAc5ok9n2zsMxATZkWOXVKM+/ytS3PiXdyBm7pF03VmHaMBuMsq4UJzuBxdvDIbhJOyCXgVRD4akj4PFjrfN0pLXORaGS9B6HoWViS0oI7jEJmC1xgr4GSx7mABOerYdmU0dOSYlGalcqswtGP/dVjItV7libvZ9q03tZb8nzavTfYytqKoaoMFXz+U1ZKakrY/RFOhkBu5cgC4Eq5Wyk9BATfuHwP2Dl0vCj+6vJ8qVGBK9cT2I2o6wFp0qZx1S26A0ea4roKjvUkUTqLDZ8Ppm36UW+QReUzGJCIvyJR8IAdkRrg38J57r72pf+h/9b/7P9ZXfa/3PCSXwv/5B8AY3SY=</latexit><latexit sha1_base64="/8RfHiPqR2J1MpofaCATkWdehoU=">ACz3icbVFdaxQxFM2MX3W0utVHX4LwlZkmSmig8ufoAPFVtw28JmWO5k7mxDMx9NMsoSR3z1r/iP/DdmZqdit15IODkn5+bem6SQpsw/O35167fuHlr63Zw5+72vfuDnQdHuqwVxkvZalOEtAoRYEzI4zEk0oh5InE4+Tsbasf0GlRVl8NqsK4xyWhcgEB+OoxeDXiIFa5qJYMDRA98fZmJ2f15BSlgu3dfgVbcWn7uQOu7vBhYcy+tdCecAc5ok9n2zsMxATZkWOXVKM+/ytS3PiXdyBm7pF03VmHaMBuMsq4UJzuBxdvDIbhJOyCXgVRD4akj4PFjrfN0pLXORaGS9B6HoWViS0oI7jEJmC1xgr4GSx7mABOerYdmU0dOSYlGalcqswtGP/dVjItV7libvZ9q03tZb8nzavTfYytqKoaoMFXz+U1ZKakrY/RFOhkBu5cgC4Eq5Wyk9BATfuHwP2Dl0vCj+6vJ8qVGBK9cT2I2o6wFp0qZx1S26A0ea4roKjvUkUTqLDZ8Ppm36UW+QReUzGJCIvyJR8IAdkRrg38J57r72pf+h/9b/7P9ZXfa/3PCSXwv/5B8AY3SY=</latexit>

???

vein veic

slide-71
SLIDE 71

Results: string editing accuracy

Identity Multitask Meta This Work 18 50 62 76

slide-72
SLIDE 72

Results

change any n 
 to a c replace all n s with c

loocies loocies

(a)

examples true description true output

  • pred. description
  • pred. output

emboldens kisses loneliness vein dogtrot emboldecs kisses locelicess veic dogtrot loonies

slide-73
SLIDE 73

Problems How do we bootstrap from (unannotated)
 exploration of the environment alone? How good are inferred descriptions as explanations?

slide-74
SLIDE 74

Problems How do we bootstrap from (unannotated)
 exploration of the environment alone How good are inferred descriptions as explanations?

slide-75
SLIDE 75

Conclusions

slide-76
SLIDE 76

Conclusions

Use RL in NLP by formulating language generation / understanding as reward maximization rather than supervised learning.

slide-77
SLIDE 77

Conclusions

Use language in (I)RL as a scafgold for learning

  • ptions, goal representations, cost functions.

Languages encode 100k years of accumulated knowledge about which abstractions are useful —take advantage of it!

slide-78
SLIDE 78

Thanks!

also… Looking for NLP jobs? Ask me about Microsoft! Applying to PhD programs? Ask me about MIT! jaandrea@microsoft.com