Tracking the World State with Recurrent Entity Networks Mikael - - PDF document

tracking the world state with recurrent entity networks
SMART_READER_LITE
LIVE PREVIEW

Tracking the World State with Recurrent Entity Networks Mikael - - PDF document

11/7/2017 Tracking the World State with Recurrent Entity Networks Tracking the World State with Recurrent Entity Networks Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann LeCun Task At each timestep, get information (in the form


slide-1
SLIDE 1

11/7/2017 Tracking the World State with Recurrent Entity Networks https://paper.dropbox.com/doc/print/RqoUdaN3Xe14IOEXWr1ze?print=true 1/10

Tracking the World State with Recurrent Entity Networks

Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann LeCun

Task

At each timestep, get information (in the form of a sentence) about the state of the world. Then answer a question. When we get new information, we should update our representation of the world state. The world state can be decomposed into the state of each entity in the world, so we only need to update one entity.

Architecture

The memory model:

input: a sequence of vectors

s , ⋯ s

  • utput: a set of entity representations

h , ⋯ h

The world is a collection of entities. Information about each entity is stored in a single cell. Each cell comes with a key and a memory slot. and

g depends on h, w, s

standard gating mechanism:

1 T

1

k

h

~
slide-2
SLIDE 2

11/7/2017 Tracking the World State with Recurrent Entity Networks https://paper.dropbox.com/doc/print/RqoUdaN3Xe14IOEXWr1ze?print=true 2/10

multiple-cells at multiple timesteps:

slide-3
SLIDE 3

11/7/2017 Tracking the World State with Recurrent Entity Networks https://paper.dropbox.com/doc/print/RqoUdaN3Xe14IOEXWr1ze?print=true 3/10

Input Encoder:

input: a sequence of sentences.

  • utput: an encoding of each sentence as a fixed sized vector

e are pretrained embeddings

Output Module:

input: a query vector

q and the outputs of the memory model

  • utput: arbitrary vector (log probabilities over words)
i
slide-4
SLIDE 4

11/7/2017 Tracking the World State with Recurrent Entity Networks https://paper.dropbox.com/doc/print/RqoUdaN3Xe14IOEXWr1ze?print=true 4/10

Look at only one entity and drop the query:

y = Rϕ(Hh ) = Rϕ y = R ϕ

Key vectors

the model should identify entities by keys, which are trainable

Key tying:

Use parser to identify entities. One memory cell for each entity. Freeze key vector to be word embedding of an entity.

Related work

j j i i j

LSTM/GRU RENN

slide-5
SLIDE 5

11/7/2017 Tracking the World State with Recurrent Entity Networks https://paper.dropbox.com/doc/print/RqoUdaN3Xe14IOEXWr1ze?print=true 5/10

LSTM: Forget gate layer: Input gate layer & tanh(hyperbolic tangent) layer: Compared to RENN, CommNN, Interaction Network, Neural Physics Engine use parallel recurrent models without gating mechanism.

Experiments

Synthetic world model task

Task details: Two agents randomly placed in a 10x10 grid Answer the locations of the agents based on up to T-2 supporting facts Details: 5 memory slots 20D per cell

scalar memory cell with full interaction separate memory cells just sigmoid layer of input and hidden state content-based term between input and hidden state Memory Network RENN store the entire input sequence in dynamic long-term memory a fixed number of blocks a window of words of hidden states as memories sequentially update a controller's hidden state via a softmax gating over the memories update each block with an independent gated RNN Gated graph network RENN inter-network communication with edges parallel/independent recurrent models

slide-6
SLIDE 6

11/7/2017 Tracking the World State with Recurrent Entity Networks https://paper.dropbox.com/doc/print/RqoUdaN3Xe14IOEXWr1ze?print=true 6/10

bAbI

Details: 20 memory cells 100D embedding

U = V = 0, W = I, ϕ = id

entity matrix

slide-7
SLIDE 7

11/7/2017 Tracking the World State with Recurrent Entity Networks https://paper.dropbox.com/doc/print/RqoUdaN3Xe14IOEXWr1ze?print=true 7/10

slide-8
SLIDE 8

11/7/2017 Tracking the World State with Recurrent Entity Networks https://paper.dropbox.com/doc/print/RqoUdaN3Xe14IOEXWr1ze?print=true 8/10

Interpreting representations

Recall that

y = R ϕ

Find closest

R for each entity ϕ

:

i i j i j
slide-9
SLIDE 9

11/7/2017 Tracking the World State with Recurrent Entity Networks https://paper.dropbox.com/doc/print/RqoUdaN3Xe14IOEXWr1ze?print=true 9/10

CBT

Input: . 20 sentences . 21st sentence with missing word . list of candidate words Details: Tied keys to candidate words Dropout

U = V = 0, W = I, ϕ = id

No normalization

slide-10
SLIDE 10

11/7/2017 Tracking the World State with Recurrent Entity Networks https://paper.dropbox.com/doc/print/RqoUdaN3Xe14IOEXWr1ze?print=true 10/10