Neural Turing Machines Tristan Deleu June 23, 2016 @tristandeleu - - PowerPoint PPT Presentation

neural turing machines
SMART_READER_LITE
LIVE PREVIEW

Neural Turing Machines Tristan Deleu June 23, 2016 @tristandeleu - - PowerPoint PPT Presentation

Neural Turing Machines Tristan Deleu June 23, 2016 @tristandeleu Deep Learning The building blocks + Convolutional Fully connected Recurrent Others Layer Layer Layer Object Recognition Predictions Speech Recognition


slide-1
SLIDE 1

Neural Turing Machines

Tristan Deleu

@tristandeleu

  • June 23, 2016
slide-2
SLIDE 2

Deep Learning

slide-3
SLIDE 3

The building blocks

Convolutional
 Layer Fully connected
 Layer Recurrent
 Layer

+

Object Recognition Object Detection Image Segmentation Others


  • Predictions
  • Speech Recognition

Language Processing

slide-4
SLIDE 4

Examples

+ =

Object Detection Predictions

+ =

Predictions Speech Recognition

+

Image Segmentation Predictions

Face detection Automatic speech recognition

=

Image segmentation

slide-5
SLIDE 5

Examples

+ +

Object Recognition Language Processing Predictions

Sentiment analysis Image captioning Machine translation

= = = +

Language Processing

+

Language Processing Predictions Language Processing

slide-6
SLIDE 6

Frameworks

Theano Torch Tensorflow Keras Chainer Neon CNTK MXNet Caffe

Lasagne

Lasagne

slide-7
SLIDE 7

Theano + Lasagne

https://github.com/Lasagne/Lasagne/blob/master/examples/mnist.py

slide-8
SLIDE 8

Neural Turing Machines

slide-9
SLIDE 9

Recurrent Neural Network

ht yt xt ht+1 yt+1 xt+1 yt−1 ht−1 xt−1

LSTMt

LSTMt−1 LSTMt+1

slide-10
SLIDE 10

Memory-augmented Networks

BOAT

Neural Network

Boats float on water You can’t sail against the wind Boats do not fly …

?

  • Inspired by neuroscience
  • Memory-augmented networks: add an external memory to neural networks to act as a knowledge base
  • Keep track of intermediate computations — The story to answer the question in QA problems


Memory Networks & Dynamic Memory Networks

slide-11
SLIDE 11

Memory-augmented Networks

Memory Networks Dynamic Memory Networks Neural GPU Neural Stack/Queue/DeQue Stack-augmented RNN

slide-12
SLIDE 12

Current state Read Operation New state Write 1 1 1

Turing Machine

1 1 1 1 1

q0 q0 q0 q0 q0 q1 q1 q1 q1 · · ·

slide-13
SLIDE 13

Neural Turing Machine

1 1 1 1 1

q0

Current state Read Operation New state Write 1 1 1

q0 q0 q0 q0 q1 q1 q1 q1 · · ·

Input Output

?

slide-14
SLIDE 14

Heads

1 1 1 1

  • wt

Mt

  • Turing Machine

Neural Turing Machine

slide-15
SLIDE 15

Neural Turing Machine

FFt ht yt rt xt xt FFt+1 ht+1 yt+1 rt+1 xt+1 xt+1 yt−1 ht−1 rt−1 FFt−1 xt−1 xt−1 Mt−1 Mt

  • Controller
  • Read heads
  • Write heads
slide-16
SLIDE 16

Neural Turing Machine

ht yt rt xt xt ht+1 yt+1 rt+1 xt+1 xt+1 yt−1 ht−1 rt−1 xt−1 xt−1 Mt−1 Mt

LSTMt LSTMt−1 LSTMt+1

  • Controller
  • Read heads
  • Write heads
slide-17
SLIDE 17

Neural Turing Machine

  • Memory
  • Controller
  • Read heads
  • Write heads

Input Output

  • NTM
slide-18
SLIDE 18

Open-source Library

medium.com/snips-ai github.com/snipsco/ntm-lasagne

slide-19
SLIDE 19

NTM-Lasagne

slide-20
SLIDE 20

Algorithmic Tasks

  • Goal: Learn full algorithms only from input/output examples


Generate as much data as we need

  • Strong Generalization: Generalize beyond the data the NTM has seen during training


Longer sequences for example

  • ?

Input Output

P(X, Y )

?

slide-21
SLIDE 21

Copy task

Inputs Outputs EOS

slide-22
SLIDE 22

Training

slide-23
SLIDE 23

Copy task

slide-24
SLIDE 24

Copy task

slide-25
SLIDE 25

Copy task

Length 120

slide-26
SLIDE 26

Copy task

Length 150

slide-27
SLIDE 27

Repeat Copy task

x5 EOS Inputs Outputs

slide-28
SLIDE 28

Repeat Copy task

slide-29
SLIDE 29

Repeat Copy task

slide-30
SLIDE 30

Associative Recall task

Inputs Outputs

slide-31
SLIDE 31

Associative Recall task

slide-32
SLIDE 32

Associative Recall task

slide-33
SLIDE 33

Priority Sort task

slide-34
SLIDE 34

bAbI tasks

slide-35
SLIDE 35

bAbI tasks

Mary John

bathroom

garden Sandra hallway

Mary John

bathroom

garden Sandra hallway

Mary went to the garden John went to the garden Mary went back to the hallway Sandra journeyed to the bathroom John went to the hallway Mary went to the bathroom

slide-36
SLIDE 36

bAbI tasks

slide-37
SLIDE 37

Conclusion

  • The NTM is able to learn algorithms only from examples
  • It shows better generalization performances compared to other recurrent architectures


For example LSTMs

  • Fully differentiable structure


Drawback: generalization is still not quite perfect

  • New take on Artificial Intelligence


Trying to teach machines things they can do, the same way we would learn them

  • Resources
  • Theano: http://deeplearning.net/software/theano/
  • Lasagne: http://lasagne.readthedocs.io/en/latest/
  • NTM-Lasagne: https://github.com/snipsco/ntm-lasagne

@tristandeleu

  • June 23, 2016
slide-38
SLIDE 38

Thank you

@tristandeleu

  • June 23, 2016