Neural Turing Machines Tristan Deleu June 23, 2016 @tristandeleu - PowerPoint PPT Presentation

Neural Turing Machines Tristan Deleu � June 23, 2016 @tristandeleu

Deep Learning

The building blocks + Convolutional   Fully connected   Recurrent   Others   Layer Layer Layer Object Recognition Predictions Speech Recognition � � � Object Detection Language Processing � � Image Segmentation �

Examples + = Object Predictions Face detection Detection + = Speech Predictions Automatic speech recognition Recognition = + Image Predictions Image segmentation Segmentation

Examples + = Language Language Processing Processing Machine translation + = Language Predictions Sentiment analysis Processing = + + Object Language Predictions Image captioning Recognition Processing

Frameworks Lasagne Lasagne Theano Torch Tensorflow Caffe Keras Neon MXNet Chainer CNTK

Theano + Lasagne https://github.com/Lasagne/Lasagne/blob/master/examples/mnist.py

Neural Turing Machines

Recurrent Neural Network y t +1 y t − 1 y t h t +1 h t − 1 h t LSTM t +1 LSTM t − 1 LSTM t x t +1 x t − 1 x t

Memory-augmented Networks BOAT Neural Network ? Boats float on water You can’t sail against the wind Boats do not fly … • Inspired by neuroscience • Memory-augmented networks : add an external memory to neural networks to act as a knowledge base • Keep track of intermediate computations — The story to answer the question in QA problems   Memory Networks & Dynamic Memory Networks

Memory-augmented Networks Memory Networks Dynamic Memory Networks Neural GPU Neural Stack/Queue/DeQue Stack-augmented RNN

Turing Machine 0 1 0 1 0 0 1 1 1 0 q 0 Current state Read Operation New state Write q 0 q 1 0 0 q 0 q 0 1 0 q 1 q 0 0 1 q 1 q 1 1 0 · · ·

Neural Turing Machine 0 1 0 1 0 0 1 1 1 0 q 0 Current state Read Operation New state Write Input Output q 0 q 1 0 0 q 0 q 0 1 0 ? q 1 q 0 0 1 q 1 q 1 1 0 · · ·

Heads M t 0 1 0 1 0 0 1 1 � � � � w t Turing Machine Neural Turing Machine

Neural Turing Machine y t +1 y t − 1 y t h t +1 h t − 1 h t � � Controller FF t − 1 FF t FF t +1 � � M t − 1 M t Read heads r t +1 r t − 1 r t � � x t +1 x t − 1 x t Write heads x t +1 x t − 1 x t

Neural Turing Machine y t +1 y t − 1 y t h t +1 h t − 1 h t � � Controller LSTM t +1 LSTM t − 1 LSTM t � � M t − 1 M t Read heads r t +1 r t − 1 r t � � x t +1 x t − 1 x t Write heads x t +1 x t − 1 x t

Neural Turing Machine Input Output � NTM � � � Controller � � � � Write heads Read heads � � Memory

Open-source Library medium.com/snips-ai � github.com/snipsco/ntm-lasagne �

NTM-Lasagne

Algorithmic Tasks • Goal : Learn full algorithms only from input/output examples   Generate as much data as we need Input Output � ? • Strong Generalization : Generalize beyond the data the NTM has seen during training   Longer sequences for example ? P ( X, Y )

Copy task Inputs EOS Outputs

Training

Copy task

Copy task Length 120

Copy task Length 150

Repeat Copy task Inputs x5 EOS Outputs

Repeat Copy task

Associative Recall task Inputs Outputs

Associative Recall task

Priority Sort task

bAbI tasks

bAbI tasks John John Mary Mary garden garden Mary went to the garden John went to the garden Mary went back to the hallway Sandra Sandra Sandra journeyed to the bathroom John went to the hallway Mary went to the bathroom hallway hallway bathroom bathroom

bAbI tasks

Conclusion • The NTM is able to learn algorithms only from examples • It shows better generalization performances compared to other recurrent architectures   For example LSTMs • Fully differentiable structure   Drawback: generalization is still not quite perfect • New take on Artificial Intelligence   Trying to teach machines things they can do, the same way we would learn them • Resources Theano: http://deeplearning.net/software/theano/ • Lasagne: http://lasagne.readthedocs.io/en/latest/ • NTM-Lasagne: https://github.com/snipsco/ntm-lasagne • � June 23, 2016 @tristandeleu

Thank you � June 23, 2016 @tristandeleu

Neural Turing Machines Tristan Deleu June 23, 2016 @tristandeleu - PowerPoint PPT Presentation

Neural Turing Machines Tristan Deleu June 23, 2016 @tristandeleu Deep Learning The building blocks + Convolutional Fully connected Recurrent Others Layer Layer Layer Object Recognition Predictions Speech Recognition

1 Turing Machines 1.1 Introduction Turing machines provide an answer to the question, What is a

Science (Bridging Course) Turing Machines Gian Diego Tipaldi Topics Covered Turing machines

Lecture 13: Oracle Turing Machines Arijit Bishnu 13.04.2010 Oracle Turing Machines

Foundations of Computer Science Lecture 26 Turing Machines The Turing Machine: DFA with Random

Foundations of Computer Science Lecture 26 Turing Machines The Turing Machine: DFA with Random

Theory Chapter 3: The Church-Turing Thesis 1 Chapter 3.1 Turing Machines 2 Turing Machines:

Turing Machines (TM) Deterministic Turing Machine (DTM) Nondeterministic Turing Machine

Advanced Topics in Theoretical Computer Science Part 1: Turing Machines and Turing Computability

Outline Super-Turing I. The Limits of Turing Computation or A. Models & Frames of

Turing Machines A more powerful computation model than a PDA ? [Section 9.1] Turing Machines

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Turing Machines Our most powerful model of a computer is the Turing Machine. This is an FA with

TURING MACHINE VARIATIONS ENCODING TURING MACHINES UNIVERSAL TURING MACHINE Your Questions?

Turing Machine properties There are many ways to skin a cat Turing Machines And many ways

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Turing

CISC 4090: Theory of Computation Chapter 3 The Church-Turing Thesis Section 3.1: Turing Machines

CS535: Deep Learning Winter 2018 Fuxin Li Course Information Instructor: Dr. Fuxin Li

Turning spaghetti into lasagne Applying the principles of application frameworks to Applying the

Introduction to Machine Learning Deep Learning Applications Barnabs Pczos Applications

Software Frameworks for Deep Learning Packages Caffe NVIDIA Digits Theano

Lecture 6: Convolutional NN Princeton University COS 495 Instructor: Yingyu Liang Review:

Year 4 Literacy Lesson One I can consistently choose nouns or pronouns appropriately to aid

Lecture 13: Introduction to Deep Learning Deep Convolutional Neural Networks Aykut Erdem

Fermilab Keras Workshop Stefan Wunsch stefan.wunsch@cern.ch December 8, 2017 1 What is this

Neural Turing Machines Tristan Deleu June 23, 2016 @tristandeleu - PowerPoint PPT Presentation

Neural Turing Machines Tristan Deleu June 23, 2016 @tristandeleu Deep Learning The building blocks + Convolutional Fully connected Recurrent Others Layer Layer Layer Object Recognition Predictions Speech Recognition

1 Turing Machines 1.1 Introduction Turing machines provide an answer to the question, What is a

Science (Bridging Course) Turing Machines Gian Diego Tipaldi Topics Covered Turing machines

Lecture 13: Oracle Turing Machines Arijit Bishnu 13.04.2010 Oracle Turing Machines

Foundations of Computer Science Lecture 26 Turing Machines The Turing Machine: DFA with Random

Foundations of Computer Science Lecture 26 Turing Machines The Turing Machine: DFA with Random

Theory Chapter 3: The Church-Turing Thesis 1 Chapter 3.1 Turing Machines 2 Turing Machines:

Turing Machines (TM) Deterministic Turing Machine (DTM) Nondeterministic Turing Machine

Advanced Topics in Theoretical Computer Science Part 1: Turing Machines and Turing Computability

Outline Super-Turing I. The Limits of Turing Computation or A. Models &amp; Frames of

Turing Machines A more powerful computation model than a PDA ? [Section 9.1] Turing Machines

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Turing Machines Our most powerful model of a computer is the Turing Machine. This is an FA with

TURING MACHINE VARIATIONS ENCODING TURING MACHINES UNIVERSAL TURING MACHINE Your Questions?

Turing Machine properties There are many ways to skin a cat Turing Machines And many ways

COMP3630/6360: Theory of Computation Semester 1, 2020 The Australian National University Turing

CISC 4090: Theory of Computation Chapter 3 The Church-Turing Thesis Section 3.1: Turing Machines

CS535: Deep Learning Winter 2018 Fuxin Li Course Information Instructor: Dr. Fuxin Li

Turning spaghetti into lasagne Applying the principles of application frameworks to Applying the

Introduction to Machine Learning Deep Learning Applications Barnabs Pczos Applications

Software Frameworks for Deep Learning Packages Caffe NVIDIA Digits Theano

Lecture 6: Convolutional NN Princeton University COS 495 Instructor: Yingyu Liang Review:

Year 4 Literacy Lesson One I can consistently choose nouns or pronouns appropriately to aid

Lecture 13: Introduction to Deep Learning Deep Convolutional Neural Networks Aykut Erdem

Fermilab Keras Workshop Stefan Wunsch stefan.wunsch@cern.ch December 8, 2017 1 What is this

Outline Super-Turing I. The Limits of Turing Computation or A. Models & Frames of