Recurrent Neural Networks Xavier Gir-i-Nieto Acknowledgments - PowerPoint PPT Presentation

Day 2 Lecture 6 Recurrent Neural Networks Xavier Giró-i-Nieto

Acknowledgments Santi Pascual 2

General idea ConvNet (or CNN) 3

General idea ConvNet (or CNN) 4

Multilayer Perceptron The output depends ONLY on the current input. Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” 5

Recurrent Neural Network (RNN) The hidden layers and the output depend from previous states of the hidden layers Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” 6

Recurrent Neural Network (RNN) The hidden layers and the output depend from previous states of the hidden layers Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” 7

Recurrent Neural Network (RNN) Front View Side View Rotation 90 o Rotation time 90 o time 8

Recurrent Neural Networks (RNN) Each node represents a layer of neurons at a single timestep. t t-1 t+1 Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” 9

Recurrent Neural Networks (RNN) The input is a SEQUENCE x(t) of any length. t t-1 t+1 Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” 10

Recurrent Neural Networks (RNN) Common visual sequences: The input is a SEQUENCE x(t) of any length. Still image Spatial scan (zigzag, snake) 11

Recurrent Neural Networks (RNN) Common visual sequences: The input is a SEQUENCE x(t) ... of any length. t Video Temporal sampling 12

Recurrent Neural Networks (RNN) Must learn temporally shared weights w2; in addition to w1 & w3. t t-1 t+1 Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” 13

Bidirectional RNN (BRNN) Must learn weights w2, w3, w4 & w5; in addition to w1 & w6. Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” 14

Bidirectional RNN (BRNN) Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” 15

Formulation: One hidden layer Delay unit (z -1 ) Slide: Santi Pascual 16

Formulation: Single recurrence One-time Recurrence Slide: Santi Pascual 17

Formulation: Multiple recurrences One time-step recurrence Recurrence T time steps recurrences Slide: Santi Pascual 18

RNN problems Long term memory vanishes because of the T nested multiplications by U. ... Slide: Santi Pascual 19

RNN problems During training, gradients may explode or vanish because of temporal depth. Example: Back- propagation in time with 3 steps. Slide: Santi Pascual 20

Long Short-Term Memory (LSTM) 21

Long Short-Term Memory (LSTM) Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9, no. 8 (1997): 1735-1780. 22

Long Short-Term Memory (LSTM) Based on a standard RNN whose neuron activates with tanh ... Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) 23

Long Short-Term Memory (LSTM) C t is the cell state, which flows through the entire chain... Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) 24

Long Short-Term Memory (LSTM) ...and is updated with a sum instead of a product. This avoid memory vanishing and exploding/vanishing backprop gradients. Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) 25

Long Short-Term Memory (LSTM) Three gates are governed by sigmoid units (btw [0,1]) define the control of in & out information.. Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) 26

Long Short-Term Memory (LSTM) Forget Gate : Concatenate Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes 27

Long Short-Term Memory (LSTM) Input Gate Layer New contribution to cell state Classic neuron Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes 28

Long Short-Term Memory (LSTM) Update Cell State (memory): Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes 29

Long Short-Term Memory (LSTM) Output Gate Layer Output to next layer Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes 30

Gated Recurrent Unit (GRU) Similar performance as LSTM with less computation. Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014). 31

Applications: Machine Translation Language OUT Language IN Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014). 32

Applications: Image Classification Diagonal Classification MNIST RowLSTM BiLSTM van den Oord, Aaron, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel Recurrent Neural Networks." arXiv preprint arXiv:1601.06759 (2016). 33

Applications: Segmentation Francesco Visin, Marco Ciccone, Adriana Romero, Kyle Kastner, Kyunghyun Cho, Yoshua Bengio, Matteo Matteucci, Aaron Courville, “ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation”. DeepVision CVPRW 2016. 34

Thanks ! Q&A ? Follow me at /ProfessorXavi @DocXavi https://imatge.upc.edu/web/people/xavier-giro 35

Recurrent Neural Networks Xavier Gir-i-Nieto Acknowledgments - PowerPoint PPT Presentation

Day 2 Lecture 6 Recurrent Neural Networks Xavier Gir-i-Nieto Acknowledgments Santi Pascual 2 General idea ConvNet (or CNN) 3 General idea ConvNet (or CNN) 4 Multilayer Perceptron The output depends ONLY on the current input.

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

Recurrent Neural Networks CS60010: Deep Learning Abir Das IIT Kharagpur Mar 11, 2020

Computa(on through dynamics Using recurrent neural networks to unveil mechanism in neural

IN5550 Neural Methods in Natural Language Processing Recurrent Neural Networks Stephan Oepen

NLP Programming Tutorial 8 - Recurrent Neural Nets Graham Neubig Nara Institute of Science and

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Introduction to Recurrent Neural Networks Jakob Verbeek Modeling sequential data with Recurrent

CSC321 Lecture 16: Learning Long-Term Dependencies Roger Grosse Roger Grosse CSC321 Lecture 16:

Long-Term Memory Introduction STM versus LTM Episodic Memory Semantic Memory

LONG SHOR T-TERM MEMOR Y Neural Comput a tion 9(8):1735{1780, 1997 Sepp Ho c hreiter

Intrinsic synaptic modulation in vertebrate brains: Depression Facilitation Mixed Dietmann et

Human Abilities Design in HCI Note: Differences with design in software

EVALUATION OF THE MODAL MODEL OF MEMORY Lecturer: Dr. Benjamin Amponsah, Dept. of Psychology, UG,

Discrimination between genuine versus fake emotion using long-short term memory with parametric

CS7015 (Deep Learning) : Lecture 15 Long Short Term Memory Cells (LSTMs), Gated Recurrent Units