Recurrent Neural Networks Xavier Gir-i-Nieto Acknowledgments - - PowerPoint PPT Presentation

recurrent neural networks
SMART_READER_LITE
LIVE PREVIEW

Recurrent Neural Networks Xavier Gir-i-Nieto Acknowledgments - - PowerPoint PPT Presentation

Day 2 Lecture 6 Recurrent Neural Networks Xavier Gir-i-Nieto Acknowledgments Santi Pascual 2 General idea ConvNet (or CNN) 3 General idea ConvNet (or CNN) 4 Multilayer Perceptron The output depends ONLY on the current input.


slide-1
SLIDE 1

Day 2 Lecture 6

Recurrent Neural Networks

Xavier Giró-i-Nieto

slide-2
SLIDE 2

2

Acknowledgments

Santi Pascual

slide-3
SLIDE 3

3

General idea

ConvNet (or CNN)

slide-4
SLIDE 4

4

General idea

ConvNet (or CNN)

slide-5
SLIDE 5

5

Multilayer Perceptron

Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”

The output depends ONLY on the current input.

slide-6
SLIDE 6

6

Recurrent Neural Network (RNN)

Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”

The hidden layers and the

  • utput depend from previous

states of the hidden layers

slide-7
SLIDE 7

7

Recurrent Neural Network (RNN)

Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”

The hidden layers and the

  • utput depend from previous

states of the hidden layers

slide-8
SLIDE 8

8

Recurrent Neural Network (RNN)

time time Rotation 90o Front View Side View Rotation 90o

slide-9
SLIDE 9

9

Recurrent Neural Networks (RNN)

Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”

Each node represents a layer

  • f neurons at a single

timestep.

t t-1 t+1

slide-10
SLIDE 10

10

Recurrent Neural Networks (RNN)

Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”

t t-1 t+1

The input is a SEQUENCE x(t)

  • f any length.
slide-11
SLIDE 11

11

Recurrent Neural Networks (RNN)

Common visual sequences: Still image Spatial scan

(zigzag, snake)

The input is a SEQUENCE x(t)

  • f any length.
slide-12
SLIDE 12

12

Recurrent Neural Networks (RNN)

Common visual sequences: Video Temporal sampling

The input is a SEQUENCE x(t)

  • f any length.

...

t

slide-13
SLIDE 13

13

Recurrent Neural Networks (RNN)

Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”

Must learn temporally shared weights w2; in addition to w1 & w3.

t t-1 t+1

slide-14
SLIDE 14

14

Bidirectional RNN (BRNN)

Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”

Must learn weights w2, w3, w4 & w5; in addition to w1 & w6.

slide-15
SLIDE 15

15

Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”

Bidirectional RNN (BRNN)

slide-16
SLIDE 16

16

Slide: Santi Pascual

Formulation: One hidden layer

Delay unit (z-1)

slide-17
SLIDE 17

17

Slide: Santi Pascual

Formulation: Single recurrence

One-time Recurrence

slide-18
SLIDE 18

18

Slide: Santi Pascual

Formulation: Multiple recurrences

Recurrence

One time-step recurrence T time steps recurrences

slide-19
SLIDE 19

19

Slide: Santi Pascual

RNN problems

Long term memory vanishes because of the T nested multiplications by U. ...

slide-20
SLIDE 20

20

Slide: Santi Pascual

RNN problems

During training, gradients may explode or vanish because of temporal depth. Example: Back- propagation in time with 3 steps.

slide-21
SLIDE 21

21

Long Short-Term Memory (LSTM)

slide-22
SLIDE 22

22

Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9, no. 8 (1997): 1735-1780.

Long Short-Term Memory (LSTM)

slide-23
SLIDE 23

23

Figure: Cristopher Olah, “Understanding LSTM Networks” (2015)

Long Short-Term Memory (LSTM)

Based on a standard RNN whose neuron activates with tanh...

slide-24
SLIDE 24

24

Long Short-Term Memory (LSTM)

Ct is the cell state, which flows through the entire chain...

Figure: Cristopher Olah, “Understanding LSTM Networks” (2015)

slide-25
SLIDE 25

25

Long Short-Term Memory (LSTM)

...and is updated with a sum instead of a product. This avoid memory vanishing and exploding/vanishing backprop gradients.

Figure: Cristopher Olah, “Understanding LSTM Networks” (2015)

slide-26
SLIDE 26

26

Long Short-Term Memory (LSTM)

Three gates are governed by sigmoid units (btw [0,1]) define the control of in & out information..

Figure: Cristopher Olah, “Understanding LSTM Networks” (2015)

slide-27
SLIDE 27

27

Long Short-Term Memory (LSTM)

Forget Gate:

Concatenate

Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes

slide-28
SLIDE 28

28

Long Short-Term Memory (LSTM)

Input Gate Layer New contribution to cell state

Classic neuron

Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes

slide-29
SLIDE 29

29

Long Short-Term Memory (LSTM)

Update Cell State (memory):

Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes

slide-30
SLIDE 30

30

Long Short-Term Memory (LSTM)

Output Gate Layer Output to next layer

Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes

slide-31
SLIDE 31

31

Gated Recurrent Unit (GRU)

Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).

Similar performance as LSTM with less computation.

slide-32
SLIDE 32

32

Applications: Machine Translation

Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).

Language IN Language OUT

slide-33
SLIDE 33

33

Applications: Image Classification

van den Oord, Aaron, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel Recurrent Neural Networks." arXiv preprint arXiv:1601.06759 (2016).

RowLSTM Diagonal BiLSTM Classification MNIST

slide-34
SLIDE 34

34

Applications: Segmentation

Francesco Visin, Marco Ciccone, Adriana Romero, Kyle Kastner, Kyunghyun Cho, Yoshua Bengio, Matteo Matteucci, Aaron Courville, “ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation”. DeepVision CVPRW 2016.

slide-35
SLIDE 35

35

Thanks ! Q&A ?

Follow me at

https://imatge.upc.edu/web/people/xavier-giro

@DocXavi /ProfessorXavi