ECE 6504: Deep Learning for Perception Topics: LSTMs (intuition - - PowerPoint PPT Presentation

ece 6504 deep learning for perception
SMART_READER_LITE
LIVE PREVIEW

ECE 6504: Deep Learning for Perception Topics: LSTMs (intuition - - PowerPoint PPT Presentation

ECE 6504: Deep Learning for Perception Topics: LSTMs (intuition and variants) [Abhishek:] Lua / Torch Tutorial Dhruv Batra Virginia Tech Administrativia HW3 Out today Due in 2 weeks Please please please please please


slide-1
SLIDE 1

ECE 6504: Deep Learning for Perception

Dhruv Batra Virginia Tech

Topics:

– LSTMs (intuition and variants) – [Abhishek:] Lua / Torch Tutorial

slide-2
SLIDE 2

Administrativia

  • HW3

– Out today – Due in 2 weeks – Please please please please please start early – https://computing.ece.vt.edu/~f15ece6504/homework3/

(C) Dhruv Batra 2

slide-3
SLIDE 3

RNN

  • Basic block diagram

(C) Dhruv Batra 3

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-4
SLIDE 4

Key Problem

  • Learning long-term dependencies is hard

(C) Dhruv Batra 4

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-5
SLIDE 5

Meet LSTMs

  • How about we explicitly encode memory?

(C) Dhruv Batra 5

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-6
SLIDE 6

LSTMs Intuition: Memory

  • Cell State / Memory

(C) Dhruv Batra 6

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-7
SLIDE 7

LSTMs Intuition: Forget Gate

  • Should we continue to remember this “bit” of

information or not?

(C) Dhruv Batra 7

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-8
SLIDE 8

LSTMs Intuition: Input Gate

  • Should we update this “bit” of information or not?

– If so, with what?

(C) Dhruv Batra 8

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-9
SLIDE 9

LSTMs Intuition: Memory Update

  • Forget that + memorize this

(C) Dhruv Batra 9

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-10
SLIDE 10

LSTMs Intuition: Output Gate

  • Should we output this “bit” of information to “deeper”

layers?

(C) Dhruv Batra 10

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-11
SLIDE 11

LSTMs Intuition: Output Gate

  • Should we output this “bit” of information to “deeper”

layers?

(C) Dhruv Batra 11

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-12
SLIDE 12

LSTMs

  • A pretty sophisticated cell

(C) Dhruv Batra 12

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-13
SLIDE 13

LSTM Variants #1: Peephole Connections

  • Let gates see the cell state / memory

(C) Dhruv Batra 13

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-14
SLIDE 14

LSTM Variants #2: Coupled Gates

  • Only memorize new if forgetting old

(C) Dhruv Batra 14

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-15
SLIDE 15

LSTM Variants #3: Gated Recurrent Units

  • Changes:

– No explicit memory; memory = hidden output – Z = memorize new and forget old

(C) Dhruv Batra 15

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

slide-16
SLIDE 16

RMSProp Intuition

  • Gradients ≠ Direction to Opt

– Gradients point in the direction of steepest ascent locally – Not where we want to go long term

  • Mismatch gradient magnitudes

– magnitude large = we should travel a small distance – magnitude small = we should travel a large distance

(C) Dhruv Batra 16

Image Credit: Geoffrey Hinton

slide-17
SLIDE 17

RMSProp Intuition

  • Keep track of previous gradients to get an idea of

magnitudes over batch

  • Divide by this accumulate

(C) Dhruv Batra 17