[PPT] - Le Lecture 10 recap ap Prof. Leal-Taix and Prof. Niessner 1 Le PowerPoint Presentation

SLIDE 1

Le Lecture 10 recap ap

Prof. Leal-Taixé and Prof. Niessner

1

SLIDE 2

Le LeNet

Digit recognition: 10 classes
Conv -> Pool -> Conv -> Pool -> Conv -> FC
As we go deeper: Width, height Number of filters
Prof. Leal-Taixé and Prof. Niessner

2

60k parameters

SLIDE 3

Al AlexNe Net

Prof. Leal-Taixé and Prof. Niessner

3

[Krizhevsky et al. 2012]

Softmax for 1000 classes

SLIDE 4

VG VGGNet

Striving for simplicity
CONV = 3x3 filters with stride 1, same convolutions
MAXPOOL = 2x2 filters with stride 2
Prof. Leal-Taixé and Prof. Niessner

4

[Simonyan and Zisserman 2014]

SLIDE 5

VG VGGNet

Prof. Leal-Taixé and Prof. Niessner

5

Conv=3x3,s=1,same Maxpool=2x2,s=2

SLIDE 6

VG VGGNet

Prof. Leal-Taixé and Prof. Niessner

6

Conv -> Pool -> Conv -> Pool -> Conv -> FC
As we go deeper: Width, height Number of filters
Called VGG-16: 16 layers that have weights
Large but simplicity makes it appealing

138M parameters

SLIDE 7

The The problem of depth

As we add more and more layers, training becomes

harder

Vanishing and exploding gradients
How can we train very deep nets?
Prof. Leal-Taixé and Prof. Niessner

7

SLIDE 8

Re Residual block

Two layers
Prof. Leal-Taixé and Prof. Niessner

8

Input Linear Non-linearity

W LxL−1 + bL xL = f(W LxL−1 + bL) xL+1 = f(W L+1xL + bL+1) xL−1 xL+1 xL

SLIDE 9

Re Residual block

Two layers
Prof. Leal-Taixé and Prof. Niessner

9

Linear

xL−1 xL+1 xL

Linear Main path Input Skip connection

SLIDE 10

Re Residual block

Two layers
Prof. Leal-Taixé and Prof. Niessner

10

Linear

xL−1 xL+1 xL

Linear Input

xL+1 = f(W L+1xL + bL+1) xL+1 = f(W L+1xL + bL+1 + xL−1)

SLIDE 11

Re Residual block

Two layers
Usually use a same convolution since we need same

dimensions

Otherwise we need to convert the dimensions with a

matrix of learned weights or zero padding

Prof. Leal-Taixé and Prof. Niessner

11

xL+1 xL xL−1

+

SLIDE 12

Wh Why do Re ResNets wo work?

The identity is easy for the residual block to learn
Guaranteed it will not hurt performance, can only

improve

Prof. Leal-Taixé and Prof. Niessner

12

xL+1 xL xL−1

+

NN

SLIDE 13

1x 1x1 1 convoluti tion

5

3 2

5

3 4 3 2 1

3

1 3 3 5

2

1 4 4 5 6 7 9

1

2 Image 5x5 Kernel 1x1

Prof. Leal-Taixé and Prof. Niessner

13

What is the output size?

SLIDE 14

1x 1x1 1 convoluti tion

5

3 2

5

3 4 3 2 1

3

1 3 3 5

2

1 4 4 5 6 7 9

1

2 Image 5x5 Kernel 1x1 −5 ∗ 2 = −10

Prof. Leal-Taixé and Prof. Niessner

14

10

SLIDE 15

1x 1x1 1 convoluti tion

5

3 2

5

3 4 3 2 1

3

1 3 3 5

2

1 4 4 5 6 7 9

1

2 Image 5x5 Kernel 1x1 −1 ∗ 2 = −2

Prof. Leal-Taixé and Prof. Niessner

15

10

6 4

10

6 8 6 4 2

6

2 6 6 10

4

2 8 8 10 12 14 18

2

SLIDE 16

1x 1x1 1 convoluti tion

5

3 2

5

3 4 3 2 1

3

1 3 3 5

2

1 4 4 5 6 7 9

1

Image 5x5

Prof. Leal-Taixé and Prof. Niessner

16

10

6 4

10

6 8 6 4 2

6

2 6 6 10

4

2 8 8 10 12 14 18

2
For 1 kernel or filter, it keeps the dimensions and just

scales the input with a number

SLIDE 17

Us Using 1x1 convolutions

Prof. Leal-Taixé and Prof. Niessner

17

Use it to shrink the number of channels
Further adds a non-linearity à one can learn more

complex functions

32 32 200 32 32 32 32 Conv 1x1x200 + ReLU

SLIDE 18

In Inceptio ion layer

Tired of choosing filter sizes?
Use them all!
All same convolutions
3x3 max pooling is with stride 1

18

Prof. Leal-Taixé and Prof. Niessner

SLIDE 19

In Inceptio ion layer: : computatio ional cost

32 32 200 92 Conv 5x5 + ReLU Multiplications: 1x1x200x32x32x16 5x5x16x32x32x92 ~ 40 million 32 32 92 16 Conv 1x1 + ReLU 32 32 16

Reduction of multiplications by 1/10

Prof. Leal-Taixé and Prof. Niessner

19

SLIDE 20

In Inceptio ion layer

Prof. Leal-Taixé and Prof. Niessner

20

SLIDE 21

Se Semant ntic Se Segment ntation n (FCN)

[Long et al. 15] Fully Convolutional Networks for Semantic Segmetnation (FCN)

Prof. Leal-Taixé and Prof. Niessner

21

SLIDE 22

Tr Trans nsfer learni ning ng

Prof. Leal-Taixé and Prof. Niessner

22

Trained on ImageNet New dataset with C classes TRAIN FROZEN

Donahue 2014, Razavian 2014

SLIDE 23

No Now you are:

Ready to perform image classification on any dataset
Ready to design your own architecture
Ready to deal with other problems such as semantic

segmentation (Fully Convolutional Network)

Prof. Leal-Taixé and Prof. Niessner

23

SLIDE 24

Re Recurrent Ne Neural Ne Networks

Prof. Leal-Taixé and Prof. Niessner

24

SLIDE 25

RN RNNs are flexi xible

Classic Neural Networks for Image Classification

Prof. Leal-Taixé and Prof. Niessner

25

SLIDE 26

RN RNNs are flexi xible

Image captioning

Prof. Leal-Taixé and Prof. Niessner

26

SLIDE 27

RN RNNs are flexi xible

Language recognition

Prof. Leal-Taixé and Prof. Niessner

27

SLIDE 28

RN RNNs are flexi xible

Machine translation

Prof. Leal-Taixé and Prof. Niessner

28

SLIDE 29

RN RNNs are flexi xible

Event classification

Prof. Leal-Taixé and Prof. Niessner

29

SLIDE 30

Ba Basic c struct uctur ure of a RN RNN

Multi-layer RNN

Outputs Inputs Hidden states

Prof. Leal-Taixé and Prof. Niessner

30

SLIDE 31

Ba Basic c struct uctur ure of a RN RNN

Multi-layer RNN

Outputs Inputs Hidden states The hidden state will have its own internal dynamics More expressive model!

Prof. Leal-Taixé and Prof. Niessner

31

SLIDE 32

Ba Basic c struct uctur ure of a RN RNN

We want to have notion of “time” or “sequence”

[Christopher Olah] Understanding LSTMs

Hidden state input Previous hidden state

Prof. Leal-Taixé and Prof. Niessner

32

SLIDE 33

Ba Basic c struct uctur ure of a RN RNN

We want to have notion of “time” or “sequence”

Hidden state Parameters to be learned

Prof. Leal-Taixé and Prof. Niessner

33

SLIDE 34

Ba Basic c struct uctur ure of a RN RNN

We want to have notion of “time” or “sequence”

Hidden state Note: non-linearities ignored for now Output

Prof. Leal-Taixé and Prof. Niessner

34

SLIDE 35

Ba Basic c struct uctur ure of a RN RNN

We want to have notion of “time” or “sequence”

Hidden state Same parameters for each time step = generalization! Output

Prof. Leal-Taixé and Prof. Niessner

35

SLIDE 36

Ba Basic c struct uctur ure of a RN RNN

Unrolling RNNs

[Christopher Olah] Understanding LSTMs

Hidden state is the same

Prof. Leal-Taixé and Prof. Niessner

36

SLIDE 37

Ba Basic c struct uctur ure of a RN RNN

Unrolling RNNs

[Christopher Olah] Understanding LSTMs

Prof. Leal-Taixé and Prof. Niessner

37

SLIDE 38

Ba Basic c struct uctur ure of a RN RNN

Unrolling RNNs as feedforward nets

w1 w2 w3 w4 w1 w2 w3 w4 w1 w2 w3 w4 w1 w2 w3 w4 xt

1

xt

2

xt+2

1

xt+2

2

xt+1

2

xt+1

1

Weights are the same!

Prof. Leal-Taixé and Prof. Niessner

38

SLIDE 39

Ba Back ckprop th through a a RNN

Unrolling RNNs as feedforward nets

w1 w2 w3 w4 w1 w2 w3 w4 w1 w2 w3 w4 w1 w2 w3 w4 Chain rule All the way to t=0

Add the derivatives at different times for each weight

Prof. Leal-Taixé and Prof. Niessner

39

SLIDE 40

Lo Long ng-te term dependencies

I mo moved to Germany any … so I speak German an fluently

Prof. Leal-Taixé and Prof. Niessner

40

SLIDE 41

Lo Long ng-te term dependencies

Simple recurrence
Let us forget the input

At = θtA0

Same weights are multiplied over and over again

Prof. Leal-Taixé and Prof. Niessner

41

SLIDE 42

Lo Long ng-te term dependencies

Simple recurrence

What happens to small weights? What happens to large weights? Vanishing gradient Exploding gradient

At = θtA0

Prof. Leal-Taixé and Prof. Niessner

42

SLIDE 43

Lo Long ng-te term dependencies

Simple recurrence
If admits eigendecomposition

At = θtA0

Diagonal of this matrix are the eigenvalues Matrix of eigenvectors

Prof. Leal-Taixé and Prof. Niessner

43

SLIDE 44

Lo Long ng-te term dependencies

Simple recurrence
If admits eigendecomposition
Orthogonal allows us to simplify the recurrence

At = QΛtQ|A0 At = θtA0

Prof. Leal-Taixé and Prof. Niessner

44

SLIDE 45

Lo Long ng-te term dependencies

Simple recurrence

At = QΛtQ|A0

What happens to eigenvalues with magnitude less than one? What happens to eigenvalues with magnitude larger than one? Vanishing gradient Exploding gradient Gradient clipping

Prof. Leal-Taixé and Prof. Niessner

45

SLIDE 46

Lo Long ng-te term dependencies

Simple recurrence

At = θtA0

Let us just make a matrix with eigenvalues = 1 Allow the ce cell to maintain its “state”

Prof. Leal-Taixé and Prof. Niessner

46

SLIDE 47

Va Vanishing gradient

1. From the weights
2. From the activation functions (tanh)

At = θtA0

Prof. Leal-Taixé and Prof. Niessner

47

SLIDE 48

Va Vanishing gradient

1. From the weights
2. From the activation functions (tanh)

At = θtA0

1

Prof. Leal-Taixé and Prof. Niessner

48

SLIDE 49

Lo Long ng Sho hort Term Me Memory

Hochreiter and Schmidhuber 1997

Prof. Leal-Taixé and Prof. Niessner

49

SLIDE 50

Lo Long ng-Sho Short Te Term Memory Uni Units

Simple RNN has tanh as non-linearity
Prof. Leal-Taixé and Prof. Niessner

50

SLIDE 51

Lo Long ng-Sho Short Te Term Memory Uni Units

LSTM
Prof. Leal-Taixé and Prof. Niessner

51

SLIDE 52

Lo Long ng-Sho Short Te Term Memory Uni Units

Key ingredients
Cell = transports the information through the unit
Prof. Leal-Taixé and Prof. Niessner

52

SLIDE 53

Lo Long ng-Sho Short Te Term Memory Uni Units

Key ingredients
Cell = transports the information through the unit
Gate = remove or add information to the cell state

Sigmoid

Prof. Leal-Taixé and Prof. Niessner

53

SLIDE 54

LSTM LSTM: step p by by step

Forget gate

Decides when to erase the cell state Sigmoid = output between 0 (forget) and 1 (keep)

Prof. Leal-Taixé and Prof. Niessner

54

SLIDE 55

LSTM LSTM: step p by by step

Input gate

Decides which values will be updated New cell state,

utput from a

tanh (-1,1)

Prof. Leal-Taixé and Prof. Niessner

55

SLIDE 56

LSTM LSTM: step p by by step

Element-wise operations
Prof. Leal-Taixé and Prof. Niessner

56

SLIDE 57

LSTM LSTM: step p by by step

Output gate

Decides which values will be

utputted

Output from a tanh (-1,1)

Prof. Leal-Taixé and Prof. Niessner

57

SLIDE 58

LSTM LSTM: step p by by step

Forget gate
Input gate
Output gate
Cell update
Cell
Output

gt = Tanh(θxgxt + θhght−1 + bg) ht = ot Tanh(Ct)

Prof. Leal-Taixé and Prof. Niessner

58

SLIDE 59

LSTM LSTM: vani nishi hing ng gr gradient nts?

Cell
1. From the weights
2. From the activation functions

weights Identity function

1 for important information

Prof. Leal-Taixé and Prof. Niessner

59

SLIDE 60

Lo Long ng-Sho Short Te Term Memory Uni Units

Highway for the gradient to flow
Prof. Leal-Taixé and Prof. Niessner

60

SLIDE 61

RN RNN’s in Computer Vi Vision

Caption generation

Xu et al. 2015

Prof. Leal-Taixé and Prof. Niessner

62

SLIDE 62

RN RNN’s in Computer Vi Vision

Caption generation
Focus is shifted to different parts of the image

Xu et al. 2015

Prof. Leal-Taixé and Prof. Niessner

63

SLIDE 63

RN RNN’s in Computer Vi Vision

Instance segmentation

Romera-Paredes et al. 2015

Prof. Leal-Taixé and Prof. Niessner

64

SLIDE 64

Fi Final nal exam am

Prof. Leal-Taixé and Prof. Niessner

65

SLIDE 65

Fi Final exa xam

Multiple choice questions
Series of questions with free answer
There can be questions related to the exercises à if

you did the exercises it will be easier for you to answer them

Prof. Leal-Taixé and Prof. Niessner

66

SLIDE 66

Fi Final exa xam

Must-know topics:

– Basics of ML à from linear classifier to NN – Optimization schemes (not necessary to know all the formulas, but to have a good understanding of the differences between them and their behavior – Backpropagation: concept, math, hint: be fluent at computing backprop by hand – Loss functions and activation functions – CNN: convolution, backprop – RNN, LSTMs

Prof. Leal-Taixé and Prof. Niessner

67

SLIDE 67

Ad Admin

Exam date: Ju

July 16 16th

h at 08:

08:00 00

There will NOT be a retake exam
No cheat sheet nor calculator during the exam
Prof. Leal-Taixé and Prof. Niessner

68

SLIDE 68

Ne Next semesters: ne new DL DL courses

Prof. Leal-Taixé and Prof. Niessner

69

SLIDE 69

De Deep Learning at TUM

Keep expanding the courses on Deep Learning
This Introduction to Deep Learning course is the basis

for a series of Advanced DL lectures on different topics

Advanced topics are typically only for Master

students

Prof. Leal-Taixé and Prof. Niessner

70

SLIDE 70

De Deep Learning at TUM

Intro to Deep Learning DL for Physics

(Th Thuerey)

DL for Vision

(Ni Niessner, , Le Leal-Ta Taixe)

DL for Medical Applicat.

(Me Menze)

DL in Robotics

(Bä Bäuml)

Machine Learning

(Gü Günnemann)

Prof. Leal-Taixé and Prof. Niessner

71

SLIDE 71

Ad Advanced DL DL for Computer Vision

Deep Learning for Vision (WS18/19): syllabus

– Advanced architectures, e.g. Siamese neural networks – Variational Autoencoders – Generative models, e.g. GAN, – Multi-dimensional CNN – Bayesian Deep Learning

Prof. Leal-Taixé and Prof. Niessner

72

SLIDE 72

Ad Advanced DL DL for Computer Vision

Deep Learning for Vision (WS18/19)

– 2 V + 5 P – Mu Must have attended the Intro to DL – Practical part is a project that will last the whole semester – Please do not sign up unless you are willing to spend a lot of time on the project!

Prof. Leal-Taixé and Prof. Niessner

73

SLIDE 73

De Detection, Segmentation and Tracking

New lecture (Prof. Leal-Taixé, SS19)

– Must have attended the Intro to DL – Common detection and segmentation frameworks (YOLO, Faster-RCNN, Mask-RCNN) – Extension to videos à tracking – One project that will last the whole semester

Prof. Leal-Taixé and Prof. Niessner

74

SLIDE 74

Tha Thank nk you

Visual Computing

Dynamic Vision and Learning

3D scanning
DL in 3D understanding
3D reconstruction
Video segmentation
Object tracking
Camera localization

Ni Niessner Le Leal-Ta Taixé

Prof. Leal-Taixé and Prof. Niessner

75