Making Convolutional Networks Shift-Invariant Again Richard Zhang - - PowerPoint PPT Presentation

making convolutional networks shift invariant again
SMART_READER_LITE
LIVE PREVIEW

Making Convolutional Networks Shift-Invariant Again Richard Zhang - - PowerPoint PPT Presentation

Making Convolutional Networks Shift-Invariant Again Richard Zhang Adobe Research Example classifications P(correct class) P(correct class) not Shift-Invariant Deep Networks are not P(correct class) P(correct class) not Shift-Invariant Deep


slide-1
SLIDE 1

Making Convolutional Networks Shift-Invariant Again

Richard Zhang

Adobe Research

slide-2
SLIDE 2

Example classifications

P(correct class) P(correct class)

slide-3
SLIDE 3

Deep Networks are not not Shift-Invariant

P(correct class) P(correct class)

slide-4
SLIDE 4

Deep Networks are not not Shift-Invariant

Azulay and Weiss. Why y do deep convo volutional networks ks generalize ze so so poorly y to sm small image transf sformations? s? In ArXiv, 2018. Engstrom, Tsipras, Schmidt, Madry. Exp xploring the Landsc scape of Spatial Robust stness.

  • ss. In ICML, 2019.

P(correct class) P(correct class)

slide-5
SLIDE 5

Why is shift-invariance lost?

5

slide-6
SLIDE 6

Why is shift-invariance lost?

“Convo volutions are sh shift-equiva variant”

6

slide-7
SLIDE 7

Why is shift-invariance lost?

“Convo volutions are sh shift-equiva variant” “Po Poolin ling builds up sh shift-inva variance”

7

slide-8
SLIDE 8

Why is shift-invariance lost?

“Convo volutions are sh shift-equiva variant” “Po Poolin ling builds up sh shift-inva variance” …but st striding ignores Nyquist sampling theorem and aliase ses

8

slide-9
SLIDE 9

9

Re-examining Max-Pooling

slide-10
SLIDE 10

10

Re-examining Max-Pooling

max

slide-11
SLIDE 11

11

Re-examining Max-Pooling

max

slide-12
SLIDE 12

12

Re-examining Max-Pooling

max

slide-13
SLIDE 13

13

Re-examining Max-Pooling

max

slide-14
SLIDE 14

14

Re-examining Max-Pooling

max

slide-15
SLIDE 15

15

Re-examining Max-Pooling

max

slide-16
SLIDE 16

16

Re-examining Max-Pooling

max

slide-17
SLIDE 17

17

Re-examining Max-Pooling

max

slide-18
SLIDE 18

18

Re-examining Max-Pooling

slide-19
SLIDE 19

19

Re-examining Max-Pooling

max

slide-20
SLIDE 20

20

Re-examining Max-Pooling

max

slide-21
SLIDE 21

21

Re-examining Max-Pooling

max

slide-22
SLIDE 22

22

Re-examining Max-Pooling

max

slide-23
SLIDE 23

23

Re-examining Max-Pooling

max

slide-24
SLIDE 24

24

Re-examining Max-Pooling

max

slide-25
SLIDE 25

25

Re-examining Max-Pooling

slide-26
SLIDE 26

26

Re-examining Max-Pooling

slide-27
SLIDE 27

27

Re-examining Max-Pooling

slide-28
SLIDE 28

28

Re-examining Max-Pooling

Max-pooling breaks ks shift-equivariance

slide-29
SLIDE 29

Shift-equivariance in VGG

  • CIFAR
  • VGG network
  • 5 max-pools
  • Test shift-equivariance condition
  • pixels

conv1 pool1 conv2 pool2 conv3 pool3 conv4 pool4 conv5 pool5 classifier softmax 32x32 1x1

slide-30
SLIDE 30

Shift-equivariance in VGG

  • CIFAR
  • VGG network
  • 5 max-pools
  • Test shift-equivariance condition
  • pixels

conv1 pool1 conv2 pool2 conv3 pool3 conv4 pool4 conv5 pool5 classifier softmax 32x32 1x1

slide-31
SLIDE 31

Perfect shift-eq. Large deviation from shift-eq.

31

pixels

conv1 v1

pool1 conv2 pool2 conv3 pool3 conv4 pool4 conv5 pool5 classifier softmax

Shift-equivariance, per layer

slide-32
SLIDE 32

Perfect shift-eq. Large deviation from shift-eq.

32

pixels

conv1 v1

pool1 conv2 pool2 conv3 pool3 conv4 pool4 conv5 pool5 classifier softmax

Shift-equivariance, per layer

Convolution is shift-equivariant

slide-33
SLIDE 33

Perfect shift-eq. Large deviation from shift-eq.

33

pixels conv1

po pool1

conv2 pool2 conv3 pool3 conv4 pool4 conv5 pool5 classifier softmax

Shift-equivariance, per layer

slide-34
SLIDE 34

Perfect shift-eq. Large deviation from shift-eq.

34

pixels conv1

po pool1

conv2 pool2 conv3 pool3 conv4 pool4 conv5 pool5 classifier softmax

Shift-equivariance, per layer

Pooling breaks shift-equivariance

slide-35
SLIDE 35

pixels conv1 pool1 conv2

po pool2

conv3 pool3 conv4 pool4 conv5 pool5 classifier softmax

Shift-equivariance, per layer

Perfect shift-eq. Large deviation from shift-eq.

slide-36
SLIDE 36

pixels conv1 pool1 conv2 pool2 conv3

po pool3

conv4 pool4 conv5 pool5 classifier softmax

Shift-equivariance, per layer

Perfect shift-eq. Large deviation from shift-eq.

slide-37
SLIDE 37

Perfect shift-eq. Large deviation from shift-eq.

pixels conv1 pool1 conv2 pool2 conv3 pool3 conv4

po pool4

conv5 pool5 classifier softmax

Shift-equivariance, per layer

slide-38
SLIDE 38

Perfect shift-eq. Large deviation from shift-eq.

pixels conv1 pool1 conv2 pool2 conv3 pool3 conv4

po pool4

conv5 pool5 classifier softmax

Shift-equivariance, per layer

Nyquist theorem ignored when pooling; aliasi sing breaks shift-equivariance

slide-39
SLIDE 39

Alternative downsampling methods

  • Blur+subsample
  • Antialiasi

sing in signal processing; image processing; graphics

  • Max-pooling
  • Performs better in deep learning applications [Scherer 2010]

39

slide-40
SLIDE 40

Alternative downsampling methods

  • Blur+subsample
  • Antialiasi

sing in signal processing; image processing; graphics

  • Max-pooling
  • Performs better in deep learning applications [Scherer 2010]

40

slide-41
SLIDE 41

Alternative downsampling methods

  • Blur+subsample
  • Antialiasi

sing in signal processing; image processing; graphics

  • Max-pooling
  • Performs better in deep learning applications [Scherer 2010]

41

Reconcile antialiasing with max-pooling

slide-42
SLIDE 42

heavy vy aliasi sing

max( x( )

Baseline

(MaxPool)

max( x( )

slide-43
SLIDE 43

heavy vy aliasi sing

max( x( )

Baseline

(MaxPool)

max( x( )

slide-44
SLIDE 44

heavy vy aliasi sing

max( x( )

Baseline

(MaxPool) (1) Max x (dense se eva valuation)

no aliasi sing

max( x( ) max( x( ) max( x( )

slide-45
SLIDE 45

heavy vy aliasi sing

max( x( )

Baseline

(MaxPool) (1) Max x (dense se eva valuation)

no aliasi sing

max( x( )

(2) Subsa sampling

heavy vy aliasi sing

max( x( ) max( x( )

slide-46
SLIDE 46

heavy vy aliasi sing

max( x( )

Baseline

(MaxPool) (1) Max x (dense se eva valuation)

no aliasi sing

max( x( )

(2) Subsa sampling

heavy vy aliasi sing

max( x( ) max( x( )

slide-47
SLIDE 47

heavy vy aliasi sing

max( x( )

Baseline

(MaxPool) (1) Max x (dense se eva valuation)

no aliasi sing

max( x( )

(2) Subsa sampling

heavy vy aliasi sing

max( x( ) max( x( )

(1) Max x (dense se eva valuation)

no aliasi sing

max( x( ) max( x( )

Anti-aliased

(MaxBlurPool)

slide-48
SLIDE 48

heavy vy aliasi sing

max( x( )

Baseline

(MaxPool) (1) Max x (dense se eva valuation)

no aliasi sing

max( x( )

(2) Subsa sampling

heavy vy aliasi sing

max( x( ) max( x( )

(1) Max x (dense se eva valuation) (2 (2) ) Anti Anti-aliasi sing filter filter

no aliasi sing no aliasi sing

max( x( ) max( x( ) Bl Blur

Anti-aliased

(MaxBlurPool)

slide-49
SLIDE 49

heavy vy aliasi sing

max( x( )

Baseline

(MaxPool) (1) Max x (dense se eva valuation)

no aliasi sing

max( x( )

(2) Subsa sampling

heavy vy aliasi sing

max( x( ) max( x( )

(1) Max x (dense se eva valuation) (2 (2) ) Anti-aliasi sing filter

no aliasi sing no aliasi sing

max( x( )

(3) Subsa sampling

reduced aliasi sing

max( x( ) Bl Blur

Anti-aliased

(MaxBlurPool)

slide-50
SLIDE 50

heavy vy aliasi sing

max( x( )

Baseline

(MaxPool) (1) Max x (dense se eva valuation)

no aliasi sing

max( x( )

(2) Subsa sampling

heavy vy aliasi sing

max( x( ) max( x( )

(1) Max x (dense se eva valuation) (2 (2) ) Anti-aliasi sing filter

no aliasi sing no aliasi sing

max( x( )

(3) Subsa sampling

reduced aliasi sing

Bl Blur

Anti-aliased

(MaxBlurPool)

max( x( )

Evaluated together as “BlurPool”

slide-51
SLIDE 51

Antialiasing any downsampling layer

  • Max Pool
  • VGG, Alexnet
  • Strided Convolution
  • Resnet, MobileNetv2
  • Average Pool
  • DenseNet

51

slide-52
SLIDE 52

Antialiasing any downsampling layer

  • Max Pool
  • VGG, Alexnet
  • Strided Convolution
  • Resnet, MobileNetv2
  • Average Pool
  • DenseNet

52

slide-53
SLIDE 53

Antialiasing any downsampling layer

  • Max Pool
  • VGG, Alexnet
  • Strided Convolution
  • Resnet, MobileNetv2
  • Average Pool
  • DenseNet

53

slide-54
SLIDE 54

ImageNet

Shift-invariance

slide-55
SLIDE 55

ImageNet

Shift-invariance Accuracy

slide-56
SLIDE 56

ImageNet

Shift-invariance Accuracy

slide-57
SLIDE 57

ImageNet

Shift-invariance Accuracy

Baseline

slide-58
SLIDE 58

ImageNet

Shift-invariance Accuracy

Baseline Antialiased

slide-59
SLIDE 59

ImageNet

Shift-invariance Accuracy

Baseline Antialiased

slide-60
SLIDE 60

ImageNet

Shift-invariance Accuracy

Baseline Antialiased

Antialiasing also improves accur accuracy acy

slide-61
SLIDE 61

Discussion

Striding aliases(stride=2) Add antialiasing filter

+ Improved shift-equivariance + Improved accuracy

Additionally

+ Improved stability to other perturbations + Improved robustness

61

slide-62
SLIDE 62

Striding aliases(stride=2) Add antialiasing filter

+ Improved shift-equivariance + Improved accuracy

Additionally

+ Improved stability to other perturbations + Improved robustness

62

Discussion

slide-63
SLIDE 63

Striding aliases(stride=2) Add antialiasing filter

+ Improved shift-equivariance + Improved accuracy

Additionally

+ Improved stability to other perturbations + Improved robustness

63

Discussion

slide-64
SLIDE 64

Striding aliases(stride=2) Add antialiasing filter

+ Improved shift-equivariance + Improved accuracy

Additionally

+ Improved stability to other perturbations + Improved robustness

64

Discussion

slide-65
SLIDE 65

Striding aliases(stride=2) Add antialiasing filter

+ Improved shift-equivariance + Improved accuracy

Additionally

+ Improved stability to other perturbations + Improved robustness

65

Discussion

slide-66
SLIDE 66

Striding aliases(stride=2) Add antialiasing filter

+ Improved shift-equivariance + Improved accuracy

Additionally

+ Improved stability to other perturbations + Improved robustness

66

Discussion

slide-67
SLIDE 67

Striding aliases(stride=2) Add antialiasing filter

+ Improved shift-equivariance + Improved accuracy

Additionally

+ Improved stability to other perturbations + Improved robustness

67

Discussion

Antialiasing code, pretrained models

https://richzhang.github.io/antialiased-cnns/

slide-68
SLIDE 68

Striding aliases(stride=2) Add antialiasing filter

+ Improved shift-equivariance + Improved accuracy

Additionally

+ Improved stability to other perturbations + Improved robustness

68

Discussion

Antialiasing code, pretrained models

https://richzhang.github.io/antialiased-cnns/

Thank you!