Robustness and geometry of deep neural networks Alhussein Fawzi - - PowerPoint PPT Presentation

robustness and geometry of deep neural networks
SMART_READER_LITE
LIVE PREVIEW

Robustness and geometry of deep neural networks Alhussein Fawzi - - PowerPoint PPT Presentation

Robustness and geometry of deep neural networks Alhussein Fawzi DeepMind May 23rd 2019 The Mathematics of Deep Learning and Data Science University of Cambridge 1 Recent advances in machine learning Error rate (%) He et., al., Delving


slide-1
SLIDE 1

Robustness and geometry of deep neural networks

Alhussein Fawzi DeepMind May 23rd 2019 The Mathematics of Deep Learning and Data Science University of Cambridge

1

slide-2
SLIDE 2

Recent advances in machine learning

He et., al., “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification” , 2015 Karpathy et., al, “Automated Image Captioning with ConvNets and Recurrent Nets” LeCun et. al.,, “Deep Learning” , 2015

Error rate (%)

DeepMind https://deepmind.com/research/alphago/

2

slide-3
SLIDE 3

Robustness of classifiers to perturbations

In real world environments, images undergo perturbations.

3

slide-4
SLIDE 4

Robustness of classifiers to perturbations

In real world environments, images undergo perturbations.

Lampshade

3

slide-5
SLIDE 5

Robustness of classifiers to perturbations

In real world environments, images undergo perturbations.

Lampshade Perturbation

3

slide-6
SLIDE 6

Robustness of classifiers to perturbations

In real world environments, images undergo perturbations.

Lampshade Perturbation

3

slide-7
SLIDE 7

Robustness of classifiers to perturbations

In real world environments, images undergo perturbations.

Lampshade

f Perturbation

3

slide-8
SLIDE 8

Robustness of classifiers to perturbations

In real world environments, images undergo perturbations.

Lampshade

f Lampshade? Perturbation

3

slide-9
SLIDE 9

Robustness of classifiers to perturbations

In real world environments, images undergo perturbations.

Lampshade

f Lampshade? Perturbation

Broad range of perturbations

Adversarial perturbations

[Szegedy et. al. ICLR 2014], [Biggio et. al., PKDD 2013], ...

Random noise

[Fawzi et. al., NIPS 2016], [Franceschi et. al., AISTATS 2018]

Structured nuisances (geometric transformations

[Bruna et. al., TPAMI 2013], [Jaderberg et. al., NIPS 2015] , occlusions [Sharif et. al., CCS 2016] , etc...). 3

slide-10
SLIDE 10

Robustness of classifiers to perturbations (Cont’d)

Safety of machine learning systems

?

4

slide-11
SLIDE 11

Robustness of classifiers to perturbations (Cont’d)

Safety of machine learning systems

?

Better understanding of the geometry of state-of-the-art classifiers.

Class 1 Class 2

4

slide-12
SLIDE 12

Talk outline

1 Fooling classifiers is easy: vulnerability to different perturbations. 2 Improving the robustness (i.e., “defending”) is difficult. 3 Geometric analysis of a successful defense: adversarial training. 5

slide-13
SLIDE 13

Adversarial perturbations

State-of-the-art deep neural networks have been shown to be surprisingly unstable to adversarial perturbations.

6

slide-14
SLIDE 14

Adversarial perturbations

State-of-the-art deep neural networks have been shown to be surprisingly unstable to adversarial perturbations.

School bus

6

slide-15
SLIDE 15

Adversarial perturbations

State-of-the-art deep neural networks have been shown to be surprisingly unstable to adversarial perturbations.

School bus Ostrich

6

slide-16
SLIDE 16

Adversarial perturbations

State-of-the-art deep neural networks have been shown to be surprisingly unstable to adversarial perturbations.

School bus Perturbation Ostrich

Figure from [Szegedy et. al., ICLR 2014]. 6

slide-17
SLIDE 17

Adversarial perturbations

State-of-the-art deep neural networks have been shown to be surprisingly unstable to adversarial perturbations.

School bus Perturbation Ostrich

Figure from [Szegedy et. al., ICLR 2014].

Adversarial examples are found by seeking the minimal perturbation (in the ℓ2 sense) that switches the label of the classifier.

6

slide-18
SLIDE 18

Adversarial perturbations

Robustness to adversarial noise

r∗( ) x

x

r∗(x) = min

r

r2 subject to f (x + r) = f (x).

7

slide-19
SLIDE 19

Other types of adversarial perturbations

Universal perturbations [Moosavi-Dezfooli et. al., 2017]

Flagpole Joystick Chihuahua L a b r a d

  • r

T e r r i e r Balloon

Geometric transformations [Fawzi et. al., 2015, Moosavi-Dezfooli et. al., 2018, Xiao et al., 2018]

8

slide-20
SLIDE 20

Finding adversarial perturbations is easy...

http://robust.vision

9

slide-21
SLIDE 21

... but designing defense mechanisms is hard!

Despite the huge number of proposed defenses, state-of-the-art classifiers are still vulnerable to small perturbations.

10

slide-22
SLIDE 22

... but designing defense mechanisms is hard!

Despite the huge number of proposed defenses, state-of-the-art classifiers are still vulnerable to small perturbations.

10

slide-23
SLIDE 23

... but designing defense mechanisms is hard!

Despite the huge number of proposed defenses, state-of-the-art classifiers are still vulnerable to small perturbations.

10

slide-24
SLIDE 24

... but designing defense mechanisms is hard!

Despite the huge number of proposed defenses, state-of-the-art classifiers are still vulnerable to small perturbations.

10

slide-25
SLIDE 25

Adversarial training

11

slide-26
SLIDE 26

Adversarial training

Adversarial accuracy (CIFAR-10): 11

slide-27
SLIDE 27

Adversarial training

Adversarial accuracy (CIFAR-10): [Madry et. al., 2017] 11

slide-28
SLIDE 28

Adversarial training

Adversarial accuracy (CIFAR-10): [Madry et. al., 2017]

Adversarial training leads to state-of-the art robustness to adversarial perturbations.

11

slide-29
SLIDE 29

Adversarial training

Adversarial accuracy (CIFAR-10): [Madry et. al., 2017]

Adversarial training leads to state-of-the art robustness to adversarial perturbations. But what does it actually do?

11

slide-30
SLIDE 30

Decision boundaries

  • Adv. direction

Random direction 12

slide-31
SLIDE 31

Decision boundaries

Normal training

  • Adv. direction

Random direction 12

slide-32
SLIDE 32

Decision boundaries

Normal training Adversarial training

  • Adv. direction

Random direction 12

slide-33
SLIDE 33

Decision boundaries

Normal training Adversarial training

  • Adv. direction

Random direction

After adversarial training, the decision boundaries are flatter and more regular.

12

slide-34
SLIDE 34

Effect of adversarial training on loss landscape

Logit Label

13

slide-35
SLIDE 35

Effect of adversarial training on loss landscape

Before adv. fine-tuning Logit Label

13

slide-36
SLIDE 36

Effect of adversarial training on loss landscape

Before adv. fine-tuning After adv. fine-tuning Logit Label

13

slide-37
SLIDE 37

Effect of adversarial training on loss landscape (Cont’d)

14

slide-38
SLIDE 38

Quantitative analysis: curvature decrease with adversarial training

We compute the Hessian matrix at a test point x with respect to inputs. H = ∂2ℓ ∂xi∂xj

  • The eigenvalues of H are the curvature of ℓ in the vicinity of x.

15

slide-39
SLIDE 39

Quantitative analysis: curvature decrease with adversarial training

We compute the Hessian matrix at a test point x with respect to inputs. H = ∂2ℓ ∂xi∂xj

  • The eigenvalues of H are the curvature of ℓ in the vicinity of x.
  • 0.5

1.5 1.0 0.5 0.0 3000 2500 2000 1500 1000 500

Original Adversarial

Eigenvalue profile

Eigenvalue number Value 15

slide-40
SLIDE 40

Quantitative analysis: curvature decrease with adversarial training

We compute the Hessian matrix at a test point x with respect to inputs. H = ∂2ℓ ∂xi∂xj

  • The eigenvalues of H are the curvature of ℓ in the vicinity of x.
  • 0.5

1.5 1.0 0.5 0.0 3000 2500 2000 1500 1000 500

Original Adversarial

Eigenvalue profile

Eigenvalue number Value 15

slide-41
SLIDE 41

Relation between curvature and robustness

Locally quadratic approximation of the loss function

16

slide-42
SLIDE 42

Relation between curvature and robustness

Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier.

16

slide-43
SLIDE 43

Relation between curvature and robustness

Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier.

Threshold

16

slide-44
SLIDE 44

Relation between curvature and robustness

Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier.

Threshold

16

slide-45
SLIDE 45

Relation between curvature and robustness

Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier.

Threshold

16

slide-46
SLIDE 46

Relation between curvature and robustness

Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier.

Threshold

16

slide-47
SLIDE 47

Relation between curvature and robustness

Locally quadratic approximation of the loss function We derive upper and lower bounds on the minimal perturbation required to fool a classifier.

Threshold

Curvature

Upper bound Lower bound

Robustness

16

slide-48
SLIDE 48

How important is the curvature decrease?

Is the curvature decrease the main effect of adversarial training leading to improved robustness?

17

slide-49
SLIDE 49

How important is the curvature decrease?

Is the curvature decrease the main effect of adversarial training leading to improved robustness? → we regularize explicitly for the curvature.

17

slide-50
SLIDE 50

How important is the curvature decrease?

Is the curvature decrease the main effect of adversarial training leading to improved robustness? → we regularize explicitly for the curvature. Idea: Regularize the norm of the Hessian of the loss wrt inputs.

17

slide-51
SLIDE 51

How important is the curvature decrease?

Is the curvature decrease the main effect of adversarial training leading to improved robustness? → we regularize explicitly for the curvature. Idea: Regularize the norm of the Hessian of the loss wrt inputs. Use Hutichinson’s estimator HF =

  • E

z∼N(0,I)

Hz2

2

17

slide-52
SLIDE 52

How important is the curvature decrease?

Is the curvature decrease the main effect of adversarial training leading to improved robustness? → we regularize explicitly for the curvature. Idea: Regularize the norm of the Hessian of the loss wrt inputs. Use Hutichinson’s estimator HF =

  • E

z∼N(0,I)

Hz2

2

In practice:

Compute Hessian-vector products with finite difference Selective sampling on directions corresponding to high curvature

17

slide-53
SLIDE 53

How important is the curvature decrease?

Is the curvature decrease the main effect of adversarial training leading to improved robustness? → we regularize explicitly for the curvature. Idea: Regularize the norm of the Hessian of the loss wrt inputs. Use Hutichinson’s estimator HF =

  • E

z∼N(0,I)

Hz2

2

In practice:

Compute Hessian-vector products with finite difference Selective sampling on directions corresponding to high curvature

CURE: Regularize using ℓr = ∇ℓ(x + hz) − ∇ℓ(x)

17

slide-54
SLIDE 54

CUREing deep networks trained on CIFAR-10

Accuracy on clean samples: Adversarial accuracy: 18

slide-55
SLIDE 55

CUREing deep networks trained on CIFAR-10

Accuracy on clean samples: Adversarial accuracy: 18

slide-56
SLIDE 56

CUREing deep networks trained on CIFAR-10

Accuracy on clean samples: Adversarial accuracy: 18

slide-57
SLIDE 57

CUREing deep networks trained on CIFAR-10

Accuracy on clean samples: Adversarial accuracy: 18

slide-58
SLIDE 58

CUREing deep networks trained on CIFAR-10

Accuracy on clean samples: Adversarial accuracy:

[Moosavi-Dezfooli, et. al., Robustness via curvature regularization, and vice-versa, CVPR 2019]

18

slide-59
SLIDE 59

Upper limits on adversarial robustness

Goal: examine the existence of upper bounds on the robustness to adversarial perturbations. Relate to quantities that we better understand/we can better measure: robustness to random noise. Comparison to the robustness to random noise quantifies the power

  • f an adversary having access to the model vs. no clue about the

model.

19

slide-60
SLIDE 60

From adversarial to random noise

Robustness to adversarial noise

r∗( ) x

x

min

r

r2 subject to f (x + r) = f (x).

20

slide-61
SLIDE 61

From adversarial to random noise

Robustness to random noise

x

min

t

|t| subject to f (x + tv) = f (x) v uniformly sampled from SD−1.

20

slide-62
SLIDE 62

Linear classifiers

Theorem (Fawzi et. al., NIPS ’16, Franceschi et. al., AISTATS ’18)

For affine classifiers, we have r∗2 = Θ 1 √ D r∗

rand2

  • ,

with high probability (over the choice of random perturbation). In high dimensions, very large gap between both robustness measures. When data is bounded, we therefore typically get r∗2 = O

  • 1

√ D

Achieving robustness in high dimensions is difficult!

21

slide-63
SLIDE 63

Intuition

Decision boundary

r∗(x) x

22

slide-64
SLIDE 64

Intuition

Decision boundary

r∗(x) x

22

slide-65
SLIDE 65

Intuition

Decision boundary

r∗(x) x

22

slide-66
SLIDE 66

Intuition

Decision boundary

r∗(x) x

22

slide-67
SLIDE 67

Intuition

Decision boundary

r∗(x) x √ d r∗(x) 2

22

slide-68
SLIDE 68

Robustness and geometry of deep networks: SPM’17

23

slide-69
SLIDE 69

Conclusions

Very simple strategies to fool classifiers: adversarial, geometric, universal perturbations, ... Difficulty of defending against adversarial perturbations; most “defenses” make the classifier robust against specific directions. Adversarial training is essentially reducing the curvature of the loss function, leading to an increased robustness.

24

slide-70
SLIDE 70

References

Szegedy et. al., Intriguing properties of neural networks, ICLR 2014 Fawzi et. al, Manitest: Are classifiers really invariant?, BMVC 2015 Moosavi-Dezfooli et. al., Deepfool: a simple and accurate approach to fool deep neural networks, CVPR 2016 Fawzi et. al., Robustness of classifiers: from adversarial to random noise, NIPS 2016 Moosavi-Dezfooli et. al., Universal adversarial perturbations, CVPR 2017 Uesato et al., Adversarial risk and the dangers of evaluating against weak attacks, arXiv 2018 Fawzi et. al., Adversarial vulnerability of any classifier, NeurIPS 2018 Moosavi-Dezfooli et. al., Robustness via curvature regularization, and vice-versa, CVPR 2019 Survey: [Fawzi et. al., Robustness of deep networks: a geometric perspective, IEEE Signal Processing Magazine 2017]

25