Distillation as a Defense to Adversarial Perturbations against Deep - - PowerPoint PPT Presentation

distillation as a defense to adversarial perturbations
SMART_READER_LITE
LIVE PREVIEW

Distillation as a Defense to Adversarial Perturbations against Deep - - PowerPoint PPT Presentation

Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks Nicolas Papernot , Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami May 24th, 2016 @ 37th IEEE Symposium on Security and Privacy @NicolasPapernot 1 M


slide-1
SLIDE 1

Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks

Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami

May 24th, 2016 @ 37th IEEE Symposium on Security and Privacy

1

@NicolasPapernot

slide-2
SLIDE 2

–Johnny Appleseed

“Type a quote here.”

2

… … …

Input Layer Output Layer Hidden Layers

(e.g., convolutional, rectified linear, …)

{

p0=0.01 p1=0.93 p8=0.02 pN=0.01

M components N components Neuron Weighted Link (weight is a parameter part of )

θO …

slide-3
SLIDE 3

–Johnny Appleseed

“Type a quote here.”

3

… … …

Input Layer Output Layer Hidden Layers

(e.g., convolutional, rectified linear, …)

{

p0=0.01 p1=0.02 p8=0.89 pN=0.01

M components N components Neuron Weighted Link (weight is a parameter part of )

θO …

slide-4
SLIDE 4

4

slide-5
SLIDE 5

Deep Learning for Classification

5

slide-6
SLIDE 6

6

… … …

Input Layer Output Layer Hidden Layers

(e.g., convolutional, rectified linear, …)

{

M components N components Neuron Weighted Link (weight is a parameter part of )

θO

slide-7
SLIDE 7

7

… … …

Input Layer Output Layer Hidden Layers

(e.g., convolutional, rectified linear, …)

{

M components N components Neuron Weighted Link (weight is a parameter part of )

θO

slide-8
SLIDE 8

8

… … …

Input Layer Output Layer Hidden Layers

(e.g., convolutional, rectified linear, …)

{

M components N components Neuron Weighted Link (weight is a parameter part of )

θO

slide-9
SLIDE 9

9

… … …

Input Layer Output Layer Hidden Layers

(e.g., convolutional, rectified linear, …)

{

M components N components Neuron Weighted Link (weight is a parameter part of )

θO

slide-10
SLIDE 10

10

… … …

Input Layer Output Layer Hidden Layers

(e.g., convolutional, rectified linear, …)

{

M components N components Neuron Weighted Link (weight is a parameter part of )

θO

slide-11
SLIDE 11

11

… … …

Input Layer Output Layer Hidden Layers

(e.g., convolutional, rectified linear, …)

{

M components N components Neuron Weighted Link (weight is a parameter part of )

θO

slide-12
SLIDE 12

12

… … …

Input Layer Output Layer Hidden Layers

(e.g., convolutional, rectified linear, …)

{

M components N components Neuron Weighted Link (weight is a parameter part of )

θO

slide-13
SLIDE 13

13

… … …

Input Layer Output Layer Hidden Layers

(e.g., convolutional, rectified linear, …)

{

M components N components Neuron Weighted Link (weight is a parameter part of )

θO

p0=0.01 p1=0.93 p8=0.02 pN=0.01

slide-14
SLIDE 14

14

Audio Frame State

Phoneme

Word

Sentence Meaning

Feature Extraction Acoustic Model Decision Trees Lexicon Language Model NLP

Source: Tara N. Sainath, Google @ ICML DL Workshop 2015

slide-15
SLIDE 15

Adversarial Samples

15

slide-16
SLIDE 16

16

CIFAR10 Dataset

bird airplane truck automobile bird

0 1 2 3 4 5 6 7 8 9 Output classification 9 8 7 6 5 4 3 2 1 0 Input class

slide-17
SLIDE 17

Adversarial strategy

17

slide-18
SLIDE 18

Defending against Adversarial Perturbations

18

slide-19
SLIDE 19

DNN Robustness

19

slide-20
SLIDE 20

Defense Design

  • Low impact on the architecture
  • Maintain accuracy
  • Robust in space relatively close to the legitimate

distribution

  • Maintain speed of network

20

slide-21
SLIDE 21

Softmax Layer and Probabilities

21

slide-22
SLIDE 22

Defensive Distillation

22

slide-23
SLIDE 23

Defensive Distillation

23

slide-24
SLIDE 24

Defensive Distillation

24

slide-25
SLIDE 25

Defensive Distillation

25

slide-26
SLIDE 26

Defensive Distillation

26

slide-27
SLIDE 27

Defensive Distillation

27

slide-28
SLIDE 28

Defensive Distillation

28

slide-29
SLIDE 29

Defensive Distillation

29

Set temperature T=1 for predictions

slide-30
SLIDE 30

Intuition behind Defensive Distillation

30

Constraining Training Reducing Jacobian Amplitudes

0 if i not correct class never equal to 0

slide-31
SLIDE 31

Validation

31

slide-32
SLIDE 32

32

Experimental Setup

slide-33
SLIDE 33

33

10 20 30 40 50 60 70 80 90 100 1 10 100 Adversarial Sample Success Rate Distillation Temperature Adversarial Samples Success Rate (MNIST) Adversarial Samples Baseline Rate (MNIST) Adversarial Samples Success Rate (CIFAR10) Adversarial Samples Baseline Rate (CIFAR10)

slide-34
SLIDE 34

Impact on accuracy

34

slide-35
SLIDE 35

Impact on Jacobian Amplitude

35

slide-36
SLIDE 36

Estimation of Robustness

36

slide-37
SLIDE 37

Conclusions

37

slide-38
SLIDE 38

Take aways

  • Distillation significantly reduces attack success
  • Yields model smoothness
  • Easy implementation, low overhead
  • Acceptable impact on accuracy

38

slide-39
SLIDE 39

Questions?

@NicolasPapernot nicolas@papernot.fr https://www.papernot.fr