Attacking Machine Learning: On the Security and Privacy of Neural - - PowerPoint PPT Presentation

attacking machine learning on the security and privacy of
SMART_READER_LITE
LIVE PREVIEW

Attacking Machine Learning: On the Security and Privacy of Neural - - PowerPoint PPT Presentation

SESSION ID: MLAI-W03 Attacking Machine Learning: On the Security and Privacy of Neural Networks Nicholas Carlini Research Scientist, Google Brain #RSAC Act I: On the Security and Privacy of Neural Networks #RSAC Let's play a game


slide-1
SLIDE 1

#RSAC SESSION ID: MLAI-W03

Attacking Machine Learning:
 On the Security and Privacy

  • f Neural Networks

Research Scientist, Google Brain

Nicholas Carlini

slide-2
SLIDE 2

Act I:

On the Security and Privacy

  • f Neural Networks
slide-3
SLIDE 3

#RSAC

Let's play a game

slide-4
SLIDE 4

#RSAC

67% it is a Great Dane

slide-5
SLIDE 5

#RSAC

83% it is a Old English Sheepdog

slide-6
SLIDE 6

#RSAC

78% it is a Greater Swiss Mountain Dog

slide-7
SLIDE 7

#RSAC

99.99% it is Guacamole

slide-8
SLIDE 8

#RSAC

99.99% it is a Golden Retriever

slide-9
SLIDE 9

#RSAC

99.99% it is Guacamole

slide-10
SLIDE 10

#RSAC

K Eykholt, I Evtimov, E Fernandes, B Li, A Rahmati, C Xiao, A Prakash, T Kohno, D Song. 
 Robust Physical-World Attacks on Deep Learning Visual Classification. 2017

76% it is a 45 MPH Sign

slide-11
SLIDE 11

#RSAC

Adversarial Examples

  • B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. 2013.
  • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. 2014.
  • I. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. 2015.
slide-12
SLIDE 12

#RSAC

What do you think this transcribes as?

N Carlini, D Wagner. Audio Adversarial Examples: Targeted Attacks on Speech-to-Text. 2018

slide-13
SLIDE 13

#RSAC

"It was the best of times, 
 it was the worst of times, 
 it was the age of wisdom, 
 it was the age of foolishness, 
 it was the epoch of belief, 
 it was the epoch of incredulity"

N Carlini, D Wagner. Audio Adversarial Examples: Targeted Attacks on Speech-to-Text. 2018

slide-14
SLIDE 14

#RSAC

N Carlini, P Mishra, T Vaidya, Y Zhang, M Sherr, C Shields, D Wagner, W Zhou. Hidden Voice Commands. 2016

slide-15
SLIDE 15

Constructing Adversarial Examples

slide-16
SLIDE 16

#RSAC

[0.9,
 0.1]

slide-17
SLIDE 17

#RSAC

[0.9,
 0.1]

slide-18
SLIDE 18

#RSAC

[0.89,
 0.11]

slide-19
SLIDE 19

#RSAC

[0.89,
 0.11]

slide-20
SLIDE 20

#RSAC

[0.89,
 0.11]

slide-21
SLIDE 21

#RSAC

[0.91,
 0.09]

slide-22
SLIDE 22

#RSAC

[0.89,
 0.11]

slide-23
SLIDE 23

#RSAC

[0.48,
 0.52]

slide-24
SLIDE 24

#RSAC

This does work ... ... but we have calculus!

slide-25
SLIDE 25

#RSAC

slide-26
SLIDE 26

#RSAC

+ .001✕ =

CAT DOG

adversarial perturbation

  • I. J. Goodfellow, J. Shlens and C. Szegedy. Explaining and harnessing adversarial examples. 2015
slide-27
SLIDE 27

#RSAC

What if we don't have direct access to the model?

slide-28
SLIDE 28

#RSAC

A Ilyas, L Engstrom, A Athalye, J Lin. Black-box Adversarial Attacks with Limited Queries and Information. 2018

slide-29
SLIDE 29

#RSAC

A Ilyas, L Engstrom, A Athalye, J Lin. Black-box Adversarial Attacks with Limited Queries and Information. 2018

slide-30
SLIDE 30

#RSAC

Generating adversarial examples is simple and practical

slide-31
SLIDE 31

Defending against Adversarial Examples

slide-32
SLIDE 32

#RSAC

Case Study: ICLR 2018 Defenses

A Athalye, N Carlini, D Wagner. Obfuscated Gradients Give a False
 Sense of Security: Circumventing Defenses to Adversarial Examples. 2018

slide-33
SLIDE 33
slide-34
SLIDE 34

#RSAC

slide-35
SLIDE 35

#RSAC

2 7 4

Out of scope

slide-36
SLIDE 36

#RSAC

2 7 4

Out of scope Correct Defenses

slide-37
SLIDE 37

#RSAC

2 7 4

Out of scope Broken Defenses Correct Defenses

slide-38
SLIDE 38
slide-39
SLIDE 39
slide-40
SLIDE 40
slide-41
SLIDE 41
slide-42
SLIDE 42

#RSAC

The Last Hope: Adversarial Training

A Madry, A Makelov, L Schmidt, D Tsipras, A Vladu. Towards Deep Learning Models Resistant to Adversarial Attacks. 2018

slide-43
SLIDE 43

#RSAC

Caveats

Requires small images (32x32) Only effective for tiny perturbations Training is 10-50x slower And even still, only works half of the time

slide-44
SLIDE 44

#RSAC

Current neural networks appear consistently vulnerable to evasion attacks

slide-45
SLIDE 45

#RSAC

First reason to not use 
 machine learning: Lack of robustness

slide-46
SLIDE 46

Act II:

On the Security and Privacy

  • f Neural Networks
slide-47
SLIDE 47

#RSAC

What are the privacy problems? Privacy of what? Training Data

slide-48
SLIDE 48

#RSAC

  • 1. Train
  • 2. Predict

Obama

slide-49
SLIDE 49

#RSAC

  • M. Fredrikson, S. Jha, T. Ristenpart. Model Inversion Attacks that

Exploit Confidence Information and Basic Countermeasures. 2015.

  • 1. Train
  • 2. Extract

Person 7

slide-50
SLIDE 50

#RSAC

N Carlini, C Liu, J Kos, Ú Erlingsson, D Song. The Secret Sharer:
 Evaluating and Testing Unintended Memorization in Neural Networks 2018

  • 1. Train
  • 2. Predict

"What are you" "doing"

slide-51
SLIDE 51

#RSAC

N Carlini, C Liu, J Kos, Ú Erlingsson, D Song. The Secret Sharer:
 Evaluating and Testing Unintended Memorization in Neural Networks 2018

  • 1. Train
  • 2. Extract

123-45-6789 Nicholas's SSN is

slide-52
SLIDE 52

#RSAC

slide-53
SLIDE 53

#RSAC

slide-54
SLIDE 54

#RSAC

slide-55
SLIDE 55

#RSAC

slide-56
SLIDE 56

Extracting Training Data From Neural Networks

slide-57
SLIDE 57

#RSAC

P( ; ) = y

  • 1. Train
  • 2. Predict
slide-58
SLIDE 58

#RSAC

P( ; ) = 0.01

My SSN is
 000-00-0000 What is ...

slide-59
SLIDE 59

#RSAC

P( ; ) = 0.02

My SSN is
 000-00-0001 What is ...

slide-60
SLIDE 60

#RSAC

P( ; ) = 0.01

My SSN is
 000-00-0002 What is ...

slide-61
SLIDE 61

#RSAC

P( ; ) = 0.00

My SSN is
 123-45-6788 What is ...

slide-62
SLIDE 62

#RSAC

P( ; ) = 0.32

My SSN is
 123-45-6789 What is ...

slide-63
SLIDE 63

#RSAC

P( ; ) = 0.01

My SSN is
 123-45-6790 What is ...

slide-64
SLIDE 64

#RSAC

P( ; ) = 0.00

My SSN is
 999-99-9998 What is ...

slide-65
SLIDE 65

#RSAC

P( ; ) = 0.01

My SSN is
 999-99-9999 What is ...

slide-66
SLIDE 66

#RSAC

P( ; ) = 0.32

My SSN is
 123-45-6789 The answer (probably) is

slide-67
SLIDE 67

#RSAC

But that takes millions of queries!

slide-68
SLIDE 68 Presenter’s Company Logo – replace or delete
  • n master slide
#RSAC
slide-69
SLIDE 69

Testing with Exposure

slide-70
SLIDE 70

#RSAC

Choose Between ... Model A Accuracy: 96% Model B Accuracy: 92%

slide-71
SLIDE 71

#RSAC

Choose Between ... Model A Accuracy: 96%
 High Memorization Model B Accuracy: 92% No Memorization

slide-72
SLIDE 72

#RSAC

Exposure-based Testing Methodology

N Carlini, C Liu, J Kos, Ú Erlingsson, D Song. The Secret Sharer:
 Evaluating and Testing Unintended Memorization in Neural Networks. 2018

slide-73
SLIDE 73

#RSAC

If a model memorizes completely random canaries, it probably also is memorizing

  • ther training data
slide-74
SLIDE 74

#RSAC

P( ; ) = y

  • 1. Train
  • 2. Predict

= "correct horse battery staple"

slide-75
SLIDE 75

#RSAC

P( ; ) = 0.1

  • 1. Train
  • 2. Predict

= "correct horse battery staple"

slide-76
SLIDE 76

#RSAC

P( ; ) =

  • 1. Train
  • 2. Predict
slide-77
SLIDE 77

#RSAC

P( ; ) = 0.6

  • 1. Train
  • 2. Predict
slide-78
SLIDE 78

#RSAC

P( ; ) = 0.1

  • 1. Train
  • 2. Predict
slide-79
SLIDE 79

#RSAC

Exposure:

Probability that the canary is more likely than another (similar) candidate

slide-80
SLIDE 80

#RSAC

expected P( ; ) P( ; )

Inserted Canary Other Candidate

slide-81
SLIDE 81

#RSAC

  • 1. Generate canary
  • 2. Insert into training data
  • 3. Train model
  • 4. Compute exposure of


(compare likelihood to other candidates)

slide-82
SLIDE 82

#RSAC

slide-83
SLIDE 83

Provable Defenses with Differential Privacy

slide-84
SLIDE 84

#RSAC

But first, what is Differential Privacy?

slide-85
SLIDE 85

#RSAC

A B

?

slide-86
SLIDE 86

#RSAC

Differentially Private Stochastic Gradient Descent

M Abadi, A Chu, I Goodfellow, H B McMahan, I Mironov, K Talwar, L Zhang. Deep Learning with Differential Privacy. 2016

slide-87
SLIDE 87

#RSAC

slide-88
SLIDE 88

#RSAC

The math may be scary ... Applying differential privacy is easy https://github.com/tensorflow/privacy

slide-89
SLIDE 89

#RSAC

The math may be scary ... Applying differential privacy is easy

  • ptimizer = tf.train.GradientDescentOptimizer()
slide-90
SLIDE 90

#RSAC

The math may be scary ... Applying differential privacy is easy

dp_optimizer_class = dp_optimizer.make_optimizer_class( tf.train.GradientDescentOptimizer)

  • ptimizer = dp_optimizer_class()

https://github.com/tensorflow/privacy

slide-91
SLIDE 91

#RSAC

Exposure confirms differential privacy is effective

slide-92
SLIDE 92

#RSAC

Second reason to not use 
 machine learning: Training Data Privacy

slide-93
SLIDE 93

Act III:

Conclusions

slide-94
SLIDE 94

#RSAC

First reason to not use 
 machine learning: Lack of robustness

slide-95
SLIDE 95

#RSAC

slide-96
SLIDE 96

#RSAC

Second reason to not use 
 machine learning: Training Data Privacy

slide-97
SLIDE 97

#RSAC

slide-98
SLIDE 98

#RSAC

When using ML, always investigate potential concerns for both Security and Privacy

slide-99
SLIDE 99

#RSAC

Next Steps

On the privacy side ... Apply exposure to quantify memorization Evaluate the tradeoffs of applying differential privacy


slide-100
SLIDE 100

#RSAC

Next Steps

On the privacy side ... Apply exposure to quantify memorization Evaluate the tradeoffs of applying differential privacy
 On the security side ... Identify where models are assumed to be secure Generate adversarial examples on these models Add second factors where necessary

slide-101
SLIDE 101

#RSAC

References

  • B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. 2013.
  • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. 2014.

I Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. 2015.


  • M. Fredrikson, S. Jha, T. Ristenpart. Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures. 2015.

N Carlini, C Liu, J Kos, Ú Erlingsson, D Song. The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks. 2018
 N Carlini, P Mishra, T Vaidya, Y Zhang, M Sherr, C Shields, D Wagner, W Zhou. Hidden Voice Commands. 2016
 M Abadi, A Chu, I Goodfellow, H B McMahan, I Mironov, K Talwar, L Zhang. Deep Learning with Differential Privacy. 2016 K Eykholt, I Evtimov, E Fernandes, B Li, A Rahmati, C Xiao, A Prakash, T Kohno, D Song. Robust Physical-World Attacks on Deep Learning Visual

  • Classification. 2017


A Madry, A Makelov, L Schmidt, D Tsipras, A Vladu. Towards Deep Learning Models Resistant to Adversarial Attacks. 2018 A Ilyas, L Engstrom, A Athalye, J Lin. Black-box Adversarial Attacks with Limited Queries and Information. 2018 N Carlini, D Wagner. Audio Adversarial Examples: Targeted Attacks on Speech-to-Text. 2018
 G Andrew, S Chien, N Papernot. https://github.com/tensorflow/privacy 2018

slide-102
SLIDE 102

#RSAC

Questions?

nicholas@carlini.com https://nicholas.carlini.com/

slide-103
SLIDE 103

#RSAC

Questions?

nicholas@carlini.com https://nicholas.carlini.com/