Provably Secure Machine Learning Jacob Steinhardt ARO Adversarial - - PowerPoint PPT Presentation

provably secure machine learning
SMART_READER_LITE
LIVE PREVIEW

Provably Secure Machine Learning Jacob Steinhardt ARO Adversarial - - PowerPoint PPT Presentation

Provably Secure Machine Learning Jacob Steinhardt ARO Adversarial Machine Learning Workshop September 14, 2017 Why Prove Things? Attackers often have more motivation/resources than defenders Heuristic defenses: arms race between attack and


slide-1
SLIDE 1

Provably Secure Machine Learning

Jacob Steinhardt

ARO Adversarial Machine Learning Workshop September 14, 2017

slide-2
SLIDE 2

Why Prove Things?

Attackers often have more motivation/resources than defenders Heuristic defenses: arms race between attack and defense Proofs break the arms race, provide absolute security

  • for a given threat model...

1

slide-3
SLIDE 3

Example: Adversarial Test Images

2

slide-4
SLIDE 4

Example: Adversarial Test Images

[Szegedy et al., 2014]: first discovers adversarial examples [Goodfellow, Shlens, Szegedy, 2015]: Fast Gradient Sign Method (FGSM) + adversarial training [Papernot et al., 2015]: defensive distillation [Carlini and Wagner, 2016]: distillation is not secure [Papernot et al., 2017]: FGSM + distillation only make attacks harder to find [Carlini and Wagner, 2017]: all detection strategies fail [Madry et al., 2017]: a secure network, finally??

2

slide-5
SLIDE 5

Example: Adversarial Test Images

[Szegedy et al., 2014]: first discovers adversarial examples [Goodfellow, Shlens, Szegedy, 2015]: Fast Gradient Sign Method (FGSM) + adversarial training [Papernot et al., 2015]: defensive distillation [Carlini and Wagner, 2016]: distillation is not secure [Papernot et al., 2017]: FGSM + distillation only make attacks harder to find [Carlini and Wagner, 2017]: all detection strategies fail [Madry et al., 2017]: a secure network, finally??

1 proof = 3 years of research

2

slide-6
SLIDE 6

Formal Verification is Hard

  • Traditional software: designed to be secure
  • ML systems: learned organically from data, no explicit design

3

slide-7
SLIDE 7

Formal Verification is Hard

  • Traditional software: designed to be secure
  • ML systems: learned organically from data, no explicit design

Hard to analyze, limited levers

3

slide-8
SLIDE 8

Formal Verification is Hard

  • Traditional software: designed to be secure
  • ML systems: learned organically from data, no explicit design

Hard to analyze, limited levers Other challenges:

  • adversary has access to sensitive parts of system
  • unclear what spec should be (car doesn’t crash?)

3

slide-9
SLIDE 9

What To Prove?

  • Security against test-time attacks
  • Security against training-time attacks
  • Lack of implementation bugs

4

slide-10
SLIDE 10

What To Prove?

  • Security against test-time attacks
  • Security against training-time attacks
  • Lack of implementation bugs

4

slide-11
SLIDE 11

Test-time Attacks

Adversarial examples: Can we prove no adversarial examples exist?

5

slide-12
SLIDE 12

Formal Goal

Goal Given a classifier f : Rd → {1, . . . , k}, and an input x, show that there is no x′ with f(x) = f(x′) and x − x′ ≤ ǫ.

  • Norm: ℓ∞-norm: x = maxd

j=1 |xj|

  • Classifier: f is a neural network

6

slide-13
SLIDE 13

Approach 1: Reluplex

Assume f is a ReLU network: layers x(1), . . . , x(L), with x(l+1)

i

= max(a(l)

i

· x(l), 0) Want to bound maximum change in output x(L). Can write as an integer-linear program (ILP): y = max(x, 0) ⇐ ⇒ x ≤ y ≤ x + b · M, 0 ≤ y ≤ (1 − b) · M, b ∈ {0, 1} Check robustness on 300-node networks

  • time ranges from 1s to 4h (median 3m-4m)

[Katz, Barrett, Dill, Julian, Kochenderfer 2017] 7

slide-14
SLIDE 14

Approach 2: Relax and Dualize

Still assume f is ReLU Can write as a non-convex quadratic program instead.

[Raghunathan, S., Liang] 8

slide-15
SLIDE 15

Approach 2: Relax and Dualize

Still assume f is ReLU Can write as a non-convex quadratic program instead. Every quadratic program can be relaxed to a semi-definite program

[Raghunathan, S., Liang] 8

slide-16
SLIDE 16

Approach 2: Relax and Dualize

Still assume f is ReLU Can write as a non-convex quadratic program instead. Every quadratic program can be relaxed to a semi-definite program Advantages:

  • always polynomial-time
  • duality: get differentiable upper bounds
  • can train against upper bound to generate robust networks

[Raghunathan, S., Liang] 8

slide-17
SLIDE 17

Results

9

slide-18
SLIDE 18

Results

9

slide-19
SLIDE 19

What To Prove?

  • Security against test-time attacks
  • Security against training-time attacks
  • Lack of implementation bugs

10

slide-20
SLIDE 20

Training-time attacks

Attack system by manipulating training data: data poisoning Traditional security: keep attacker away from important parts of system Data poisoning: attacker has access to most important part of all

11

slide-21
SLIDE 21

Training-time attacks

Attack system by manipulating training data: data poisoning Traditional security: keep attacker away from important parts of system Data poisoning: attacker has access to most important part of all Huge issue in practice...

11

slide-22
SLIDE 22

Training-time attacks

Attack system by manipulating training data: data poisoning Traditional security: keep attacker away from important parts of system Data poisoning: attacker has access to most important part of all Huge issue in practice... How can we keep adversary from subverting the model?

11

slide-23
SLIDE 23

Formal Setting

Adversarial game:

  • Start with clean dataset Dc = {x1, . . . , xn}
  • Adversary adds ǫn bad points Dp
  • Learner trains model on D = Dc ∪Dp, outputs model θ and incurs

loss L(θ) Learner’s goal: ensure L(θ) is low no matter what adversary does

  • under a priori assumptions,
  • or for a specific dataset Dc.

12

slide-24
SLIDE 24

Formal Setting

Adversarial game:

  • Start with clean dataset Dc = {x1, . . . , xn}
  • Adversary adds ǫn bad points Dp
  • Learner trains model on D = Dc ∪Dp, outputs model θ and incurs

loss L(θ) Learner’s goal: ensure L(θ) is low no matter what adversary does

  • under a priori assumptions,
  • or for a specific dataset Dc.

In high dimensions, most algorithms fail!

12

slide-25
SLIDE 25

Learning from Untrusted Data

A priori assumption: covariance of data is bounded by σ.

[Charikar, S., Valiant 2017] 13

slide-26
SLIDE 26

Learning from Untrusted Data

A priori assumption: covariance of data is bounded by σ. Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%).

[Charikar, S., Valiant 2017] 13

slide-27
SLIDE 27

Learning from Untrusted Data

A priori assumption: covariance of data is bounded by σ. Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%).

[Charikar, S., Valiant 2017] 13

slide-28
SLIDE 28

Learning from Untrusted Data

A priori assumption: covariance of data is bounded by σ. Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%).

[Charikar, S., Valiant 2017] 13

slide-29
SLIDE 29

Learning from Untrusted Data

A priori assumption: covariance of data is bounded by σ. Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%).

[Charikar, S., Valiant 2017] 13

slide-30
SLIDE 30

Learning from Untrusted Data

A priori assumption: covariance of data is bounded by σ. Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%).

[Charikar, S., Valiant 2017] 13

slide-31
SLIDE 31

Learning from Untrusted Data

A priori assumption: covariance of data is bounded by σ. Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%). Growing literature: 15+ papers since 2016 [DKKLMS16/17, LRV16, SVC16, DKS16/17, CSV17, SCV17, L17, DBS17, KKP17, S17, MV17]

[Charikar, S., Valiant 2017] 13

slide-32
SLIDE 32

What about certifying a specific algorithm on a specific data set?

14

slide-33
SLIDE 33

Certified Defenses for Data Poisoning

[S., Koh, and Liang 2017] 15

slide-34
SLIDE 34

Certified Defenses for Data Poisoning

[S., Koh, and Liang 2017] 15

slide-35
SLIDE 35

Certified Defenses for Data Poisoning

[S., Koh, and Liang 2017] 15

slide-36
SLIDE 36

Certified Defenses for Data Poisoning

[S., Koh, and Liang 2017] 15

slide-37
SLIDE 37

Certified Defenses for Data Poisoning

[S., Koh, and Liang 2017] 15

slide-38
SLIDE 38

Certified Defenses for Data Poisoning

[S., Koh, and Liang 2017] 15

slide-39
SLIDE 39

Impact on training loss

Worst-case impact is solution to bi-level optimization problem: maximizeˆ

θ,Dp L(ˆ

θ) subject to ˆ θ = argminθ

  • x∈Dc∪Dp ℓ(θ; x),

Dp ⊆ F

16

slide-40
SLIDE 40

Impact on training loss

Worst-case impact is solution to bi-level optimization problem: maximizeˆ

θ,Dp L(ˆ

θ) subject to ˆ θ = argminθ

  • x∈Dc∪Dp ℓ(θ; x),

Dp ⊆ F (Very) NP-hard in general

16

slide-41
SLIDE 41

Impact on training loss

Worst-case impact is solution to bi-level optimization problem: maximizeˆ

θ,Dp L(ˆ

θ) subject to ˆ θ = argminθ

  • x∈Dc∪Dp ℓ(θ; x),

Dp ⊆ F (Very) NP-hard in general Key insight: approximate test loss by train loss, can then upper bound via a saddle point problem (tractable)

  • automatically generates a nearly optimal attack

16

slide-42
SLIDE 42

Results

17

slide-43
SLIDE 43

Results

17

slide-44
SLIDE 44

Results

17

slide-45
SLIDE 45

What To Prove?

  • Security against test-time attacks
  • Security against training-time attacks
  • Lack of implementation bugs

18

slide-46
SLIDE 46

19

slide-47
SLIDE 47

19

slide-48
SLIDE 48

Developing Bug-Free ML Systems

[Selsam and Liang 2017] 20

slide-49
SLIDE 49

Provable Generalization via Recursion

[Cai, Shin, and Song 2017] 21

slide-50
SLIDE 50

Summary

Formal verification can be used in many contexts:

  • test-time attacks
  • training-time attacks
  • implementation bugs
  • checking generalization

High-level ideas:

  • cast as optimization problem: rich set of tools
  • train/optimize against certificate
  • re-design system to be amenable to proof

22

slide-51
SLIDE 51

Are we verifying the right thing?

“Real” goal not easy to state:

  • ℓ∞-perturbations are arbitrary
  • low test error =

⇒ specific inputs could still be bad

  • what does security even mean for non-convex models?

How do we specify our real end goals?

  • “my car won’t crash”
  • “my newsfeed won’t disseminate propaganda”
  • “my trading algorithm won’t lose $$$”

23

slide-52
SLIDE 52

Acknowledgments

Collaborators: Funding: NIPS Workshop on Secure ML: Please submit your work!

24