provably secure machine learning
play

Provably Secure Machine Learning Jacob Steinhardt ARO Adversarial - PowerPoint PPT Presentation

Provably Secure Machine Learning Jacob Steinhardt ARO Adversarial Machine Learning Workshop September 14, 2017 Why Prove Things? Attackers often have more motivation/resources than defenders Heuristic defenses: arms race between attack and


  1. Provably Secure Machine Learning Jacob Steinhardt ARO Adversarial Machine Learning Workshop September 14, 2017

  2. Why Prove Things? Attackers often have more motivation/resources than defenders Heuristic defenses: arms race between attack and defense Proofs break the arms race, provide absolute security • for a given threat model... 1

  3. Example: Adversarial Test Images 2

  4. Example: Adversarial Test Images [Szegedy et al., 2014]: first discovers adversarial examples [Goodfellow, Shlens, Szegedy, 2015]: Fast Gradient Sign Method (FGSM) + adversarial training [Papernot et al., 2015]: defensive distillation [Carlini and Wagner, 2016]: distillation is not secure [Papernot et al., 2017]: FGSM + distillation only make attacks harder to find [Carlini and Wagner, 2017]: all detection strategies fail [Madry et al., 2017]: a secure network, finally?? 2

  5. Example: Adversarial Test Images [Szegedy et al., 2014]: first discovers adversarial examples [Goodfellow, Shlens, Szegedy, 2015]: Fast Gradient Sign Method (FGSM) + adversarial training [Papernot et al., 2015]: defensive distillation [Carlini and Wagner, 2016]: distillation is not secure [Papernot et al., 2017]: FGSM + distillation only make attacks harder to find [Carlini and Wagner, 2017]: all detection strategies fail [Madry et al., 2017]: a secure network, finally?? 1 proof = 3 years of research 2

  6. Formal Verification is Hard • Traditional software: designed to be secure • ML systems: learned organically from data, no explicit design 3

  7. Formal Verification is Hard • Traditional software: designed to be secure • ML systems: learned organically from data, no explicit design Hard to analyze, limited levers 3

  8. Formal Verification is Hard • Traditional software: designed to be secure • ML systems: learned organically from data, no explicit design Hard to analyze, limited levers Other challenges: • adversary has access to sensitive parts of system • unclear what spec should be (car doesn’t crash?) 3

  9. What To Prove? • Security against test-time attacks • Security against training-time attacks • Lack of implementation bugs 4

  10. What To Prove? • Security against test-time attacks • Security against training-time attacks • Lack of implementation bugs 4

  11. Test-time Attacks Adversarial examples: Can we prove no adversarial examples exist? 5

  12. Formal Goal Goal Given a classifier f : R d → { 1 , . . . , k } , and an input x , show that there is no x ′ with f ( x ) � = f ( x ′ ) and � x − x ′ � ≤ ǫ . • Norm: ℓ ∞ -norm: � x � = max d j =1 | x j | • Classifier: f is a neural network 6

  13. [Katz, Barrett, Dill, Julian, Kochenderfer 2017] Approach 1: Reluplex Assume f is a ReLU network: layers x (1) , . . . , x ( L ) , with x ( l +1) = max( a ( l ) · x ( l ) , 0) i i Want to bound maximum change in output x ( L ) . Can write as an integer-linear program (ILP) : y = max( x, 0) ⇐ ⇒ x ≤ y ≤ x + b · M, 0 ≤ y ≤ (1 − b ) · M, b ∈ { 0 , 1 } Check robustness on 300-node networks • time ranges from 1s to 4h (median 3m-4m) 7

  14. [Raghunathan, S., Liang] Approach 2: Relax and Dualize Still assume f is ReLU Can write as a non-convex quadratic program instead. 8

  15. [Raghunathan, S., Liang] Approach 2: Relax and Dualize Still assume f is ReLU Can write as a non-convex quadratic program instead. Every quadratic program can be relaxed to a semi-definite program 8

  16. [Raghunathan, S., Liang] Approach 2: Relax and Dualize Still assume f is ReLU Can write as a non-convex quadratic program instead. Every quadratic program can be relaxed to a semi-definite program Advantages: • always polynomial-time • duality: get differentiable upper bounds • can train against upper bound to generate robust networks 8

  17. Results 9

  18. Results 9

  19. What To Prove? • Security against test-time attacks • Security against training-time attacks • Lack of implementation bugs 10

  20. Training-time attacks Attack system by manipulating training data: data poisoning Traditional security: keep attacker away from important parts of system Data poisoning: attacker has access to most important part of all 11

  21. Training-time attacks Attack system by manipulating training data: data poisoning Traditional security: keep attacker away from important parts of system Data poisoning: attacker has access to most important part of all Huge issue in practice... 11

  22. Training-time attacks Attack system by manipulating training data: data poisoning Traditional security: keep attacker away from important parts of system Data poisoning: attacker has access to most important part of all Huge issue in practice... How can we keep adversary from subverting the model? 11

  23. Formal Setting Adversarial game: • Start with clean dataset D c = { x 1 , . . . , x n } • Adversary adds ǫn bad points D p • Learner trains model on D = D c ∪D p , outputs model θ and incurs loss L ( θ ) Learner’s goal: ensure L ( θ ) is low no matter what adversary does • under a priori assumptions, • or for a specific dataset D c . 12

  24. Formal Setting Adversarial game: • Start with clean dataset D c = { x 1 , . . . , x n } • Adversary adds ǫn bad points D p • Learner trains model on D = D c ∪D p , outputs model θ and incurs loss L ( θ ) Learner’s goal: ensure L ( θ ) is low no matter what adversary does • under a priori assumptions, • or for a specific dataset D c . In high dimensions, most algorithms fail! 12

  25. [Charikar, S., Valiant 2017] Learning from Untrusted Data A priori assumption: covariance of data is bounded by σ . 13

  26. [Charikar, S., Valiant 2017] Learning from Untrusted Data A priori assumption: covariance of data is bounded by σ . Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%). 13

  27. [Charikar, S., Valiant 2017] Learning from Untrusted Data A priori assumption: covariance of data is bounded by σ . Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%). 13

  28. [Charikar, S., Valiant 2017] Learning from Untrusted Data A priori assumption: covariance of data is bounded by σ . Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%). 13

  29. [Charikar, S., Valiant 2017] Learning from Untrusted Data A priori assumption: covariance of data is bounded by σ . Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%). 13

  30. [Charikar, S., Valiant 2017] Learning from Untrusted Data A priori assumption: covariance of data is bounded by σ . Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%). 13

  31. [Charikar, S., Valiant 2017] Learning from Untrusted Data A priori assumption: covariance of data is bounded by σ . Theorem: as long as we have a small number of “verified” points, can be robust to any fraction of adversaries (even e.g. 90%). Growing literature: 15+ papers since 2016 [DKKLMS16/17, LRV16, SVC16, DKS16/17, CSV17, SCV17, L17, DBS17, KKP17, S17, MV17] 13

  32. What about certifying a specific algorithm on a specific data set? 14

  33. [S., Koh, and Liang 2017] Certified Defenses for Data Poisoning 15

  34. [S., Koh, and Liang 2017] Certified Defenses for Data Poisoning 15

  35. [S., Koh, and Liang 2017] Certified Defenses for Data Poisoning 15

  36. [S., Koh, and Liang 2017] Certified Defenses for Data Poisoning 15

  37. [S., Koh, and Liang 2017] Certified Defenses for Data Poisoning 15

  38. [S., Koh, and Liang 2017] Certified Defenses for Data Poisoning 15

  39. Impact on training loss Worst-case impact is solution to bi-level optimization problem : θ, D p L (ˆ θ ) subject to ˆ maximize ˆ θ = argmin θ � x ∈D c ∪D p ℓ ( θ ; x ) , D p ⊆ F 16

  40. Impact on training loss Worst-case impact is solution to bi-level optimization problem : θ, D p L (ˆ θ ) subject to ˆ maximize ˆ θ = argmin θ � x ∈D c ∪D p ℓ ( θ ; x ) , D p ⊆ F (Very) NP-hard in general 16

  41. Impact on training loss Worst-case impact is solution to bi-level optimization problem : θ, D p L (ˆ θ ) subject to ˆ maximize ˆ θ = argmin θ � x ∈D c ∪D p ℓ ( θ ; x ) , D p ⊆ F (Very) NP-hard in general Key insight: approximate test loss by train loss, can then upper bound via a saddle point problem (tractable) • automatically generates a nearly optimal attack 16

  42. Results 17

  43. Results 17

  44. Results 17

  45. What To Prove? • Security against test-time attacks • Security against training-time attacks • Lack of implementation bugs 18

  46. 19

  47. 19

  48. [Selsam and Liang 2017] Developing Bug-Free ML Systems 20

  49. [Cai, Shin, and Song 2017] Provable Generalization via Recursion 21

  50. Summary Formal verification can be used in many contexts: • test-time attacks • training-time attacks • implementation bugs • checking generalization High-level ideas: • cast as optimization problem : rich set of tools • train/optimize against certificate • re-design system to be amenable to proof 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend