On Adaptive Attacks to Adversarial Example Defenses Florian Tramr - PowerPoint PPT Presentation

On Adaptive Attacks to Adversarial Example Defenses Florian Tramèr USENIX ScAINet August 10 th , 2020 Joint work with Nicholas Carlini, Wieland Brendel and Aleksander Madry

What Are Adversarial Examples? 88% Tabby Cat 99% Guacamole Biggio et al., 2014 Szegedy et al., 2014 Goodfellow et al., 2015 2

Why Should We Care? ML in security-critical applications Malware Anti phishing Ad-blocking Content takedown detection Understanding robustness under (standard) distribution shift Recht et al., 2019 3

Many Defenses Have Been Proposed... https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html 4

...But Evaluating Them Properly Is Hard We re-evaluated 13 defenses presented at [ ICLR | ICML | NeurIPS ] [ 2018 | 2019 | 2020 ] All defenses claim to follow the best evaluation standards Yet, we circumvent all of them ⇒ reduce accuracy to baseline (usually 0%) in the considered threat model 5

Isn’t This Old News? Broke 10 (mainly unpublished) defenses in 2017 Broke 7 defenses published at ICLR 2018 6

Why We Hoped Things Might Have Changed Consensus on what constitutes a good evaluation Clearly defined threat model Adaptive 1. White-box: adversary has access Adversary tailors the to defense parameters attack to the defense Carlini & Wagner, 2017, 2. Small perturbations : Athalye et al., 2018, find 𝑦’ s.t. 𝑦’ misclassified Carlini et al. 2019, ... and ∥ 𝑦 − 𝑦’ ∥ ! ≤ ε Incomplete definition Easy to formalize Surprisingly hard 7

Evaluation Standards Seem To Be Improving Athalye et al. 2018 T et al. 2020 Carlini & Wagner 2017 (7 defenses) (13 defenses) (10 defenses) • Some white-box • All white-box • All white-box • 0/10 adaptive • 2/7 adaptive • 9/13 adaptive • 13/13 with code! Authors (and reviewers) are aware of the importance of adaptive attacks in evaluations 8

Then Why Are Defenses Still Broken? Many defenses are not evaluated against a strong adaptive attack 9

Our Work 13 case studies on how to design strong(er) adaptive attacks Including: • Our hypotheses when reading each defense’s paper/code • Things we tried but that didn’t work • Some things we didn’t try but might also have worked 10

How (not) to build & evaluate defenses 11

Don’t Intentionally Obfuscate Gradients If this wasn’t enough... this won’t be either Breaking specific attack techniques is not the way forward 12

Don’t Blindly Re-use Prior (Adaptive) Attacks Adaptive attack strategies are not universal! Most popular “victims”: BPDA & EOT (Athalye et al. 2018) • Understand why an attack worked on other defenses before re-using it • Use BPDA as a last resort (try gradient-free / decision-based attacks first) • Before using EOT, build an attack that works for fixed randomness 13

Don’t Complicate The Attack Many proposed defenses are complicated (for some reasons, this is particularly true for AdvML papers in security conferences) This is OK! Maybe the best defense has to be complex (randomized) Anomaly detector ... preprocessing (non-differentiable) Multiple components 14

Don’t Complicate The Attack Many proposed defenses are complicated (for some reasons, this is particularly true for AdvML papers in security conferences) This is OK! Maybe the best defense has to be complex But attacks don’t have to be! • Optimizing over complex defenses can be hard ( ℒ = 𝜇 1 ℒ 1 + 𝜇 2 ℒ 2 + 𝜇 3 ℒ 3 + … ) • Evaluate each component individually, there is often a weak link • Combining broken components rarely works 15

Don’t Complicate The Attack Use feature adversaries (Sabour et al. 2015) to break multiple components at once Guac OK Anomaly detector Guac OK Anomaly detector 16

Don’t Convince Reviewers, Convince Yourself! Really try to break your defense (others probably will...) • An evaluation against 10 non-adaptive attacks isn’t broad • If offered $1M to break your defense, would you use a non-adaptive attack? • What assumptions/invariants does the defense rely on? Attack those! Evaluation guidelines are great, but: • Not just a check-list to appease reviewers • They also apply to adaptive attacks (e.g., adaptive attacks should never perform worse than non-adaptive ones) 17

My Defense Got Broken. Now What? 18

My Defense Got Broken. Now What? ~40 white-box defenses that were publicly broken (that I know of) • one paper was retracted before publication • one paper was amended on arXiv We should do better! • Hard to navigate the field for newcomers • Many ideas get re-used despite being broken 19

My Defense Got Broken. Now What? Personal experience: • Often referenced as an effective defense against black-box attacks • Later work developed much stronger transfer attacks L Þ Please contact authors when you find an attack! After intro, or in abstract, results, etc. 20

Conclusion Evaluating adversarial examples defenses is hard! How do we improve things? Resisting attacks that broke prior defenses ≠ progress Ideal: defense evaluation = 99% adaptive attacks • Try breaking other defenses before attacking your own • Strive for simple attacks (and defenses if possible) • We need more independent re-evaluations • If a defense is broken, acknowledge the attack, amend the paper, and keep going! tramer@cs.stanford.edu https://arxiv.org/abs/2002.08347 21

On Adaptive Attacks to Adversarial Example Defenses Florian Tramr - PowerPoint PPT Presentation

On Adaptive Attacks to Adversarial Example Defenses Florian Tramr USENIX ScAINet August 10 th , 2020 Joint work with Nicholas Carlini, Wieland Brendel and Aleksander Madry What Are Adversarial Examples? 88% Tabby Cat 99% Guacamole Biggio

Internet Outbreaks: Internet Outbreaks: Epidemiology and Defenses Epidemiology and Defenses

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

Adaptive Control Chapter 1: Introduction to Adaptive Control Adaptive Control Landau, Lozano,

Adaptive Control Chapter 11: Direct Adaptive Control 1 Adaptive Control Landau, Lozano,

Adversarial Training Attacks on Deep Networks and Generative Adversarial Networks Erkut Erdem

Attacks and Defenses Dr. Falko Strenzke fstrenzke@cryptosource.de cryptosource Cryptography.

Advanced Man-at-the-end Attacks and Defenses Bjorn De Sutter ISSISP 2018 Canberra 1

Adversarial Attacks and Defenses in Deep Learning Hang Su suhangss@tsinghua.edu.cn Institute for

15-780 Graduate Artificial Intelligence: Adversarial attacks and provable defenses J. Zico

Adaptive Control Chapter 12: Indirect Adaptive Control 1 Adaptive Control Landau, Lozano,

The case for dynamic defenses against adversarial examples Ian Goodfellow SafeML ICLR Workshop

Transferable Adversarial Examples: Insights, A9acks & Defenses June 12 th 2017 Florian

Friendly Adversarial Training: Attacks Which Do Not Kill Training Make Adversarial Learning

FOSAD07 Low-level Software Security: Attacks and Defenses lfar Erlingsson Microsoft

Memory Exploits & Defenses Presenter: Kevin Snow What is the threat? How do we defend

Visualiza(on of Unique Temporal Sequences of Treatments and

Information models as a basis for Interoperability SemTechBiz June 3-5, 2013 Stanley M Huff, MD

CS-527 Software Security Practical Defenses Asst. Prof. Mathias Payer Department of Computer

Buffer Overflow overflows Defenses and other memory safety vulnerabilities Finish overflow

Clickjacking Credit: paper (Clickjacking: Attacks and Defenses Huang et al.) and most slide

Lecture 08 Control-flow Hijacking Defenses Stephen Checkoway University of Illinois at

On Adaptive Attacks to Adversarial Example Defenses Florian Tramr - PowerPoint PPT Presentation

On Adaptive Attacks to Adversarial Example Defenses Florian Tramr USENIX ScAINet August 10 th , 2020 Joint work with Nicholas Carlini, Wieland Brendel and Aleksander Madry What Are Adversarial Examples? 88% Tabby Cat 99% Guacamole Biggio

Internet Outbreaks: Internet Outbreaks: Epidemiology and Defenses Epidemiology and Defenses

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

Adaptive Control Chapter 1: Introduction to Adaptive Control Adaptive Control Landau, Lozano,

Adaptive Control Chapter 11: Direct Adaptive Control 1 Adaptive Control Landau, Lozano,

Adversarial Training Attacks on Deep Networks and Generative Adversarial Networks Erkut Erdem

Attacks and Defenses Dr. Falko Strenzke fstrenzke@cryptosource.de cryptosource Cryptography.

Advanced Man-at-the-end Attacks and Defenses Bjorn De Sutter ISSISP 2018 Canberra 1

Adversarial Attacks and Defenses in Deep Learning Hang Su suhangss@tsinghua.edu.cn Institute for

15-780 Graduate Artificial Intelligence: Adversarial attacks and provable defenses J. Zico

Adaptive Control Chapter 12: Indirect Adaptive Control 1 Adaptive Control Landau, Lozano,

The case for dynamic defenses against adversarial examples Ian Goodfellow SafeML ICLR Workshop

Transferable Adversarial Examples: Insights, A9acks &amp; Defenses June 12 th 2017 Florian

Friendly Adversarial Training: Attacks Which Do Not Kill Training Make Adversarial Learning

FOSAD07 Low-level Software Security: Attacks and Defenses lfar Erlingsson Microsoft

Memory Exploits &amp; Defenses Presenter: Kevin Snow What is the threat? How do we defend

Visualiza(on of Unique Temporal Sequences of Treatments and

Information models as a basis for Interoperability SemTechBiz June 3-5, 2013 Stanley M Huff, MD

CS-527 Software Security Practical Defenses Asst. Prof. Mathias Payer Department of Computer

Buffer Overflow overflows Defenses and other memory safety vulnerabilities Finish overflow

Clickjacking Credit: paper (Clickjacking: Attacks and Defenses Huang et al.) and most slide

Lecture 08 Control-flow Hijacking Defenses Stephen Checkoway University of Illinois at

Transferable Adversarial Examples: Insights, A9acks & Defenses June 12 th 2017 Florian

Memory Exploits & Defenses Presenter: Kevin Snow What is the threat? How do we defend