On Adaptive Attacks to Adversarial Example Defenses Florian Tramr - - PowerPoint PPT Presentation

on adaptive attacks to adversarial example defenses
SMART_READER_LITE
LIVE PREVIEW

On Adaptive Attacks to Adversarial Example Defenses Florian Tramr - - PowerPoint PPT Presentation

On Adaptive Attacks to Adversarial Example Defenses Florian Tramr USENIX ScAINet August 10 th , 2020 Joint work with Nicholas Carlini, Wieland Brendel and Aleksander Madry What Are Adversarial Examples? 88% Tabby Cat 99% Guacamole Biggio


slide-1
SLIDE 1

On Adaptive Attacks to Adversarial Example Defenses

Florian Tramèr USENIX ScAINet August 10th, 2020

Joint work with Nicholas Carlini, Wieland Brendel and Aleksander Madry

slide-2
SLIDE 2

What Are Adversarial Examples?

2

88% Tabby Cat

Biggio et al., 2014 Szegedy et al., 2014 Goodfellow et al., 2015

99% Guacamole

slide-3
SLIDE 3

Why Should We Care?

ML in security-critical applications Understanding robustness under (standard) distribution shift

3

Malware detection Ad-blocking Anti phishing Content takedown

Recht et al., 2019

slide-4
SLIDE 4

Many Defenses Have Been Proposed...

4

https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html

slide-5
SLIDE 5

...But Evaluating Them Properly Is Hard

We re-evaluated 13 defenses presented at

[ICLR | ICML | NeurIPS] [2018 | 2019 | 2020]

All defenses claim to follow the best evaluation standards Yet, we circumvent all of them

⇒ reduce accuracy to baseline (usually 0%) in the considered threat model

5

slide-6
SLIDE 6

Isn’t This Old News?

6

Broke 10 (mainly unpublished) defenses in 2017 Broke 7 defenses published at ICLR 2018

slide-7
SLIDE 7

Why We Hoped Things Might Have Changed

7

Consensus on what constitutes a good evaluation

Clearly defined threat model 1. White-box: adversary has access to defense parameters 2. Small perturbations: find 𝑦’ s.t. 𝑦’ misclassified and ∥ 𝑦 − 𝑦’ ∥! ≤ ε Adaptive Adversary tailors the attack to the defense

Incomplete definition Easy to formalize Surprisingly hard

Carlini & Wagner, 2017, Athalye et al., 2018, Carlini et al. 2019, ...

slide-8
SLIDE 8

Evaluation Standards Seem To Be Improving

8

Carlini & Wagner 2017 (10 defenses) Athalye et al. 2018 (7 defenses) T et al. 2020 (13 defenses)

  • Some white-box
  • 0/10 adaptive
  • All white-box
  • 2/7 adaptive
  • All white-box
  • 9/13 adaptive
  • 13/13 with code!

Authors (and reviewers) are aware of the importance of adaptive attacks in evaluations

slide-9
SLIDE 9

Then Why Are Defenses Still Broken?

9

Many defenses are not evaluated against a strong adaptive attack

slide-10
SLIDE 10

Our Work

10

13 case studies on how to design strong(er) adaptive attacks

Including:

  • Our hypotheses when reading each defense’s paper/code
  • Things we tried but that didn’t work
  • Some things we didn’t try but might also have worked
slide-11
SLIDE 11

How (not) to build & evaluate defenses

11

slide-12
SLIDE 12

Don’t Intentionally Obfuscate Gradients

12

If this wasn’t enough... this won’t be either

Breaking specific attack techniques is not the way forward

slide-13
SLIDE 13

Don’t Blindly Re-use Prior (Adaptive) Attacks

Adaptive attack strategies are not universal!

Most popular “victims”: BPDA & EOT (Athalye et al. 2018)

  • Understand why an attack worked on other defenses before re-using it
  • Use BPDA as a last resort (try gradient-free / decision-based attacks first)
  • Before using EOT, build an attack that works for fixed randomness

13

slide-14
SLIDE 14

Don’t Complicate The Attack

Many proposed defenses are complicated

(for some reasons, this is particularly true for AdvML papers in security conferences)

This is OK! Maybe the best defense has to be complex

14

(randomized) preprocessing Multiple components Anomaly detector (non-differentiable) ...

slide-15
SLIDE 15

Don’t Complicate The Attack

Many proposed defenses are complicated

(for some reasons, this is particularly true for AdvML papers in security conferences)

This is OK! Maybe the best defense has to be complex But attacks don’t have to be!

  • Optimizing over complex defenses can be hard (ℒ = 𝜇1ℒ1 + 𝜇2ℒ2 + 𝜇3ℒ3+ …)
  • Evaluate each component individually, there is often a weak link
  • Combining broken components rarely works

15

slide-16
SLIDE 16

Don’t Complicate The Attack

Use feature adversaries (Sabour et al. 2015) to break multiple components at once

16

Anomaly detector

OK Guac Guac

Anomaly detector

OK

slide-17
SLIDE 17

Don’t Convince Reviewers, Convince Yourself!

Really try to break your defense (others probably will...)

  • An evaluation against 10 non-adaptive attacks isn’t broad
  • If offered $1M to break your defense, would you use a non-adaptive attack?
  • What assumptions/invariants does the defense rely on? Attack those!

Evaluation guidelines are great, but:

  • Not just a check-list to appease reviewers
  • They also apply to adaptive attacks

(e.g., adaptive attacks should never perform worse than non-adaptive ones)

17

slide-18
SLIDE 18

My Defense Got Broken. Now What?

18

slide-19
SLIDE 19

My Defense Got Broken. Now What?

~40 white-box defenses that were publicly broken (that I know of)

  • one paper was retracted before publication
  • one paper was amended on arXiv

We should do better!

  • Hard to navigate the field for newcomers
  • Many ideas get re-used despite being broken

19

slide-20
SLIDE 20

My Defense Got Broken. Now What?

Personal experience:

  • Often referenced as an effective defense against black-box attacks
  • Later work developed much stronger transfer attacks L

ÞPlease contact authors when you find an attack!

20

After intro, or in abstract, results, etc.

slide-21
SLIDE 21

Conclusion

Evaluating adversarial examples defenses is hard! How do we improve things?

Resisting attacks that broke prior defenses ≠ progress Ideal: defense evaluation = 99% adaptive attacks

  • Try breaking other defenses before attacking your own
  • Strive for simple attacks (and defenses if possible)
  • We need more independent re-evaluations
  • If a defense is broken, acknowledge the attack, amend the paper, and keep going!

21

https://arxiv.org/abs/2002.08347 tramer@cs.stanford.edu