Adversarial Training and Robustness for Multiple Perturbations - - PowerPoint PPT Presentation

adversarial training and robustness for multiple
SMART_READER_LITE
LIVE PREVIEW

Adversarial Training and Robustness for Multiple Perturbations - - PowerPoint PPT Presentation

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr & Dan Boneh NeurIPS 2019 Adversarial examples 88% Tabby Cat 99% Guacamole Szegedy et al., 2014 Goodfellow et al., 2015 Athalye, 2017 Adversarial


slide-1
SLIDE 1

Adversarial Training and Robustness for Multiple Perturbations

Florian Tramèr & Dan Boneh NeurIPS 2019

Poster #87

slide-2
SLIDE 2

Adversarial examples

Adversarial Training and Robustness for Multiple Perturbations

88% Tabby Cat 99% Guacamole

Szegedy et al., 2014 Goodfellow et al., 2015 Athalye, 2017

slide-3
SLIDE 3

Adversarial examples

  • ML models learn very different features than humans
  • This is a safety concern for deployed ML models
  • Classification in adversarial settings is hard

Adversarial Training and Robustness for Multiple Perturbations

88% Tabby Cat 99% Guacamole

Szegedy et al., 2014 Goodfellow et al., 2015 Athalye, 2017

slide-4
SLIDE 4

Adversarial Training and Robustness for Multiple Perturbations

Adversarial training

Szegedy et al., 2014 Madry et al., 2017

slide-5
SLIDE 5

Adversarial Training and Robustness for Multiple Perturbations

Adversarial training

Szegedy et al., 2014 Madry et al., 2017

1. Choose a set of perturbations: e.g., noise of small ℓ∞ norm:

slide-6
SLIDE 6

Adversarial Training and Robustness for Multiple Perturbations

Adversarial training

Szegedy et al., 2014 Madry et al., 2017

1. Choose a set of perturbations: e.g., noise of small ℓ∞ norm: 2. For each example , find an adversarial example: 3. Train the model on 4. Repeat until convergence

slide-7
SLIDE 7

Adversarial Training and Robustness for Multiple Perturbations

How well does it work?

ℓ1 noise Rotation

Engstrom et al., 2017 Sharma & Chen, 2018

slide-8
SLIDE 8

Adversarial Training and Robustness for Multiple Perturbations

How well does it work?

accuracy

9%

16% 70% 96%

Adversarial training on CIFAR10, with ℓ∞ noise

ℓ1 noise Rotation No noise ℓ∞ noise

  • Engstrom et al., 2017

Sharma & Chen, 2018

slide-9
SLIDE 9

Adversarial Training and Robustness for Multiple Perturbations

How well does it work?

accuracy

9%

16% 70% 96%

Adversarial training on CIFAR10, with ℓ∞ noise

ℓ1 noise Rotation No noise ℓ∞ noise

  • Engstrom et al., 2017

Sharma & Chen, 2018

slide-10
SLIDE 10

Adversarial Training and Robustness for Multiple Perturbations

How to prevent other adversarial examples?

slide-11
SLIDE 11

Adversarial Training and Robustness for Multiple Perturbations

How to prevent other adversarial examples?

S1 = {δ: ❘❘δ❘❘∞ ≤ ε∞} S2 = {δ: ❘❘δ❘❘1 ≤ ε1} S3 = {𝜀:«small rotation»}

  • Adversary can

choose a perturbation type for each input

slide-12
SLIDE 12
  • Pick worst-case adversarial example from S
  • Train the model on that example

Adversarial Training and Robustness for Multiple Perturbations

How to prevent other adversarial examples?

S1 = {δ: ❘❘δ❘❘∞ ≤ ε∞} S2 = {δ: ❘❘δ❘❘1 ≤ ε1} S3 = {𝜀:«small rotation»}

  • S = S1 ⋃ S2 ⋃ S3

Adversary can choose a perturbation type for each input

slide-13
SLIDE 13

Adversarial Training and Robustness for Multiple Perturbations

Does this work?

slide-14
SLIDE 14

Adversarial Training and Robustness for Multiple Perturbations

Does this work?

slide-15
SLIDE 15

Adversarial Training and Robustness for Multiple Perturbations

Does this work?

A robustness tradeoff is provably inherent in some classification tasks Increased robustness to one type of noise ⇒ decreased robustness to another Empirically validated on CIFAR10 & MNIST

slide-16
SLIDE 16

Adversarial Training and Robustness for Multiple Perturbations

Does this work?

A robustness tradeoff is provably inherent in some classification tasks Increased robustness to one type of noise ⇒ decreased robustness to another Empirically validated on CIFAR10 & MNIST MNIST:

slide-17
SLIDE 17

Adversarial Training and Robustness for Multiple Perturbations

Does this work?

A robustness tradeoff is provably inherent in some classification tasks Increased robustness to one type of noise ⇒ decreased robustness to another Empirically validated on CIFAR10 & MNIST MNIST:

For ℓ∞, ℓ1 and ℓ2 noise:

50% accuracy

slide-18
SLIDE 18

Adversarial Training and Robustness for Multiple Perturbations

Does this work?

A robustness tradeoff is provably inherent in some classification tasks Increased robustness to one type of noise ⇒ decreased robustness to another Empirically validated on CIFAR10 & MNIST MNIST:

gradient masking

For ℓ∞, ℓ1 and ℓ2 noise:

50% accuracy

slide-19
SLIDE 19

Adversarial Training and Robustness for Multiple Perturbations

What if we combine perturbations?

slide-20
SLIDE 20
  • Adversarial Training and Robustness for Multiple Perturbations

What if we combine perturbations?

  • natural image

rotation ℓ∞ noise ½ rotation + ½ ℓ∞ noise

slide-21
SLIDE 21
  • Adversarial Training and Robustness for Multiple Perturbations

What if we combine perturbations?

  • Accuracy

55% 65% 70% 96%

natural image rotation ℓ∞ noise ½ rotation + ½ ℓ∞ noise No noise One noise type One of two noise types Mixture of two noise types

slide-22
SLIDE 22

Adversarial Training and Robustness for Multiple Perturbations

Conclusion

Adversarial training for multiple perturbation sets works, but...

  • Significant loss in robustness
  • Weak robustness to affine combinations of perturbations

https://arxiv.org/abs/1904.13000

Poster #87

slide-23
SLIDE 23

Adversarial Training and Robustness for Multiple Perturbations

Conclusion

Adversarial training for multiple perturbation sets works, but...

  • Significant loss in robustness
  • Weak robustness to affine combinations of perturbations

Open questions:

  • Train a single MNIST model with high robustness to any ℓp noise
  • Better scaling of multi-perturbation adversarial training
  • Which perturbations do we care about?

https://arxiv.org/abs/1904.13000

Poster #87