Understanding and Mitigating the Tradeoff Between Robustness and - - PowerPoint PPT Presentation

understanding and mitigating the tradeoff between
SMART_READER_LITE
LIVE PREVIEW

Understanding and Mitigating the Tradeoff Between Robustness and - - PowerPoint PPT Presentation

Understanding and Mitigating the Tradeoff Between Robustness and Accuracy Aditi Raghunathan* Sang Michael Xie* Fanny Yang John C. Duchi Percy Liang Stanford University Adversarial examples Standard training leads to models that are not


slide-1
SLIDE 1

Understanding and Mitigating the Tradeoff Between Robustness and Accuracy

Aditi Raghunathan* Sang Michael Xie* Fanny Yang John C. Duchi Percy Liang

Stanford University

slide-2
SLIDE 2

Adversarial examples

[Goodfellow et al. 2015]

  • Standard training leads to models that are not robust
slide-3
SLIDE 3

Adversarial examples

  • Adversarial training is a popular approach to improve robustness
  • It augments the training set on-the-fly with adversarial examples

[Goodfellow et al. 2015]

  • Standard training leads to models that are not robust
slide-4
SLIDE 4

Adversarial training increases standard error

Robust Accuracy: % of test examples misclassified after an ℓ!-bounded adversarial perturbation

Method Robust Accuracy Standard Training 0% TRADES Adversarial Training (Zhang et al. 2019) 55.4% CIFAR-10

slide-5
SLIDE 5

Adversarial training increases standard error

Robust Accuracy: % of test examples misclassified after an ℓ!-bounded adversarial perturbation

Why is there a tradeoff between robustness and accuracy? We only augmented with more data!

Method Robust Accuracy Standard Accuracy Standard Training 0% 95.2% TRADES Adversarial Training (Zhang et al. 2019) 55.4% 84.0% CIFAR-10

slide-6
SLIDE 6

Prior hypotheses for the tradeoff

  • Optimal predictor not robust to adversarial

perturbations [Tsipras et al. 2019]

  • But typical perturbations are imperceptible,

robustness should be possible

  • Hypothesis class not expressive enough [Nakkiran et al.

2019]

  • But neural networks highly expressive, reaches

100% std and robust training accuracy

slide-7
SLIDE 7

Prior hypotheses for the tradeoff

  • Optimal predictor not robust to adversarial

perturbations [Tsipras et al. 2019]

  • But typical perturbations are imperceptible,

robustness should be possible

  • Hypothesis class not expressive enough [Nakkiran et al.

2019]

  • But neural networks highly expressive, reaches

100% std and robust training accuracy

Perturb

slide-8
SLIDE 8

Prior hypotheses for the tradeoff

  • Optimal predictor not robust to adversarial

perturbations [Tsipras et al. 2019]

  • But typical perturbations are imperceptible,

robustness should be possible

  • Hypothesis class not expressive enough [Nakkiran et al.

2019]

  • But neural networks highly expressive, reaches

100% std and robust training accuracy

Perturb

slide-9
SLIDE 9

Prior hypotheses for the tradeoff

  • Optimal predictor not robust to adversarial

perturbations [Tsipras et al. 2019]

  • But typical perturbations are imperceptible,

robustness should be possible

  • Hypothesis class not expressive enough [Nakkiran et al.

2019]

  • But neural networks highly expressive, reaches

100% std and robust training accuracy

Perturb

slide-10
SLIDE 10

Prior hypotheses for the tradeoff

  • Optimal predictor not robust to adversarial

perturbations [Tsipras et al. 2019]

  • But typical perturbations are imperceptible,

robustness should be possible

  • Hypothesis class not expressive enough [Nakkiran et al.

2019]

  • But neural networks highly expressive, reaches

100% std and robust training accuracy

These hypotheses suggest a tradeoff even in the infinite data limit…

Perturb

slide-11
SLIDE 11

Prior hypotheses for the tradeoff

  • Optimal predictor not robust to adversarial

perturbations [Tsipras et al. 2019]

  • But typical perturbations are imperceptible,

robustness should be possible

  • Hypothesis class not expressive enough [Nakkiran et al.

2019]

  • But neural networks highly expressive, reaches

100% std and robust training accuracy

These hypotheses suggest a tradeoff even in the infinite data limit…

Perturb

slide-12
SLIDE 12

Prior hypotheses for the tradeoff

  • Optimal predictor not robust to adversarial

perturbations [Tsipras et al. 2019]

  • But typical perturbations are imperceptible,

robustness should be possible

  • Hypothesis class not expressive enough [Nakkiran et al.

2019]

  • But neural networks highly expressive, reaches

100% std and robust training accuracy

These hypotheses suggest a tradeoff even in the infinite data limit…

Perturb

slide-13
SLIDE 13

Prior hypotheses for the tradeoff

  • Optimal predictor not robust to adversarial

perturbations [Tsipras et al. 2019]

  • But typical perturbations are imperceptible,

robustness should be possible

  • Hypothesis class not expressive enough [Nakkiran et al.

2019]

  • But neural networks highly expressive, reaches

100% std and robust training accuracy

These hypotheses suggest a tradeoff even in the infinite data limit…

Consistent More realistic settings:

slide-14
SLIDE 14

Prior hypotheses for the tradeoff

  • Optimal predictor not robust to adversarial

perturbations [Tsipras et al. 2019]

  • But typical perturbations are imperceptible,

robustness should be possible

  • Hypothesis class not expressive enough [Nakkiran et al.

2019]

  • But neural networks highly expressive, reaches

100% std and robust training accuracy

These hypotheses suggest a tradeoff even in the infinite data limit…

Consistent Well-specified More realistic settings:

slide-15
SLIDE 15

No tradeoff with infinite data

  • Observations
  • Gap between robust and standard accuracies

are large for small data regime

  • Gap decreases with labeled sample size

CIFAR-10

slide-16
SLIDE 16

No tradeoff with infinite data

  • Observations
  • Gap between robust and standard accuracies

are large for small data regime

  • Gap decreases with labeled sample size
  • We ask: if we have consistent

perturbations + well-specified model family (no inherent tradeoff), why do we

  • bserve a tradeoff in practice?

CIFAR-10

slide-17
SLIDE 17

Results overview

  • Characterize how training with consistent extra data can increase

standard error even in well-specified noiseless linear regression

  • Analysis suggests robust self-training to mitigate tradeoff [Carmon 2019,

Najafi 2019, Uesato 2019]

slide-18
SLIDE 18

Results overview

  • Characterize how training with consistent extra data can increase

standard error even in well-specified noiseless linear regression

  • Analysis suggests robust self-training to mitigate tradeoff [Carmon 2019,

Najafi 2019, Uesato 2019]

  • Prove that robust self-training (RST) improves robust error without

hurting standard error in linear setting with unlabeled data

slide-19
SLIDE 19

Results overview

  • Characterize how training with consistent extra data can increase

standard error even in well-specified noiseless linear regression

  • Analysis suggests robust self-training to mitigate tradeoff [Carmon 2019,

Najafi 2019, Uesato 2019]

  • Prove that robust self-training (RST) improves robust error without

hurting standard error in linear setting with unlabeled data

  • Empirically, RST improves robust and standard error across different

adversarial training algorithms and adversarial perturbation types

slide-20
SLIDE 20

Noiseless linear regression

  • Model: 𝑧 = 𝑦!𝜄∗
  • Standard data: 𝑌#$% ∈ ℝ&×%, 𝑧#$% = 𝑌#$%𝜄∗, 𝑜 ≪ 𝑒 (overparameterized)
  • Extra data (adv examples): 𝑌()$ ∈ ℝ*×%, 𝑧()$ = 𝑌()$𝜄∗
  • We study min-norm interpolators
  • 𝜄!"# = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#}
  • 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}
  • Standard error: 𝜄 − 𝜄∗ !Σ 𝜄 − 𝜄∗

for population covariance Σ

Well-specified

slide-21
SLIDE 21

Noiseless linear regression

  • Model: 𝑧 = 𝑦!𝜄∗
  • Standard data: 𝑌#$% ∈ ℝ&×%, 𝑧#$% = 𝑌#$%𝜄∗, 𝑜 ≪ 𝑒 (overparameterized)
  • Extra data (adv examples): 𝑌()$ ∈ ℝ*×%, 𝑧()$ = 𝑌()$𝜄∗
  • We study min-norm interpolators
  • 𝜄!"# = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#}
  • 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}
  • Standard error: 𝜄 − 𝜄∗ !Σ 𝜄 − 𝜄∗

for population covariance Σ

Well-specified

slide-22
SLIDE 22

Noiseless linear regression

  • Model: 𝑧 = 𝑦!𝜄∗
  • Standard data: 𝑌#$% ∈ ℝ&×%, 𝑧#$% = 𝑌#$%𝜄∗, 𝑜 ≪ 𝑒 (overparameterized)
  • Extra data (adv examples): 𝑌()$ ∈ ℝ*×%, 𝑧()$ = 𝑌()$𝜄∗
  • We study min-norm interpolators
  • 𝜄!"# = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#}
  • 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}
  • Standard error: 𝜄 − 𝜄∗ !Σ 𝜄 − 𝜄∗

for population covariance Σ

Well-specified Consistent

slide-23
SLIDE 23

Noiseless linear regression

  • Model: 𝑧 = 𝑦!𝜄∗
  • Standard data: 𝑌#$% ∈ ℝ&×%, 𝑧#$% = 𝑌#$%𝜄∗, 𝑜 ≪ 𝑒 (overparameterized)
  • Extra data (adv examples): 𝑌()$ ∈ ℝ*×%, 𝑧()$ = 𝑌()$𝜄∗
  • We study min-norm interpolants
  • 𝜄!"# = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#}
  • 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}
  • Standard error: 𝜄 − 𝜄∗ !Σ 𝜄 − 𝜄∗

for population covariance Σ

Well-specified Consistent

slide-24
SLIDE 24

Noiseless linear regression

  • Model: 𝑧 = 𝑦!𝜄∗
  • Standard data: 𝑌#$% ∈ ℝ&×%, 𝑧#$% = 𝑌#$%𝜄∗, 𝑜 ≪ 𝑒 (overparameterized)
  • Extra data (adv examples): 𝑌()$ ∈ ℝ*×%, 𝑧()$ = 𝑌()$𝜄∗
  • We study min-norm interpolants
  • 𝜄!"# = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#}
  • 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}
  • Standard error: 𝜄 − 𝜄∗ !Σ 𝜄 − 𝜄∗

for population covariance Σ

Well-specified Consistent

slide-25
SLIDE 25

Example: when extra data hurts standard error

  • Min-norm interpolants + noiseless:

recover 𝜄∗ exactly in span of training data

𝜄!"# = argmin$ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"# 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}

slide-26
SLIDE 26
  • Min-norm interpolants + noiseless:

recover 𝜄∗ exactly in span of training data

  • Suppose null space of 𝑌"#$ is [𝑓%, 𝑓&]

𝜄!"# = argmin$ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"# 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}

𝜄∗ 𝑓# 𝑓$

Example: when extra data hurts standard error

slide-27
SLIDE 27
  • Min-norm interpolants + noiseless:

recover 𝜄∗ exactly in span of training data

  • Suppose null space of 𝑌"#$ is [𝑓%, 𝑓&]

𝜄!"# = argmin$ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"# 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}

𝜄∗ 𝜄%&' 𝑓# 𝑓$

Example: when extra data hurts standard error

slide-28
SLIDE 28
  • Min-norm interpolants + noiseless:

recover 𝜄∗ exactly in span of training data

  • Suppose null space of 𝑌"#$ is [𝑓%, 𝑓&]

𝜄!"# = argmin$ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"# 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}

𝑦()& 𝜄∗ 𝜄%&' 𝑓# 𝑓$

Example: when extra data hurts standard error

slide-29
SLIDE 29
  • Min-norm interpolants + noiseless:

recover 𝜄∗ exactly in span of training data

  • Suppose null space of 𝑌"#$ is [𝑓%, 𝑓&]
  • 𝜄'() fits 𝜄∗ in 𝑦*+# direction, 0 otherwise

𝜄!"# = argmin$ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"# 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}

𝑦()& 𝜄∗ 𝜄*+, 𝜄%&' 𝑓# 𝑓$

Example: when extra data hurts standard error

slide-30
SLIDE 30
  • Min-norm interpolants + noiseless:

recover 𝜄∗ exactly in span of training data

  • Suppose null space of 𝑌"#$ is [𝑓%, 𝑓&]
  • 𝜄'() fits 𝜄∗ in 𝑦*+# direction, 0 otherwise

𝜄!"# = argmin$ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"# 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}

𝑦()& 𝜄∗ 𝜄*+, 𝜄%&' 𝑓# 𝑓$

Example: when extra data hurts standard error

slide-31
SLIDE 31
  • Min-norm interpolants + noiseless:

recover 𝜄∗ exactly in span of training data

  • Suppose null space of 𝑌"#$ is [𝑓%, 𝑓&]
  • 𝜄'() fits 𝜄∗ in 𝑦*+# direction, 0 otherwise
  • If Σ has high weight on 𝑓& direction,

errors in 𝑓& are more costly ⇒ augmented estimator has higher error

𝜄!"# = argmin$ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"# 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}

𝑦()& 𝜄∗ 𝜄*+, 𝜄%&' 𝑓# 𝑓$ Aug param err on 𝑓# Std param err on 𝑓#

Example: when extra data hurts standard error

slide-32
SLIDE 32
  • Min-norm interpolants + noiseless:

recover 𝜄∗ exactly in span of training data

  • Suppose null space of 𝑌"#$ is [𝑓%, 𝑓&]
  • 𝜄'() fits 𝜄∗ in 𝑦*+# direction, 0 otherwise
  • If Σ has high weight on 𝑓& direction,

errors in 𝑓& are more costly ⇒ augmented estimator has higher error

  • The paper has exact characterization for

noiseless linear regression setting

𝜄!"# = argmin$ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"# 𝜄&'( = argmin${ 𝜄 %: 𝑌!"#𝜄 = 𝑧!"#, 𝑌)*"𝜄 = 𝑧)*"}

𝑦()& 𝜄∗ 𝜄*+, 𝜄%&' 𝑓# 𝑓$ Aug param err on 𝑓# Std param err on 𝑓#

Example: when extra data hurts standard error

slide-33
SLIDE 33

Mitigating the increase in error

  • Suppose we know the population covariance Σ has high weight on 𝑓%

𝜄∗ 𝜄%&' 𝑓$ 𝑓#

slide-34
SLIDE 34

Mitigating the increase in error

  • Suppose we know the population covariance Σ has high weight on 𝑓%

𝑦()& 𝜄∗ 𝜄%&' 𝑓$ 𝑓# Space of solutions that fit 𝑦()&

slide-35
SLIDE 35

Mitigating the increase in error

  • Suppose we know the population covariance Σ has high weight on 𝑓%
  • To mitigate error, regularize toward 𝜄&'( on 𝑓% component

𝑦()& 𝜄∗ 𝜄%&' 𝑓$ 𝑓# 𝜄-%& Space of solutions that fit 𝑦()&

slide-36
SLIDE 36

Mitigating the increase in error

  • Suppose we know the population covariance Σ has high weight on 𝑓%
  • To mitigate error, regularize toward 𝜄&'( on 𝑓% component

𝑦()& 𝜄∗ 𝜄%&' 𝑓$ 𝑓# 𝜄-%& Space of solutions that fit 𝑦()& Std param err on 𝑓#

slide-37
SLIDE 37

Mitigating the increase in error

  • Suppose we know the population covariance Σ has high weight on 𝑓%
  • To mitigate error, regularize toward 𝜄&'( on 𝑓% component

𝑦()& 𝜄∗ 𝜄%&' 𝑓$ 𝑓# 𝜄-%& Same error as std on 𝑓# Space of solutions that fit 𝑦()&

slide-38
SLIDE 38

Mitigating the increase in error

  • Suppose we know the population covariance Σ has high weight on 𝑓%
  • To mitigate error, regularize toward 𝜄&'( on 𝑓% component

𝑦()& 𝜄∗ 𝜄%&' 𝑓$ 𝑓# 𝜄-%& Same error as std on 𝑓#

  • Idea: Use unlabeled

data to estimate Σ

Space of solutions that fit 𝑦()&

slide-39
SLIDE 39

Mitigating the increase in error

  • Suppose we know the population covariance Σ has high weight on 𝑓%
  • To mitigate error, regularize toward 𝜄&'( on 𝑓% component

𝑦()& 𝜄∗ 𝜄%&' 𝑓$ 𝑓# 𝜄-%& Same error as std on 𝑓#

  • Idea: Use unlabeled

data to estimate Σ

Space of solutions that fit 𝑦()&

We show this is exactly Robust Self-Training!

slide-40
SLIDE 40

Robust Self-Training (RST)

  • Recent semi-supervised algorithm that can be applied on top of existing

adversarial training methods (Carmon et al., Najafi et al., Uesato et al.)

  • Labeled examples 𝑦, 𝑧

Standard Robust (extra data 𝒚𝒇𝒚𝒖 = 𝒚𝒃𝒆𝒘) Labeled Fit 𝑦, 𝑧 Fit 𝑦*'4, 𝑧

Components of RST

slide-41
SLIDE 41

Robust Self-Training (RST)

  • Recent semi-supervised algorithm that can be applied on top of existing

adversarial training methods (Carmon et al., Najafi et al., Uesato et al.)

  • Labeled examples 𝑦, 𝑧
  • Unlabeled examples .

𝑦 → standard predictor → pseudo-labels . 𝑧

Standard Robust (extra data 𝒚𝒇𝒚𝒖 = 𝒚𝒃𝒆𝒘) Labeled Fit 𝑦, 𝑧 Fit 𝑦*'4, 𝑧 Unlabeled Fit - 𝑦, - 𝑧 Fit - 𝑦*'4, - 𝑧

Components of RST

slide-42
SLIDE 42

Robust Self-Training (RST)

  • Recent semi-supervised algorithm that can be applied on top of existing

adversarial training methods (Carmon et al., Najafi et al., Uesato et al.)

  • Labeled examples 𝑦, 𝑧
  • Unlabeled examples .

𝑦 → standard predictor → pseudo-labels . 𝑧

Standard Robust (extra data 𝒚𝒇𝒚𝒖 = 𝒚𝒃𝒆𝒘) Labeled Fit 𝑦, 𝑧 Fit 𝑦*'4, 𝑧 Unlabeled Fit - 𝑦, - 𝑧 Fit - 𝑦*'4, - 𝑧

Theorem (informal): for noiseless linear regression, RST always improves both standard and robust errors

Components of RST

slide-43
SLIDE 43

RST mitigates tradeoff in adversarial training

  • RST mitigates tradeoff for adv. training with both TRADES and PG-AT

Method Robust Accuracy Standard Accuracy Standard Training 0% 95.2% PG-AT (Madry et al. 2018) 45.8% 87.3% TRADES (Zhang et al. 2019) 55.4% 84.0% CIFAR-10

slide-44
SLIDE 44

RST mitigates tradeoff in adversarial training

  • RST mitigates tradeoff for adv. training with both TRADES and PG-AT

Method Robust Accuracy Standard Accuracy Standard Training 0% 95.2% PG-AT (Madry et al. 2018) 45.8% 87.3% TRADES (Zhang et al. 2019) 55.4% 84.0% RST + PG-AT 58.5% 91.8% RST + TRADES 63.1% 89.7% CIFAR-10

slide-45
SLIDE 45

RST mitigates tradeoff in adversarial training

  • RST mitigates tradeoff for adv. training with both TRADES and PG-AT
  • Other semi-supervised approaches do not improve standard accuracy

Method Robust Accuracy Standard Accuracy Standard Training 0% 95.2% PG-AT (Madry et al. 2018) 45.8% 87.3% TRADES (Zhang et al. 2019) 55.4% 84.0% RST + PG-AT 58.5% 91.8% RST + TRADES 63.1% 89.7% Robust Consistency Training (Carmon et al. 2019) 56.5% 83.2% CIFAR-10

slide-46
SLIDE 46

RST mitigates tradeoff across perturbation types

  • Adversarial rotations + translations don’t hurt standard error

(Engstrom et al. 2019, Yang et al. 2019)

  • Even in this case, RST improves both standard and robust error

Method Robust Accuracy Standard Accuracy Standard Training 0.2% 94.6% Worst-of-10 73.9% 95.0% RST + Worst-of-10 75.1% 95.8% CIFAR-10

slide-47
SLIDE 47

Takeaways

We characterize the tradeoff in noiseless linear regression in the more realistic setting of no inherent tradeoff.

slide-48
SLIDE 48

Takeaways

We characterize the tradeoff in noiseless linear regression in the more realistic setting of no inherent tradeoff. We show the effect of inductive bias in causing a tradeoff with finite data.

𝑦()& 𝜄∗ 𝜄%&' 𝑓$ 𝑓# 𝜄-%&

slide-49
SLIDE 49

Takeaways

We characterize the tradeoff in noiseless linear regression in the more realistic setting of no inherent tradeoff. We show the effect of inductive bias in causing a tradeoff with finite data. Using unlabeled data, we can mitigate the tradeoff via robust self-training (RST).

𝑦()& 𝜄∗ 𝜄%&' 𝑓$ 𝑓# 𝜄-%&

slide-50
SLIDE 50

Thanks!

This work was funded by an Open Philanthropy Project Award and NSF Frontier Award as part of the Center for Trustworthy Machine Learning (CTML). AR was supported by Google Fellowship and Open Philanthropy AI Fellowship. FY was supported by the Institute for Theoretical Studies ETH Zurich and the Dr. Max Rossler and the Walter Haefner

  • Foundation. FY and JCD were supported by the Office of Naval Research Young Investigator Awards.

SMX was supported by an NDSEG Fellowship. We thank the following people for valuable comments and discussions: Tengyu Ma, Yair Carmon, Ananya Kumar, Pang Wei Koh, Fereshte Khani, Shiori Sagawa, Karan Goel.