calibrated surrogate losses for adversarially robust
play

Calibrated Surrogate Losses for Adversarially Robust Classification - PowerPoint PPT Presentation

Calibrated Surrogate Losses for Adversarially Robust Classification 1 The University of Tokyo 2 RIKEN AIP 3 University of Michigan Jul. 9 th - 12 th @ COLT 2020 Han Bao 1,2 Clayton Scott 3 Masashi Sugiyama 2,1 Adversarial Attacks 2


  1. Calibrated Surrogate Losses for Adversarially Robust Classification 1 The University of Tokyo 2 RIKEN AIP 3 University of Michigan Jul. 9 th - 12 th @ COLT 2020 Han Bao 1,2 Clayton Scott 3 Masashi Sugiyama 2,1

  2. Adversarial Attacks 2 Adding inperceptible small noise can fool classifiers! [Goodfellow+ 2015] original data perturbed data Goodfellow, I. J., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. In ICLR , 2015.

  3. Penalize Vulnerable Prediction 3 : -ball should be penalized prediction too close to boundary robust 0-1 loss usual 0-1 loss no penalty no penalty penalized! no penalty Robust Classification Usual Classification ℓ 01 ( x , y , f ) = { ℓ γ ( x , y , f ) = { 1 if yf ( x ) ≤ 0 1 if ∃Δ ∈ 𝔺 2 ( γ ) . yf ( x + Δ ) ≤ 0 0 otherwise 0 otherwise 𝔺 2 ( γ ) = { x ∈ ℝ d ∣ ∥ x ∥ 2 ≤ γ } γ

  4. In Case of Linear Predictors 4 no penalty penalized! robust 0-1 loss linear predictors ℱ lin = { x ↦ θ ⊤ x ∣ ∥ θ ∥ 2 = 1} margin = θ ⊤ x x θ ⊤ x > γ θ ⊤ x ≤ γ ℓ γ ( x , y , f ) = { = 1 { yf ( x ) ≤ γ } := ϕ γ ( yf ( x )) 1 if ∃Δ ∈ 𝔺 2 ( γ ) . yf ( x + Δ ) ≤ 0 0 otherwise

  5. Formulation of Classification 5 are not easy to optimize! & non-robust wrong correct wrong correct minimize 0-1 risk minimize -robust 0-1 risk Robust Classification Usual Classification (restricted to linear predictors) γ R ϕ γ ( f ) = 𝔽 [ ϕ γ ( Yf ( X )) ] R ϕ 01 ( f ) = 𝔽 [ ϕ 01 ( Yf ( X )) ] robust 0-1 loss ϕ γ ( α ) = 1 { α ≤ γ } 0-1 loss ϕ 01 ( α ) = 1 { α ≤ 0} ϕ 01 ϕ γ ϕ 01 ϕ γ

  6. What surrogate is desirable? final learning criterion target risk surrogate risk … 6 Calibrated surrogate Target loss (0-1 loss) easily optimizable Surrogate loss ϕ R ϕ ( f ) R * ϕ R ψ ( f ) ϕ 01 R * ψ f m f ∞

  7. What surrogate is calibrated? wrong ? surrogate [Bartlett+ 2006] calibrated convex & 7 non-robust surrogate correct robust 0-1 wrong correct 0-1 loss Robust Classification Usual Classification calibrated ϕ ϕ ϕ ′ (0) < 0 ϕ 01 ϕ γ P. L. Bartlett, M. I. Jordan, & J. D. McAuliffe. (2006). Convexity, classification, and risk bounds. Journal of the American Statistical Association , 101(473), 138-156.

  8. Short Course on Calibration Analysis ̶ how to analyze loss calibration property ̶ Ingo Steinwart. How to compare different loss functions and their risks. Constructive Approximation , 2007.

  9. Conditional Risk and Calibration , there exists surrogate excess conditional risk target excess conditional risk )- calibrated for a target loss is ( , 9 , and such that for all . if for any (prediction) (class prob.) Definition. Conditional Risk = Risk at a single x R ϕ ( f ) = 𝔽 X [ ℙ ( Y = + 1 | X ) ϕ ( f ( X )) + ℙ ( Y = − 1 | X ) ϕ ( − f ( X )) ] ℙ ( Y = + 1 | X ) := η f ( X ) := α C ϕ ( α , η ) := ηϕ ( α ) + (1 − η ) ϕ ( − α ) ϕ ψ ℱ ψ ε > 0 δ > 0 α ∈ A ℱ η ∈ [0,1] C ϕ ( α , η ) − C * ϕ , ℱ ( η ) < δ ⟹ C ψ ( α , η ) − C * ψ , ℱ ( η ) < ε A ℱ := { f ( x ) ∣ f ∈ ℱ , x ∈ 𝒴 }

  10. Main Tool: Calibration Function 10 target excess conditional risk s.t. Definition. (calibration function) : biconjugate of increasing monotonically target excess risk surrogate excess risk surrogate excess conditional risk )-calibrated )-calibrated for all δ ( ε ) = η ∈ [0,1] inf inf C ϕ ( η , α ) − C * ϕ , ℱ ( η ) C ψ ( η , α ) − C * ψ , ℱ ( η ) ≥ ε α ∈ A ℱ ■ Provides iff condition ψ ℱ ⟺ δ ( ε ) > 0 ε > 0 ▶ ( , ■ Provides excess risk bound ψ ≤ ( δ **) − 1 ( R ϕ ( f ) − R * ϕ ) ψ ℱ ⟹ R ψ ( f ) − R * ▶ ( , A ℱ := { f ( x ) ∣ f ∈ ℱ , x ∈ 𝒴 } δ ** δ

  11. Example: Binary Classification ( ▶ squared loss ) hinge loss : all measurable functions [Bartlett+ 2006] , )-calibrated iff Theorem. If surrogate is convex, it is ( 11 ϕ 01 ϕ ϕ 01 ℱ all ▶ differentiable at 0 ϕ ′ (0) < 0 ℱ all δ δ 1 1 ε ε 0 0 1 1 ϕ ( α ) = (1 − α ) 2 δ ( ε ) = ε 2 ϕ ( α ) = [1 − α ] + δ ( ε ) = ε P. L. Bartlett, M. I. Jordan, & J. D. McAuliffe. (2006). Convexity, classification, and risk bounds. Journal of the American Statistical Association , 101(473), 138-156.

  12. Analysis of Robust Classification robust 0-1 correct wrong non-robust surrogate Any convex surrogates? calibrated restricted to linear predictors ϕ γ ϕ

  13. No convex calibrated surrogate non-robust 13 non-robust surrogate conditional risk is plotted correct non-robust correct non-robust minimizer! calibration function wrong wrong correct s.t. is non-robust Proof Sketch )-calibrated. Theorem. Any convex surrogate is not ( , correct ϕ γ ℱ lin convex in α | α | ≤ γ δ ( ε ) = η ∈ [0,1] inf inf C ϕ ( η , α ) − C * ϕ , ℱ ( η ) C ϕ γ ( η , α ) − C * ϕ γ , ℱ ( η ) ≥ ε α ∈ A ℱ − γ γ α α α η ≈ 1 η ≈ 0 η ≈ 1 2

  14. How to find calibrated surrogate? correct conditional risk is quasiconcave consider a surrogate such that surrogate conditional risk is plotted Idea. To make conditional risk not minimized in non-robust area non-robust wrong correct 14 all superlevels are convex non-robust correct non-robust wrong correct − γ γ α α α η ≈ 1 η ≈ 0 η ≈ 1 2 ϕ

  15. Example: Shifted Ramp Loss Ramp loss Shifted ramp loss 15 calibration function ) conditional risk ( ϕ ( α ) = clip [0,1] ( ) 1 − α 2 α − 1 1 ϕ β ( α ) = clip [0,1] ( ) 1 − α + β + β 2 α − 1 + β 1 + β η > 1/2 assume 0 < β < 1 − γ

  16. Calibrated Surrogate Losses for Adversarially Robust Classification Example: Quasiconcavity is important correct non-robust correct because minimizer lies in non-robust area conditional risk under linear predictors correct non-robust correct under restriction to linear predictors No convex calibrated surrogate ⇐ minimizing target minimizing surrogate Calibrated surrogate loss non-robust wrong correct = minimize robust 0-1 loss Robust classification 16 shifted ramp loss ℙ ( Y = + 1 | X ) = 1 2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend