Minimax Statistical Learning with Wasserstein distances Jaeho Lee - - PowerPoint PPT Presentation

minimax statistical learning
SMART_READER_LITE
LIVE PREVIEW

Minimax Statistical Learning with Wasserstein distances Jaeho Lee - - PowerPoint PPT Presentation

Minimax Statistical Learning with Wasserstein distances Jaeho Lee & Maxim Raginsky NeurIPS 2018 Poster #86 Minimax learning Goal: find the hypothesis minimizing the worst-case risk is an ambiguity set representing


slide-1
SLIDE 1

Jaeho Lee & Maxim Raginsky

Minimax Statistical Learning

with Wasserstein distances

NeurIPS 2018 Poster #86

slide-2
SLIDE 2

“Minimax” learning

Goal: find the hypothesis minimizing the worst-case risk

  • domain drift ( mismatch of training & test distribution )
  • adversarial attack ( enhancing robustness of hypothesis )

… is an ambiguity set representing uncertainty, e.g. Γ(P, ϱ)

slide-3
SLIDE 3

“Minimax” learning

Goal: find the hypothesis minimizing the worst-case risk

  • domain drift ( mismatch of training & test distribution )
  • adversarial attack ( enhancing robustness of hypothesis )

… is an ambiguity set representing uncertainty, e.g. Γ(P, ϱ)

Approach: find the hypothesis minimizing the empirical risk

slide-4
SLIDE 4

“Minimax” learning

Goal: find the hypothesis minimizing the worst-case risk

  • domain drift ( mismatch of training & test distribution )
  • adversarial attack ( enhancing robustness of hypothesis )

… is an ambiguity set representing uncertainty, e.g. Γ(P, ϱ)

Approach: find the hypothesis minimizing the empirical risk Question: what is the speed of convergence

slide-5
SLIDE 5

“Minimax” learning

Goal: find the hypothesis minimizing the worst-case risk

  • domain drift ( mismatch of training & test distribution )
  • adversarial attack ( enhancing robustness of hypothesis )

… is an ambiguity set representing uncertainty, e.g. Γ(P, ϱ)

Approach: find the hypothesis minimizing the empirical risk Question: what is the speed of convergence

Focus on 1-Wasserstein ambiguity ball! (we have results for p-Wasserstein balls, too! See Poster#86)

P ϱ

slide-6
SLIDE 6

Taming the supremum

Main challenge is to handle the supremum.

slide-7
SLIDE 7

Taming the supremum

Main challenge is to handle the supremum. Trick: (1) write down the dual form

slide-8
SLIDE 8

Taming the supremum

Main challenge is to handle the supremum. (2) empirical risk minimization is now joint minimization Trick: (1) write down the dual form

slide-9
SLIDE 9

Taming the supremum

Main challenge is to handle the supremum. (2) empirical risk minimization is now joint minimization Trick: (1) write down the dual form (3) gauge the complexity of the “set of all possible ”

With high probability,

slide-10
SLIDE 10

Result

Theorem) Under mild assumptions, with high probability,

  • vanishes to 0 as the sample size grows.
  • does not require Lipschitz-type assumptions on f
  • similar procedure could be applied for any ambiguity set


with suitable dual form

Come to poster #86 for…

  • applications to domain adaptation
  • complementary generalization bound recovering


classic bound as

  • Results on p-Wasserstein balls