Minimax Statistical Learning with Wasserstein distances Jaeho Lee - - PowerPoint PPT Presentation

▶

Aug 03, 2023 160 likes •275 views

Minimax Statistical Learning with Wasserstein distances Jaeho Lee & Maxim Raginsky NeurIPS 2018 Poster #86 Minimax learning Goal: find the hypothesis minimizing the worst-case risk is an ambiguity set representing

SLIDE 1

Jaeho Lee & Maxim Raginsky

Minimax Statistical Learning

with Wasserstein distances

NeurIPS 2018 Poster #86

SLIDE 2

“Minimax” learning

Goal: find the hypothesis minimizing the worst-case risk

domain drift ( mismatch of training & test distribution )
adversarial attack ( enhancing robustness of hypothesis )

… is an ambiguity set representing uncertainty, e.g. Γ(P, ϱ)

SLIDE 3

“Minimax” learning

Goal: find the hypothesis minimizing the worst-case risk

domain drift ( mismatch of training & test distribution )
adversarial attack ( enhancing robustness of hypothesis )

… is an ambiguity set representing uncertainty, e.g. Γ(P, ϱ)

Approach: find the hypothesis minimizing the empirical risk

SLIDE 4

“Minimax” learning

Goal: find the hypothesis minimizing the worst-case risk

domain drift ( mismatch of training & test distribution )
adversarial attack ( enhancing robustness of hypothesis )

… is an ambiguity set representing uncertainty, e.g. Γ(P, ϱ)

Approach: find the hypothesis minimizing the empirical risk Question: what is the speed of convergence

SLIDE 5

“Minimax” learning

Goal: find the hypothesis minimizing the worst-case risk

domain drift ( mismatch of training & test distribution )
adversarial attack ( enhancing robustness of hypothesis )

… is an ambiguity set representing uncertainty, e.g. Γ(P, ϱ)

Approach: find the hypothesis minimizing the empirical risk Question: what is the speed of convergence

Focus on 1-Wasserstein ambiguity ball! (we have results for p-Wasserstein balls, too! See Poster#86)

P ϱ

SLIDE 6

Taming the supremum

Main challenge is to handle the supremum.

SLIDE 7

Taming the supremum

Main challenge is to handle the supremum. Trick: (1) write down the dual form

SLIDE 8

Taming the supremum

Main challenge is to handle the supremum. (2) empirical risk minimization is now joint minimization Trick: (1) write down the dual form

SLIDE 9

Taming the supremum

Main challenge is to handle the supremum. (2) empirical risk minimization is now joint minimization Trick: (1) write down the dual form (3) gauge the complexity of the “set of all possible ”

With high probability,

SLIDE 10

Result

Theorem) Under mild assumptions, with high probability,

vanishes to 0 as the sample size grows.
does not require Lipschitz-type assumptions on f
similar procedure could be applied for any ambiguity set

with suitable dual form

Come to poster #86 for…

applications to domain adaptation
complementary generalization bound recovering

classic bound as

Results on p-Wasserstein balls