On Minimax Optimality of GANs for Robust Mean Estimation Kaiwen Wu - PowerPoint PPT Presentation

On Minimax Optimality of GANs for Robust Mean Estimation Kaiwen Wu 1,2 With Gavin Weiguang Ding 3 , Ruitong Huang 3 and Yaoliang Yu 1,2 University of Waterloo 1 Vector Institute 2 Borealis AI 3 Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 1 / 13

The success of generative adversarial networks (GANs) (Arjovsky et al. 2017; Goodfellow et al. 2014; Li et al. 2017; Miyato et al. 2018) Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 2 / 13

The success of generative adversarial networks (GANs) (Arjovsky et al. 2017; Goodfellow et al. 2014; Li et al. 2017; Miyato et al. 2018) But... what if the training data is noisy? Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 2 / 13

H N ( θ, I p ) X 1 , X 2 , · · · X n ∼ (1 − ǫ ) N ( θ, I p ) + ǫ H (Huber 1964) Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 3 / 13

H N ( θ, I p ) X 1 , X 2 , · · · X n ∼ (1 − ǫ ) N ( θ, I p ) + ǫ H (Huber 1964) Compute an estimator ˆ θ ≈ θ Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 3 / 13

Goal: small RMSE sup H E � ˆ θ − θ � in the worst case Sample average: infinite error in the worst case Coordinate-wise median: √ p ǫ error Tukey’s median (Tukey 1975): optimal error ǫ , but NP-hard Statistically optimal & computationally feasible estimators (Diakonikolas et al. 2016; Lai et al. 2016) Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 4 / 13

Robust Mean Estimation via GANs ˆ E data [ T ( X )] − E N ( η, I p ) [ s ( T ( Y ))] θ := argmin sup T ∈T η N ( η, I p ) is the generator T is the discriminator function class Which discriminator class T guarantees small estimation error? Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 5 / 13

f -GAN (Nowozin et al. 2016) Discriminator is an one-hidden-layer network � � l � � � w i σ ( u ⊤ T = g i x + b i ) : � w � 1 ≤ κ i =1 Theorem ( f -GAN) Under mild assumptions on the activations, we have � p � ˆ θ n − θ � � n ∨ ǫ with high probability. Generalizing results of Gao et al. (2019) on TV-GAN and JS-GAN Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 6 / 13

MMD-GAN (Dziugaite et al. 2015; Li et al. 2017) T = { f ∈ H k : � f � H k ≤ 1 } Discriminator is a unit ball in RKHS: � − � x − y � 2 � We focus on the Gaussian kernel: k ( x , y ) = exp 2 σ 2 Theorem With appropriate tuning of the bandwidth ( σ = √ p), � p n ∨ √ p ǫ � ˆ θ n − θ � � Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 7 / 13

MMD-GAN (Dziugaite et al. 2015; Li et al. 2017) T = { f ∈ H k : � f � H k ≤ 1 } Discriminator is a unit ball in RKHS: � − � x − y � 2 � We focus on the Gaussian kernel: k ( x , y ) = exp 2 σ 2 Theorem With appropriate tuning of the bandwidth ( σ = √ p), � p n ∨ √ p ǫ � ˆ θ n − θ � � Theorem For any bandwidth σ , there exists a contamination H such that θ − θ � � √ p ǫ � ˆ Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 7 / 13

Simulation σ = 5 σ = 15 1.1 2.5 σ = 7.5 σ = 20 σ = 10 1.0 2.0 0.9 θ − θ ‖ θ − θ ‖ 0.8 1.5 0.7 ‖ ̂ ‖ ̂ 1.0 0.6 0.5 0.5 0.4 0.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3 4 5 6 7 8 9 10 ‖ ̃ θ − θ ‖/ √ p √ p (a) different σ and δ ˜ θ in 100 dimension (b) different dimension p with σ = √ p Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 8 / 13

Wasserstein GAN (Arjovsky et al. 2017) Discriminator is 1-Lipschitz functions: T = { f : | f ( x ) − f ( y ) | ≤ � x − y � , ∀ x , y ∈ X} . Theorem In one dimension, estimation error is bounded: | ˆ θ − θ | ≍ ǫ θ − θ � ≍ √ p ǫ empirically... In high dimensions, � ˆ Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 9 / 13

Minimizing Wasserstein distance directly by Sinkhorn divergence 1.4 1.2 1.0 θ − θ ‖ 0.8 ‖ ̂ λ = 0.1 0.6 λ = 0.05 0.4 λ = 0.01 0.2 2 3 4 5 6 7 8 9 10 √ p (a) WGAN in 1 dimension (b) WGAN in p dimension Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 10 / 13

Extension of f -GAN Unknown covariance Sparse mean estimation ◮ θ has at most s nonzero entries ◮ Sparse constraints on both discriminator and generator l � � � � � w i σ ( u ⊤ Discriminator: T = g i x + b i ) : � w � 1 ≤ κ, � u � 0 ≤ 2 s i =1 � � Generator: N ( η, I p ) : � η � 0 ≤ s Theorem � s log ep � ˆ s θ n − θ � ≍ ∨ ǫ n Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 11 / 13

Simulation (b) sparse vs. nonsparse (a) varying sparsity s Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 12 / 13

Summary Characterize minimax optimality of several GAN formulations – Complete characterization of the discriminator function class – Computational complexity of GANs Wu, Ding, Huang & Yu GANs for Robust Mean Estimation 13 / 13

On Minimax Optimality of GANs for Robust Mean Estimation Kaiwen Wu - PowerPoint PPT Presentation

On Minimax Optimality of GANs for Robust Mean Estimation Kaiwen Wu 1,2 With Gavin Weiguang Ding 3 , Ruitong Huang 3 and Yaoliang Yu 1,2 University of Waterloo 1 Vector Institute 2 Borealis AI 3 Wu, Ding, Huang & Yu GANs for Robust Mean

Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs Yogesh

GANs for Word Embeddings Akshay Budhkar and Krishnapriya Introduction GANs have shown incredible

GANs, Optimal Transport, and Implicit Distribution Estimation Tengyuan Liang Econometrics and

Nonparametric Minimax Estimation of the Estimation of the Volatility in High- Volatility in

Spatial covariance-robust minimax prediction based on experimental design ideas Gunter Spoeck

Optimality Conditions Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Optimality

Advanced Section #8: Generative Adversarial Networks (GANs) CS109B Data Science 2 Vincent Casser

Reading group: Latent Optimized GANs (Game theory brings guns to GANs) Michal Sustr Dept. of

Bregman and Wasserstein, with Applications to Generative Adversarial Networks (GANs) and beyond

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

4. Minimax and planning problems Optimizing piecewise linear functions Minimax problems

Robust Gate Sizing via Mean- - Robust Gate Sizing via Mean Excess Delay Minimization Excess

Stability, Networks: Stability, Networks: Control, and Optimality Control, and Optimality

Robust Location and Scatter Estimators Outline for Multivariate Data Analysis Background

Robust Estimation and Generative Adversarial Networks Weizhi ZHU Hong Kong University of Science

Solving Bulk-Robust Assignment Problems to Optimality Matthias Walter (RWTH Aachen) Joint work

Interval-valued regression and classication models in the framework of machine learning Lev

!"#"!$%&'()*&+,"**-(& !"#"!$%&'()*&+,"**-(&

Modelling an Opponent in Board Games Julian Jocque

Social Tool Minimax Digital Package - Google Search Network Ads - Bing Search Network Ads -

Connect Four by Jacob Frericks Rules Snapshot Algorithms: Random Randomly chooses a

Minimization Problem with Smooth Components Yu. Nesterov Presenter: Lei Tang Department of CSE

Enhancing the Climate Resilience of African Infrastructure T HE W ATER AND POWER S ECTORS : S

Company pany Pre Present entation ion C.S.I. s.r.l. is firefighting equipment manufacturer,

On Minimax Optimality of GANs for Robust Mean Estimation Kaiwen Wu - PowerPoint PPT Presentation

On Minimax Optimality of GANs for Robust Mean Estimation Kaiwen Wu 1,2 With Gavin Weiguang Ding 3 , Ruitong Huang 3 and Yaoliang Yu 1,2 University of Waterloo 1 Vector Institute 2 Borealis AI 3 Wu, Ding, Huang & Yu GANs for Robust Mean

Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs Yogesh

GANs for Word Embeddings Akshay Budhkar and Krishnapriya Introduction GANs have shown incredible

GANs, Optimal Transport, and Implicit Distribution Estimation Tengyuan Liang Econometrics and

Nonparametric Minimax Estimation of the Estimation of the Volatility in High- Volatility in

Spatial covariance-robust minimax prediction based on experimental design ideas Gunter Spoeck

Optimality Conditions Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Optimality

Advanced Section #8: Generative Adversarial Networks (GANs) CS109B Data Science 2 Vincent Casser

Reading group: Latent Optimized GANs (Game theory brings guns to GANs) Michal Sustr Dept. of

Bregman and Wasserstein, with Applications to Generative Adversarial Networks (GANs) and beyond

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

4. Minimax and planning problems Optimizing piecewise linear functions Minimax problems

Robust Gate Sizing via Mean- - Robust Gate Sizing via Mean Excess Delay Minimization Excess

Stability, Networks: Stability, Networks: Control, and Optimality Control, and Optimality

Robust Location and Scatter Estimators Outline for Multivariate Data Analysis Background

Robust Estimation and Generative Adversarial Networks Weizhi ZHU Hong Kong University of Science

Solving Bulk-Robust Assignment Problems to Optimality Matthias Walter (RWTH Aachen) Joint work

Interval-valued regression and classication models in the framework of machine learning Lev

!&quot;#&quot;!$%&amp;'()*&amp;+,&quot;**-(&amp; !&quot;#&quot;!$%&amp;'()*&amp;+,&quot;**-(&amp;

Modelling an Opponent in Board Games Julian Jocque

Social Tool Minimax Digital Package - Google Search Network Ads - Bing Search Network Ads -

Connect Four by Jacob Frericks Rules Snapshot Algorithms: Random Randomly chooses a

Minimization Problem with Smooth Components Yu. Nesterov Presenter: Lei Tang Department of CSE

Enhancing the Climate Resilience of African Infrastructure T HE W ATER AND POWER S ECTORS : S

Company pany Pre Present entation ion C.S.I. s.r.l. is firefighting equipment manufacturer,

!"#"!$%&'()*&+,"**-(& !"#"!$%&'()*&+,"**-(&