When does the Tukey Median work? Banghua Zhu with Jiantao Jiao and - PowerPoint PPT Presentation

When does the Tukey Median work? Banghua Zhu with Jiantao Jiao and Jacob Steinhardt Department of EECS and Statistics University of California, Berkeley ISIT 2020 June 21, 2020

Robust mean estimation - mean and median in 1d Mean estimation in the presence of additive corruption (outlier) (Huber, 1973)

Median in high dimension? Tukey depth: v ∈ R d p ( v ⊤ ( X − µ ) ≥ 0 ) . D Tukey ( µ , p ) = inf Tukey median (Tukey, 1975): the point(s) with largest Tukey depth: T ( p ) = argmax D Tukey ( µ , p ) . µ ∈ R d

Preliminaries - corruption model Two corruption models: Total Variation (TV) corruption stronger than additive corruption:

Preliminaries - assumption on the true distribution p ∗ Halfspace symmetric distributions (Zuo and Serfling, 2000; Chen, Tyler, et al., 2002): exists a point µ ∈ R d such that for X ∼ p ∗ , d ∀ v ∈ R d , v ⊤ ( X − µ ) = − v ⊤ ( X − µ ) Example: Gaussian

Preliminaries - performance metric Maximum bias for Tukey median: the maximum distance between T ( p ) and T ( p ∗ ) , where p is in the set of all possible level- ε corruptions b add ( p ∗ , ε ) = sup � x − y � , p ∈ C add ( p ∗ , ε ) , x ∈ T ( p ) , y ∈ T ( p ∗ ) b TV ( p ∗ , ε ) = sup � x − y � . TV ( p ∗ , p ) ≤ ε , x ∈ T ( p ) , y ∈ T ( p ∗ )

Preliminaries - performance metric Breakdown point: the minimum corruption level that can drive the maximum bias to infinity: ε ∗ add ( p ∗ ) = inf { ε | b ( p ∗ , ε ) = ∞ } , ε ∗ TV ( p ∗ ) = inf { ε | b ( p ∗ , ε ) = ∞ } . Breakdown point for a family of distribution G : ε ∗ q ∈ G ε ∗ ε ∗ q ∈ G ε ∗ add ( G ) = inf add ( q ) , TV ( G ) = inf TV ( q ) .

Previous Results Breakdown point under additive corruption (Donoho, 1982; Donoho and Gasko, 1992): 0.6 Tukey+additive+symmetric Tukey+additive+general 0.5 breakdown point 0.4 1/3 0.3 0.2 1/(d+1) 0.1 0 1 2 3 4 5 6 7 dimension

Our contribution Breakdown point under TV corruption: 0.6 1/2 0.5 breakdown point 0.4 1/3 0.3 1/4 0.2 1/(d+1) projection+TV+symmetric Tukey+TV+symmetric 0.1 Tukey+additive+symmetric Tukey+additive+general 0 1 2 3 4 5 6 7 dimension Characterization of maximum bias in population and finite-sample case: both algorithms can achieve near optimal maximum bias Θ( ε ) under TV corruption when ε < 0 . 249 for Gaussian distribution.

Main results - Breakdown point Theorem (Breakdown point for Tukey median (Zhu, Jiao, and Steinhardt, 2020, Theorem 1)) Denote G as the set of all halfspace-symmetric distributions. Then the breakdown point for G is  �  1 / 2 , d = 1  1 / 2 , d = 1 ε ∗ ε ∗ add ( G ) = d ≥ 2 , TV ( G ) = 1 / 3 , d = 2  1 / 3 ,  1 / 4 , d ≥ 3 Proof of upper bound via figures:

Main results - Maxbias Theorem (Maximum bias under finite-sample TV corruption model (Zhu, Jiao, and Steinhardt, 2020, Theorem 3)) Assume p ∗ is halfspace-symmetric centered at µ ∗ with decay function h ( t ) = sup v ∈ R d , � v � ∗ ≤ 1 p ∗ ( v ⊤ ( X − µ ∗ ) > t ) . Denote ˆ p n as the empirical distribution taken from ε - TV corrupted distribution p. When d ≥ 3 , with probability at least 1 − δ , there exists universal constant C > 0 such that for any ˆ µ ∈ T (ˆ p n ) , µ − µ ∗ � ≤ h − 1 ( 1 − h ( 0 ) − 2 ˜ � ˆ ε ) (1) � d + 1 +log( 1 / δ ) , h − 1 is the generalized when 2 ˜ ε < 1 − h ( 0 ) , ˜ ε = ε + C · n inverse function of h. As n → ∞ , recover the result in population. Can generalize to other cases. Since h ( 0 ) ≤ 1 / 2, implies 1 / 4 lower bound on the breakdown point. For Gaussian p ∗ , h ( t ) = 1 / 2 − Θ( t ) for t small, achieve maxibias O ( ε ) when n = Ω( d / ε 2 ) .

Main results - Maxbias (proof sketch) Proof sketch of population case: Lemma (Zhu, Jiao, and Steinhardt (2020, Lemma 1)) If D Tukey ( T ( p ) , p ∗ ) ≥ α , we have � T ( p ) − µ ∗ � ≤ h − 1 ( α ) . (2) For TV corruption model, we have D Tukey ( T ( p ) , p ∗ ) ≥ D Tukey ( T ( p ) , p ) − ε ≥ D Tukey ( µ ∗ , p ) − ε ≥ D Tukey ( µ ∗ , p ∗ ) − 2 ε = 1 − h ( 0 ) − 2 ε . µ , p ∗ ) , ˆ For finite-sample case, it suffices to lower bound D Tukey (ˆ µ ∈ T (ˆ p n ) using standard concentration argument.

Main results - Projection algorithm Consider the halfspace metric defined in Donoho and Liu (1988) as � | p ( v ⊤ X ≥ t ) − q ( v ⊤ X ≥ t ) | . TV ( p , q ) = sup (3) v ∈ R d , t ∈ R Let G ( h ) be the set of half-space symmetric distributions: G ( h ) = { p |∃ µ ∈ R d X ∼ p is halfspace-symmetric around µ and p ( v ⊤ ( X − µ ) > t ) ≤ h ( t ) } . sup (4) v ∈ R d , � v � ∗ ≤ 1 The projection algorithm outputs ˆ µ ( p ) = T ( q ) : r � V T e d n u n o t i c e corrupted distribution ˆ o j p n r p TV � ε � q ∈ G p ∗ ∈ G G

Main results - Projection algorithm Theorem (Maximum bias and breakdown point for projection algorithm (Zhu, Jiao, and Steinhardt, 2020, Theorem 3)) Assume the true distribution p ∗ is halfspace-symmetric centered at µ ∗ with decay function h ( t ) = sup v ∈ R d , � v � ∗ ≤ 1 p ∗ ( v ⊤ ( X − µ ∗ ) > t ) . Then for any p with TV ( p ∗ , p ) ≤ ε , the projection estimator ˆ µ ( p ) satisfies µ − µ ∗ � ≤ 2 h − 1 ( 1 / 2 − ε ) � ˆ (5) when ε < 1 / 2 . Here h − 1 is the generalized inverse function of h. Improve the breakdown point from 1 / 4 for Tukey median in high dimension under TV corruption to 1 / 2, optimal among all translation-equivariant estimators (Rousseeuw and Leroy, 2005, Equation 1.38). Can be extended to finite-sample case using similar argument. Achieve O ( ε ) maximum bias for Gaussians.

Main results - Projection algorithm Intuition on improving the breakdown point:

Conclusion Tukey median: affine-equivariant, breakdown point 1 / 4 under TV corruption in high dimensions, good finite sample error. � TV projection algorithm: not affine-equivariant, breakdown point 1 / 2 and good finite sample error. Open problem: find an estimator that is affine-equivariant, with breakdown point 1 / 2 and good finite sample error.

References I Huber, P . J. (1973). Robust regression: Asymptotics, conjectures and monte carlo. The Annals of Statistics , 1 (5), 799–821. Tukey, J. W. (1975). Mathematics and the picturing of data. In Proceedings of the international congress of mathematicians, vancouver, 1975 . Donoho, D. L. (1982). Breakdown properties of multivariate location estimators (tech. rep.). Technical report, Harvard University, Boston. Donoho, D. L., & Liu, R. C. (1988). The “automatic” robustness of minimum distance functionals. The Annals of Statistics , 16 (2), 552–586. Donoho, D. L., & Gasko, M. (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingness. The Annals of Statistics , 20 (4), 1803–1827. Zuo, Y., & Serfling, R. (2000). General notions of statistical depth function. Annals of statistics , 461–482. Chen, Z., Tyler, D. E. Et al. (2002). The influence function and maximum bias of tukey’s median. The Annals of Statistics , 30 (6), 1737–1759. Rousseeuw, P . J., & Leroy, A. M. (2005). Robust regression and outlier detection (Vol. 589). John wiley & sons.

References II Zhu, B., Jiao, J., & Steinhardt, J. (2020). When does the tukey median work? arXiv preprint arXiv:2001.07805 .

When does the Tukey Median work? Banghua Zhu with Jiantao Jiao and - PowerPoint PPT Presentation

When does the Tukey Median work? Banghua Zhu with Jiantao Jiao and Jacob Steinhardt Department of EECS and Statistics University of California, Berkeley ISIT 2020 June 21, 2020 Robust mean estimation - mean and median in 1d Mean estimation

the nerves sensory radial median ulnar median median sensory median median ulnar radial

Client-side plug-ins for Tukey Eric Griffis Joshua Eisenberg Current State of Tukey All

2. Topology for Tukey Paul Gartside BLAST 2018 University of Pittsburgh The Tukey order We

I - -75 Median Cable Barrier 75 Median Cable Barrier 75 Median Cable Barrier I 75 Median Cable

1. The Tukey Order Paul Gartside BLAST 2018 University of Pittsburgh Origins of the Tukey order

Linear-time Median Def: Median of elements A=a 1 , a 2 , , a n is the (n/2)-th smallest element

Spartanburg Nation Median Value of a $115,900 $184,700 Home Median Gross Rent $705 $950

African American Strategy Equitable Access to Homeownership Presentation April 16, 2018

Median Finding Test Cases What's Next 1. Median finding, part 2 2. Why we write test cases 3.

Business Statistics CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks

Median Drains Hydraulic Policy Bill P Schmidt, PE INDOT Hydraulics Team Leader

Events MedIAN Jobs Contact The Network Background About MedIAN UKs national Medical

Harbor Bay Median Sheet Mulch project HARBOR BAY MEDIAN Located in the City of Alameda 2

City of Atascadero Size: 26 sq miles Incorporated: 1979 Population: 30,305 o Median

On the Least Median Square On the Least Median Square Problem Problem Jeff Erickson University

Median Finding 1. Testing iroot 2. Analyze backboneSimilar 3. Median finding Testing iroot

PhyloSub Jiao et. al. BMC Bioinformatics 2014, 15:35 Background Genetically-diverse subclonal

Lecture 8 Testing Zach Tatlock / Spring 2018 Outline Why correct software matters

Modlisation dysfonctionnelle des analyses de scurit diriges par les modles Christophe

Acts 2:42-47 1 42 They devoted themselves to the apostles teaching and to fellowship , to the

What is luban http://luban.danse.us A python package Simple, natural syntax for

Robustly Reusable Fuzzy Extractor from Standard Assumptions Yunhua Wen and Shengli Liu Shanghai

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array

Low Background Laboratories Per Provencher PHYS 575 Fall 2015 12/1/2015 Why Low Background

Sambuz

Useful Links

Newsletter

Mail Us