Robustness Meets Algorithms Ankur Moitra (MIT) Robust Statistics - PowerPoint PPT Presentation

Simultaneously [Lai, Rao, Vempala ‘16] gave agnostic algorithms that achieve: When the covariance is bounded, this translates to: Subsequently many works handling more errors via list decoding , giving lower bounds against statistical query algorithms ,

Simultaneously [Lai, Rao, Vempala ‘16] gave agnostic algorithms that achieve: When the covariance is bounded, this translates to: Subsequently many works handling more errors via list decoding , giving lower bounds against statistical query algorithms , weakening the distributional assumptions ,

Simultaneously [Lai, Rao, Vempala ‘16] gave agnostic algorithms that achieve: When the covariance is bounded, this translates to: Subsequently many works handling more errors via list decoding , giving lower bounds against statistical query algorithms , weakening the distributional assumptions , exploiting sparsity ,

Simultaneously [Lai, Rao, Vempala ‘16] gave agnostic algorithms that achieve: When the covariance is bounded, this translates to: Subsequently many works handling more errors via list decoding , giving lower bounds against statistical query algorithms , weakening the distributional assumptions , exploiting sparsity , working with more complex generative models

A GENERAL RECIPE Robust estimation in high-dimensions: Step #1: Find an appropriate parameter distance Step #2: Detect when the naïve estimator has been compromised Step #3: Find good parameters, or make progress Filtering: Fast and practical Convex Programming: Better sample complexity

A GENERAL RECIPE Robust estimation in high-dimensions: Step #1: Find an appropriate parameter distance Step #2: Detect when the naïve estimator has been compromised Step #3: Find good parameters, or make progress Filtering: Fast and practical Convex Programming: Better sample complexity Let’s see how this works for unknown mean …

OUTLINE Part I: Introduction Robust Estimation in One-dimension Robustness vs. Hardness in High-dimensions Our Results Part II: Agnostically Learning a Gaussian Parameter Distance Detecting When an Estimator is Compromised A Win-Win Algorithm Unknown Covariance Part III: Experiments

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians A Basic Fact: (1)

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians A Basic Fact: (1) This can be proven using Pinsker’s Inequality and the well-known formula for KL-divergence between Gaussians

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians A Basic Fact: (1)

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians A Basic Fact: (1) Corollary: If our estimate (in the unknown mean case) satisfies then

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians A Basic Fact: (1) Corollary: If our estimate (in the unknown mean case) satisfies then Our new goal is to be close in Euclidean distance

DETECTING CORRUPTIONS Step #2: Detect when the naïve estimator has been compromised

DETECTING CORRUPTIONS Step #2: Detect when the naïve estimator has been compromised = uncorrupted = corrupted

DETECTING CORRUPTIONS Step #2: Detect when the naïve estimator has been compromised = uncorrupted = corrupted There is a direction of large (> 1) variance

Key Lemma: If X 1 , X 2 , … X N come from a distribution that is ε-close to and then for (1) (2) with probability at least 1-δ

Key Lemma: If X 1 , X 2 , … X N come from a distribution that is ε-close to and then for (1) (2) with probability at least 1-δ Take-away: An adversary needs to mess up the second moment in order to corrupt the first moment

A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers

A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that:

A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points: v where v is the direction of largest variance

A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points: v where v is the direction of largest variance, and T has a formula

A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points: T v where v is the direction of largest variance, and T has a formula

A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points

A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points If we continue too long, we’d have no corrupted points left!

A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points If we continue too long, we’d have no corrupted points left! Eventually we find (certifiably) good parameters

A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points If we continue too long, we’d have no corrupted points left! Eventually we find (certifiably) good parameters Running Time: Sample Complexity:

A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points If we continue too long, we’d have no corrupted points left! Eventually we find (certifiably) good parameters Running Time: Sample Complexity: Concentration of LTFs

A GENERAL RECIPE Robust estimation in high-dimensions: Step #1: Find an appropriate parameter distance Step #2: Detect when the naïve estimator has been compromised Step #3: Find good parameters, or make progress Filtering: Fast and practical Convex Programming: Better sample complexity

A GENERAL RECIPE Robust estimation in high-dimensions: Step #1: Find an appropriate parameter distance Step #2: Detect when the naïve estimator has been compromised Step #3: Find good parameters, or make progress Filtering: Fast and practical Convex Programming: Better sample complexity How about for unknown covariance ?

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians Another Basic Fact: (2)

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians Another Basic Fact: (2) Again, proven using Pinsker’s Inequality

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians Another Basic Fact: (2) Again, proven using Pinsker’s Inequality Our new goal is to find an estimate that satisfies:

PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians Another Basic Fact: (2) Again, proven using Pinsker’s Inequality Our new goal is to find an estimate that satisfies: Distance seems strange, but it’s the right one to use to bound TV

UNKNOWN COVARIANCE What if we are given samples from ?

UNKNOWN COVARIANCE What if we are given samples from ? How do we detect if the naïve estimator is compromised?

UNKNOWN COVARIANCE What if we are given samples from ? How do we detect if the naïve estimator is compromised? Key Fact: Let and Then restricted to flattenings of d x d symmetric matrices

UNKNOWN COVARIANCE What if we are given samples from ? How do we detect if the naïve estimator is compromised? Key Fact: Let and Then restricted to flattenings of d x d symmetric matrices Proof uses Isserlis’s Theorem

UNKNOWN COVARIANCE What if we are given samples from ? How do we detect if the naïve estimator is compromised? Key Fact: Let and Then restricted to flattenings of d x d symmetric matrices need to project out

Key Idea: Transform the data, look for restricted large eigenvalues

Key Idea: Transform the data, look for restricted large eigenvalues If were the true covariance, we would have for inliers

Robustness Meets Algorithms Ankur Moitra (MIT) Robust Statistics - PowerPoint PPT Presentation

Robustness Meets Algorithms Ankur Moitra (MIT) Robust Statistics Summer School CLASSIC PARAMETER ESTIMATION Given samples from an unknown distribution in some class e.g. a 1-D Gaussian can we accurately estimate its parameters? CLASSIC

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Robustness and Generalization Huan Xu The University of Texas at Austin Department of Electrical

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Algorithms in Nature Network robustness Slides adapted from Carl Kingsford Network robustness

Matrix Robustness, with an Application to Power System Observability Matthias Brosemann Jochen

Robustness and SMC Adam Pechner Overview What is Robustness and why do we care? Different

S9932: LEARNING TO BOOST S9932: LEARNING TO BOOST ROBUSTNESS FOR ROBUSTNESS FOR AUTONOMOUS

Trade-off between Efficiency and Robustness Doctoral Colloqium @ SenSys18, Shenzhen Robert

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis

Point sets, Maps and Navigation - II D.A. Forsyth Robustness is a serious problem Robustness is

Where eco-tourism meets business park.. Where economic opportunity meets people! Westmead

Graph Algorithms Chapter 22 1 CPTR 430 Algorithms Graph Algorithms Why Study Graph Algorithms?

Greedy Algorithms Chapter 16 1 CPTR 430 Algorithms Greedy Algorithms Greedy Algorithms For

Algorithms Chapter 3 Chapter Summary Algorithms n Example Algorithms n Algorithmic Paradigms

Robustness measures and level set methods P. Van Dooren Abstract In this talk we discuss the

The Empirical Landscape of Trade Policy Chad P. Bown Meredith A. Crowley Peterson Institute

An Empirical Study of Perfect Potential Heuristics Augusto B. Corra and Florian Pommerening

Volatility, Valuation Ratios, and Bubbles: An Empirical Measure of Market Sentiment Can Gao Ian

An Empirical Security Study of An Empirical Security Study of the Native Code in the JDK the

An Empirical Analysis of Traceability in the Monero Blockchain Malte Mser, Kyle Soska, Ethan

An Empirical Characterization of A E i i l Ch t i ti f Stream Programs and its

Dialetheism is an empirical hypothesis David Ripley University of Connecticut

An Empirical Study on Selective Sampling in Active Learning for Splog Detection Taichi Katayama 1