robustness meets algorithms
play

Robustness Meets Algorithms Ankur Moitra (MIT) Robust Statistics - PowerPoint PPT Presentation

Robustness Meets Algorithms Ankur Moitra (MIT) Robust Statistics Summer School CLASSIC PARAMETER ESTIMATION Given samples from an unknown distribution in some class e.g. a 1-D Gaussian can we accurately estimate its parameters? CLASSIC


  1. Simultaneously [Lai, Rao, Vempala ‘16] gave agnostic algorithms that achieve: When the covariance is bounded, this translates to: Subsequently many works handling more errors via list decoding , giving lower bounds against statistical query algorithms ,

  2. Simultaneously [Lai, Rao, Vempala ‘16] gave agnostic algorithms that achieve: When the covariance is bounded, this translates to: Subsequently many works handling more errors via list decoding , giving lower bounds against statistical query algorithms , weakening the distributional assumptions ,

  3. Simultaneously [Lai, Rao, Vempala ‘16] gave agnostic algorithms that achieve: When the covariance is bounded, this translates to: Subsequently many works handling more errors via list decoding , giving lower bounds against statistical query algorithms , weakening the distributional assumptions , exploiting sparsity ,

  4. Simultaneously [Lai, Rao, Vempala ‘16] gave agnostic algorithms that achieve: When the covariance is bounded, this translates to: Subsequently many works handling more errors via list decoding , giving lower bounds against statistical query algorithms , weakening the distributional assumptions , exploiting sparsity , working with more complex generative models

  5. A GENERAL RECIPE Robust estimation in high-dimensions: Ÿ Step #1: Find an appropriate parameter distance Ÿ Step #2: Detect when the naïve estimator has been compromised Ÿ Step #3: Find good parameters, or make progress Filtering: Fast and practical Convex Programming: Better sample complexity

  6. A GENERAL RECIPE Robust estimation in high-dimensions: Ÿ Step #1: Find an appropriate parameter distance Ÿ Step #2: Detect when the naïve estimator has been compromised Ÿ Step #3: Find good parameters, or make progress Filtering: Fast and practical Convex Programming: Better sample complexity Let’s see how this works for unknown mean …

  7. OUTLINE Part I: Introduction Ÿ Robust Estimation in One-dimension Ÿ Robustness vs. Hardness in High-dimensions Ÿ Our Results Part II: Agnostically Learning a Gaussian Ÿ Parameter Distance Ÿ Detecting When an Estimator is Compromised Ÿ A Win-Win Algorithm Ÿ Unknown Covariance Part III: Experiments

  8. OUTLINE Part I: Introduction Ÿ Robust Estimation in One-dimension Ÿ Robustness vs. Hardness in High-dimensions Ÿ Our Results Part II: Agnostically Learning a Gaussian Ÿ Parameter Distance Ÿ Detecting When an Estimator is Compromised Ÿ A Win-Win Algorithm Ÿ Unknown Covariance Part III: Experiments

  9. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians

  10. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians A Basic Fact: (1)

  11. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians A Basic Fact: (1) This can be proven using Pinsker’s Inequality and the well-known formula for KL-divergence between Gaussians

  12. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians A Basic Fact: (1)

  13. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians A Basic Fact: (1) Corollary: If our estimate (in the unknown mean case) satisfies then

  14. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians A Basic Fact: (1) Corollary: If our estimate (in the unknown mean case) satisfies then Our new goal is to be close in Euclidean distance

  15. OUTLINE Part I: Introduction Ÿ Robust Estimation in One-dimension Ÿ Robustness vs. Hardness in High-dimensions Ÿ Our Results Part II: Agnostically Learning a Gaussian Ÿ Parameter Distance Ÿ Detecting When an Estimator is Compromised Ÿ A Win-Win Algorithm Ÿ Unknown Covariance Part III: Experiments

  16. OUTLINE Part I: Introduction Ÿ Robust Estimation in One-dimension Ÿ Robustness vs. Hardness in High-dimensions Ÿ Our Results Part II: Agnostically Learning a Gaussian Ÿ Parameter Distance Ÿ Detecting When an Estimator is Compromised Ÿ A Win-Win Algorithm Ÿ Unknown Covariance Part III: Experiments

  17. DETECTING CORRUPTIONS Step #2: Detect when the naïve estimator has been compromised

  18. DETECTING CORRUPTIONS Step #2: Detect when the naïve estimator has been compromised = uncorrupted = corrupted

  19. DETECTING CORRUPTIONS Step #2: Detect when the naïve estimator has been compromised = uncorrupted = corrupted There is a direction of large (> 1) variance

  20. Key Lemma: If X 1 , X 2 , … X N come from a distribution that is ε-close to and then for (1) (2) with probability at least 1-δ

  21. Key Lemma: If X 1 , X 2 , … X N come from a distribution that is ε-close to and then for (1) (2) with probability at least 1-δ Take-away: An adversary needs to mess up the second moment in order to corrupt the first moment

  22. OUTLINE Part I: Introduction Ÿ Robust Estimation in One-dimension Ÿ Robustness vs. Hardness in High-dimensions Ÿ Our Results Part II: Agnostically Learning a Gaussian Ÿ Parameter Distance Ÿ Detecting When an Estimator is Compromised Ÿ A Win-Win Algorithm Ÿ Unknown Covariance Part III: Experiments

  23. OUTLINE Part I: Introduction Ÿ Robust Estimation in One-dimension Ÿ Robustness vs. Hardness in High-dimensions Ÿ Our Results Part II: Agnostically Learning a Gaussian Ÿ Parameter Distance Ÿ Detecting When an Estimator is Compromised Ÿ A Win-Win Algorithm Ÿ Unknown Covariance Part III: Experiments

  24. A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers

  25. A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that:

  26. A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points: v where v is the direction of largest variance

  27. A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points: v where v is the direction of largest variance, and T has a formula

  28. A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points: T v where v is the direction of largest variance, and T has a formula

  29. A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points

  30. A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points If we continue too long, we’d have no corrupted points left!

  31. A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points If we continue too long, we’d have no corrupted points left! Eventually we find (certifiably) good parameters

  32. A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points If we continue too long, we’d have no corrupted points left! Eventually we find (certifiably) good parameters Running Time: Sample Complexity:

  33. A WIN-WIN ALGORITHM Step #3: Either find good parameters, or remove many outliers Filtering Approach: Suppose that: We can throw out more corrupted than uncorrupted points If we continue too long, we’d have no corrupted points left! Eventually we find (certifiably) good parameters Running Time: Sample Complexity: Concentration of LTFs

  34. OUTLINE Part I: Introduction Ÿ Robust Estimation in One-dimension Ÿ Robustness vs. Hardness in High-dimensions Ÿ Our Results Part II: Agnostically Learning a Gaussian Ÿ Parameter Distance Ÿ Detecting When an Estimator is Compromised Ÿ A Win-Win Algorithm Ÿ Unknown Covariance Part III: Experiments

  35. OUTLINE Part I: Introduction Ÿ Robust Estimation in One-dimension Ÿ Robustness vs. Hardness in High-dimensions Ÿ Our Results Part II: Agnostically Learning a Gaussian Ÿ Parameter Distance Ÿ Detecting When an Estimator is Compromised Ÿ A Win-Win Algorithm Ÿ Unknown Covariance Part III: Experiments

  36. A GENERAL RECIPE Robust estimation in high-dimensions: Ÿ Step #1: Find an appropriate parameter distance Ÿ Step #2: Detect when the naïve estimator has been compromised Ÿ Step #3: Find good parameters, or make progress Filtering: Fast and practical Convex Programming: Better sample complexity

  37. A GENERAL RECIPE Robust estimation in high-dimensions: Ÿ Step #1: Find an appropriate parameter distance Ÿ Step #2: Detect when the naïve estimator has been compromised Ÿ Step #3: Find good parameters, or make progress Filtering: Fast and practical Convex Programming: Better sample complexity How about for unknown covariance ?

  38. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians

  39. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians Another Basic Fact: (2)

  40. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians Another Basic Fact: (2) Again, proven using Pinsker’s Inequality

  41. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians Another Basic Fact: (2) Again, proven using Pinsker’s Inequality Our new goal is to find an estimate that satisfies:

  42. PARAMETER DISTANCE Step #1: Find an appropriate parameter distance for Gaussians Another Basic Fact: (2) Again, proven using Pinsker’s Inequality Our new goal is to find an estimate that satisfies: Distance seems strange, but it’s the right one to use to bound TV

  43. UNKNOWN COVARIANCE What if we are given samples from ?

  44. UNKNOWN COVARIANCE What if we are given samples from ? How do we detect if the naïve estimator is compromised?

  45. UNKNOWN COVARIANCE What if we are given samples from ? How do we detect if the naïve estimator is compromised? Key Fact: Let and Then restricted to flattenings of d x d symmetric matrices

  46. UNKNOWN COVARIANCE What if we are given samples from ? How do we detect if the naïve estimator is compromised? Key Fact: Let and Then restricted to flattenings of d x d symmetric matrices Proof uses Isserlis’s Theorem

  47. UNKNOWN COVARIANCE What if we are given samples from ? How do we detect if the naïve estimator is compromised? Key Fact: Let and Then restricted to flattenings of d x d symmetric matrices need to project out

  48. Key Idea: Transform the data, look for restricted large eigenvalues

  49. Key Idea: Transform the data, look for restricted large eigenvalues

  50. Key Idea: Transform the data, look for restricted large eigenvalues If were the true covariance, we would have for inliers

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend