Tight Bounds for Learning a Mixture
- f Two Gaussians
Moritz Hardt Eric Price
Google Research UT Austin
2015-06-17
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 1 / 27
Tight Bounds for Learning a Mixture of Two Gaussians Moritz Hardt - - PowerPoint PPT Presentation
Tight Bounds for Learning a Mixture of Two Gaussians Moritz Hardt Eric Price Google Research UT Austin 2015-06-17 Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 1 / 27 Problem 140 140
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 1 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 2 / 27
◮ Male/female heights are very close to Gaussian distribution. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 2 / 27
◮ Male/female heights are very close to Gaussian distribution.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 2 / 27
◮ Male/female heights are very close to Gaussian distribution.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 2 / 27
◮ Male/female heights are very close to Gaussian distribution.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 2 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 3 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 4 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 4 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 4 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 4 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments” Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
◮ Royce ’58, Gridgeman ’70, Gupta-Huang ’80 Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
◮ Royce ’58, Gridgeman ’70, Gupta-Huang ’80
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
◮ Royce ’58, Gridgeman ’70, Gupta-Huang ’80
◮ Clustering: Dasgupta ’99, DA ’00 Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
◮ Royce ’58, Gridgeman ’70, Gupta-Huang ’80
◮ Clustering: Dasgupta ’99, DA ’00 ◮ Spectral methods: VW ’04, AK ’05, KSV ’05, AM ’05, VW ’05 Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
◮ Royce ’58, Gridgeman ’70, Gupta-Huang ’80
◮ Clustering: Dasgupta ’99, DA ’00 ◮ Spectral methods: VW ’04, AK ’05, KSV ’05, AM ’05, VW ’05
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
◮ Royce ’58, Gridgeman ’70, Gupta-Huang ’80
◮ Clustering: Dasgupta ’99, DA ’00 ◮ Spectral methods: VW ’04, AK ’05, KSV ’05, AM ’05, VW ’05
◮ Extended to general k mixtures: Moitra-Valiant ’10, Belkin-Sinha ’10 Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
◮ Royce ’58, Gridgeman ’70, Gupta-Huang ’80
◮ Clustering: Dasgupta ’99, DA ’00 ◮ Spectral methods: VW ’04, AK ’05, KSV ’05, AM ’05, VW ’05
◮ Extended to general k mixtures: Moitra-Valiant ’10, Belkin-Sinha ’10
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
◮ Royce ’58, Gridgeman ’70, Gupta-Huang ’80
◮ Clustering: Dasgupta ’99, DA ’00 ◮ Spectral methods: VW ’04, AK ’05, KSV ’05, AM ’05, VW ’05
◮ Extended to general k mixtures: Moitra-Valiant ’10, Belkin-Sinha ’10
◮ Our result: tight upper and lower bounds for the sample complexity. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
◮ Royce ’58, Gridgeman ’70, Gupta-Huang ’80
◮ Clustering: Dasgupta ’99, DA ’00 ◮ Spectral methods: VW ’04, AK ’05, KSV ’05, AM ’05, VW ’05
◮ Extended to general k mixtures: Moitra-Valiant ’10, Belkin-Sinha ’10
◮ Our result: tight upper and lower bounds for the sample complexity. ◮ For k = 2 mixtures, arbitrary d dimensions. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
◮ “Method of moments”
◮ Royce ’58, Gridgeman ’70, Gupta-Huang ’80
◮ Clustering: Dasgupta ’99, DA ’00 ◮ Spectral methods: VW ’04, AK ’05, KSV ’05, AM ’05, VW ’05
◮ Extended to general k mixtures: Moitra-Valiant ’10, Belkin-Sinha ’10
◮ Our result: tight upper and lower bounds for the sample complexity. ◮ For k = 2 mixtures, arbitrary d dimensions. ◮ Lower bound extends to larger k. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 5 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 6 / 27
◮ Male/female average heights, std. deviations. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 6 / 27
◮ Male/female average heights, std. deviations.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 6 / 27
◮ Male/female average heights, std. deviations.
◮ Quite general: non-properly for any mixture of known unimodal
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 6 / 27
◮ Male/female average heights, std. deviations.
◮ Quite general: non-properly for any mixture of known unimodal
◮ Proper learning: [Daskalakis-Kamath ’14] Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 6 / 27
◮ Male/female average heights, std. deviations.
◮ Quite general: non-properly for any mixture of known unimodal
◮ Proper learning: [Daskalakis-Kamath ’14] ◮ But only in low dimensions. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 6 / 27
◮ Male/female average heights, std. deviations.
◮ Quite general: non-properly for any mixture of known unimodal
◮ Proper learning: [Daskalakis-Kamath ’14] ◮ But only in low dimensions. ◮ Generic high-d TV estimation algs use 1d parameter estimation. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 6 / 27
α Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 7 / 27
α Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 7 / 27
◮ µi to ±ǫσ α Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 7 / 27
◮ µi to ±ǫσ ◮ σ2
i to ±ǫ2σ2
α Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 7 / 27
◮ µi to ±ǫσ ◮ σ2
i to ±ǫ2σ2
α Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 7 / 27
◮ µi to ±ǫσ ◮ σ2
i to ±ǫ2σ2
◮ Previously: 1/ǫ≈300, no lower bound. α Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 7 / 27
◮ µi to ±ǫσ ◮ σ2
i to ±ǫ2σ2
◮ Previously: 1/ǫ≈300, no lower bound. ◮ Moreover: algorithm is almost the same as Pearson (1894). α Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 7 / 27
◮ µi to ±ǫσ ◮ σ2
i to ±ǫ2σ2
◮ Previously: 1/ǫ≈300, no lower bound. ◮ Moreover: algorithm is almost the same as Pearson (1894). α
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 7 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 8 / 27
◮ “σ2” is max variance in any coordinate. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 8 / 27
◮ “σ2” is max variance in any coordinate. ◮ Get each entry of covariance matrix to ±ǫ2σ2. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 8 / 27
◮ “σ2” is max variance in any coordinate. ◮ Get each entry of covariance matrix to ±ǫ2σ2. ◮ Useful when covariance matrix is sparse. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 8 / 27
◮ “σ2” is max variance in any coordinate. ◮ Get each entry of covariance matrix to ±ǫ2σ2. ◮ Useful when covariance matrix is sparse.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 8 / 27
◮ “σ2” is max variance in any coordinate. ◮ Get each entry of covariance matrix to ±ǫ2σ2. ◮ Useful when covariance matrix is sparse.
◮ If components overlap, then parameter distance ≈ TV. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 8 / 27
◮ “σ2” is max variance in any coordinate. ◮ Get each entry of covariance matrix to ±ǫ2σ2. ◮ Useful when covariance matrix is sparse.
◮ If components overlap, then parameter distance ≈ TV. ◮ If components don’t overlap, then clustering is trivial. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 8 / 27
◮ “σ2” is max variance in any coordinate. ◮ Get each entry of covariance matrix to ±ǫ2σ2. ◮ Useful when covariance matrix is sparse.
◮ If components overlap, then parameter distance ≈ TV. ◮ If components don’t overlap, then clustering is trivial. ◮ Straightforwardly gives
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 8 / 27
◮ “σ2” is max variance in any coordinate. ◮ Get each entry of covariance matrix to ±ǫ2σ2. ◮ Useful when covariance matrix is sparse.
◮ If components overlap, then parameter distance ≈ TV. ◮ If components don’t overlap, then clustering is trivial. ◮ Straightforwardly gives
◮ Best known, but not the
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 8 / 27
◮ “σ2” is max variance in any coordinate. ◮ Get each entry of covariance matrix to ±ǫ2σ2. ◮ Useful when covariance matrix is sparse.
◮ If components overlap, then parameter distance ≈ TV. ◮ If components don’t overlap, then clustering is trivial. ◮ Straightforwardly gives
◮ Best known, but not the
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 8 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 9 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 9 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 9 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 10 / 27
140 160 180 200
Height (cm)
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 11 / 27
140 160 180 200
Height (cm)
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 11 / 27
140 160 180 200
Height (cm)
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 11 / 27
140 160 180 200
Height (cm)
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 11 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 12 / 27
◮ Convert to “central moments” Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 12 / 27
◮ Convert to “central moments” ◮ M′
2 = M2 − M2 1 is independent of translation.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 12 / 27
◮ Convert to “central moments” ◮ M′
2 = M2 − M2 1 is independent of translation.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 12 / 27
◮ Convert to “central moments” ◮ M′
2 = M2 − M2 1 is independent of translation.
◮ X4 = M4 − 3M2
2 is independent of adding N(0, σ2).
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 12 / 27
◮ Convert to “central moments” ◮ M′
2 = M2 − M2 1 is independent of translation.
◮ X4 = M4 − 3M2
2 is independent of adding N(0, σ2).
◮ “Excess kurtosis” coined by Pearson, appearing in every Wikipedia
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 12 / 27
◮ Convert to “central moments” ◮ M′
2 = M2 − M2 1 is independent of translation.
◮ X4 = M4 − 3M2
2 is independent of adding N(0, σ2).
◮ “Excess kurtosis” coined by Pearson, appearing in every Wikipedia
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 12 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 13 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 13 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 13 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 14 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 14 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 14 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 15 / 27
◮ Positive roots correspond to mixtures that match on five moments. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 15 / 27
◮ Positive roots correspond to mixtures that match on five moments. ◮ Pearson’s proposal: choose root with closer 6th moment. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 15 / 27
◮ Positive roots correspond to mixtures that match on five moments. ◮ Pearson’s proposal: choose root with closer 6th moment.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 15 / 27
◮ Positive roots correspond to mixtures that match on five moments. ◮ Pearson’s proposal: choose root with closer 6th moment.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 15 / 27
◮ Positive roots correspond to mixtures that match on five moments. ◮ Pearson’s proposal: choose root with closer 6th moment.
◮ Usually works well Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 15 / 27
◮ Positive roots correspond to mixtures that match on five moments. ◮ Pearson’s proposal: choose root with closer 6th moment.
◮ Usually works well ◮ Not when there’s a double root. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 15 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 16 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 16 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 16 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 16 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 16 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 16 / 27
◮ Given approximations |
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 16 / 27
◮ Given approximations |
◮ Getting α lets us estimate means, variances. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 16 / 27
1 σ
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 17 / 27
1 σ
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 17 / 27
1 σ
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 17 / 27
1 σ
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 17 / 27
1 σ
◮ If components are Ω(1) standard deviations apart, O(1/ǫ2) samples
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 17 / 27
1 σ
◮ If components are Ω(1) standard deviations apart, O(1/ǫ2) samples
◮ In general, O(1/ǫ12) samples suffice to get ǫσ accuracy. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 17 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 18 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
◮ Necessary to get sixth moment to ±(ǫσ)6. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
◮ Necessary to get sixth moment to ±(ǫσ)6.
◮ Constant means and variances. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
◮ Necessary to get sixth moment to ±(ǫσ)6.
◮ Constant means and variances. ◮ Add N(0, σ2) to each mixture for growing σ. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
◮ Necessary to get sixth moment to ±(ǫσ)6.
◮ Constant means and variances. ◮ Add N(0, σ2) to each mixture for growing σ. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
◮ Necessary to get sixth moment to ±(ǫσ)6.
◮ Constant means and variances. ◮ Add N(0, σ2) to each mixture for growing σ. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
◮ Necessary to get sixth moment to ±(ǫσ)6.
◮ Constant means and variances. ◮ Add N(0, σ2) to each mixture for growing σ. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
◮ Necessary to get sixth moment to ±(ǫσ)6.
◮ Constant means and variances. ◮ Add N(0, σ2) to each mixture for growing σ. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
◮ Necessary to get sixth moment to ±(ǫσ)6.
◮ Constant means and variances. ◮ Add N(0, σ2) to each mixture for growing σ. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
◮ Necessary to get sixth moment to ±(ǫσ)6.
◮ Constant means and variances. ◮ Add N(0, σ2) to each mixture for growing σ. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
◮ Necessary to get sixth moment to ±(ǫσ)6.
◮ Constant means and variances. ◮ Add N(0, σ2) to each mixture for growing σ.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 19 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 20 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 20 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 20 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 20 / 27
◮ H2(P, Q) := 1
2
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 20 / 27
◮ H2(P, Q) := 1
2
◮ H2 is subadditive on product measures: Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 20 / 27
◮ H2(P, Q) := 1
2
◮ H2 is subadditive on product measures: ⋆ H2((x1, . . . , xm), (x′ 1, . . . , x′ m)) ≤ mH2(x, x′). Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 20 / 27
◮ H2(P, Q) := 1
2
◮ H2 is subadditive on product measures: ⋆ H2((x1, . . . , xm), (x′ 1, . . . , x′ m)) ≤ mH2(x, x′). ◮ Sample complexity is Ω(1/H2(F, F ′)) Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 20 / 27
◮ H2(P, Q) := 1
2
◮ H2 is subadditive on product measures: ⋆ H2((x1, . . . , xm), (x′ 1, . . . , x′ m)) ≤ mH2(x, x′). ◮ Sample complexity is Ω(1/H2(F, F ′)) ◮ H2 TV H, but often H ≈ TV. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 20 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 21 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 21 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 21 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 21 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 21 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 21 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 21 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 21 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 21 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 21 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 22 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 22 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 22 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 22 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 23 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 23 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 23 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 23 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 23 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 23 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 23 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 23 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 23 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 23 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 24 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 24 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 24 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 24 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 24 / 27
◮ Each set of ǫ−12 samples has a constant chance of giving no
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 24 / 27
◮ Each set of ǫ−12 samples has a constant chance of giving no
◮ With o(ǫ−12 log d) samples, some coordinate will be independent of
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 24 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 25 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ (And covariance matrix) Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ (And covariance matrix)
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ (And covariance matrix)
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ (And covariance matrix)
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ (And covariance matrix)
◮ Does µi go with µj or µ′
j?
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ (And covariance matrix)
◮ Does µi go with µj or µ′
j?
◮ Project onto a random direction ei sin θ + ej cos θ. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ (And covariance matrix)
◮ Does µi go with µj or µ′
j?
◮ Project onto a random direction ei sin θ + ej cos θ. ◮ (µi, µj) usually has a significantly different projection from (µi, µ′
j).
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ (And covariance matrix)
◮ Does µi go with µj or µ′
j?
◮ Project onto a random direction ei sin θ + ej cos θ. ◮ (µi, µj) usually has a significantly different projection from (µi, µ′
j).
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ (And covariance matrix)
◮ Does µi go with µj or µ′
j?
◮ Project onto a random direction ei sin θ + ej cos θ. ◮ (µi, µj) usually has a significantly different projection from (µi, µ′
j).
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ (And covariance matrix)
◮ Does µi go with µj or µ′
j?
◮ Project onto a random direction ei sin θ + ej cos θ. ◮ (µi, µj) usually has a significantly different projection from (µi, µ′
j).
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 26 / 27
◮ Θ(ǫ−12 log d) samples necessary and sufficient to estimate µi to
i to ±ǫ2σ2.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 27 / 27
◮ Θ(ǫ−12 log d) samples necessary and sufficient to estimate µi to
i to ±ǫ2σ2.
◮ If the means have ασ separation, just O(ǫ−2α−12) for ǫασ accuracy. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 27 / 27
◮ Θ(ǫ−12 log d) samples necessary and sufficient to estimate µi to
i to ±ǫ2σ2.
◮ If the means have ασ separation, just O(ǫ−2α−12) for ǫασ accuracy.
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 27 / 27
◮ Θ(ǫ−12 log d) samples necessary and sufficient to estimate µi to
i to ±ǫ2σ2.
◮ If the means have ασ separation, just O(ǫ−2α−12) for ǫασ accuracy.
◮ Lower bound extends, at least to Ω(ǫ−6k−2). Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 27 / 27
◮ Θ(ǫ−12 log d) samples necessary and sufficient to estimate µi to
i to ±ǫ2σ2.
◮ If the means have ασ separation, just O(ǫ−2α−12) for ǫασ accuracy.
◮ Lower bound extends, at least to Ω(ǫ−6k−2). ◮ Do we really care about finding an O(ǫ−22) algorithm? Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 27 / 27
◮ Θ(ǫ−12 log d) samples necessary and sufficient to estimate µi to
i to ±ǫ2σ2.
◮ If the means have ασ separation, just O(ǫ−2α−12) for ǫασ accuracy.
◮ Lower bound extends, at least to Ω(ǫ−6k−2). ◮ Do we really care about finding an O(ǫ−22) algorithm? ◮ Solving the system of equations gets nasty. Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 27 / 27
◮ Θ(ǫ−12 log d) samples necessary and sufficient to estimate µi to
i to ±ǫ2σ2.
◮ If the means have ασ separation, just O(ǫ−2α−12) for ǫασ accuracy.
◮ Lower bound extends, at least to Ω(ǫ−6k−2). ◮ Do we really care about finding an O(ǫ−22) algorithm? ◮ Solving the system of equations gets nasty. ◮ [Next talk: Ge-Huang-Kakade avoid this for smoothed instances] Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 27 / 27
◮ Θ(ǫ−12 log d) samples necessary and sufficient to estimate µi to
i to ±ǫ2σ2.
◮ If the means have ασ separation, just O(ǫ−2α−12) for ǫασ accuracy.
◮ Lower bound extends, at least to Ω(ǫ−6k−2). ◮ Do we really care about finding an O(ǫ−22) algorithm? ◮ Solving the system of equations gets nasty. ◮ [Next talk: Ge-Huang-Kakade avoid this for smoothed instances]
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 27 / 27
◮ Θ(ǫ−12 log d) samples necessary and sufficient to estimate µi to
i to ±ǫ2σ2.
◮ If the means have ασ separation, just O(ǫ−2α−12) for ǫασ accuracy.
◮ Lower bound extends, at least to Ω(ǫ−6k−2). ◮ Do we really care about finding an O(ǫ−22) algorithm? ◮ Solving the system of equations gets nasty. ◮ [Next talk: Ge-Huang-Kakade avoid this for smoothed instances]
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 27 / 27
Moritz Hardt, Eric Price (Google/UT) Tight Bounds for Learning a Mixture of Two Gaussians 2015-06-17 28 / 27