introduction to sum of squares
play

Introduction to Sum-of-Squares Ankur Moitra (MIT) Robust Statistics - PowerPoint PPT Presentation

Introduction to Sum-of-Squares Ankur Moitra (MIT) Robust Statistics Summer School A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges A CLASSIC HARD PROBLEM:


  1. Introduction to Sum-of-Squares Ankur Moitra (MIT) Robust Statistics Summer School

  2. A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges

  3. A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges NP-hard to maximize exactly, one of [Karp, ‘72] ‘s 21 problems

  4. A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges NP-hard to maximize exactly, one of [Karp, ‘72] ‘s 21 problems How well can we approximate MAXCUT?

  5. A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges NP-hard to maximize exactly, one of [Karp, ‘72] ‘s 21 problems How well can we approximate MAXCUT? Simple ½-approximation algorithm: Choose U randomly.

  6. A CLASSIC HARD PROBLEM: MAXCUT Goal: given a graph : find a cut that maximizes the number of crossing edges NP-hard to maximize exactly, one of [Karp, ‘72] ‘s 21 problems How well can we approximate MAXCUT? Simple ½-approximation algorithm: Choose U randomly. But can we do better?

  7. MAXCUT AS A QUADRATIC PROGRAM Alternatively we can write

  8. MAXCUT AS A QUADRATIC PROGRAM Alternatively we can write x i ’s are 0/1 valued

  9. MAXCUT AS A QUADRATIC PROGRAM Alternatively we can write counts the number of edges crossing the cut x i ’s are 0/1 valued

  10. MAXCUT AS A QUADRATIC PROGRAM Alternatively we can write counts the number of edges crossing the cut x i ’s are 0/1 valued Now we can leverage the Sum-of-Squares (SOS) Hierarchy …

  11. MAXCUT AS A QUADRATIC PROGRAM Alternatively we can write counts the number of edges crossing the cut x i ’s are 0/1 valued Now we can leverage the Sum-of-Squares (SOS) Hierarchy … We will utilize an alternative view based on the notion of a pseudo-expectation…

  12. AN ALTERNATIVE VIEW OF SOS Pseudo-expectation [informally]: An operator that behaves like an expectation over a distribution on solutions degree ≤ d polynomials in n variables

  13. AN ALTERNATIVE VIEW OF SOS Pseudo-expectation [informally]: An operator that behaves like an expectation over a distribution on solutions degree ≤ d polynomials in n variables This formulation is the starting point for state-of-the-art algorithms for quantum separability , tensor completion , tensor PCA , finding a planted sparse vector in a subspace , the best separable state problem , …

  14. AN ALTERNATIVE VIEW OF SOS Pseudo-expectation [informally]: An operator that behaves like an expectation over a distribution on solutions degree ≤ d polynomials in n variables This formulation is the starting point for state-of-the-art algorithms for quantum separability , tensor completion , tensor PCA , finding a planted sparse vector in a subspace , the best separable state problem , … Let’s see what it looks like for MAXCUT…

  15. Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2

  16. Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2 (1) – (3) are the usual constraints that say Ẽ behaves like it is taking the expectation under some distribution on assignments to the variables

  17. Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2 (1) – (3) are the usual constraints that say Ẽ behaves like it is taking the expectation under some distribution on assignments to the variables (4) is because we want the distribution to be supported on 0/1 valued assignments

  18. Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2 But why is this a relaxation for MAXCUT?

  19. Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2 Claim: If there is a cut that has at least k edges crossing, there is a feasible solution to (1) – (4) with objective value ≥ k

  20. Degree d relaxation for MAXCUT: such that: (1) (3) is linear for all deg(p) ≤ d/2 (4) (2) for all deg(p) ≤ d-2 Claim: If there is a cut that has at least k edges crossing, there is a feasible solution to (1) – (4) with objective value ≥ k Proof: if a 1 , a 2 , …, a n is the indicator vector of the cut U, set

  21. Can we efficiently solve this relaxation?

  22. Can we efficiently solve this relaxation? Theorem: There is an n O(d) -time algorithm for finding such an operator, if it exists

  23. Can we efficiently solve this relaxation? Theorem: There is an n O(d) -time algorithm for finding such an operator, if it exists It is a semidefinite program on a n O(d) x n O(d) matrix whose entries are the pseudo-expectation applied to monomials

  24. Can we efficiently solve this relaxation? Theorem: There is an n O(d) -time algorithm for finding such an operator, if it exists It is a semidefinite program on a n O(d) x n O(d) matrix whose entries are the pseudo-expectation applied to monomials How well does SOS approximate MAXCUT?

  25. APPROXIMATION ALGORITHMS FOR MAXCUT Revolutionary work of [Goemans, Williamson] : Theorem: There is a -approximation algorithm for for MAXCUT

  26. APPROXIMATION ALGORITHMS FOR MAXCUT Revolutionary work of [Goemans, Williamson] : Theorem: There is a -approximation algorithm for for MAXCUT We will give an alternate proof by rounding the degree two Sum-of-Squares relaxation

  27. Main Question: How do you round a pseudo-expectation to find a cut? I.e. if I give you how do you find a cut with at least edges crossing (in expectation)?

  28. Main Question: How do you round a pseudo-expectation to find a cut? I.e. if I give you how do you find a cut with at least edges crossing (in expectation)? Main Idea: Use a sample from a Gaussian distribution whose moments match the pseudo-moments

  29. Main Question: How do you round a pseudo-expectation to find a cut? I.e. if I give you how do you find a cut with at least edges crossing (in expectation)? Main Idea: Use a sample from a Gaussian distribution whose moments match the pseudo-moments Aside: Rounding higher degree relaxations is much harder b/c you cannot necc. find a r.v. whose moments match the pseudo-moments

  30. Claim: Without loss of generality, can assume for all i

  31. Claim: Without loss of generality, can assume for all i Intuition: You can always change U to V\U without changing the value of the cut, so WLOG x i has probability 1/2 of being in U

  32. GAUSSIAN ROUNDING Let y be a Gaussian vector with mean and covariance for and

  33. GAUSSIAN ROUNDING Let y be a Gaussian vector with mean and covariance for and Now set if and otherwise

  34. GAUSSIAN ROUNDING Let y be a Gaussian vector with mean and covariance for and Now set if and otherwise We will show that for each (i, j) we have which, by linearity of expectation, will complete the proof

  35. For each edge (i,j), calculate contribution to objective value :

  36. For each edge (i,j), calculate contribution to objective value :

  37. For each edge (i,j), calculate contribution to objective value : for

  38. For each edge (i,j), calculate contribution to objective value : for And its contribution to the expected number of edges crossing :

  39. For each edge (i,j), calculate contribution to objective value : for And its contribution to the expected number of edges crossing :

  40. For each edge (i,j), calculate contribution to objective value : for And its contribution to the expected number of edges crossing : and

  41. For each edge (i,j), calculate contribution to objective value : for And its contribution to the expected number of edges crossing : and Now we can compute: independent std Gaussians

  42. For each edge (i,j), calculate contribution to objective value : for And its contribution to the expected number of edges crossing : and Now we can compute: independent std Gaussians

  43. Putting it all together, we have for every edge (i, j): which completes the proof

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend