SOS presentation: SOS is not obviously automatizable, even approximately Paper written by Ryan O’Donnell November 21, 2017
Introduction We will only look at feasibility, no optimization. We look at the Ellipsoid algorithm for solving SDP. ◮ Needs a polynomial time separation oracle for the constraints. PSD-ness constraint has a polynomial time separation oracle. ◮ Needs technical assumptions on solution space.
Ellipsoid algorithm
Ellipsoid algorithm
Technical assumptions for Ellipsoid algorithm V = feasible region for a given convex optimization problem. Parameters: 1. R : radius of initial L 2 -norm ball containing V 2. r : number such that V � = ∅ ⇐ ⇒ V contains some L 2 -norm ball of radius r
Ellipsoid algorithm Ellipsoid algorithm: ◮ Start with the ball of size R (initial ellipsoid). ◮ Repeatedly find a violated constraint for the center, and construct the next ellipsoid based on that.
Ellipsoid algorithm Ellipsoid algorithm: ◮ Start with the ball of size R (initial ellipsoid). ◮ Repeatedly find a violated constraint for the center, and construct the next ellipsoid based on that. Termination: 1. Center of ellipsoid is feasible. 2. Volume gets too small, so there is no solution. The Ellipsoid algorithm runs in time polynomial in log( R / r ).
Ellipsoid algorithm running time For linear programming, let L = the total number of bits in all coefficients together Can always take, without loss of generality, R = O (2 L ). Usually however, r = 0. Can modify the problem by changing ⇒ − ǫ ≤ a i x − b i ≤ ǫ a i x = b i = for small enough ǫ . Therefore, running time on LP is polynomial in L .
Degree- d SOS running time Degree- d SOS can typically be formulated in n O ( d ) bits. So L = n O ( d ) , but what are r and R ? The paper gives an example where every SOS proof has very large coefficients: Very large = 2 Ω(2 n ) If we start with an ellipsoid centered at 0 , then R = 2 Ω(2 n ) . It seems that r = 0 (?)
Preliminary observation SDP solutions can need doubly exponential coefficients: x 1 = 2 , x i +1 = x 2 i ∀ i Solution: x n = 2 2 n − 1
The example with large coefficients Given the constraints 2 x 1 y 1 = y 1 , 2 x 2 y 2 = y 2 , 2 x 3 y 3 = y 3 , 2 x n y n = y n x 2 x 2 x 2 x 2 1 = x 1 , 2 = x 2 , 3 = x 3 , n = x n y 2 y 2 y 2 y 2 1 = y 2 , 2 = y 3 , 3 = y 4 , n = 0 Prove that p n ( x , y ) = x 1 + x 2 + x 3 + . . . + x n − 2 y 1 ≥ 0.
The example with large coefficients Given the constraints 2 x 1 y 1 = y 1 , 2 x 2 y 2 = y 2 , 2 x 3 y 3 = y 3 , 2 x n y n = y n x 2 x 2 x 2 x 2 1 = x 1 , 2 = x 2 , 3 = x 3 , n = x n y 2 y 2 y 2 y 2 1 = y 2 , 2 = y 3 , 3 = y 4 , n = 0 Prove that p n ( x , y ) = x 1 + x 2 + x 3 + . . . + x n − 2 y 1 ≥ 0. Solution by hand ◮ Solve the second row to get x i ∈ { 0 , 1 } ∀ i . ◮ Solve the third row to get y i = 0 ∀ i .
The example with large coefficients Given the constraints 2 x 1 y 1 = y 1 , 2 x 2 y 2 = y 2 , 2 x 3 y 3 = y 3 , 2 x n y n = y n x 2 x 2 x 2 x 2 1 = x 1 , 2 = x 2 , 3 = x 3 , n = x n y 2 y 2 y 2 y 2 1 = y 2 , 2 = y 3 , 3 = y 4 , n = 0 Prove that p n ( x , y ) = x 1 + x 2 + x 3 + . . . + x n − 2 y 1 ≥ 0. How do we show that SOS needs large coefficients? We focus on degree-2 SOS here.
The example with large coefficients Given the constraints 2 x 1 y 1 = y 1 , 2 x 2 y 2 = y 2 , 2 x 3 y 3 = y 3 , 2 x n y n = y n x 2 x 2 x 2 x 2 1 = x 1 , 2 = x 2 , 3 = x 3 , n = x n y 2 y 2 y 2 y 2 1 = y 2 , 2 = y 3 , 3 = y 4 , n = 0 Prove that p n ( x , y ) = x 1 + x 2 + x 3 + . . . + x n − 2 y 1 ≥ 0. Working “mod the ideal” Solve � ℓ j ( x , y ) 2 p n ( x , y ) ≡ mod ( K ) j Where K is the set of equations above and ( K ) the generated ideal.
The example with large coefficients Given the constraints 2 x 1 y 1 = y 1 , 2 x 2 y 2 = y 2 , 2 x 3 y 3 = y 3 , 2 x n y n = y n x 2 x 2 x 2 x 2 1 = x 1 , 2 = x 2 , 3 = x 3 , n = x n y 2 y 2 y 2 y 2 1 = y 2 , 2 = y 3 , 3 = y 4 , n = 0 Prove that p n ( x , y ) = x 1 + x 2 + x 3 + . . . + x n − 2 y 1 ≥ 0. Solution ( x i − 2 2 i − 1 y i ) 2 � p n ( x , y ) ≡ mod ( K ) i
The example with large coefficients � ℓ j ( x , y ) 2 p n ( x , y ) ≡ mod ( K ) j 1. We ignore the “cross terms” x i x j , x i y j and y i y j ( i � = j ). They do not “mix” with the rest through the ideal.
The example with large coefficients � ℓ j ( x , y ) 2 p n ( x , y ) ≡ mod ( K ) j 1. We ignore the “cross terms” x i x j , x i y j and y i y j ( i � = j ). They do not “mix” with the rest through the ideal. 2. ℓ j must have zero constant terms. Proof j c 2 The constant term is of the form � j and is not reduced by the ideal, and p n ( x , y ) has zero constant term.
The example with large coefficients Therefore, if ℓ j = � i a ij x i + � i b ij y i , ℓ j ( x , y ) 2 ≡ � � ( A 2 i x 2 i + 2 M i x i y i + B 2 i y 2 i ) mod ( K , crossterms) j i �� �� j a 2 j b 2 where A i = ij , B i = ij and M i = � j a ij b ij . Note that by Cauchy-Schwarz, | M i | ≤ A i B i .
The example with large coefficients The constraints were 2 x 1 y 1 = y 1 , 2 x 2 y 2 = y 2 , 2 x 3 y 3 = y 3 , 2 x n y n = y n x 2 x 2 x 2 x 2 1 = x 1 , 2 = x 2 , 3 = x 3 , n = x n y 2 y 2 y 2 y 2 1 = y 2 , 2 = y 3 , 3 = y 4 , n = 0 Therefore � ( A 2 i x 2 i + 2 M i x i y i + B 2 i y 2 � ( A 2 i x i + M i y i + B 2 i ) ≡ i y i +1 ) mod ( K ) i i where y n +1 = 0.
The example with large coefficients � � ( A 2 i x i + M i y i + B 2 x i − 2 y 1 ≡ i y i +1 ) mod ( K , crossterms) i i Now we can drop the “mod”. This implies 1. A i = 1 for all i . 2. M 1 = − 2. 3. M i +1 = − B 2 i . Combining this with | M i | ≤ A i B i , we get B 1 ≥ 2 and B i +1 ≥ B 2 i . Therefore, B n ≥ 2 2 n − 1 . So, the largest coefficient is doubly exponential.
Part 2: Even approximately Degree-2 SOS proofs of the approximate version p n ( x , y ) ≥ − o n (1) needs coefficients of size 2 Ω(2 n ) . It turns out, we can look at p n ( x , y ) ≥ − 0 . 01.
Analysis of approximate case We can still disregard cross-terms x k x k ′ , x k y k ′ and y k y k ′ ( k � = k ′ ). But linear functions may have non-zero constant terms. j b ij y i + c j , � ℓ j ( x , y ) 2 becomes Therefore, if ℓ j = � i a ij x i + � � ( A 2 i x 2 i + 2 M i x i y i + B 2 i y 2 i + 2 U i x i + 2 V i y i ) + C 2 i where A i , B i and M i are as before, U i = � a ij c j , V i = � b ij c j , �� j c 2 and C = j .
Analysis of approximate case � ℓ j ( x , y ) 2 becomes � ( A 2 i x 2 i + 2 M i x i y i + B 2 i y 2 i + 2 U i x i + 2 V i y i ) + C 2 i where A i , B i and M i are as before, U i = � a ij c j , V i = � b ij c j , �� j c 2 and C = j . 1. By Cauchy-Schwarz, | U i | ≤ A i C and | V i | ≤ B i C . 2. Reducing modulo the ideal, we get i + 2 U i x i + 2 V i y i ) + C 2 = � ( A 2 i x 2 i + 2 M i x i y i + B 2 i y 2 i � (( A 2 i + 2 U i ) x i + ( M i + 2 V i ) y i + B 2 i y i +1 ) + C 2 i
Analysis of approximate case i + 2 U i x i + 2 V i y i ) + C 2 = � ( A 2 i x 2 i + 2 M i x i y i + B 2 i y 2 i i y i +1 ) + C 2 = � (( A 2 i + 2 U i ) x i + ( M i + 2 V i ) y i + B 2 i � x i − 2 y 1 + 0 . 01 1. C 2 = 0 . 01, 2. A 2 i + 2 U i = 1 ∀ i , 3. M 1 + 2 V 1 = − 2, 4. M i +1 + 2 V i +1 = − B 2 i ∀ i .
Analysis of approximate case 1. C 2 = 0 . 01, 2. A 2 i + 2 U i = 1 ∀ i , 3. M 1 + 2 V 1 = − 2, 4. M i +1 + 2 V i +1 = − B 2 i ∀ i .
Analysis of approximate case 1. C 2 = 0 . 01, 2. A 2 i + 2 U i = 1 ∀ i , 3. M 1 + 2 V 1 = − 2, 4. M i +1 + 2 V i +1 = − B 2 i ∀ i . 5. C = 0 . 1 and | U i | ≤ 0 . 1 A i , | V i | ≤ 0 . 1 B i .
Analysis of approximate case 1. C 2 = 0 . 01, 2. A 2 i + 2 U i = 1 ∀ i , 3. M 1 + 2 V 1 = − 2, 4. M i +1 + 2 V i +1 = − B 2 i ∀ i . 5. C = 0 . 1 and | U i | ≤ 0 . 1 A i , | V i | ≤ 0 . 1 B i . 6. A 2 i − 0 . 2 A i ≤ 1 = ⇒ A i ≤ 1 . 2 ∀ i .
Analysis of approximate case 1. C 2 = 0 . 01, 2. A 2 i + 2 U i = 1 ∀ i , 3. M 1 + 2 V 1 = − 2, 4. M i +1 + 2 V i +1 = − B 2 i ∀ i . 5. C = 0 . 1 and | U i | ≤ 0 . 1 A i , | V i | ≤ 0 . 1 B i . 6. A 2 i − 0 . 2 A i ≤ 1 = ⇒ A i ≤ 1 . 2 ∀ i . 7. | M 1 | ≥ 2 − 0 . 2 B 1 .
Analysis of approximate case 1. C 2 = 0 . 01, 2. A 2 i + 2 U i = 1 ∀ i , 3. M 1 + 2 V 1 = − 2, 4. M i +1 + 2 V i +1 = − B 2 i ∀ i . 5. C = 0 . 1 and | U i | ≤ 0 . 1 A i , | V i | ≤ 0 . 1 B i . 6. A 2 i − 0 . 2 A i ≤ 1 = ⇒ A i ≤ 1 . 2 ∀ i . 7. | M 1 | ≥ 2 − 0 . 2 B 1 . 8. | M i +1 | ≥ B 2 i − 0 . 2 B i +1 ∀ i .
Analysis of approximate case 1. C 2 = 0 . 01, 2. A 2 i + 2 U i = 1 ∀ i , 3. M 1 + 2 V 1 = − 2, 4. M i +1 + 2 V i +1 = − B 2 i ∀ i . 5. C = 0 . 1 and | U i | ≤ 0 . 1 A i , | V i | ≤ 0 . 1 B i . 6. A 2 i − 0 . 2 A i ≤ 1 = ⇒ A i ≤ 1 . 2 ∀ i . 7. | M 1 | ≥ 2 − 0 . 2 B 1 . 8. | M i +1 | ≥ B 2 i − 0 . 2 B i +1 ∀ i . Now combine with | M i | ≤ A i B i ≤ 1 . 2 B i to get 1. 1 . 2 B 1 ≥ 2 − 0 . 2 B 1 . 2. 1 . 2 B i +1 ≥ B 2 i − 0 . 2 B i +1 ∀ i . So B i ≥ 1 . 4(2 / 1 . 4 2 ) 2 i − 1 , which is doubly exponential.
Analysis of approximate case: Archimedean constraints The paper goes on to show that after adding the constraints x 2 i ≤ 1 and y 2 i ≤ 1, we still need doubly-exponential coefficients.
Recommend
More recommend