approximation algorithms i
play

approximation algorithms I David Steurer Cornell Cargese Workshop, - PowerPoint PPT Presentation

SUM - OF - SQUARES method and approximation algorithms I David Steurer Cornell Cargese Workshop, 2014 encoded as low-degree polynomial in meta-task 2 example: () = ,


  1. SUM - OF - SQUARES method and approximation algorithms I David Steurer Cornell Cargese Workshop, 2014

  2. encoded as low-degree polynomial in ℝ 𝑦 meta-task 2 example: 𝑔(𝑦) = 𝑗,π‘˜βˆˆ π‘œ π‘₯ π‘—π‘˜ β‹… 𝑦 𝑗 βˆ’ 𝑦 π‘˜ 𝑛 : Β±1 π‘œ β†’ ℝ given: functions 𝑔 1 , … , 𝑔 solution 𝑦 ∈ Β±1 π‘œ to 𝑔 find: 1 = 0, … , 𝑔 𝑛 = 0 2 1 1 𝐹 𝐻 π‘—π‘˜βˆˆπΉ 𝐻 Laplacian 𝑀 𝐻 = 4 𝑦 𝑗 βˆ’ 𝑦 π‘˜ examples: combinatorial optimization problem on graph 𝐻 𝑀 𝐻 = 1 βˆ’ 𝜁 over Β±1 π‘œ MAX CUT : where 1 βˆ’ 𝜁 is guess for optimum value 𝑀 𝐻 = 1 βˆ’ 𝜁, 𝑗 𝑦 𝑗 = 0 over Β±1 π‘œ MAX BISECTION : goal: develop SDP-based algorithms with provable guarantees in terms of complexity and approximation (β€œon the edge intractability” οƒ  need strongest possible relaxations)

  3. meta-task 𝑛 : Β±1 π‘œ β†’ ℝ given: functions 𝑔 1 , … , 𝑔 solution 𝑦 ∈ Β±1 π‘œ to 𝑔 find: 1 = 0, … , 𝑔 𝑛 = 0 goal: develop SDP-based algorithms with provable guarantees in terms of complexity and approximation price of convexity: individual solutions οƒ  distributions over solutions price of tractability: can only enforce β€œefficiently checkable knowledge” about solutions distributions over solutions individual solutions β€œpseudo - distributions over solutions” (consistent with efficiently checkable knowledge)

  4. examples uniform distribution: 𝐸 = 2 βˆ’π‘œ distribution 𝐸 over Β±1 π‘œ fixed 2-bit parity: 𝐸 𝑦 = (1 + 𝑦 1 𝑦 2 )/2 π‘œ function 𝐸: Β±1 π‘œ β†’ ℝ # function values is exponential οƒ  need careful representation non-negativity: 𝐸 𝑦 β‰₯ 0 for all 𝑦 ∈ Β±1 π‘œ normalization: π‘¦βˆˆ Β±1 𝐸 𝑦 = 1 # independent inequalities is exponential οƒ  not efficiently checkable 𝑗 : Β±1 π‘œ β†’ ℝ distribution 𝐸 satisfies 𝑔 1 = 0, … , 𝑔 𝑛 = 0 for some 𝑔 2 + β‹― + 𝑔 2 = 0 ( equivalently: β„™ 𝐸 βˆ€π‘—. 𝑔 𝔽 𝐸 𝑔 𝑗 β‰  0 = 0 ) 1 𝑛 convex: 𝐸, 𝐸′ satisfy conditions οƒ  𝐸 + 𝐸 β€² /2 satisfies conditions examples fixed 2-bit parity distribution satisfies 𝑦 1 𝑦 2 = 1 uniform distribution does not satisfy 𝑔 = 0 for any 𝑔 β‰  0

  5. deg.- 𝑒 pseudo-distribution 𝐸 convenient notation: 𝔽 𝐸 𝑔 ≔ 𝑦 𝐸 𝑦 𝑔 𝑦 distribution 𝐸 over Β±1 π‘œ β€œ pseudo-expectation of 𝑔 under 𝐸 ” function 𝐸: Β±1 π‘œ β†’ ℝ non-negativity: 𝐸 𝑦 β‰₯ 0 for all 𝑦 ∈ Β±1 π‘œ π‘¦βˆˆ Β±1 π‘œ 𝐸 𝑦 𝑔 𝑦 2 β‰₯ 0 for normalization: π‘¦βˆˆ Β±1 𝐸 𝑦 = 1 every deg.- 𝑒/2 polynomial 𝑔 pseudo- 𝑗 : Β±1 π‘œ β†’ ℝ distribution 𝐸 satisfies 𝑔 1 = 0, … , 𝑔 𝑛 = 0 for some 𝑔 2 + β‹― + 𝑔 2 = 0 ( equivalently: β„™ 𝐸 βˆ€π‘—. 𝑔 𝔽 𝐸 𝑔 𝑗 β‰  0 = 0 ) 𝔽 𝐸 1 𝑛 deg.- 2π‘œ pseudo-distributions are actual distributions 2 (point-indicators 𝟐 𝑦 have deg. π‘œ οƒ  𝐸 𝑦 = 𝔽 𝐸 𝟐 𝑦 β‰₯ 0 )

  6. deg.- 𝑒 pseudo-distr. 𝐸: Β±1 π‘œ β†’ ℝ notation: 𝔽 𝐸 𝑔 ≔ 𝑦 𝐸 𝑦 𝑔 𝑦 , β€œ pseudo-expectation of 𝑔 under 𝐸 ” 𝔽 𝐸 𝑔 2 β‰₯ 0 for every deg.- 𝑒/2 poly. 𝑔 non-negativity: normalization: 𝔽 𝐸 1 = 1 𝑗 : Β±1 π‘œ β†’ ℝ pseudo-distr. 𝐸 satisfies 𝑔 1 = 0, … , 𝑔 𝑛 = 0 for some 𝑔 2 + β‹― + 𝑔 2 = 0 ( equivalently: 𝔽 𝐸 𝑔 𝔽 𝐸 𝑔 𝑗 β‹… 𝑕 = 0 whenever deg 𝑕 ≀ 𝑒 βˆ’ deg 𝑔 𝑗 ) 1 𝑛

  7. deg.- 𝑒 pseudo-distr. 𝐸: Β±1 π‘œ β†’ ℝ notation: 𝔽 𝐸 𝑔 ≔ 𝑦 𝐸 𝑦 𝑔 𝑦 , β€œ pseudo-expectation of 𝑔 under 𝐸 ” 𝔽 𝐸 𝑔 2 β‰₯ 0 for every deg.- 𝑒/2 poly. 𝑔 non-negativity: normalization: 𝔽 𝐸 1 = 1 𝑗 : Β±1 π‘œ β†’ ℝ pseudo-distr. 𝐸 satisfies 𝑔 1 = 0, … , 𝑔 𝑛 = 0 for some 𝑔 2 + β‹― + 𝑔 2 = 0 ( equivalently: 𝔽 𝐸 𝑔 𝔽 𝐸 𝑔 𝑗 β‹… 𝑕 = 0 whenever deg 𝑕 ≀ 𝑒 βˆ’ deg 𝑔 𝑗 ) 1 𝑛 claim: can compute such 𝐸 in time π‘œ 𝑃(𝑒) if it exists (otherwise, certify that no solution to original problem exists) [Shor, Parrilo, Lasserre] 𝔽 𝐸 𝑔 2 is π‘œ 𝑒 - (can assume 𝐸 is deg.- 𝑒 polynomial οƒ  separation problem min 𝑔 dim. eigenvalue prob. οƒ  π‘œ 𝑃(𝑒) -time via grad. descent / ellipsoid method)

  8. deg.- 𝑒 pseudo-distr. 𝐸: Β±1 π‘œ β†’ ℝ notation: 𝔽 𝐸 𝑔 ≔ 𝑦 𝐸 𝑦 𝑔 𝑦 , β€œ pseudo-expectation of 𝑔 under 𝐸 ” 𝔽 𝐸 𝑔 2 β‰₯ 0 for every deg.- 𝑒/2 poly. 𝑔 non-negativity: normalization: 𝔽 𝐸 1 = 1 𝑗 : Β±1 π‘œ β†’ ℝ pseudo-distr. 𝐸 satisfies 𝑔 1 = 0, … , 𝑔 𝑛 = 0 for some 𝑔 2 + β‹― + 𝑔 2 = 0 ( equivalently: 𝔽 𝐸 𝑔 𝔽 𝐸 𝑔 𝑗 β‹… 𝑕 = 0 whenever deg 𝑕 ≀ 𝑒 βˆ’ deg 𝑔 𝑗 ) 1 𝑛 surprising property: 𝔽 𝐸 𝑔 β‰₯ 0 for many* low-degree polynomials 𝑔 such that 𝑔 β‰₯ 0 follows from 𝑔 1 = 0, … , 𝑔 𝑛 = 0 by β€œexplicit proof” soon: examples of such properties and how to exploit them

  9. deg.- 𝑒 pseudo-distr. 𝐸: Β±1 π‘œ β†’ ℝ notation: 𝔽 𝐸 𝑔 ≔ 𝑦 𝐸 𝑦 𝑔 𝑦 , β€œ pseudo-expectation of 𝑔 under 𝐸 ” 𝔽 𝐸 𝑔 2 β‰₯ 0 for every deg.- 𝑒/2 poly. 𝑔 non-negativity: normalization: 𝔽 𝐸 1 = 1 𝑗 : Β±1 π‘œ β†’ ℝ pseudo-distr. 𝐸 satisfies 𝑔 1 = 0, … , 𝑔 𝑛 = 0 for some 𝑔 2 + β‹― + 𝑔 2 = 0 π‘œ 𝑝(𝑒) -time algorithms cannot* distinguish ( equivalently: 𝔽 𝐸 𝑔 𝔽 𝐸 𝑔 𝑗 β‹… 𝑕 = 0 whenever deg 𝑕 ≀ 𝑒 βˆ’ deg 𝑔 𝑗 ) 1 𝑛 between deg.- 𝑒 pseudo-distributions and deg.- 𝑒 part of actual distr.’s surprising property: 𝔽 𝐸 𝑔 β‰₯ 0 for many* low-degree polynomials 𝑔 such that 𝑔 β‰₯ 0 follows from 𝑔 1 = 0, … , 𝑔 𝑛 = 0 by β€œexplicit proof” soon: examples of such properties and how to exploit them deg.- 𝑒 part of actual distr. over optimal solutions pseudo-distr. over approximate solution efficient algorithm optimal solutions (to original problem) emerging algorithm-design paradigm: analyze algorithm pretending that underlying actual distribution exists; verify only afterwards that low-deg. pseudo- distr.’s satisfy required properties

  10. dual view (sum-of-squares proof system) either βˆƒ deg.- 𝑒 pseudo-distribution 𝐸 over Β±1 π‘œ satisfying 𝑔 1 = 0, … , 𝑔 𝑛 = 0 or 2 = βˆ’1 over Β±1 π‘œ βˆƒ 𝑕 1 , … , 𝑕 𝑛 and β„Ž 1 , … , β„Ž 𝑙 such that 𝑗 𝑔 𝑗 β‹… 𝑕 𝑗 + π‘˜ β„Ž π‘˜ and deg 𝑔 𝑗 + deg 𝑕 𝑗 ≀ 𝑒 and deg β„Ž 𝑗 ≀ 𝑒/2 derivation of unsatisfiable constraint βˆ’1 β‰₯ 0 𝑛 = 0 over Β±1 π‘œ from 𝑔 1 = 0, … , 𝑔 βˆ’1 𝐿 𝑒 𝑔 𝑔 1 𝐸 if βˆ’1 βˆ‰ 𝐿 𝑒 then βˆƒ separating hyperplane 𝐸 𝑛 𝑔 𝑔 2 with 𝔽 𝐸 βˆ’ 1 = βˆ’1 and 𝔽 𝐸 𝑔 β‰₯ 0 for all 𝑔 ∈ 𝐿 𝑒 2 𝐿 𝑒 = 𝑔 = 𝑗 𝑔 𝑗 β‹… 𝑕 𝑗 + π‘˜ β„Ž π‘˜

  11. pseudo-distribution satisfies all local properties of ±𝟐 𝒐 example: triangle inequalities over Β±1 π‘œ 2 + 𝑦 π‘˜ βˆ’ 𝑦 𝑙 2 βˆ’ 𝑦 𝑗 βˆ’ 𝑦 𝑙 2 β‰₯ 0 𝔽 𝐸 𝑦 𝑗 βˆ’ 𝑦 π‘˜ claim suppose 𝑔 β‰₯ 0 is 𝑒/2 -junta over Β±1 π‘œ (depends on ≀ 𝑒/2 coordinates) then, 𝔽 𝐸 𝑔 β‰₯ 0 2 β‰₯ 0 𝑔 has degree ≀ 𝑒/2 οƒ  𝔽 𝐸 𝑔 = proof: 𝔽 𝐸 𝑔 corollary for any set 𝑇 of ≀ 𝑒 coordinates, marginal 𝐸 β€² = 𝑦 𝑇 𝐸 is actual distribution 𝐸 β€² 𝑦 𝑇 = 𝐸 𝑦 𝑇 , 𝑦 π‘œ βˆ–π‘‡ = 𝔽 𝐸 𝟐 𝑦 𝑇 β‰₯ 0 𝑦 π‘œ βˆ–π‘‡ 𝑒 -junta (also captured by LP methods, e.g., Sherali –Adams hierarchies … )

  12. conditioning pseudo-distributions claim βˆ€π‘— ∈ π‘œ , 𝜏 ∈ Β±1 . 𝐸 β€² = 𝑦 ∣ 𝑦 π‘˜ = 𝜏 𝐸 is deg.- 𝑒 βˆ’ 2 pseudo-distr. proof 𝐸 β€² 𝑦 = 1 β„™ 𝐸 𝑦 π‘˜ =𝜏 𝐸 𝑦 β‹… 1 𝑦 π‘˜ =𝜏 2 𝔽 𝐸 β€² 𝑔 2 ∝ 𝔽 𝐸 1 𝑦 π‘˜ =𝜏 𝑔 2 = οƒ  𝔽 𝐸 1 𝑦 π‘˜ =𝜏 𝑔 β‰₯ 0 deg 𝟐 𝑦 π‘˜ =𝜏 𝑔 ≀ 𝑒/2 deg 𝑔 ≀ (𝑒 βˆ’ 2)/2 (also captured by LP methods, e.g., Sherali –Adams hierarchies … )

  13. pseudo-covariances are covariances of distributions over ℝ 𝒐 claim there exists a (Gaussian) distr. 𝜊 over ℝ π‘œ such that 𝔽 𝐸 𝑦𝑦 π‘ˆ = 𝔽 𝜊𝜊 π‘ˆ 𝔽 𝐸 𝑦 = 𝔽 𝜊 and consequence: 𝔽 𝐸 π‘Ÿ = 𝔽 𝜊 π‘Ÿ proof for every π‘Ÿ of deg. 2 let 𝜈 = 𝔽 𝐸 𝑦 and 𝑁 = 𝑦 βˆ’ 𝜈 π‘ˆ 𝔽 𝐸 𝑦 βˆ’ 𝜈 choose 𝜊 to be Gaussian with mean 𝜈 and covariance 𝑁 𝔽 𝐸 𝑀 π‘ˆ 𝑦 2 β‰₯ 0 for all 𝑀 ∈ ℝ π‘œ matrix 𝑁 p.s.d. because 𝑀 π‘ˆ 𝑁𝑀 = square of linear form

  14. pseudo- distr.’s satisfy (compositions of) low -deg. univariate properties claim for every univariate π‘ž β‰₯ 0 over ℝ and every π‘œ -variate polynomial π‘Ÿ with deg π‘ž β‹… deg π‘Ÿ ≀ 𝑒 , useful class of non-local 𝔽 𝐸 π‘ž π‘Ÿ 𝑦 β‰₯ 0 higher-deg. inequalities π‘ž enough to show: π‘ž is sum of squares proof by induction on deg π‘ž π‘ž 𝛽 β‰₯ 0 choose: minimizer 𝛽 of π‘ž 𝛽 ℝ then: p = π‘ž 𝛽 + 𝑦 βˆ’ 𝛽 2 β‹… π‘ž β€² for some polynomial 𝑄′ with deg π‘žβ€² < deg π‘ž squares sum of squares by ind. hyp.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend