approximation algorithms I David Steurer Cornell Cargese Workshop, - - PowerPoint PPT Presentation
approximation algorithms I David Steurer Cornell Cargese Workshop, - - PowerPoint PPT Presentation
SUM - OF - SQUARES method and approximation algorithms I David Steurer Cornell Cargese Workshop, 2014 encoded as low-degree polynomial in meta-task 2 example: () = ,
meta-task given: functions π
1, β¦ , π π: Β±1 π β β
find: solution π¦ β Β±1 π to π
1 = 0, β¦ , π π = 0
encoded as low-degree polynomial in β π¦
example: π(π¦) = π,πβ π π₯ππ β π¦π β π¦π
2
examples: combinatorial optimization problem on graph π»
MAX CUT:
ππ» = 1 β π over Β±1 π
MAX BISECTION:
ππ» = 1 β π, π π¦π = 0 over Β±1 π
where 1 β π is guess for optimum value
Laplacian ππ» =
1 πΉ π» ππβπΉ π» 1 4 π¦π β π¦π 2
goal: develop SDP-based algorithms with provable guarantees in terms of complexity and approximation (βon the edge intractabilityβ ο need strongest possible relaxations)
price of convexity: individual solutions ο distributions over solutions price of tractability: can only enforce βefficiently checkable knowledgeβ about solutions individual solutions distributions over solutions βpseudo-distributions over solutionsβ
(consistent with efficiently checkable knowledge)
given: functions π
1, β¦ , π π: Β±1 π β β
find: solution π¦ β Β±1 π to π
1 = 0, β¦ , π π = 0
meta-task goal: develop SDP-based algorithms with provable guarantees in terms of complexity and approximation
distribution πΈ over Β±1 π function πΈ: Β±1 π β β non-negativity: πΈ π¦ β₯ 0 for all π¦ β Β±1 π normalization: π¦β Β±1 πΈ π¦ = 1 distribution πΈ satisfies π
1 = 0, β¦ , π π = 0 for some π π: Β±1 π β β
π½πΈπ
1 2 + β― + π π 2 = 0
(equivalently: βπΈ βπ. π
π β 0 = 0)
examples uniform distribution: πΈ = 2βπ fixed 2-bit parity: πΈ π¦ = (1 + π¦1π¦2)/2π examples fixed 2-bit parity distribution satisfies π¦1π¦2 = 1 uniform distribution does not satisfy π = 0 for any π β 0
convex: πΈ, πΈβ² satisfy conditions ο πΈ + πΈβ² /2 satisfies conditions
# function values is exponential ο need careful representation # independent inequalities is exponential ο not efficiently checkable
distribution πΈ over Β±1 π function πΈ: Β±1 π β β non-negativity: πΈ π¦ β₯ 0 for all π¦ β Β±1 π normalization: π¦β Β±1 πΈ π¦ = 1 distribution πΈ satisfies π
1 = 0, β¦ , π π = 0 for some π π: Β±1 π β β
π½πΈπ
1 2 + β― + π π 2 = 0
(equivalently: βπΈ βπ. π
π β 0 = 0)
deg.-π pseudo-distribution πΈ π¦β Β±1 π πΈ π¦ π π¦ 2 β₯ 0 for every deg.-π/2 polynomial π convenient notation: π½πΈπ β π¦ πΈ π¦ π π¦ βpseudo-expectation of π under πΈβ π½πΈ deg.-2π pseudo-distributions are actual distributions (point-indicators π π¦ have deg. π ο πΈ π¦ = π½πΈπ π¦
2
β₯ 0) pseudo-
deg.-π pseudo-distr. πΈ: Β±1 π β β non-negativity: π½πΈπ2 β₯ 0 for every deg.-π/2 poly. π normalization: π½πΈ1 = 1 pseudo-distr. πΈ satisfies π
1 = 0, β¦ , π π = 0 for some π π: Β±1 π β β
π½πΈπ
1 2 + β― + π π 2 = 0
(equivalently: π½πΈπ
π β π = 0 whenever deg π β€ π β deg π π)
notation: π½πΈπ β π¦ πΈ π¦ π π¦ , βpseudo-expectation of π under πΈβ
deg.-π pseudo-distr. πΈ: Β±1 π β β non-negativity: π½πΈπ2 β₯ 0 for every deg.-π/2 poly. π normalization: π½πΈ1 = 1 pseudo-distr. πΈ satisfies π
1 = 0, β¦ , π π = 0 for some π π: Β±1 π β β
π½πΈπ
1 2 + β― + π π 2 = 0
(equivalently: π½πΈπ
π β π = 0 whenever deg π β€ π β deg π π)
notation: π½πΈπ β π¦ πΈ π¦ π π¦ , βpseudo-expectation of π under πΈβ claim: can compute such πΈ in time ππ(π) if it exists (otherwise, certify that no
solution to original problem exists)
(can assume πΈ is deg.-π polynomial ο separation problem min
π
π½πΈπ2 is ππ-
- dim. eigenvalue prob. ο ππ(π)-time via grad. descent / ellipsoid method)
[Shor, Parrilo, Lasserre]
surprising property: π½πΈπ β₯ 0 for many* low-degree polynomials π such that π β₯ 0 follows from π
1 = 0, β¦ , π π = 0 by βexplicit proofβ
soon: examples of such properties and how to exploit them deg.-π pseudo-distr. πΈ: Β±1 π β β non-negativity: π½πΈπ2 β₯ 0 for every deg.-π/2 poly. π normalization: π½πΈ1 = 1 pseudo-distr. πΈ satisfies π
1 = 0, β¦ , π π = 0 for some π π: Β±1 π β β
π½πΈπ
1 2 + β― + π π 2 = 0
(equivalently: π½πΈπ
π β π = 0 whenever deg π β€ π β deg π π)
notation: π½πΈπ β π¦ πΈ π¦ π π¦ , βpseudo-expectation of π under πΈβ
surprising property: π½πΈπ β₯ 0 for many* low-degree polynomials π such that π β₯ 0 follows from π
1 = 0, β¦ , π π = 0 by βexplicit proofβ
deg.-π pseudo-distr. πΈ: Β±1 π β β non-negativity: π½πΈπ2 β₯ 0 for every deg.-π/2 poly. π normalization: π½πΈ1 = 1 pseudo-distr. πΈ satisfies π
1 = 0, β¦ , π π = 0 for some π π: Β±1 π β β
π½πΈπ
1 2 + β― + π π 2 = 0
(equivalently: π½πΈπ
π β π = 0 whenever deg π β€ π β deg π π)
notation: π½πΈπ β π¦ πΈ π¦ π π¦ , βpseudo-expectation of π under πΈβ soon: examples of such properties and how to exploit them emerging algorithm-design paradigm: analyze algorithm pretending that underlying actual distribution exists; verify only afterwards that low-deg. pseudo-distr.βs satisfy required properties
pseudo-distr. over
- ptimal solutions
approximate solution (to original problem) efficient algorithm deg.-π part of actual distr.
- ver optimal solutions
ππ(π)-time algorithms cannot* distinguish between deg.-π pseudo-distributions and deg.-π part of actual distr.βs
dual view (sum-of-squares proof system)
either β deg.-π pseudo-distribution πΈ over Β±1 π satisfying π
1 = 0, β¦ , π π = 0
- r
β π1, β¦ , ππ and β1, β¦ , βπ such that π π
π β ππ + π βπ 2 = β1 over Β±1 π
and deg π
π + deg ππ β€ π and deg βπ β€ π/2
derivation of unsatisfiable constraint β1 β₯ 0 from π
1 = 0, β¦ , π π = 0 over Β±1 π
β1
πΈ
π π
1
π
2
π
π
πΏπ = π = π π
π β ππ + π βπ 2
πΏπ if β1 β πΏπ then β separating hyperplane πΈ with π½πΈ β 1 = β1 and π½πΈπ β₯ 0 for all π β πΏπ
pseudo-distribution satisfies all local properties of Β±π π claim suppose π β₯ 0 is π/2-junta over Β±1 π (depends on β€ π/2 coordinates) then, π½πΈπ β₯ 0 proof: π has degree β€ π/2 ο π½πΈπ = π½πΈ π
2 β₯ 0
corollary for any set π of β€ π coordinates, marginal πΈβ² = π¦π πΈ is actual distribution πΈβ² π¦π =
π¦ π βπ
πΈ π¦π, π¦ π βπ = π½πΈπ π¦π β₯ 0 π-junta (also captured by LP methods, e.g., SheraliβAdams hierarchies β¦ ) example: triangle inequalities over Β±1 π
π½πΈ π¦π β π¦π
2 + π¦π β π¦π 2 β π¦π β π¦π 2 β₯ 0
conditioning pseudo-distributions claim βπ β π , π β Β±1 . πΈβ² = π¦ β£ π¦π = π πΈ is deg.- π β 2 pseudo-distr. proof πΈβ² π¦ =
1 βπΈ π¦π=π πΈ π¦ β 1 π¦π=π
ο π½πΈβ²π2 β π½πΈ1 π¦π=π π2 = π½πΈ 1 π¦π=π π
2
β₯ 0 deg π β€ (π β 2)/2 deg π π¦π=π π β€ π/2 (also captured by LP methods, e.g., SheraliβAdams hierarchies β¦ )
pseudo-covariances are covariances of distributions over βπ claim there exists a (Gaussian) distr. π over βπ such that π½πΈπ¦ = π½ π and π½πΈπ¦π¦π = π½ πππ let π = π½πΈπ¦ and π = π½πΈ π¦ β π π¦ β π π choose π to be Gaussian with mean π and covariance π matrix π p.s.d. because π€πππ€ = π½πΈ π€ππ¦ 2 β₯ 0 for all π€ β βπ consequence: π½πΈπ = π½ π π for every π of deg. 2 square of linear form proof
claim for every univariate π β₯ 0 over β and every π-variate polynomial π with deg π β deg π β€ π, π½πΈπ π π¦ β₯ 0 enough to show: π is sum of squares choose: minimizer π½ of π proof by induction on deg π squares sum of squares by ind. hyp.
π½
π π½ β₯ 0
then: p= π π½ + π¦ β π½ 2 β πβ² for some polynomial πβ² with deg πβ² < deg π β pseudo-distr.βs satisfy (compositions of) low-deg. univariate properties useful class of non-local higher-deg. inequalities π
MAX CUT
given: deg.-π pseudo-distr. πΈ over Β±1 π, satisfies ππ» = 1 β π
ππ» =
1 πΉ π» ππβπΉ π» 1 4 π¦π β π¦π 2
goal: find π§ β Β±1 π with ππ» π§ β₯ 1 β π π algorithm sample from Gaussian distr. π over βπ with π½ πππ = π½πΈ π¦π¦π
- utput π§ = sgn π
analysis claim: βπΈ π¦π β π¦π = 1 β π β β π§π β π§π β₯ 1 β π π proof: ππ, ππ satisfies βπ½ ππππ = β π½πΈπ¦ππ¦π = 1 β π π and π½ππ
2 = π½ππ 2 = 1
ο (tedious calculation) ο β sgn ππ β sgn ππ β₯ 1 β π π
[Goeman-Williamson]
low global correlation in (pseudo-)distributions claim βπ . β deg.- π β 2π pseudo-distribution πΈβ², obtained by conditioning πΈ, Avgπ,πβ π π½πΈβ² π¦π, π¦π β€ 1/π
[Barak-Raghavendra-S., Raghavendra-Tan]
proof potential Avgπβ π πΌ π¦π ; greedily condition on variables to maximize potential decrease until global correlation is low mutual information: π½ π¦, π§ = πΌ π¦ β πΌ π¦ π§ potential decrease β₯ Avgπβ π πΌ π¦π β Avgπβ π Avgπβ π πΌ π¦π β£ π¦π = Avgπ,πβ π π½πΈβ² π¦π, π¦π how often do we need to condition? ο only need to condition β€ π times
MAX BISECTION
given: deg.-π pseudo-distr. πΈ over Β±1 π, satisfies ππ» = 1 β π, π π¦π = 0 goal: find π§ β Β±1 π with ππ» π§ β₯ 1 β π π and π π§π = 0 π = 1/ππ 1 algorithm let πΈβ² be conditioning of πΈ with global correlation β€ ππ 1 sample Gaussian π with same deg.-2 moments as πΈβ²
- utput π§ with π§π = sgn(ππ β π’π) (choose π’π β β so that π½ π§π =
π½πΈπ¦π) analysis almost as before: βπΈβ² π¦π β π¦π β₯ 1 β π β β π§π β π§π β₯ 1 β π π
(π’π = 0 is worst case ο same analysis as MAX CUT)
new: π½ π¦π, π¦π β€ ππ 1 β π½π§ππ§π = π½ π¦ππ¦π Β± ππ(1) ο π½ π π§π β€ π½ π π§π 2 1/2 = π½ π π¦π 2 1/2 + ππ 1 β π = ππ 1 β π ο get bisection π§β² from π§ by correcting ππ(1) fraction of vertices
[Raghavendra-Tan]
π½ π π¦π 2 = 0
Cargese Workshop, 2014
SUM-OF-SQUARES method and
approximation algorithms II
David Steurer
Cornell
sparse vector given: linear subspace π β βπ (represented by some basis), parameter π β π promise: βπ€0 β π such that π€0 is π-sparse (and π€0 β 0, Β±1 π) goal: find π-sparse vector π€ β π efficient approximation algorithm for π = Ξ© π would be major step toward refuting Khotβs Unique Games Conjecture and improved guarantees for MAX CUT, VERTEX COVER, β¦ planted / average-case version (benchmark for unsupervised learning tasks) subspace π spanned by π β 1 random vectors and some π-sparse vector π€0 previous best algorithms only work for very sparse vectors
π π β€ 1/ π
[Spielman-Wang-Wright, Demanet-Hand]
here: deg.-4 pseudo-distributions work for
π π = Ξ© π up to π β€ π
π
[Barak-Kelner-S.]
limitations of ββ/β1 (previous best algorithm; exact via linear programming) limitations of std. SDP relaxation for β2/β1 (best proxy for sparsity) analytical proxy for sparsity if vector π€ is π-sparse then
π€ β π€ 1 β₯ 1 π , π€ 2
2
π€ 1
2 β₯
1 π , and π€ 4
4
π€ 2
4 β₯
1 π
(tight if π€ β 0, Β±1 π)
π€ = sum of π random Β±1 vectors with same first coordinate β β π€ β β₯ π , β β π€ 1 β€ π + π π ο ratio β
π π
ο ββ/β1 algorithm fails for
π π β₯ 1 π
βideal objectβ: distribution πΈ over β2 unit sphere of subspace π β1-constraint: π½πΈ π€ 1
2 β€ π
tractable relaxation: π,π π½πΈπ€ππ€π β€ π
not a low-deg. polynomial in π€ ο unclear how to represent (also NP-hard in worst-case)
[d'Aspremont-El Ghaoui-Jordan-Lanckriet]
but: for uniform distr. πΈ over β2 sphere of π-dim. rand. subspace π,π π½πΈπ€ππ€π β
π π ο same limitation as ββ/β1
deg.-π pseudo-distr. πΈ: π€ β π; π€ 2 = 1 β β over unit β2-sphere of π degree-π SOS relaxation for β4/β2 pseudo-distribution satisfies π€ 4
4 = 1/π
notation: π½πΈπ β π€βπ;
π€ =1
πΈ β π (only consider polynomials ο easy to integrate) normalization: π½πΈ1 = 1 non-negativity: π½πΈβ(π€)2 β₯ 0 for every β of deg. β€ π/2
- rthogonality:
π½πΈ π€ 4
4 β 1 π β π(π€) = 0 for every π of deg. β€ π β 4
set of deg.-π pseudo-distributions = convex set with ππ π -time separation oracle separation problem given: function πΈ (represented as deg.-π polynomial) check: quadratic form π β¦ π½πΈπ2 is p.s.d. or output violated constraint π½πΈπ2 < 0 how to find pseudo-distributions? deg.-π pseudo-distr. πΈ: π€ β π; π€ 2 = 1 β β over unit β2-sphere of π degree-π SOS relaxation for β4/β2 pseudo-distribution satisfies π€ 4
4 = 1/π
notation: π½πΈπ β π€βπ;
π€ =1
πΈ β π (only consider polynomials ο easy to integrate) normalization: π½πΈ1 = 1 non-negativity: π½πΈβ(π€)2 β₯ 0 for every β of deg. β€ π/2
- rthogonality:
π½πΈ π€ 4
4 β 1 π β π(π€) = 0 for every π of deg. β€ π β 4
rule of thumb: set of deg.-π pseudo-moments π½πΈπ β£ deg π β€ π difficult* to distinguish / separate from deg.-π moments of actual distr. of solutions
(* unless you invest πΞ© π time to distinguish)
also: values π½πΈπ β£ deg π > π do not carry additional information ο no need to look at them how to use pseudo-distributions? deg.-π pseudo-distr. πΈ: π€ β π; π€ 2 = 1 β β over unit β2-sphere of π degree-π SOS relaxation for β4/β2 pseudo-distribution satisfies π€ 4
4 = 1/π
notation: π½πΈπ β π€βπ;
π€ =1
πΈ β π (only consider polynomials ο easy to integrate) normalization: π½πΈ1 = 1 non-negativity: π½πΈβ(π€)2 β₯ 0 for every β of deg. β€ π/2
- rthogonality:
π½πΈ π€ 4
4 β 1 π β π(π€) = 0 for every π of deg. β€ π β 4
dual view (SOS certificates) π€ 4
4 β 1 π β π + π βπ 2 = β1 over π€ β π; π€ 2 = 1
for some π of deg. β€ π β 4 and {βπ} of deg. β€ π/2
β no deg.-π pseudo-distr. exists (ο no solution exists)
for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate directly) deg.-π pseudo-distr. πΈ: π€ β π; π€ 2 = 1 β β over unit β2-sphere of π degree-π SOS relaxation for β4/β2 pseudo-distribution satisfies π€ 4
4 = 1/π
notation: π½πΈπ β π€βπ;
π€ =1
πΈ β π (only consider polynomials ο easy to integrate) normalization: π½πΈ1 = 1 non-negativity: π½πΈβ(π€)2 β₯ 0 for every β of deg. β€ π/2
- rthogonality:
π½πΈ π€ 4
4 β 1 π β π(π€) = 0 for every π of deg. β€ π β 4
CauchyβSchwarz inequality HΓΆlderβs inequality β4-triangle inequality π½πΈ π£, π€ β€ π½πΈ π£ 2 1/2 π½πΈ π€ 2 1/2 let πΈ = π£, π€ be a deg.-4 pseudo-distribution over βπ Γ βπ π½πΈ π π£π
3 β π€π β€
π½πΈ π£ 4
4 3/4
π½πΈ π€ 4
4 1/4
π½πΈ π£ + π€ 4
4 β€
π½πΈ π£ 4
4 1/4 +
π½πΈ π€ 4
4 1/4
following inequalities hold as expected (same as for distributions) general properties of pseudo-distributions
claim let πβ² β βπ be a random π-dim. subspace with π βͺ π let πβ² be the orthogonal projector into πβ² then w.h.p, πβ²π€ 4
4 = π 1 π
π€ 2
4 β π βπ π€ 2 over π€ β βπ for βπβs of deg. 4
[Barak-Brandao-Harrow-Kelner-S.-Zhou]
proof sketch (SOS certificate for classical inequality πβ²π€ 4
4 β€ π 1 π
π€ 2
4)
basis change: let π¦ = πΆππ€ where πΆβs columns are orthonormal basis of π (so that πβ² = πΆπΆπ) ο πβ²π€ 4
4 = 1 π2 π ππ, π¦ 4 with π1, β¦ , ππ close to i.i.d.
standard Gaussian vectors (so that π½π π, π¦ 2 = π¦ 2
2 and π½π π, π¦ 4 = 3 β
π¦ 2
4)
enough to show:
1 π π=1 π
ππ, π¦ 4 = π 1 β π½π π, π¦ 4 β π βπ
β² π¦ 2
reduce to deg. 2:
1 π π=1 π
ππ
β2, π§ 2 β€ π 1 β π½π πβ2, π§ 2 (π§ = π¦β2)
ο use concentration inequalities for quadratic forms (aka matrices)
given: some basis of subspace π = span πβ² βͺ π€0 β βπ, where πβ² β βπ random π-dim. subspace, and π€0 β βπ with π€0 β₯ πβ², π€0 4
4 = 1 π, and π€0 2 4 = 1 (e.g., π-sparse)
approximation algorithm for planted sparse vector compute deg.-4 pseudo-distr. πΈ = {π€} over unit ball of π satisfying π€ 4
4 = 1 π
goal: find unit vector π₯ with π₯, π€0 2 β₯ 1 β π π/π 1/4 algorithm sample Gaussian distr. π₯ with π½ π₯π₯π = π½πΈπ€π€π and renormalize analysis claim: π½πΈ π€, π€0 2 β₯ 1 β π π/π 1/4 (ο Gaussian π₯ almost 1-dim.)
analysis claim: π½πΈ π€, π€0 2 β₯ 1 β π π/π 1/4 (ο Gaussian π₯ almost 1-dim.)
1 π1/4 =
π½πΈ π€ 4
4 1/4
(πΈ satisfies β β π€ 4
4 =
1 π )
= π½πΈ π€, π€0 π€0 + πβ²π€ 4
4 1/4
(same function)
β€ π½πΈ π€, π€0 π€0 4
4 1 4 +
π½πΈ πβ²π€ 4
4 1 4
(β4-triangle inequ.)
β€
1 π
1 4 β
π½πΈ π€, π€0 4
1 4 + π 1 π1/4
(SOS cert. for πβ²)
ο π½πΈ π€, π€0 4 β₯ 1 β π π/π 1/4 ο π½πΈ π€, π€0 2 β₯ 1 β π π/π 1/4
(because π€, π€0 4 = 1 β πβ²π€ 2
2
π€, π€0 2)
given: some basis of subspace π = span πβ² βͺ π€0 β βπ, where πβ² β βπ random π-dim. subspace, and π€0 β βπ with π€0 β₯ πβ², π€0 4
4 = 1 π, and π€0 2 4 = 1 (e.g., π-sparse)
approximation algorithm for planted sparse vector goal: find unit vector π₯ with π₯, π€0 2 β₯ 1 β π π/π 1/4
CauchyβSchwarz inequality HΓΆlderβs inequality β4-triangle inequality π½πΈ π£, π€ β€ π½πΈ π£ 2 1/2 π½πΈ π€ 2 1/2 let πΈ = π£, π€ be a deg.-4 pseudo-distribution over βπ Γ βπ π½πΈ π π£π
3 β π€π β€
π½πΈ π£ 4
4 3/4
π½πΈ π€ 4
4 1/4
π½πΈ π£ + π€ 4
4 β€
π½πΈ π£ 4
4 1/4 +
π½πΈ π€ 4
4 1/4
following inequalities hold as expected (same as for distributions) general properties of pseudo-distributions
products of pseudo-distributions claim suppose πΈ, πΈβ²: Ξ© β β is deg.-π pseudo-distr. over Ξ© then, πΈ β πΈβ²: Ξ© Γ Ξ© β β is deg.-π pseudo-distr. over Ξ© Γ Ξ© proof tensor products of positive semidefinite matrices are positive semidefinite
CauchyβSchwarz inequality π½πΈ π£, π€ β€ π½πΈ π£ 2
2 1/2
π½πΈ π€ 2
2 1/2
let πΈ = π£, π€ be a deg.-2 pseudo-distribution over βπ Γ βπ π½πΈ π£, π€
2
= π½πΈβπΈ π£, π€ π£β², π€β²
(πΈ β πΈβ² is product pseudo-distr.)
= π½πΈβπΈ ππ π£ππ€ππ£π
β²π€π β²
β€
1 2
π½πΈβπΈ ππ π£π
2 π€π β² 2 + ππ π£π β² 2π€π 2
(2ππ = π2 + π2 β π β π 2)
=
1 2
π½πΈβπΈ π£ 2
2 π€β² 2 2 + π£β² 2 2 π€ 2 2
= π½πΈ π£ 2
2 β
π½πΈ π€ 2
2
(πΈ β πΈβ² is product pseudo-distr.)
proof
let πΈ = π£, π€ be a deg.-4 pseudo-distribution over βπ Γ βπ proof HΓΆlderβs inequality π½πΈ π π£π
3 β π€π β€
π½πΈ π£ 4
4 3/4
π½πΈ π€ 4
4 1/4
π½πΈ π π£π
3 β π€π
β€ π½πΈ π π£π
4 1 2 β
π½πΈ π π£π
2 β π€π 2 1/2
(Cauchy-Schwarz)
β€ π½πΈ π π£π
4 1 2 β
π½πΈ π π£π
4 β
π½πΈ π π€π
4 1/4
(Cauchy-Schwarz)
we also used: {π£, π€} deg-4 pseudo-distr. ο π£ β π£, π£ β π€ deg.-2 pseudo-distr. (every deg.-2 poly. in π£ β π£, π£ β π€ is deg.-4 poly. in π£, π€ )
let πΈ = π£, π€ be a deg.-4 pseudo-distribution over βπ Γ βπ β4-triangle inequality π½πΈ π£ + π€ 4
4 1/4 β€
π½πΈ π£ 4
4 1/4 +
π½πΈ π€ 4
4 1/4
proof expand π£ + π€ 4
4 in terms of π π£π 4, π π£π 3π€π, π π£π 2π€π 2, π π£ππ€π 3, π π€π 4
bound pseudo-expect. of βmixed termsβ using Cauchy-Schwarz / HΓΆlder check that total is equal to right-hand side
tensor decomposition given: tensor π β π ππ
β4 (in spectral norm) for nice π1, β¦ , ππ β βπ
goal: find set of vectors πΆ β Β±π1, β¦ , Β±ππ for simplicity: orthonormal and π = π approach show βuniquenessβ: π ππ
β4 β π ππ β4 β Β±π1, β¦ , Β±ππ β Β±π1, β¦ , Β±ππ