SLIDE 1
Lecture 14: Planted Sparse Vector
SLIDE 2 Lecture Outline
- Part I: Planted Sparse Vector and 2 to 4 Norm
- Part II: SOS and 2 to 4 Norm on Random
Subspaces
- Part III: Warmup: Showing π¦
β 1
- Part IV: 4-Norm Analysis
- Part V: SOS-symmetry to the Rescue
- Part VI: Observations and Loose Ends
- Part VII: Open Problems
SLIDE 3
Part I: Planted Sparse Vector and 2 to 4 Norm
SLIDE 4
- Planted Sparse Vector problem: Given the span
- f π β 1 random vectors in βπ and one unit
vector π€ β βπ of sparsity π, can we recover π€?
- More precisely, let π be an n Γ π matrix where:
1. π β 1 columns of π are vectors of length β 1 chosen randomly from βπ
- 2. One column of π is a unit vector π€ with β€ π
nonzero entries.
- Given ππ where π is an arbitrary invertible π Γ π
matrix, can we recover π€?
Planted Sparse Vector
SLIDE 5
- Theorem 1.4 [BKS14]: There is a constant π > 0
and an algorithm based on constant degree SOS such that for every vector π€0 supported on at most ππ β
min{1, π/π2} coordinates, if π€1, β¦ , π€π are chosen independently at random from the Gaussian distribution on ππ, then given any basis for π = π‘πππ{π€0, β¦ , π€π}, the algorithm
- utputs an π-approximation to π€0 in
ππππ§(π, log(1/π)) time.
Theorem Statement
SLIDE 6
- Random Distribution: We choose each entry of π
independently from π 0,
1 π , the normal
distribution with mean 0 and standard deviation
1 π
- We then choose π to be a random π Γ π
- rthogonal/rotation matrix and take ππ to be
- ur input matrix.
Random Distribution
SLIDE 7
- Remark: If π is any π Γ π orthogonal/rotation
matrix then ππ can also be chosen by taking each entry of π independently from π 0,
1 π .
- Idea: Each row of π comes from a multivariate
normal distribution with covariance matrix 1
π π½ππ,
which is invariant under rotations
Random Distribution
SLIDE 8
- Planted Distribution: We choose each entry of
the first π β 1 columns of π independently from π 0,
1 π . The last column of π is our sparse unit
vector π€.
- We then choose π to be a random π Γ π
- rthogonal/rotation matrix and take ππ to be
- ur input matrix.
Planted Distribution
SLIDE 9
- We ask for an π¦ such that
1. πππ¦ = 1 2. πππ¦ is k-sparse (i.e. at most π indices of πππ¦ are nonzero).
- Hard to search for π¦ such that πππ¦ is k-sparse,
so weβll need to relax the problem.
Output
SLIDE 10
- Key idea: All unit vectors have the same 2-norm.
However, sparse vectors will have higher 4-norm
- 4-norm for a π-sparse unit vector in βπ is at
least
4 k β
1
π2 = 1
4 π (obtained by setting π
coordinates to
Β±1 π and the rest to 0)
- Relaxation Attempt #1: Search for an π¦ such that
1. πππ¦ = 1 2. πππ¦ 4 β₯
1
4 π
Distinguishing Sparse Vectors
SLIDE 11
- This is the 2 to 4 Norm Problem: Given a matrix
π΅, find the vector π¦ which maximizes
π΅π¦ 4 π΅π¦
2 to 4 Norm Problem
SLIDE 12
Part II: SOS and 2 to 4 Norm on Random Subspaces
SLIDE 13
- Unfortunately, the 2 to 4 norm problem is hard
[BBH+12]:
β NP-hard to obtain an approximation ratio of 1 +
1 πππππ§πππ(π)
β Assuming ETH (the exponential time hypothesis), it is hard to approximate to within a constant factor.
- Thus, weβll need to relax our problem further.
2 to 4 Norm Hardness
SLIDE 14
πΉ which respects the following constraints:
1. πππ¦ 2 = Οπ=1
π
πππ¦ π
2 = 1
2. πππ¦ 4
4 = Οπ=1 π
πππ¦ π
4 β₯ 1 π
SOS Relaxation
SLIDE 15
1. πππ¦ 2 = Οπ=1
π
πππ¦ π
2 = 1
2. πππ¦ 4
4 = Οπ=1 π
πππ¦ π
4 β₯ 1 π
- To show that SOS distinguishes between the
random and planted distribution, it is sufficient to show that there is no ΰ·¨ πΉ which respects these constraints and has a PSD moment matrix π.
- Remark: Although the 2 to 4 Norm problem is
hard in general, we just need to show that SOS can approximate it on random subspaces.
Showing a Distinguishing Algorithm
SLIDE 16
- Given a random subspace, what is the expected
value of the largest 4-norm of a unit vector in the subspace?
- Trivial strategy: Any unit vectorβs 4-norm is at
least
1
4 π.
2 to 4 Norm on Random Subspaces
SLIDE 17
- Another strategy: Take a basis for this space and
take a linear combination which maximizes one coordinate (subject to having length 1)
- If we add together π random vectors with entries
β Β±
1 π, w.h.p. the result will have norm ΰ·©
Ξ π . Diving the resulting vector by ΰ·© Ξ π , the maximized entry will have magnitude ΰ·© Ξ
π π ,
- ther entries will have magnitude ΰ·©
O
1 π
2 to 4 Norm on Random Subspaces
SLIDE 18
- Calling our final result π₯, w.h.p. the maximized
entry of π₯ contributes ΰ·© Ξ
π2 π2 to π₯ 4 4 while the
- ther entries contribute ΰ·©
Ξ
1 π .
- It turns out that this strategy is essentially
- ptimal. Thus, with high probability the
maximum 4-norm of a unit vector in a d- dimensional random subspace will be ΰ·© Ξ max
π π , 1
4 π
.
2 to 4 Norm on Random Subspaces
SLIDE 19
- Planted dist: max 4-norm β₯
1
4 π
- Random dist: max 4-norm is ΰ·©
Ξ max
π π , 1
4 π
.
- IF SOS can certify the upper bound for a
random subspace, this gives a distinguishing algorithm when max
π π , 1
4 π βͺ
1
4 π (which
happens when π β€ π and π βͺ π or when π β₯ π and k βͺ π2
π2)
Algorithm Boundary
SLIDE 20
Part III: Warmup: Showing π¦ β 1
SLIDE 21
- Take π₯ = πππ¦.
- We expect that π₯
β π¦ . Since we require that π₯ = 1, this implies that we will have π¦ β 1
β π¦ , observe that π₯ 2
2 =
π¦π RV T VR x. Thus, it is sufficient to show that RV T VR β π½π.
Showing π¦ β 1
SLIDE 22
- We have that RV T VR β π½π because the
columns of ππ are π random unit vectors (where π βͺ π) and are thus approximately
- rthonormal.
- However, we will use graph matrices to analyze
the 4-norm, so as a warm-up, letβs check that RV T VR β π½π using graph matrices.
Checking RV T VR β π½π
SLIDE 23
- So far we have worked over {β1, +1}π.
- How can we use graph matrices over π 0,1 π?
- Key idea: Look at the Fourier characters over
π(0,1).
Graph Matrices Over π(0,1)
SLIDE 24
- Inner product on π 0,1 : π β
π =
πΉπ¦βΌπ 0,1 π π¦ π(π¦)
- Fourier characters: Hermite polynomials
- The first few Hermite polynomials (up to
normalization) are as follows:
1. β0 = 1 2. β1 = π¦ 3. β2 = π¦2 β 1 4. β3 = π¦3 β 3π¦
- To normalize, divide βπ by π!
Fourier Analysis Over π(0,1)
SLIDE 25
- Graph matrices over {β1,1}π: 1 and π¦ are a
basis for functions over {β1,1}. We represent π¦ by an edge and 1 by the absence of an edge
- Graph matrices over π 0,1 π: {βπ} are a basis
for functions over π(0,1). We represent βπ by a multi-edge with multiplicity π.
Graph Matrices Over π(0,1)
SLIDE 26
- For convenience, take π΅ =
πππ and think of the entries of π΅ as the input. Now each entry of π΅ is chosen independently from π(0,1)
- π΅ππ is represented by an edge from node π to
node π.
- In class challenge: What is RV T VR in terms
- f graph matrices?
Graph Matrices for RV T VR
π1 π π π2
Γ
1 π π π π π
SLIDE 27
- In class challenge answer:
Graph Matrices for RV T VR
π1 π π π2
Γ
1 π π π π π
=
d n d π π π1 π2 π 1 π
+
d n π = π π π d n π = π π π 1 π
+
2 π
SLIDE 28
- Here we have two different types of vertices,
- ne for the rows of π΅ (which has π possibilities)
and one for the columns of π΅ (which has π possibilities)
- Can generalize the rough norm bounds to handle
multiple types of vertices (writing this up is on my to-do list)
Generalizing Rough Norm Bounds
SLIDE 29
- Generalized rough norm bounds:
- Each isolated vertex outside of π and π
contributes a factor equal to the number of possibilities for that vertex
- Each vertex in the minimum separator (which
minimizes the total number of possibilities for its vertices) contributes nothing
- Each other vertex contributes a factor equal to
the square root of the number of possibilities for that vertex
Generalizing Rough Norm Bounds
SLIDE 30
Norm Bounds for RV T VR
π1 π π π2
Γ
1 π π π π π
=
d n d π π π1 π2 π 1 π
+
d n π = π π π d n π = π π π 1 π
+
2 π
ΰ·¨ π
π π
ΰ·¨ π
1 π
= π½ππ
SLIDE 31
Part IV: 4-Norm Analysis
SLIDE 32
1 π π΅π¦ 4 4
- Take πΆ to be the matrix with entries πΆπ,(π1,π2) =
π΅ππ1π΅ππ2
π π΅π¦ 4 4
= 1
π2 π¦ β π¦ ππΆππΆ(π¦ β π¦)
- Can try to bound πΆππΆ
4-Norm Analysis
SLIDE 33
- Picture for πΆππΆ:
Picture for πΆππΆ
π1 π π π π1 π π π π π π1 π2 2
+ +
π3 π π π π3 π π π π π π3 π4 2
+ + Γ
SLIDE 34
π, the target norm bound on πΆππΆ is ΰ·© O(π), giving a bound of ΰ·© O
1 π on πππ¦ 4 4.
π, the target norm bound on πΆππΆ is ΰ·© O π2 , giving a bound of ΰ·© O
π2 π2 on πππ¦ 4 4
Targets
SLIDE 35
Casework
d n π π1 π d π2 d π π3 d π4 π π π1 π2
Γ
π π π3 π4
Norm ΰ·¨ π π π if π β€ π, norm ΰ·¨ π π2 if π β₯ π
SLIDE 36
Casework
d n π π1 π d π2 π d π4 π π π1 π2
Γ
π π π1 π4
Norm ΰ·¨ π ππ Note: 0 or 2 edges between π and π1
SLIDE 37
Casework
d n π = π π1 π d π2 π π π1 π2
Γ
π π π1 π2
= ππ½π + Norm ΰ·¨ π π Note: 0 or 2 edges between π and π1, 0 or 2 edges between π and π2
SLIDE 38
Casework
d n π π1 π d π π3 d π4 π π π1
Γ
π π π3 π4
Norm ΰ·¨ π ππ3 Too large! Note: 0 or 2 edges between π and π1 Note: 0 or 2 edges between π and π1
SLIDE 39
Casework
n π π d π π1 d π4 π π π1
Γ
π π π1 π4
Norm ΰ·¨ π ππ Note: 1 or 3 edges between π and π1 Note: 0 or 2 edges between π and π1
SLIDE 40
Casework
d n π π1 π d π π2 π π π1
Γ
π π π2
Norm ΰ·¨ π ππ Too large! Note: 0 or 2 edges between π and π1 and between π and π2 Note: 0 or 2 edges between π and π1 and between π and π2
SLIDE 41
Casework
n π d π = π π1 π π π1
Γ
π π π1
Turns out to be 3π½π + Norm ΰ·¨ π( π) Note: 0 or 2 edges between π and π1 on both ends Note: 0,2, or 4 edges between π and π1
SLIDE 42
- Most cases have sufficiently small norm.
- Two cases have a norm which is too large, so
norm bounds alone are not enoughβ¦
Summary
SLIDE 43
Part V: SOS-Symmetry to the Rescue
SLIDE 44
- Instead of looking at max
π₯: π₯ =1π₯ππΆππΆπ₯, we only
need to upper bound max
π¦: π¦ =1 π¦ β π¦ ππΆππΆ(π¦ β π¦)
- As far as π¦ β π¦ ππΆππΆ(π¦ β π¦) is concerned, we
can rearrange indices in pieces of πΆππΆ.
Key Idea: Rearranging Indices
SLIDE 45
Rearranging Indices Case #1
d n π π1 π d π π2 π π π1
Γ
π π π2 d n π = π π1 π d π2 π π π1 π2
Γ
π π π1 π2 rearranging indices
SLIDE 46
Rearranging Indices Case #2
rearranging indices d n π π1 π d π π3 d π4 π π π1
Γ
π π π3 π4 d n π π1 π d π2 π d π4 π π π1 π2
Γ
π π π1 π4
SLIDE 47
- For the two cases whose norm is too high, their
norm can be reduced by rearranging indices.
- This proves the upper bound on
max
π¦: π¦ =1 π¦ β π¦ ππΆππΆ(π¦ β π¦)
Effect of Rearranging Indices
SLIDE 48
Part VI: Observations and Loose Ends
SLIDE 49
- Note: This 4-norm analysis roughly
corresponds to p.33-37 of [BBH+12]
π, with a slightly more careful analysis we can show that π¦ β π¦ ππΆππΆ π¦ β π¦ = 3 Β± π 1 π¦ 2
4,
matching the results in [BBH+12].
Observations: 4-Norm Analysis
SLIDE 50
- How can we handle arbitrary π rather than a
random orthogonal π (i.e. any span of the vectors)?
- SOS handles it automatically!
- Idea: The SOS-symmetry and π β½ 0 constraints
are invariant under linear transformations of the
- variables. Thus, having a different π merely
applies a linear transformation to the pseudo- expectation values.
Loose Ends: Arbitrary π
SLIDE 51
- We have only shown a distinguishing algorithm
between the random and planted cases. How can we find the planted sparse vector π€ exactly?
- Can be done in two steps:
- 1. The analysis shows that degree 4 SOS will output a
vector π€β² which is highly correlated with π€ (because the random part of the subspace has nothing with high 4-norm)
- 2. Using π€β² as a guide, find π€. This can be done by
minimizing then π1 norm of a vector π€ in the subspace subject to π€ β
π€β² = 1, see [BKS14] for details.
Loose Ends: Finding π€ Exactly
SLIDE 52
Part VII: Open Problems
SLIDE 53
- What more can we say when π β«
π?
- More specifically, can we find a better algorithm
using more than the 4-norm? Is there an SOS lower bound showing that π =
π2 π2 is tight?
Open Problems
SLIDE 54 References
- [BBH+12] B. Barak, F. G. S. L. BrandΓ£o, A. W. Harrow, J. A. Kelner, D. Steurer, and Y.
- Zhou. Hypercontractivity, sum-of-squares proofs, and their applications. STOC p.
307β326, 2012.
- [BKS14] B. Barak, J. A. Kelner, and D. Steurer. Rounding Sum of Squares Relaxations.
STOC 2014.