lecture 14 planted sparse vector lecture outline
play

Lecture 14: Planted Sparse Vector Lecture Outline Part I: Planted - PowerPoint PPT Presentation

Lecture 14: Planted Sparse Vector Lecture Outline Part I: Planted Sparse Vector and 2 to 4 Norm Part II: SOS and 2 to 4 Norm on Random Subspaces Part III: Warmup: Showing 1 Part IV: 4-Norm Analysis Part V:


  1. Lecture 14: Planted Sparse Vector

  2. Lecture Outline • Part I: Planted Sparse Vector and 2 to 4 Norm • Part II: SOS and 2 to 4 Norm on Random Subspaces • Part III: Warmup: Showing 𝑦 ≈ 1 • Part IV: 4-Norm Analysis • Part V: SOS-symmetry to the Rescue • Part VI: Observations and Loose Ends • Part VII: Open Problems

  3. Part I: Planted Sparse Vector and 2 to 4 Norm

  4. Planted Sparse Vector • Planted Sparse Vector problem: Given the span of 𝑒 − 1 random vectors in ℝ 𝑜 and one unit vector 𝑤 ∈ ℝ 𝑜 of sparsity 𝑙 , can we recover 𝑤 ? • More precisely, let 𝑊 be an n × 𝑒 matrix where: 1. 𝑒 − 1 columns of 𝑊 are vectors of length ≈ 1 chosen randomly from ℝ 𝑜 2. One column of 𝑊 is a unit vector 𝑤 with ≤ 𝑙 nonzero entries. • Given 𝑊𝑆 where 𝑆 is an arbitrary invertible 𝑒 × 𝑒 matrix, can we recover 𝑤 ?

  5. Theorem Statement • Theorem 1.4 [BKS14]: There is a constant 𝑑 > 0 and an algorithm based on constant degree SOS such that for every vector 𝑤 0 supported on at most 𝑑𝑜 ⋅ min{1, 𝑜/𝑒 2 } coordinates, if 𝑤 1 , … , 𝑤 𝑒 are chosen independently at random from the Gaussian distribution on 𝑆 𝑜 , then given any basis for 𝑊 = 𝑡𝑞𝑏𝑜{𝑤 0 , … , 𝑤 𝑒 } , the algorithm outputs an 𝜗 -approximation to 𝑤 0 in 𝑞𝑝𝑚𝑧(𝑜, log(1/𝜗)) time.

  6. Random Distribution • Random Distribution: We choose each entry of 𝑊 1 independently from 𝑂 0, 𝑜 , the normal distribution with mean 0 and standard deviation 1 𝑜 • We then choose 𝑆 to be a random 𝑒 × 𝑒 orthogonal/rotation matrix and take 𝑊𝑆 to be our input matrix.

  7. Random Distribution • Remark: If 𝑆 is any 𝑒 × 𝑒 orthogonal/rotation matrix then 𝑊𝑆 can also be chosen by taking 1 each entry of 𝑊 independently from 𝑂 0, 𝑜 . • Idea: Each row of 𝑊 comes from a multivariate normal distribution with covariance matrix 1 𝑜 𝐽𝑒 𝑒 , which is invariant under rotations

  8. Planted Distribution • Planted Distribution: We choose each entry of the first 𝑒 − 1 columns of 𝑊 independently from 1 𝑂 0, 𝑜 . The last column of 𝑊 is our sparse unit vector 𝑤 . • We then choose 𝑆 to be a random 𝑒 × 𝑒 orthogonal/rotation matrix and take 𝑊𝑆 to be our input matrix.

  9. Output • We ask for an 𝑦 such that 𝑊𝑆𝑦 = 1 1. 𝑊𝑆𝑦 is k-sparse (i.e. at most 𝑙 indices of 𝑊𝑆𝑦 are 2. nonzero). • Hard to search for 𝑦 such that 𝑊𝑆𝑦 is k-sparse, so we’ll need to relax the problem.

  10. Distinguishing Sparse Vectors • Key idea: All unit vectors have the same 2 -norm. However, sparse vectors will have higher 4-norm • 4-norm for a 𝑙 -sparse unit vector in ℝ 𝑜 is at 4 k ⋅ 1 1 𝑙 2 = 4 𝑙 (obtained by setting 𝑙 least ±1 𝑙 and the rest to 0 ) coordinates to • Relaxation Attempt #1: Search for an 𝑦 such that 𝑊𝑆𝑦 = 1 1. 1 𝑊𝑆𝑦 4 ≥ 2. 4 𝑙

  11. 2 to 4 Norm Problem • This is the 2 to 4 Norm Problem: Given a matrix 𝐵𝑦 4 𝐵 , find the vector 𝑦 which maximizes 𝐵𝑦

  12. Part II: SOS and 2 to 4 Norm on Random Subspaces

  13. 2 to 4 Norm Hardness • Unfortunately, the 2 to 4 norm problem is hard [BBH+12]: – NP-hard to obtain an approximation ratio of 1 1 + 𝑜𝑞𝑝𝑚𝑧𝑚𝑝𝑕(𝑜) – Assuming ETH (the exponential time hypothesis), it is hard to approximate to within a constant factor. • Thus, we’ll need to relax our problem further.

  14. SOS Relaxation • Relaxation: Find ෨ 𝐹 which respects the following constraints: 2 = 1 𝑊𝑆𝑦 2 = σ 𝑗=1 𝑜 𝑊𝑆𝑦 𝑗 1. 4 ≥ 4 = σ 𝑗=1 1 𝑜 𝑊𝑆𝑦 4 𝑊𝑆𝑦 𝑗 2. 𝑙

  15. Showing a Distinguishing Algorithm • Constraints: 2 = 1 𝑊𝑆𝑦 2 = σ 𝑗=1 𝑜 1. 𝑊𝑆𝑦 𝑗 4 ≥ 4 = σ 𝑗=1 1 𝑜 𝑊𝑆𝑦 4 𝑊𝑆𝑦 𝑗 2. 𝑙 • To show that SOS distinguishes between the random and planted distribution, it is sufficient to show that there is no ෨ 𝐹 which respects these constraints and has a PSD moment matrix 𝑁 . • Remark: Although the 2 to 4 Norm problem is hard in general, we just need to show that SOS can approximate it on random subspaces.

  16. 2 to 4 Norm on Random Subspaces • Given a random subspace, what is the expected value of the largest 4 -norm of a unit vector in the subspace? • Trivial strategy: Any unit vector’s 4 -norm is at 1 least 4 𝑜 . • Can we do better?

  17. 2 to 4 Norm on Random Subspaces • Another strategy: Take a basis for this space and take a linear combination which maximizes one coordinate (subject to having length 1 ) • If we add together 𝑒 random vectors with entries 1 𝑜 , w.h.p. the result will have norm ෩ ≈ ± Θ 𝑒 . Diving the resulting vector by ෩ Θ 𝑒 , the 𝑒 maximized entry will have magnitude ෩ Θ 𝑜 , 1 other entries will have magnitude ෩ O 𝑜

  18. 2 to 4 Norm on Random Subspaces • Calling our final result 𝑥 , w.h.p. the maximized 𝑒 2 4 while the entry of 𝑥 contributes ෩ Θ 𝑜 2 to 𝑥 4 1 other entries contribute ෩ Θ 𝑜 . • It turns out that this strategy is essentially optimal. Thus, with high probability the maximum 4 -norm of a unit vector in a d- dimensional random subspace will be 𝑒 𝑜 , 1 ෩ Θ max . 4 𝑜

  19. Algorithm Boundary 1 • Planted dist: max 4 -norm ≥ 4 𝑙 𝑒 𝑜 , 1 • Random dist: max 4-norm is ෩ Θ max . 4 𝑜 • IF SOS can certify the upper bound for a random subspace, this gives a distinguishing 𝑜 , 1 𝑒 1 algorithm when max 4 𝑜 ≪ 4 𝑙 (which happens when 𝑒 ≤ 𝑜 and 𝑙 ≪ 𝑜 or when 𝑜 and k ≪ 𝑜 2 𝑒 ≥ 𝑒 2 )

  20. Part III: Warmup: Showing 𝑦 ≈ 1

  21. Showing 𝑦 ≈ 1 • Take 𝑥 = 𝑊𝑆𝑦 . • We expect that 𝑥 ≈ 𝑦 . Since we require that 𝑥 = 1 , this implies that we will have 𝑦 ≈ 1 2 = • To check that 𝑥 ≈ 𝑦 , observe that 𝑥 2 𝑦 𝑈 RV T VR x . Thus, it is sufficient to show that RV T VR ≈ 𝐽𝑒 .

  22. Checking RV T VR ≈ 𝐽𝑒 • We have that RV T VR ≈ 𝐽𝑒 because the columns of 𝑊𝑆 are 𝑒 random unit vectors (where 𝑒 ≪ 𝑜 ) and are thus approximately orthonormal. • However, we will use graph matrices to analyze the 4-norm, so as a warm- up, let’s check that RV T VR ≈ 𝐽𝑒 using graph matrices.

  23. Graph Matrices Over 𝑂(0,1) • So far we have worked over {−1, +1} 𝑛 . • How can we use graph matrices over 𝑂 0,1 𝑛 ? • Key idea: Look at the Fourier characters over 𝑂(0,1) .

  24. Fourier Analysis Over 𝑂(0,1) • Inner product on 𝑂 0,1 : 𝑔 ⋅ 𝑕 = 𝐹 𝑦∼𝑂 0,1 𝑔 𝑦 𝑕(𝑦) • Fourier characters: Hermite polynomials • The first few Hermite polynomials (up to normalization) are as follows: 1. ℎ 0 = 1 2. ℎ 1 = 𝑦 ℎ 2 = 𝑦 2 − 1 3. ℎ 3 = 𝑦 3 − 3𝑦 4. • To normalize, divide ℎ 𝑘 by 𝑘!

  25. Graph Matrices Over 𝑂(0,1) • Graph matrices over {−1,1} 𝑛 : 1 and 𝑦 are a basis for functions over {−1,1} . We represent 𝑦 by an edge and 1 by the absence of an edge • Graph matrices over 𝑂 0,1 𝑛 : {ℎ 𝑘 } are a basis for functions over 𝑂(0,1) . We represent ℎ 𝑘 by a multi-edge with multiplicity 𝑘 .

  26. Graph Matrices for RV T VR • For convenience, take 𝐵 = 𝑜𝑆𝑊 and think of the entries of 𝐵 as the input. Now each entry of 𝐵 is chosen independently from 𝑂(0,1) • 𝐵 𝑗𝑘 is represented by an edge from node 𝑗 to node 𝑘 . • In class challenge: What is RV T VR in terms of graph matrices? 1 × 𝑘 1 𝑗 𝑗 𝑘 2 𝑜 𝑒 𝑜 𝑜 𝑒

  27. Graph Matrices for RV T VR • In class challenge answer: 1 = 𝑘 1 𝑗 × 𝑗 𝑘 2 𝑜 𝑒 𝑜 𝑜 𝑒 n n 𝑗 𝑗 1 2 1 𝑘 2 𝑘 𝑘 𝑘 1 𝑗 + + 𝑜 𝑜 𝑜 d d d n d 𝑊 𝑉 = 𝑊 𝑉 = 𝑊 𝑉

  28. Generalizing Rough Norm Bounds • Here we have two different types of vertices, one for the rows of 𝐵 (which has 𝑜 possibilities) and one for the columns of 𝐵 (which has 𝑒 possibilities) • Can generalize the rough norm bounds to handle multiple types of vertices (writing this up is on my to-do list)

  29. Generalizing Rough Norm Bounds • Generalized rough norm bounds: • Each isolated vertex outside of 𝑉 and 𝑊 contributes a factor equal to the number of possibilities for that vertex • Each vertex in the minimum separator (which minimizes the total number of possibilities for its vertices) contributes nothing • Each other vertex contributes a factor equal to the square root of the number of possibilities for that vertex

  30. Norm Bounds for RV T VR 1 = × 𝑘 1 𝑗 𝑗 𝑘 2 𝑜 𝑒 𝑜 𝑜 𝑒 n n 𝑗 𝑗 1 2 1 𝑘 2 𝑘 𝑘 𝑘 1 𝑗 + + 𝑜 𝑜 𝑜 d d d n d 𝑊 𝑉 = 𝑊 𝑉 = 𝑊 𝑉 𝑒 ෨ 1 𝑃 ෨ = 𝐽𝑒 𝑒 𝑃 𝑜 𝑜

  31. Part IV: 4-Norm Analysis

  32. 4-Norm Analysis 4 1 • We want to bound 𝑜 𝐵𝑦 4 • Take 𝐶 to be the matrix with entries 𝐶 𝑗,(𝑘 1 ,𝑘 2 ) = 𝐵 𝑗𝑘 1 𝐵 𝑗𝑘 2 4 1 = 1 𝑜 2 𝑦 ⊗ 𝑦 𝑈 𝐶 𝑈 𝐶(𝑦 ⊗ 𝑦) • 𝑜 𝐵𝑦 4 • Can try to bound 𝐶 𝑈 𝐶

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend