Hardness of Certification for Constrained PCA Alex Wein Courant - PowerPoint PPT Presentation

The Low-Degree Method Suppose we want to hypothesis test (with error probability o (1)) between two distributions: ◮ Null model Y ∼ Q n e.g. G ( n , 1 / 2) ◮ Planted model Y ∼ P n e.g. G ( n , 1 / 2) ∪ { k -clique } Look for a degree- D multivariate polynomial f that distinguishes P from Q : E Y ∼ P [ f ( Y )] max � E Y ∼ Q [ f ( Y ) 2 ] f ∈ R [ Y ] D 6 / 19

The Low-Degree Method Suppose we want to hypothesis test (with error probability o (1)) between two distributions: ◮ Null model Y ∼ Q n e.g. G ( n , 1 / 2) ◮ Planted model Y ∼ P n e.g. G ( n , 1 / 2) ∪ { k -clique } Look for a degree- D multivariate polynomial f that distinguishes P from Q : E Y ∼ P [ f ( Y )] max � E Y ∼ Q [ f ( Y ) 2 ] f ∈ R [ Y ] D Want f ( Y ) to be big when Y ∼ P and small when Y ∼ Q 6 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] max R [ Y ] D : polynomials of � E Y ∼ Q [ f ( Y ) 2 ] f ∈ R [ Y ] D degree ≤ D (subspace) 7 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] max R [ Y ] D : polynomials of � E Y ∼ Q [ f ( Y ) 2 ] f ∈ R [ Y ] D degree ≤ D (subspace) E Y ∼ Q [ L ( Y ) f ( Y )] = max � E Y ∼ Q [ f ( Y ) 2 ] L ( Y ) = d P f ∈ R [ Y ] D d Q ( Y ) 7 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] max R [ Y ] D : polynomials of � E Y ∼ Q [ f ( Y ) 2 ] f ∈ R [ Y ] D degree ≤ D (subspace) E Y ∼ Q [ L ( Y ) f ( Y )] = max � E Y ∼ Q [ f ( Y ) 2 ] L ( Y ) = d P f ∈ R [ Y ] D d Q ( Y ) � L , f � = max � f , g � = E Y ∼ Q [ f ( Y ) g ( Y )] � f � f ∈ R [ Y ] D � � f � = � f , f � 7 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] max R [ Y ] D : polynomials of � E Y ∼ Q [ f ( Y ) 2 ] f ∈ R [ Y ] D degree ≤ D (subspace) E Y ∼ Q [ L ( Y ) f ( Y )] = max � E Y ∼ Q [ f ( Y ) 2 ] L ( Y ) = d P f ∈ R [ Y ] D d Q ( Y ) � L , f � = max � f , g � = E Y ∼ Q [ f ( Y ) g ( Y )] � f � f ∈ R [ Y ] D = � L ≤ D � � � f � = � f , f � Maximizer: f = L ≤ D := proj ( R [ Y ] D ) L 7 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] max R [ Y ] D : polynomials of � E Y ∼ Q [ f ( Y ) 2 ] f ∈ R [ Y ] D degree ≤ D (subspace) E Y ∼ Q [ L ( Y ) f ( Y )] = max � E Y ∼ Q [ f ( Y ) 2 ] L ( Y ) = d P f ∈ R [ Y ] D d Q ( Y ) � L , f � = max � f , g � = E Y ∼ Q [ f ( Y ) g ( Y )] � f � f ∈ R [ Y ] D = � L ≤ D � � � f � = � f , f � Maximizer: f = L ≤ D := proj ( R [ Y ] D ) L Norm of low-degree likelihood ratio 7 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] √ E Y ∼ Q [ f ( Y ) 2 ] = � L ≤ D � Conclusion: max f ∈ R [ Y ] D 8 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] √ E Y ∼ Q [ f ( Y ) 2 ] = � L ≤ D � Conclusion: max f ∈ R [ Y ] D Heuristically, � ω (1) degree- D polynomial can distinguish Q , P � L ≤ D � = O (1) degree- D polynomials fail 8 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] √ E Y ∼ Q [ f ( Y ) 2 ] = � L ≤ D � Conclusion: max f ∈ R [ Y ] D Heuristically, � ω (1) degree- D polynomial can distinguish Q , P � L ≤ D � = O (1) degree- D polynomials fail Degree- O (log n ) polynomials ⇔ Polynomial-time algorithms 8 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] √ E Y ∼ Q [ f ( Y ) 2 ] = � L ≤ D � Conclusion: max f ∈ R [ Y ] D Heuristically, � ω (1) degree- D polynomial can distinguish Q , P � L ≤ D � = O (1) degree- D polynomials fail Degree- O (log n ) polynomials ⇔ Polynomial-time algorithms ◮ Spectral method: distinguish via top eigenvalue of matrix M = M ( Y ) whose entries are O (1)-degree polynomials in Y 8 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] √ E Y ∼ Q [ f ( Y ) 2 ] = � L ≤ D � Conclusion: max f ∈ R [ Y ] D Heuristically, � ω (1) degree- D polynomial can distinguish Q , P � L ≤ D � = O (1) degree- D polynomials fail Degree- O (log n ) polynomials ⇔ Polynomial-time algorithms ◮ Spectral method: distinguish via top eigenvalue of matrix M = M ( Y ) whose entries are O (1)-degree polynomials in Y ◮ Log-degree distinguisher: f ( Y ) = Tr ( M q ) with q = Θ(log n ) 8 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] √ E Y ∼ Q [ f ( Y ) 2 ] = � L ≤ D � Conclusion: max f ∈ R [ Y ] D Heuristically, � ω (1) degree- D polynomial can distinguish Q , P � L ≤ D � = O (1) degree- D polynomials fail Degree- O (log n ) polynomials ⇔ Polynomial-time algorithms ◮ Spectral method: distinguish via top eigenvalue of matrix M = M ( Y ) whose entries are O (1)-degree polynomials in Y ◮ Log-degree distinguisher: f ( Y ) = Tr ( M q ) with q = Θ(log n ) ◮ Spectral methods ⇔ sum-of-squares [HKPRSS ’17] 8 / 19

The Low-Degree Method E Y ∼ P [ f ( Y )] √ E Y ∼ Q [ f ( Y ) 2 ] = � L ≤ D � Conclusion: max f ∈ R [ Y ] D Heuristically, � ω (1) degree- D polynomial can distinguish Q , P � L ≤ D � = O (1) degree- D polynomials fail Degree- O (log n ) polynomials ⇔ Polynomial-time algorithms ◮ Spectral method: distinguish via top eigenvalue of matrix M = M ( Y ) whose entries are O (1)-degree polynomials in Y ◮ Log-degree distinguisher: f ( Y ) = Tr ( M q ) with q = Θ(log n ) ◮ Spectral methods ⇔ sum-of-squares [HKPRSS ’17] Conjecture (informal variant of [Hopkins ’18] ) For “nice” Q , P , if � L ≤ D � = O (1) for D = log 1+Ω(1) ( n ) then no polynomial-time algorithm can distinguish Q , P with success probability 1 − o (1) . 8 / 19

Advantages of the Low-Degree Method ◮ Can actually calculate/bound � L ≤ D � for many problems 9 / 19

Advantages of the Low-Degree Method ◮ Can actually calculate/bound � L ≤ D � for many problems ◮ And the predictions are correct! (i.e. matching widely-believed conjectures) 9 / 19

Advantages of the Low-Degree Method ◮ Can actually calculate/bound � L ≤ D � for many problems ◮ And the predictions are correct! (i.e. matching widely-believed conjectures) ◮ Planted clique, sparse PCA, stochastic block model, tensor PCA, ... 9 / 19

Advantages of the Low-Degree Method ◮ Can actually calculate/bound � L ≤ D � for many problems ◮ And the predictions are correct! (i.e. matching widely-believed conjectures) ◮ Planted clique, sparse PCA, stochastic block model, tensor PCA, ... ◮ Heuristically, low-degree prediction matches performance of sum-of-squares 9 / 19

Advantages of the Low-Degree Method ◮ Can actually calculate/bound � L ≤ D � for many problems ◮ And the predictions are correct! (i.e. matching widely-believed conjectures) ◮ Planted clique, sparse PCA, stochastic block model, tensor PCA, ... ◮ Heuristically, low-degree prediction matches performance of sum-of-squares ◮ But low-degree calculation is much easier than proving SOS lower bounds 9 / 19

Advantages of the Low-Degree Method ◮ Can actually calculate/bound � L ≤ D � for many problems ◮ And the predictions are correct! (i.e. matching widely-believed conjectures) ◮ Planted clique, sparse PCA, stochastic block model, tensor PCA, ... ◮ Heuristically, low-degree prediction matches performance of sum-of-squares ◮ But low-degree calculation is much easier than proving SOS lower bounds ◮ By varying degree D , can explore power of subexponential-time algorithms: ◮ Degree- n δ polynomials ⇔ Time-2 n δ algorithms δ ∈ (0 , 1) 9 / 19

How to Compute � L ≤ D � Additive Gaussian noise: P : Y = X + Z vs Q : Y = Z where X ∼ P , any distribution over R N and Z is i.i.d. N (0 , 1) 10 / 19

How to Compute � L ≤ D � Additive Gaussian noise: P : Y = X + Z vs Q : Y = Z where X ∼ P , any distribution over R N and Z is i.i.d. N (0 , 1) d Q ( Y ) = E X exp( − 1 2 � Y − X � 2 ) = E X exp( � Y , X �− 1 L ( Y ) = d P 2 � X � 2 ) exp( − 1 2 � Y � 2 ) 10 / 19

How to Compute � L ≤ D � Additive Gaussian noise: P : Y = X + Z vs Q : Y = Z where X ∼ P , any distribution over R N and Z is i.i.d. N (0 , 1) d Q ( Y ) = E X exp( − 1 2 � Y − X � 2 ) = E X exp( � Y , X �− 1 L ( Y ) = d P 2 � X � 2 ) exp( − 1 2 � Y � 2 ) Write L = � α c α h α where { h α } are Hermite polynomials (orthonormal basis w.r.t. Q ) 10 / 19

How to Compute � L ≤ D � Additive Gaussian noise: P : Y = X + Z vs Q : Y = Z where X ∼ P , any distribution over R N and Z is i.i.d. N (0 , 1) d Q ( Y ) = E X exp( − 1 2 � Y − X � 2 ) = E X exp( � Y , X �− 1 L ( Y ) = d P 2 � X � 2 ) exp( − 1 2 � Y � 2 ) Write L = � α c α h α where { h α } are Hermite polynomials (orthonormal basis w.r.t. Q ) � L ≤ D � 2 = � | α |≤ D c 2 α where c α = � L , h α � = E Y ∼ Q [ L ( Y ) h α ( Y )] 10 / 19

How to Compute � L ≤ D � Additive Gaussian noise: P : Y = X + Z vs Q : Y = Z where X ∼ P , any distribution over R N and Z is i.i.d. N (0 , 1) d Q ( Y ) = E X exp( − 1 2 � Y − X � 2 ) = E X exp( � Y , X �− 1 L ( Y ) = d P 2 � X � 2 ) exp( − 1 2 � Y � 2 ) Write L = � α c α h α where { h α } are Hermite polynomials (orthonormal basis w.r.t. Q ) � L ≤ D � 2 = � | α |≤ D c 2 α where c α = � L , h α � = E Y ∼ Q [ L ( Y ) h α ( Y )] · · · 10 / 19

How to Compute � L ≤ D � Additive Gaussian noise: P : Y = X + Z vs Q : Y = Z where X ∼ P , any distribution over R N and Z is i.i.d. N (0 , 1) d Q ( Y ) = E X exp( − 1 2 � Y − X � 2 ) = E X exp( � Y , X �− 1 L ( Y ) = d P 2 � X � 2 ) exp( − 1 2 � Y � 2 ) Write L = � α c α h α where { h α } are Hermite polynomials (orthonormal basis w.r.t. Q ) � L ≤ D � 2 = � | α |≤ D c 2 α where c α = � L , h α � = E Y ∼ Q [ L ( Y ) h α ( Y )] · · · Result: � L ≤ D � 2 = � D d ! E X , X ′ [ � X , X ′ � d ] 1 d =0 10 / 19

Part II: Hardness of Certification for Constrained PCA Problems 11 / 19

Constrained PCA Let W ∼ GOE( n ) “Gaussian orthogonal ensemble” ◮ n × n random symmetric matrix: W ij = W ji ∼ N (0 , 1 / n ) , W ii ∼ N (0 , 2 / n ) 12 / 19

Constrained PCA Let W ∼ GOE( n ) “Gaussian orthogonal ensemble” ◮ n × n random symmetric matrix: W ij = W ji ∼ N (0 , 1 / n ) , W ii ∼ N (0 , 2 / n ) ◮ Eigenvalues follow semicircle law on [ − 2 , 2] 12 / 19

Constrained PCA Let W ∼ GOE( n ) “Gaussian orthogonal ensemble” ◮ n × n random symmetric matrix: W ij = W ji ∼ N (0 , 1 / n ) , W ii ∼ N (0 , 2 / n ) ◮ Eigenvalues follow semicircle law on [ − 2 , 2] � x � =1 x ⊤ Wx PCA: max 12 / 19

Constrained PCA Let W ∼ GOE( n ) “Gaussian orthogonal ensemble” ◮ n × n random symmetric matrix: W ij = W ji ∼ N (0 , 1 / n ) , W ii ∼ N (0 , 2 / n ) ◮ Eigenvalues follow semicircle law on [ − 2 , 2] � x � =1 x ⊤ Wx = λ max ( W ) → 2 PCA: max as n → ∞ 12 / 19

Constrained PCA Let W ∼ GOE( n ) “Gaussian orthogonal ensemble” ◮ n × n random symmetric matrix: W ij = W ji ∼ N (0 , 1 / n ) , W ii ∼ N (0 , 2 / n ) ◮ Eigenvalues follow semicircle law on [ − 2 , 2] � x � =1 x ⊤ Wx = λ max ( W ) → 2 PCA: max as n → ∞ x ∈{± 1 / √ n } n x ⊤ Wx Constrained PCA: φ ( W ) := max 12 / 19

Constrained PCA Let W ∼ GOE( n ) “Gaussian orthogonal ensemble” ◮ n × n random symmetric matrix: W ij = W ji ∼ N (0 , 1 / n ) , W ii ∼ N (0 , 2 / n ) ◮ Eigenvalues follow semicircle law on [ − 2 , 2] � x � =1 x ⊤ Wx = λ max ( W ) → 2 PCA: max as n → ∞ x ∈{± 1 / √ n } n x ⊤ Wx Constrained PCA: φ ( W ) := max Statistical physics: “Sherrington–Kirkpatrick spin glass model” ◮ φ ( W ) → 2P ∗ ≈ 1 . 5264 as n → ∞ [Parisi ’80; Talagrand ’06] 12 / 19

Search vs Certification x ∈{± 1 / √ n } n x ⊤ Wx , φ ( W ) := max W ∼ GOE ( n ) 13 / 19

Search vs Certification x ∈{± 1 / √ n } n x ⊤ Wx , φ ( W ) := max W ∼ GOE ( n ) Two computational problems: 13 / 19

Search vs Certification x ∈{± 1 / √ n } n x ⊤ Wx , φ ( W ) := max W ∼ GOE ( n ) Two computational problems: ◮ Search: given W , find x ∈ {± 1 / √ n } n with large x ⊤ Wx 13 / 19

Search vs Certification x ∈{± 1 / √ n } n x ⊤ Wx , φ ( W ) := max W ∼ GOE ( n ) Two computational problems: ◮ Search: given W , find x ∈ {± 1 / √ n } n with large x ⊤ Wx ◮ Proves a lower bound on φ ( W ) 13 / 19

Search vs Certification x ∈{± 1 / √ n } n x ⊤ Wx , φ ( W ) := max W ∼ GOE ( n ) Two computational problems: ◮ Search: given W , find x ∈ {± 1 / √ n } n with large x ⊤ Wx ◮ Proves a lower bound on φ ( W ) ◮ Certification: given W , prove φ ( W ) ≤ B for some bound B 13 / 19

Search vs Certification x ∈{± 1 / √ n } n x ⊤ Wx , φ ( W ) := max W ∼ GOE ( n ) Two computational problems: ◮ Search: given W , find x ∈ {± 1 / √ n } n with large x ⊤ Wx ◮ Proves a lower bound on φ ( W ) ◮ Certification: given W , prove φ ( W ) ≤ B for some bound B ◮ Formally: algorithm { f n } outputs f n ( W ) ∈ R such that: 13 / 19

Search vs Certification x ∈{± 1 / √ n } n x ⊤ Wx , φ ( W ) := max W ∼ GOE ( n ) Two computational problems: ◮ Search: given W , find x ∈ {± 1 / √ n } n with large x ⊤ Wx ◮ Proves a lower bound on φ ( W ) ◮ Certification: given W , prove φ ( W ) ≤ B for some bound B ◮ Formally: algorithm { f n } outputs f n ( W ) ∈ R such that: (i) φ ( W ) ≤ f n ( W ) ∀ W ∈ R n × n 13 / 19

Search vs Certification x ∈{± 1 / √ n } n x ⊤ Wx , φ ( W ) := max W ∼ GOE ( n ) Two computational problems: ◮ Search: given W , find x ∈ {± 1 / √ n } n with large x ⊤ Wx ◮ Proves a lower bound on φ ( W ) ◮ Certification: given W , prove φ ( W ) ≤ B for some bound B ◮ Formally: algorithm { f n } outputs f n ( W ) ∈ R such that: (i) φ ( W ) ≤ f n ( W ) ∀ W ∈ R n × n (ii) if W ∼ GOE ( n ), f n ( W ) ≤ B + o (1) w.p. 1 − o (1) 13 / 19

Search vs Certification x ∈{± 1 / √ n } n x ⊤ Wx , φ ( W ) := max W ∼ GOE ( n ) Two computational problems: ◮ Search: given W , find x ∈ {± 1 / √ n } n with large x ⊤ Wx ◮ Proves a lower bound on φ ( W ) ◮ Certification: given W , prove φ ( W ) ≤ B for some bound B ◮ Formally: algorithm { f n } outputs f n ( W ) ∈ R such that: (i) φ ( W ) ≤ f n ( W ) ∀ W ∈ R n × n (ii) if W ∼ GOE ( n ), f n ( W ) ≤ B + o (1) w.p. 1 − o (1) ◮ Note: cannot just output f n ( W ) = 2P ∗ + ε 13 / 19

Search vs Certification: Prior Work 14 / 19

Search vs Certification: Prior Work Perfect search is possible in poly time 14 / 19

Search vs Certification: Prior Work Perfect search is possible in poly time ◮ Can find x ∈ {± 1 / √ n } n such that x ⊤ Wx ≥ 2P ∗ − ε [Montanari ’18] 14 / 19

Search vs Certification: Prior Work Perfect search is possible in poly time ◮ Can find x ∈ {± 1 / √ n } n such that x ⊤ Wx ≥ 2P ∗ − ε [Montanari ’18] ◮ Optimization of full-RSB models [Subag ’18] 14 / 19

Search vs Certification: Prior Work Perfect search is possible in poly time ◮ Can find x ∈ {± 1 / √ n } n such that x ⊤ Wx ≥ 2P ∗ − ε [Montanari ’18] ◮ Optimization of full-RSB models [Subag ’18] Trivial spectral certification: 14 / 19

Search vs Certification: Prior Work Perfect search is possible in poly time ◮ Can find x ∈ {± 1 / √ n } n such that x ⊤ Wx ≥ 2P ∗ − ε [Montanari ’18] ◮ Optimization of full-RSB models [Subag ’18] Trivial spectral certification: � x � =1 x ⊤ Wx = λ max ( W ) → 2 φ ( W ) ≤ max 14 / 19

Search vs Certification: Prior Work Perfect search is possible in poly time ◮ Can find x ∈ {± 1 / √ n } n such that x ⊤ Wx ≥ 2P ∗ − ε [Montanari ’18] ◮ Optimization of full-RSB models [Subag ’18] Trivial spectral certification: � x � =1 x ⊤ Wx = λ max ( W ) → 2 φ ( W ) ≤ max Can we do better (in poly time)? 14 / 19

Search vs Certification: Prior Work Perfect search is possible in poly time ◮ Can find x ∈ {± 1 / √ n } n such that x ⊤ Wx ≥ 2P ∗ − ε [Montanari ’18] ◮ Optimization of full-RSB models [Subag ’18] Trivial spectral certification: � x � =1 x ⊤ Wx = λ max ( W ) → 2 φ ( W ) ≤ max Can we do better (in poly time)? ◮ Convex relaxation? 14 / 19

Search vs Certification: Prior Work Perfect search is possible in poly time ◮ Can find x ∈ {± 1 / √ n } n such that x ⊤ Wx ≥ 2P ∗ − ε [Montanari ’18] ◮ Optimization of full-RSB models [Subag ’18] Trivial spectral certification: � x � =1 x ⊤ Wx = λ max ( W ) → 2 φ ( W ) ≤ max Can we do better (in poly time)? ◮ Convex relaxation? ◮ Sum-of-squares? 14 / 19

Search vs Certification: Prior Work Perfect search is possible in poly time ◮ Can find x ∈ {± 1 / √ n } n such that x ⊤ Wx ≥ 2P ∗ − ε [Montanari ’18] ◮ Optimization of full-RSB models [Subag ’18] Trivial spectral certification: � x � =1 x ⊤ Wx = λ max ( W ) → 2 φ ( W ) ≤ max Can we do better (in poly time)? ◮ Convex relaxation? ◮ Sum-of-squares? Answer: no! 14 / 19

Search vs Certification: Prior Work Perfect search is possible in poly time ◮ Can find x ∈ {± 1 / √ n } n such that x ⊤ Wx ≥ 2P ∗ − ε [Montanari ’18] ◮ Optimization of full-RSB models [Subag ’18] Trivial spectral certification: � x � =1 x ⊤ Wx = λ max ( W ) → 2 φ ( W ) ≤ max Can we do better (in poly time)? ◮ Convex relaxation? ◮ Sum-of-squares? Answer: no! ◮ In particular, any convex relaxation fails 14 / 19

Main Result 15 / 19

Main Result Theorem (informal) Conditional on the low-degree method, for any ε > 0 , no polynomial-time algorithm can certify an upper bound of 2 − ε on φ ( W ) . 15 / 19

Main Result Theorem (informal) Conditional on the low-degree method, for any ε > 0 , no polynomial-time algorithm can certify an upper bound of 2 − ε on φ ( W ) . ◮ In fact, need essentially exponential time: 2 n 1 − o (1) 15 / 19

Main Result Theorem (informal) Conditional on the low-degree method, for any ε > 0 , no polynomial-time algorithm can certify an upper bound of 2 − ε on φ ( W ) . ◮ In fact, need essentially exponential time: 2 n 1 − o (1) ◮ Also for constraint sets other than {± 1 / √ n } n 15 / 19

Main Result Theorem (informal) Conditional on the low-degree method, for any ε > 0 , no polynomial-time algorithm can certify an upper bound of 2 − ε on φ ( W ) . ◮ In fact, need essentially exponential time: 2 n 1 − o (1) ◮ Also for constraint sets other than {± 1 / √ n } n Proof outline: 15 / 19

Main Result Theorem (informal) Conditional on the low-degree method, for any ε > 0 , no polynomial-time algorithm can certify an upper bound of 2 − ε on φ ( W ) . ◮ In fact, need essentially exponential time: 2 n 1 − o (1) ◮ Also for constraint sets other than {± 1 / √ n } n Proof outline: (i) Reduction from a hypothesis testing problem (negatively-spiked Wishart) to certification problem 15 / 19

Main Result Theorem (informal) Conditional on the low-degree method, for any ε > 0 , no polynomial-time algorithm can certify an upper bound of 2 − ε on φ ( W ) . ◮ In fact, need essentially exponential time: 2 n 1 − o (1) ◮ Also for constraint sets other than {± 1 / √ n } n Proof outline: (i) Reduction from a hypothesis testing problem (negatively-spiked Wishart) to certification problem (ii) Use low-degree method to show that the hypothesis testing problem is hard 15 / 19

Spiked Wishart Model 16 / 19

Spiked Wishart Model Q : Observe N independent samples y 1 , . . . , y N where y i ∼ N (0 , I n ) 16 / 19

Spiked Wishart Model Q : Observe N independent samples y 1 , . . . , y N where y i ∼ N (0 , I n ) P : Planted vector x ∼ Unif ( {± 1 / √ n } n ) Observe y 1 , . . . , y N with y i ∼ N (0 , I n + β xx ⊤ ) Parameters: n / N → γ, β ∈ [ − 1 , ∞ ) 16 / 19

Spiked Wishart Model Q : Observe N independent samples y 1 , . . . , y N where y i ∼ N (0 , I n ) P : Planted vector x ∼ Unif ( {± 1 / √ n } n ) Observe y 1 , . . . , y N with y i ∼ N (0 , I n + β xx ⊤ ) Parameters: n / N → γ, β ∈ [ − 1 , ∞ ) Spectral threshold: if β 2 > γ , can distinguish Q , P using top/bottom eigenvalue of sample covariance matrix Y = 1 i y i y ⊤ � [Baik, Ben Arous, P´ ech´ e ’05] N i 16 / 19

Spiked Wishart Model Q : Observe N independent samples y 1 , . . . , y N where y i ∼ N (0 , I n ) P : Planted vector x ∼ Unif ( {± 1 / √ n } n ) Observe y 1 , . . . , y N with y i ∼ N (0 , I n + β xx ⊤ ) Parameters: n / N → γ, β ∈ [ − 1 , ∞ ) Spectral threshold: if β 2 > γ , can distinguish Q , P using top/bottom eigenvalue of sample covariance matrix Y = 1 i y i y ⊤ � [Baik, Ben Arous, P´ ech´ e ’05] N i Using low-degree method, we show: if β 2 < γ , cannot distinguish Q , P (unless given exponential time) 16 / 19

Negatively-Spiked Wishart Model Our case of interest: β = − 1 (technically β > − 1 , β ≈ − 1) 17 / 19

Hardness of Certification for Constrained PCA Alex Wein Courant - PowerPoint PPT Presentation

Hardness of Certification for Constrained PCA Alex Wein Courant Institute, NYU Joint work with: Afonso Bandeira (NYU) Tim Kunisky (NYU) 1 / 19 Part I: Statistical-to-Computational Gaps and the Low-Degree Method 2 / 19

Mechanical Properties of Paint Coatings Hardness Measurement Shore Hardness D Barcol Hardness

ECS231 PCA, revisited May 28, 2019 1 / 18 Outline 1. PCA for lossy data compression 2. PCA for

MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline

Ive Got You Under My Skin: A Comparison of IV and s/c PCA Nick Williamson Clinical Nurse

Exploratory Factor Analysis PCA Analysis A Review Precipitation Temperature Ecosystems PCA

Lecture 25: Autoencoders Kernel PCA Aykut Erdem January 2017 Hacettepe University Today

Subsection 2 NP -hardness 36 / 109 NP -Hardness Do hard problems exist? Depends on P = NP

Beyond NP [HMU06,Chp.11a] Tautology Problem NP-Hardness and co-NP Historical

Average-Case Fine-Grained Hardness Marshall Ball Alon Rosen Manuel Sabin Prashant Nalini

3/9/2020 The Virtual The Virtual The Virtual The Virtual Certification Certification

Implementing Existing Management Protocols on Constrained Devices J urgen Sch onw alder

Kernel PCA for SNe Kernel PCA for SNe photometric classification photometric classification

Application of PCA to Facial Recognition Aaron Kosmatin, Clayton Broman Math 45 December 17,

Robust PCA Yingjun Wu Preliminary: vector projection Scalar projection of a onto b: a1 could be

The Zen of PCA, t-SNE, and Autoencoders http://mit6874.github.io 1 Today: Gene Expression, PCA,

Discriminant Analysis Aleix M. Martinez aleix@ece.osu.edu PCA Eigenfaces (PCA) 1 Linear

Learning Linear Quadratic Regulators Efficiently with Only Regret T Alon Cohen Joint

The Fundamental Theorem of Algebra in ACL2 Ruben Gamboa and John Cowles Department of Computer

Lattice-Based SNARGs and Their Application to More Efficient Obfuscation Dan Boneh, Yuval Ishai,

COVID-19 Presented by UK Industry Analysts: Samuel Kotze John Griffin 20 th April 2020 COVID-19

Homomorphic Secret Sharing & Applications from Lattice-Based Assumptions Elette Boyle Many

Module Structure of the Space of Holomorphic Polydifferentials Adam Wood Department of

ASIC Physical Design CMOS Processes Smith Text: Chapters 2 & 3 Weste CMOS VLSI

Typically-Correct Derandomization for Small Time and Space William M. Hoza 1 University of Texas

Hardness of Certification for Constrained PCA Alex Wein Courant - PowerPoint PPT Presentation

Hardness of Certification for Constrained PCA Alex Wein Courant Institute, NYU Joint work with: Afonso Bandeira (NYU) Tim Kunisky (NYU) 1 / 19 Part I: Statistical-to-Computational Gaps and the Low-Degree Method 2 / 19

Mechanical Properties of Paint Coatings Hardness Measurement Shore Hardness D Barcol Hardness

ECS231 PCA, revisited May 28, 2019 1 / 18 Outline 1. PCA for lossy data compression 2. PCA for

MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline

Ive Got You Under My Skin: A Comparison of IV and s/c PCA Nick Williamson Clinical Nurse

Exploratory Factor Analysis PCA Analysis A Review Precipitation Temperature Ecosystems PCA

Lecture 25: Autoencoders Kernel PCA Aykut Erdem January 2017 Hacettepe University Today

Subsection 2 NP -hardness 36 / 109 NP -Hardness Do hard problems exist? Depends on P = NP

Beyond NP [HMU06,Chp.11a] Tautology Problem NP-Hardness and co-NP Historical

Average-Case Fine-Grained Hardness Marshall Ball Alon Rosen Manuel Sabin Prashant Nalini

3/9/2020 The Virtual The Virtual The Virtual The Virtual Certification Certification

Implementing Existing Management Protocols on Constrained Devices J urgen Sch onw alder

Kernel PCA for SNe Kernel PCA for SNe photometric classification photometric classification

Application of PCA to Facial Recognition Aaron Kosmatin, Clayton Broman Math 45 December 17,

Robust PCA Yingjun Wu Preliminary: vector projection Scalar projection of a onto b: a1 could be

The Zen of PCA, t-SNE, and Autoencoders http://mit6874.github.io 1 Today: Gene Expression, PCA,

Discriminant Analysis Aleix M. Martinez aleix@ece.osu.edu PCA Eigenfaces (PCA) 1 Linear

Learning Linear Quadratic Regulators Efficiently with Only Regret T Alon Cohen Joint

The Fundamental Theorem of Algebra in ACL2 Ruben Gamboa and John Cowles Department of Computer

Lattice-Based SNARGs and Their Application to More Efficient Obfuscation Dan Boneh, Yuval Ishai,

COVID-19 Presented by UK Industry Analysts: Samuel Kotze John Griffin 20 th April 2020 COVID-19

Homomorphic Secret Sharing &amp; Applications from Lattice-Based Assumptions Elette Boyle Many

Module Structure of the Space of Holomorphic Polydifferentials Adam Wood Department of

ASIC Physical Design CMOS Processes Smith Text: Chapters 2 &amp; 3 Weste CMOS VLSI

Typically-Correct Derandomization for Small Time and Space William M. Hoza 1 University of Texas

Homomorphic Secret Sharing & Applications from Lattice-Based Assumptions Elette Boyle Many

ASIC Physical Design CMOS Processes Smith Text: Chapters 2 & 3 Weste CMOS VLSI