spectral methods from tensor networks
play

Spectral Methods from Tensor Networks Alex Wein Courant Institute, - PowerPoint PPT Presentation

Spectral Methods from Tensor Networks Alex Wein Courant Institute, NYU Joint work with Ankur Moitra (MIT) 1 / 19 Outline Tensors 2 / 19 Outline Tensors Statistical problems involving tensors 2 / 19 Outline Tensors


  1. Spectral Methods from Tensor Networks Alex Wein Courant Institute, NYU Joint work with Ankur Moitra (MIT) 1 / 19

  2. Outline ◮ Tensors 2 / 19

  3. Outline ◮ Tensors ◮ Statistical problems involving tensors 2 / 19

  4. Outline ◮ Tensors ◮ Statistical problems involving tensors ◮ A general framework for designing algorithms for tensor problems: “spectral methods from tensor networks” 2 / 19

  5. Outline ◮ Tensors ◮ Statistical problems involving tensors ◮ A general framework for designing algorithms for tensor problems: “spectral methods from tensor networks” ◮ Orbit recovery: a certain class of tensor problems ◮ Structured tensor decomposition 2 / 19

  6. Outline ◮ Tensors ◮ Statistical problems involving tensors ◮ A general framework for designing algorithms for tensor problems: “spectral methods from tensor networks” ◮ Orbit recovery: a certain class of tensor problems ◮ Structured tensor decomposition ◮ Main result: first polynomial-time algorithm for a certain orbit recovery problem 2 / 19

  7. I. Tensors and Tensor Networks 3 / 19

  8. What is a Tensor? An order-p tensor is an n 1 × n 2 × · · · × n p multi-array: T = ( T i 1 , i 2 ,..., i p ) with i j ∈ { 1 , 2 , . . . , n j } . An order-1 tensor is a vector. An order-2 tensor is a matrix. 4 / 19

  9. What is a Tensor? An order-p tensor is an n 1 × n 2 × · · · × n p multi-array: T = ( T i 1 , i 2 ,..., i p ) with i j ∈ { 1 , 2 , . . . , n j } . An order-1 tensor is a vector. An order-2 tensor is a matrix. T is symmetric if n 1 = · · · = n p = n and T i 1 ,..., i p = T i π (1) ,..., i π ( p ) for any permutation π . ◮ In this talk, all tensors will be symmetric. 4 / 19

  10. What is a Tensor? An order-p tensor is an n 1 × n 2 × · · · × n p multi-array: T = ( T i 1 , i 2 ,..., i p ) with i j ∈ { 1 , 2 , . . . , n j } . An order-1 tensor is a vector. An order-2 tensor is a matrix. T is symmetric if n 1 = · · · = n p = n and T i 1 ,..., i p = T i π (1) ,..., i π ( p ) for any permutation π . ◮ In this talk, all tensors will be symmetric. Given p vectors x 1 , . . . , x p , the rank-1 tensor x 1 ⊗ x 2 ⊗ · · · ⊗ x p has entries ( x 1 ⊗ x 2 ⊗ · · · ⊗ x p ) i 1 ,..., i p = ( x 1 ) i 1 ( x 2 ) i 2 · · · ( x p ) i p . ◮ Generalizes the rank-1 matrix xy ⊤ . ◮ Symmetric version: x ⊗ p = x ⊗ · · · ⊗ x ( p times). 4 / 19

  11. Tensor Problems Some statistical problems involving tensors: 5 / 19

  12. Tensor Problems Some statistical problems involving tensors: ◮ Tensor PCA / Spiked Tensor Model [RM’14, HSS’15] : Observe T = λ x ⊗ p + Z where ◮ x ∈ R n is planted “signal” (norm 1) ◮ λ > 0 is signal-to-noise parameter ◮ Z is “noise” (i.i.d. Gaussian tensor) Goal: given T , recover x “Recover a rank-1 tensor buried in noise” 5 / 19

  13. Tensor Problems Some statistical problems involving tensors: ◮ Tensor PCA / Spiked Tensor Model [RM’14, HSS’15] : Observe T = λ x ⊗ p + Z where ◮ x ∈ R n is planted “signal” (norm 1) ◮ λ > 0 is signal-to-noise parameter ◮ Z is “noise” (i.i.d. Gaussian tensor) Goal: given T , recover x “Recover a rank-1 tensor buried in noise” ◮ Tensor Decomposition [AGJ’14, BKS’15, GM’15, HSSS’16, MSS’16] : Observe T = � r i =1 x ⊗ p where { x i } are random vectors: i ◮ x i ∼ N (0 , I n ) Goal: given T , recover { x 1 , . . . , x r } “Recover the components of a rank- r tensor” 5 / 19

  14. Tensor Network Notation A graphical representation for tensors (used in e.g., quantum) 6 / 19

  15. Tensor Network Notation A graphical representation for tensors (used in e.g., quantum) An order- p tensor has p “legs”, one for each index: i T ⇔ T = ( T i , j , k ) k j 6 / 19

  16. Tensor Network Notation A graphical representation for tensors (used in e.g., quantum) An order- p tensor has p “legs”, one for each index: i T ⇔ T = ( T i , j , k ) k j Two (or more) tensors can be attached by contracting indices: a b i T U B = ( B a , b , c , d ) ⇔ c B a , b , c , d = � i T a , c , i U b , d , i d Rule: sum over “fully connected” indices (in this case, i ) 6 / 19

  17. More Examples A bigger example: u i j T k B = ( B a , b , c , d ) ⇔ T T B a , b , c , d = � i , j , k T a , c , j T b , d , k T i , j , k u i a c b d 7 / 19

  18. More Examples A bigger example: u i j T k B = ( B a , b , c , d ) ⇔ T T B a , b , c , d = � i , j , k T a , c , j T b , d , k T i , j , k u i a c b d This framework generalizes matrix/vector multiplication: x ⊤ ABy x − A − B − y ⇔ 7 / 19

  19. More Examples A bigger example: u i j T k B = ( B a , b , c , d ) ⇔ T T B a , b , c , d = � i , j , k T a , c , j T b , d , k T i , j , k u i a c b d This framework generalizes matrix/vector multiplication: x ⊤ ABy x − A − B − y ⇔ � x i A ij B jk y k ijk 7 / 19

  20. II. Spectral Methods from Tensor Networks 8 / 19

  21. Spectral Methods from Tensor Networks General framework for solving tensor problems: 1. Given input tensor T 2. Build a new tensor B by connecting copies of T in a tensor network 3. Flatten B to form a symmetric matrix M ◮ E.g., the ( { a , b } , { c , d } )-flattening of B = ( B a , b , c , d ) is the n 2 × n 2 matrix M ( a , b ) , ( c , d ) = B a , b , c , d 4. Compute the leading eigenvector of M 9 / 19

  22. Prior Work Prior work has (implicitly) used this framework: u a c T T T T T T T T T b d a c b d ◮ [Richard–Montanari’14, Hopkins–Shi–Steurer’15] “Tensor unfolding” ◮ [Hopkins–Shi–Steurer’15] “Spectral SoS” ◮ [Hopkins–Schramm–Shi–Steurer’16] “Spectral SoS with partial trace” ◮ [Hopkins–Schramm–Shi–Steurer’16] “Spectral tensor decomposition” u is a random vector (to break symmetry). 10 / 19

  23. Our Contribution 11 / 19

  24. Our Contribution We give the first polynomial-time algorithm for a particular tensor problem: heterogeneous continuous multi-reference alignment. The algorithm is a spectral method based on this tensor network: c a T T T T u T T T b T T d Smaller tensor networks fail for this problem. 11 / 19

  25. General Analysis of Tensor Networks 12 / 19

  26. General Analysis of Tensor Networks Main step of analysis is to upper bound largest eigenvalue of a matrix built from a tensor network. 12 / 19

  27. General Analysis of Tensor Networks Main step of analysis is to upper bound largest eigenvalue of a matrix built from a tensor network. Trace moment method: for a symmetric matrix M with eigenvalues { λ i } and λ max = max i | λ i | , Tr ( M 2 k ) = � λ 2 k ≥ λ 2 k i max i so compute E [ Tr ( M 2 k )] and apply Markov’s inequality: max ≥ t 2 k ) ≤ E [ Tr ( M 2 k )] P ( λ max ≥ t ) = P ( λ 2 k . t 2 k 12 / 19

  28. Trace Method for Tensor Networks Example: T is an order-3 symmetric tensor with i.i.d. Rademacher (uniform ± 1) entries, and we want to compute E [ Tr ( M 6 )] where M is the ( { a , b } , { c , d } )-flattening of this tensor: a b T T c d 13 / 19

  29. Trace Method for Tensor Networks Example: T is an order-3 symmetric tensor with i.i.d. Rademacher (uniform ± 1) entries, and we want to compute E [ Tr ( M 6 )] where M is the ( { a , b } , { c , d } )-flattening of this tensor: a b T T c d Note that M M M Tr ( M 6 ) = M M M so plug in the definition of M ... 13 / 19

  30. Trace Method for Tensor Networks (Continued) T T T T T T Tr ( M 6 ) = T T T T T T So the computation of E [ Tr ( M 6 )] is reduced to a combinatorial question about this diagram. When T is i.i.d. Rademacher: E [ Tr ( M 6 )] is the number of ways to label the edges of the diagram with elements of [ n ] such that each triple { i , j , k } appears incident to an even number of T ’s. 14 / 19

  31. Trace Method for Tensor Networks (Continued) T j T k T i T T T Tr ( M 6 ) = T T T T T T So the computation of E [ Tr ( M 6 )] is reduced to a combinatorial question about this diagram. When T is i.i.d. Rademacher: E [ Tr ( M 6 )] is the number of ways to label the edges of the diagram with elements of [ n ] such that each triple { i , j , k } appears incident to an even number of T ’s. 14 / 19

  32. Trace Method for Tensor Networks (Continued) T j T k T i T T T Tr ( M 6 ) j = T T T T T k i T So the computation of E [ Tr ( M 6 )] is reduced to a combinatorial question about this diagram. When T is i.i.d. Rademacher: E [ Tr ( M 6 )] is the number of ways to label the edges of the diagram with elements of [ n ] such that each triple { i , j , k } appears incident to an even number of T ’s. 14 / 19

  33. III. Orbit Recovery Problems 15 / 19

  34. Image Alignment Given many noisy rotated copies of an image, recover the image. Image credit: [Bandeira, PhD thesis ’15] 16 / 19

  35. Image Alignment Given many noisy rotated copies of an image, recover the image. Image credit: [Bandeira, PhD thesis ’15] Application: cryo-EM (cryo-electron microscopy) ◮ Given many noisy pictures of a molecule taken from different unknown angles, recover the 3D structure of the molecule. 16 / 19

  36. Orbit Recovery Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPW W 17,APS18] : 17 / 19

  37. Orbit Recovery Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPW W 17,APS18] : ◮ Let x ∈ R n be an unknown “signal” (e.g. the image) 17 / 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend