Spectral Methods from Tensor Networks Alex Wein Courant Institute, - - PowerPoint PPT Presentation

spectral methods from tensor networks
SMART_READER_LITE
LIVE PREVIEW

Spectral Methods from Tensor Networks Alex Wein Courant Institute, - - PowerPoint PPT Presentation

Spectral Methods from Tensor Networks Alex Wein Courant Institute, NYU Joint work with Ankur Moitra (MIT) 1 / 19 Outline Tensors 2 / 19 Outline Tensors Statistical problems involving tensors 2 / 19 Outline Tensors


slide-1
SLIDE 1

Spectral Methods from Tensor Networks

Alex Wein

Courant Institute, NYU Joint work with Ankur Moitra (MIT)

1 / 19

slide-2
SLIDE 2

Outline

◮ Tensors

2 / 19

slide-3
SLIDE 3

Outline

◮ Tensors ◮ Statistical problems involving tensors

2 / 19

slide-4
SLIDE 4

Outline

◮ Tensors ◮ Statistical problems involving tensors ◮ A general framework for designing algorithms for tensor

problems: “spectral methods from tensor networks”

2 / 19

slide-5
SLIDE 5

Outline

◮ Tensors ◮ Statistical problems involving tensors ◮ A general framework for designing algorithms for tensor

problems: “spectral methods from tensor networks”

◮ Orbit recovery: a certain class of tensor problems

◮ Structured tensor decomposition 2 / 19

slide-6
SLIDE 6

Outline

◮ Tensors ◮ Statistical problems involving tensors ◮ A general framework for designing algorithms for tensor

problems: “spectral methods from tensor networks”

◮ Orbit recovery: a certain class of tensor problems

◮ Structured tensor decomposition

◮ Main result: first polynomial-time algorithm for a certain orbit

recovery problem

2 / 19

slide-7
SLIDE 7
  • I. Tensors and Tensor Networks

3 / 19

slide-8
SLIDE 8

What is a Tensor?

An order-p tensor is an n1 × n2 × · · · × np multi-array: T = (Ti1,i2,...,ip) with ij ∈ {1, 2, . . . , nj}. An order-1 tensor is a vector. An order-2 tensor is a matrix.

4 / 19

slide-9
SLIDE 9

What is a Tensor?

An order-p tensor is an n1 × n2 × · · · × np multi-array: T = (Ti1,i2,...,ip) with ij ∈ {1, 2, . . . , nj}. An order-1 tensor is a vector. An order-2 tensor is a matrix. T is symmetric if n1 = · · · = np = n and Ti1,...,ip = Tiπ(1),...,iπ(p) for any permutation π.

◮ In this talk, all tensors will be symmetric.

4 / 19

slide-10
SLIDE 10

What is a Tensor?

An order-p tensor is an n1 × n2 × · · · × np multi-array: T = (Ti1,i2,...,ip) with ij ∈ {1, 2, . . . , nj}. An order-1 tensor is a vector. An order-2 tensor is a matrix. T is symmetric if n1 = · · · = np = n and Ti1,...,ip = Tiπ(1),...,iπ(p) for any permutation π.

◮ In this talk, all tensors will be symmetric.

Given p vectors x1, . . . , xp, the rank-1 tensor x1 ⊗ x2 ⊗ · · · ⊗ xp has entries (x1 ⊗ x2 ⊗ · · · ⊗ xp)i1,...,ip = (x1)i1(x2)i2 · · · (xp)ip.

◮ Generalizes the rank-1 matrix xy⊤. ◮ Symmetric version: x⊗p = x ⊗ · · · ⊗ x

(p times).

4 / 19

slide-11
SLIDE 11

Tensor Problems

Some statistical problems involving tensors:

5 / 19

slide-12
SLIDE 12

Tensor Problems

Some statistical problems involving tensors:

◮ Tensor PCA / Spiked Tensor Model [RM’14, HSS’15]:

Observe T = λx⊗p + Z where

◮ x ∈ Rn is planted “signal” (norm 1) ◮ λ > 0 is signal-to-noise parameter ◮ Z is “noise” (i.i.d. Gaussian tensor)

Goal: given T, recover x “Recover a rank-1 tensor buried in noise”

5 / 19

slide-13
SLIDE 13

Tensor Problems

Some statistical problems involving tensors:

◮ Tensor PCA / Spiked Tensor Model [RM’14, HSS’15]:

Observe T = λx⊗p + Z where

◮ x ∈ Rn is planted “signal” (norm 1) ◮ λ > 0 is signal-to-noise parameter ◮ Z is “noise” (i.i.d. Gaussian tensor)

Goal: given T, recover x “Recover a rank-1 tensor buried in noise”

◮ Tensor Decomposition [AGJ’14, BKS’15, GM’15, HSSS’16, MSS’16]:

Observe T = r

i=1 x⊗p i

where {xi} are random vectors:

◮ xi ∼ N(0, In)

Goal: given T, recover {x1, . . . , xr} “Recover the components of a rank-r tensor”

5 / 19

slide-14
SLIDE 14

Tensor Network Notation

A graphical representation for tensors (used in e.g., quantum)

6 / 19

slide-15
SLIDE 15

Tensor Network Notation

A graphical representation for tensors (used in e.g., quantum) An order-p tensor has p “legs”, one for each index: T

i j k

⇔ T = (Ti,j,k)

6 / 19

slide-16
SLIDE 16

Tensor Network Notation

A graphical representation for tensors (used in e.g., quantum) An order-p tensor has p “legs”, one for each index: T

i j k

⇔ T = (Ti,j,k) Two (or more) tensors can be attached by contracting indices: T U

i a c b d

B = (Ba,b,c,d) Ba,b,c,d =

i Ta,c,i Ub,d,i

Rule: sum over “fully connected” indices (in this case, i)

6 / 19

slide-17
SLIDE 17

More Examples

A bigger example: T T T u

j k i a c b d

B = (Ba,b,c,d) Ba,b,c,d =

i,j,k Ta,c,j Tb,d,k Ti,j,k ui

7 / 19

slide-18
SLIDE 18

More Examples

A bigger example: T T T u

j k i a c b d

B = (Ba,b,c,d) Ba,b,c,d =

i,j,k Ta,c,j Tb,d,k Ti,j,k ui

This framework generalizes matrix/vector multiplication: x − A − B − y ⇔ x⊤ABy

7 / 19

slide-19
SLIDE 19

More Examples

A bigger example: T T T u

j k i a c b d

B = (Ba,b,c,d) Ba,b,c,d =

i,j,k Ta,c,j Tb,d,k Ti,j,k ui

This framework generalizes matrix/vector multiplication: x − A − B − y ⇔ x⊤ABy

  • ijk

xiAijBjkyk

7 / 19

slide-20
SLIDE 20
  • II. Spectral Methods from Tensor Networks

8 / 19

slide-21
SLIDE 21

Spectral Methods from Tensor Networks

General framework for solving tensor problems:

  • 1. Given input tensor T
  • 2. Build a new tensor B by connecting copies of T in a tensor

network

  • 3. Flatten B to form a symmetric matrix M

◮ E.g., the ({a, b}, {c, d})-flattening of B = (Ba,b,c,d) is the

n2 × n2 matrix M(a,b),(c,d) = Ba,b,c,d

  • 4. Compute the leading eigenvector of M

9 / 19

slide-22
SLIDE 22

Prior Work

Prior work has (implicitly) used this framework: T T T T

d b c a

T T T T T u

a c b d

◮ [Richard–Montanari’14, Hopkins–Shi–Steurer’15] “Tensor unfolding” ◮ [Hopkins–Shi–Steurer’15] “Spectral SoS” ◮ [Hopkins–Schramm–Shi–Steurer’16] “Spectral SoS with partial trace” ◮ [Hopkins–Schramm–Shi–Steurer’16] “Spectral tensor decomposition”

u is a random vector (to break symmetry).

10 / 19

slide-23
SLIDE 23

Our Contribution

11 / 19

slide-24
SLIDE 24

Our Contribution

We give the first polynomial-time algorithm for a particular tensor problem: heterogeneous continuous multi-reference alignment. The algorithm is a spectral method based on this tensor network: u T T T T T T T T T a c b d Smaller tensor networks fail for this problem.

11 / 19

slide-25
SLIDE 25

General Analysis of Tensor Networks

12 / 19

slide-26
SLIDE 26

General Analysis of Tensor Networks

Main step of analysis is to upper bound largest eigenvalue of a matrix built from a tensor network.

12 / 19

slide-27
SLIDE 27

General Analysis of Tensor Networks

Main step of analysis is to upper bound largest eigenvalue of a matrix built from a tensor network. Trace moment method: for a symmetric matrix M with eigenvalues {λi} and λmax = maxi |λi|, Tr(M2k) =

  • i

λ2k

i

≥ λ2k

max

so compute E[Tr(M2k)] and apply Markov’s inequality: P(λmax ≥ t) = P(λ2k

max ≥ t2k) ≤ E[Tr(M2k)]

t2k .

12 / 19

slide-28
SLIDE 28

Trace Method for Tensor Networks

Example: T is an order-3 symmetric tensor with i.i.d. Rademacher (uniform ±1) entries, and we want to compute E[Tr(M6)] where M is the ({a, b}, {c, d})-flattening of this tensor: T T

a c b d

13 / 19

slide-29
SLIDE 29

Trace Method for Tensor Networks

Example: T is an order-3 symmetric tensor with i.i.d. Rademacher (uniform ±1) entries, and we want to compute E[Tr(M6)] where M is the ({a, b}, {c, d})-flattening of this tensor: T T

a c b d

Note that Tr(M6) = M M M M M M so plug in the definition of M...

13 / 19

slide-30
SLIDE 30

Trace Method for Tensor Networks (Continued)

Tr(M6) = T T T T T T T T T T T T So the computation of E[Tr(M6)] is reduced to a combinatorial question about this diagram. When T is i.i.d. Rademacher: E[Tr(M6)] is the number of ways to label the edges of the diagram with elements of [n] such that each triple {i, j, k} appears incident to an even number of T’s.

14 / 19

slide-31
SLIDE 31

Trace Method for Tensor Networks (Continued)

Tr(M6) = T T T T T T T T T T T T

i k j

So the computation of E[Tr(M6)] is reduced to a combinatorial question about this diagram. When T is i.i.d. Rademacher: E[Tr(M6)] is the number of ways to label the edges of the diagram with elements of [n] such that each triple {i, j, k} appears incident to an even number of T’s.

14 / 19

slide-32
SLIDE 32

Trace Method for Tensor Networks (Continued)

Tr(M6) = T T T T T T T T T T T T

i k j i j k

So the computation of E[Tr(M6)] is reduced to a combinatorial question about this diagram. When T is i.i.d. Rademacher: E[Tr(M6)] is the number of ways to label the edges of the diagram with elements of [n] such that each triple {i, j, k} appears incident to an even number of T’s.

14 / 19

slide-33
SLIDE 33
  • III. Orbit Recovery Problems

15 / 19

slide-34
SLIDE 34

Image Alignment

Given many noisy rotated copies of an image, recover the image.

Image credit: [Bandeira, PhD thesis ’15] 16 / 19

slide-35
SLIDE 35

Image Alignment

Given many noisy rotated copies of an image, recover the image.

Image credit: [Bandeira, PhD thesis ’15]

Application: cryo-EM (cryo-electron microscopy)

◮ Given many noisy pictures of a molecule taken from different

unknown angles, recover the 3D structure of the molecule.

16 / 19

slide-36
SLIDE 36

Orbit Recovery

Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPWW17,APS18]:

17 / 19

slide-37
SLIDE 37

Orbit Recovery

Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPWW17,APS18]:

◮ Let x ∈ Rn be an unknown “signal” (e.g. the image)

17 / 19

slide-38
SLIDE 38

Orbit Recovery

Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPWW17,APS18]:

◮ Let x ∈ Rn be an unknown “signal” (e.g. the image) ◮ Let G be a compact group acting on Rn (e.g. rotations SO(2))

17 / 19

slide-39
SLIDE 39

Orbit Recovery

Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPWW17,APS18]:

◮ Let x ∈ Rn be an unknown “signal” (e.g. the image) ◮ Let G be a compact group acting on Rn (e.g. rotations SO(2)) ◮ Observe samples yi = gi · x + zi where gi ∼ G, zi ∼ N(0, In)

17 / 19

slide-40
SLIDE 40

Orbit Recovery

Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPWW17,APS18]:

◮ Let x ∈ Rn be an unknown “signal” (e.g. the image) ◮ Let G be a compact group acting on Rn (e.g. rotations SO(2)) ◮ Observe samples yi = gi · x + zi where gi ∼ G, zi ∼ N(0, In) ◮ Goal: recover the orbit of x (can’t distinguish x from g · x)

17 / 19

slide-41
SLIDE 41

Orbit Recovery

Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPWW17,APS18]:

◮ Let x ∈ Rn be an unknown “signal” (e.g. the image) ◮ Let G be a compact group acting on Rn (e.g. rotations SO(2)) ◮ Observe samples yi = gi · x + zi where gi ∼ G, zi ∼ N(0, In) ◮ Goal: recover the orbit of x (can’t distinguish x from g · x) ◮ Heterogeneous: signals x1, . . . , xK, samples yi = gi · xki + zi

17 / 19

slide-42
SLIDE 42

Orbit Recovery

Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPWW17,APS18]:

◮ Let x ∈ Rn be an unknown “signal” (e.g. the image) ◮ Let G be a compact group acting on Rn (e.g. rotations SO(2)) ◮ Observe samples yi = gi · x + zi where gi ∼ G, zi ∼ N(0, In) ◮ Goal: recover the orbit of x (can’t distinguish x from g · x) ◮ Heterogeneous: signals x1, . . . , xK, samples yi = gi · xki + zi

This paper: heterogeneous continuous multi-reference alignment

17 / 19

slide-43
SLIDE 43

Orbit Recovery

Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPWW17,APS18]:

◮ Let x ∈ Rn be an unknown “signal” (e.g. the image) ◮ Let G be a compact group acting on Rn (e.g. rotations SO(2)) ◮ Observe samples yi = gi · x + zi where gi ∼ G, zi ∼ N(0, In) ◮ Goal: recover the orbit of x (can’t distinguish x from g · x) ◮ Heterogeneous: signals x1, . . . , xK, samples yi = gi · xki + zi

This paper: heterogeneous continuous multi-reference alignment

◮ Each signal xk is a random real-valued (band-limited) function

  • n the unit circle

17 / 19

slide-44
SLIDE 44

Orbit Recovery

Orbit Recovery Problem [APS17,BRW17,PWBRS17,BBKPWW17,APS18]:

◮ Let x ∈ Rn be an unknown “signal” (e.g. the image) ◮ Let G be a compact group acting on Rn (e.g. rotations SO(2)) ◮ Observe samples yi = gi · x + zi where gi ∼ G, zi ∼ N(0, In) ◮ Goal: recover the orbit of x (can’t distinguish x from g · x) ◮ Heterogeneous: signals x1, . . . , xK, samples yi = gi · xki + zi

This paper: heterogeneous continuous multi-reference alignment

◮ Each signal xk is a random real-valued (band-limited) function

  • n the unit circle

◮ G = SO(2) acting by rotation

17 / 19

slide-45
SLIDE 45

Our algorithm

Method of moments: use samples to estimate 3rd moment tensor Ei[y⊗3

i

] ⇒ T =

K

  • k=1
  • g∼SO(2)

(g · xk)⊗3.

18 / 19

slide-46
SLIDE 46

Our algorithm

Method of moments: use samples to estimate 3rd moment tensor Ei[y⊗3

i

] ⇒ T =

K

  • k=1
  • g∼SO(2)

(g · xk)⊗3. Plug T (and random u) into tensor network, and compute leading eigenvector:

u T T T T T T T T T a c b d

18 / 19

slide-47
SLIDE 47

Our algorithm

Method of moments: use samples to estimate 3rd moment tensor Ei[y⊗3

i

] ⇒ T =

K

  • k=1
  • g∼SO(2)

(g · xk)⊗3. Plug T (and random u) into tensor network, and compute leading eigenvector:

u T T T T T T T T T a c b d

Our algorithm gives:

◮ optimal sample complexity ◮ heterogeneity K ≤ nδ (optimal should be n1/2) ◮ list recovery of {xk} ◮ first solution to heterogeneous problem over infinite group

18 / 19

slide-48
SLIDE 48

Summary

◮ General framework for designing spectral algorithms for tensor

problems

◮ Tensor network notation makes general analysis tractable ◮ First polynomial-time algorithm for a certain continuous tensor

decomposition problem (heterogeneous continuous MRA)

◮ Orbit recovery problems are in need of further theoretical

study

◮ All groups (especially infinite groups) ◮ Optimal heterogeneity 19 / 19

slide-49
SLIDE 49

Summary

◮ General framework for designing spectral algorithms for tensor

problems

◮ Tensor network notation makes general analysis tractable ◮ First polynomial-time algorithm for a certain continuous tensor

decomposition problem (heterogeneous continuous MRA)

◮ Orbit recovery problems are in need of further theoretical

study

◮ All groups (especially infinite groups) ◮ Optimal heterogeneity

Thanks!

19 / 19