Tensors Lek-Heng Lim Statistics Department Retreat October 27, - - PowerPoint PPT Presentation

tensors
SMART_READER_LITE
LIVE PREVIEW

Tensors Lek-Heng Lim Statistics Department Retreat October 27, - - PowerPoint PPT Presentation

Tensors Lek-Heng Lim Statistics Department Retreat October 27, 2012 Thanks: NSF DMS 1209136 and DMS 1057064 L.-H. Lim (Stat Retreat) Tensors October 27, 2012 1 / 20 tensors on one foot a tensor is a multilinear functional f : V 1


slide-1
SLIDE 1

Tensors

Lek-Heng Lim

Statistics Department Retreat

October 27, 2012

Thanks: NSF DMS 1209136 and DMS 1057064

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 1 / 20

slide-2
SLIDE 2

tensors on one foot

a tensor is a multilinear functional f : V1 × · · · × Vd → C if we give f coordinates, get hypermatrix A = (aj1···jd) ∈ Cn1×···×nd where n1 = dim V1, . . . , nd = dim Vd d-dimensional hypermatrix represents d-tensor the same way matrix represents 2-tensor (i.e. linear operators, bilinear forms, bivectors) for more info:

◮ P. McCullagh, Tensor Methods in Statistics, Chapman and Hall,

London, 1987.

◮ plug: L.-H. Lim, “Tensors,” in L. Hogben (Ed.), Handbook of Linear

Algebra, 2nd Ed., CRC Press, Boca Raton, FL, 2013.

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 2 / 20

slide-3
SLIDE 3

where do we find tensors?

higher-order derivatives f (x) ∈ R, ∇f (x) ∈ Rn, ∇2f (x) ∈ Rn×n, ∇3f (x) ∈ Rn×n×n, ∇4f (x) ∈ Rn×n×n×n, . . . multivariate moments and cumulants [Fisher-Wishart, 1932]: log E(exp(it, x)) ≈

m

  • |α|=1

i|α|κα(x)tα α!. coefficients are symmetric tensors: (κα(x))|α|=1 ∈ Cp, (κα(x))|α|=2 ∈ Cp×p, (κα(x))|α|=3 ∈ Cp×p×p, (κα(x))|α|=4 ∈ Cp×p×p×p, . . .

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 3 / 20

slide-4
SLIDE 4

where do we find tensors?

quantum mechanics

◮ H1, . . . , Hk state spaces, state space of unified system is

H ⊆ H1 ⊗ · · · ⊗ Hk

◮ H contains factorizable states ψ1 ⊗ · · · ⊗ ψk but also mixed states

αψ1 ⊗ · · · ⊗ ψk + · · · + βϕ1 ⊗ · · · ⊗ ϕk

◮ Hj : Hj → Hj Hamiltonian of jth system and I identity operator

H1 ⊗ I ⊗ · · · ⊗ I + I ⊗ H2 ⊗ · · · ⊗ I + · · · + I ⊗ · · · ⊗ I ⊗ Hk Hamiltonian of unified system provided systems do not interact

self-concordance in convex optimization ∇3f (x) ⊗ ∇3f (x) 4∇2f (x) ⊗ ∇2f (x) ⊗ ∇2f (x)

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 4 / 20

slide-5
SLIDE 5

what can we do with a single tensor?

rank hyperdeterminant various decompositions system of multilinear equations multilinear programming multilinear least squares eigenvalues and eigenvectors singular values and singular vectors Gaussian elimination and QR factorization nonnegative tensors and Perron-Frobenius theory spectral, operator, H¨

  • lder, Schatten, Ky Fan norms

symmetric positive definite tensors and Cholesky decomposition linear preservers of rank, hyperdeterminant, singular, and eigenvalues

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 5 / 20

slide-6
SLIDE 6

why study tensors?

a rich source of new problems

◮ hypermatrix analogues of matrix notions ◮ problems trivial for matrices become non-trivial

a rich source of tools for known applications

◮ quantum systems ◮ holographic algorithms ◮ algebraic complexity of matrix multiplication and inversion

a rich source of tools for new applications

◮ causal inference ◮ phylogenetics inference ◮ higher order optimization theory ◮ principal components of higher order moments and cumulants ◮ spectral hypergraph theory ◮ encoding NP-hard and #P-hard problems ◮ multiarray signal processing ◮ diffusion MRI imaging

caveat: there will be obstacles

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 6 / 20

slide-7
SLIDE 7

tensor rank

rank of A ∈ Cl×m×n [Hitchcock, 1927] is rank(A) := min

  • r
  • A =

r

i=1 σiui ⊗ vi ⊗ wi

  • computational complexity: Strassen matrix multiplication/inversion

inf

  • ω
  • rank⊗

n

i,j,k=1ϕik ⊗ ϕkj ⊗ Eij

  • = O(nω)
  • = 2?

quantum computing: algebraic measure of entanglement |GHZ = |0 ⊗ |0 ⊗ |0 + |1 ⊗ |1 ⊗ |1 ∈ C2×2×2 machine learning: na¨ ıve Bayes model Pr(x, y, z) =

  • h Pr(h) Pr(x | h) Pr(y | h) Pr(z | h)

H

  • X
  • Y
  • Z
  • L.-H. Lim (Stat Retreat)

Tensors October 27, 2012 7 / 20

slide-8
SLIDE 8

example: phylogenetic invariants

Markov model for evolution of 3-taxon tree [Allman-Rhodes, 2006] probability distribution given by 4 × 4 × 4 table with model P = πAρA ⊗ σA ⊗ θA + πCρC ⊗ σC ⊗ θC + πGρG ⊗ σG ⊗ θG + πTρT ⊗ σT ⊗ θT for i, j, k ∈ {A, C, G, T}, pijk = πAρAiσAjθAk + πCρCiσCjθCk + πGρGiσGjθGk + πTρTiσTjθTk

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 8 / 20

slide-9
SLIDE 9

multilinear systems and hyperdeterminants

hyperdeterminant of A = (aijk) ∈ R2×2×2 [Cayley, 1845] is

Det2,2,2(A) = 1 4

  • det

a000 a010 a001 a011

  • +

a100 a110 a101 a111

  • − det

a000 a010 a001 a011

a100 a110 a101 a111 2 − 4 det a000 a010 a001 a011

  • det

a100 a110 a101 a111

  • a result that parallels the matrix case: system of bilinear equations

a000x0y0 + a010x0y1 + a100x1y0 + a110x1y1 = 0, a001x0y0 + a011x0y1 + a101x1y0 + a111x1y1 = 0, a000x0z0 + a001x0z1 + a100x1z0 + a101x1z1 = 0, a010x0z0 + a011x0z1 + a110x1z0 + a111x1z1 = 0, a000y0z0 + a001y0z1 + a010y1z0 + a011y1z1 = 0, a100y0z0 + a101y0z1 + a110y1z0 + a111y1z1 = 0,

has non-trivial solution iff Det2,2,2(A) = 0

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 9 / 20

slide-10
SLIDE 10

eigenvalues and singular values of tensors

eigenvalues and singular values are Lagrange multipliers eigenvalues/vectors of S = (sijk) ∈ S3(Cn): cubic Rayleigh quotient [LHL, 2005; Qi, 2005] S(x, x, x) = n

i,j,k=1 sijkxixjxk

constrained to unit ℓ3-sphere x3 = 1 singular values/vectors of A = (aijk) ∈ Cl×m×n: trilinear Rayleigh quotient [LHL, 2005] A(x, y, z) = l,m,n

i,j,k=1 aijkxiyjzk

constrained to product of unit ℓ3-spheres x3 = y3 = z3 = 1 Perron-Frobenius theorem for nonnegative tensors [LHL, 2005], [Chang-Pearson-Zhang, 2010], [Friedland-Gaubert-Han, 2012]

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 10 / 20

slide-11
SLIDE 11

tensor norms

  • perator norm of A ∈ Cl×m×n

A2,2,2 = max

x=0,y=0,z=0

|A(x, y, z)| xyz = σmax(A) i.e. equals largest singular value of A Schatten and Ky Fan norms [LHL-Comon, 2012] A∗,p := inf r

i=1|λi|p1/p

  • A =

r

i=1 λiui ⊗ vi ⊗ wi,

ui = vi = wi = 1, r ∈ N

  • ne interesting property [LHL-Comon, 2012]

A∗,1 ≤ rank(A)A∗,∞ analogue of v1 ≤ v0v∞ and M∗ ≤ rank(M)M2 for v ∈ Cn and M ∈ Cm×n

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 11 / 20

slide-12
SLIDE 12

most tensor problems are NP-hard

NP-Hard NP-Complete P NP

Matrix Problems Tensor Problems

some have no FPTAS some are NP-hard even to approximate some are #P-hard some are undecidable C.J. Hillar and L.-H. Lim, “Most tensor problems are NP hard,” J. Assoc.

  • Comput. Mach., to appear.

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 12 / 20

slide-13
SLIDE 13

3-coloring encoded as tensor problem

1 2 3 4 1 2 3 4

3-colorings of left graph can be encoded as nonzero real solutions to following square set of n = 35 quadratic polynomials in 35 real unknowns ai, bi, ci, di (i = 1, . . . , 4), u, wi (i = 1, . . . , 18):

a1c1 − b1d1 − u2, b1c1 + a1d1, c1u − a2

1 + b2 1, d1u − 2a1b1, a1u − c2 1 + d2 1 , b1u − 2d1c1,

a2c2 − b2d2 − u2, b2c2 + a2d2, c2u − a2

2 + b2 2, d2u − 2a2b2, a2u − c2 2 + d2 2 , b2u − 2d2c2,

a3c3 − b3d3 − u2, b3c3 + a3d3, c3u − a2

3 + b2 3, d3u − 2a3b3, a3u − c2 3 + d2 3 , b3u − 2d3c3,

a4c4 − b4d4 − u2, b4c4 + a4d4, c4u − a2

4 + b2 4, d4u − 2a4b4, a4u − c2 4 + d2 4 , b4u − 2d4c4,

a2

1 − b2 1 + a1a3 − b1b3 + a2 3 − b2 3, a2 1 − b2 1 + a1a4 − b1b4 + a2 4 − b2 4, a2 1 − b2 1 + a1a2 − b1b2 + a2 2 − b2 2,

a2

2 − b2 2 + a2a3 − b2b3 + a2 3 − b2 3, a2 3 − b2 3 + a3a4 − b3b4 + a2 4 − b2 4, 2a1b1 + a1b2 + a2b1 + 2a2b2,

2a2b2 + a2b3 + a3b2 + 2a3b3, 2a1b1 + a1b3 + a2b1 + 2a3b3, 2a1b1 + a1b4 + a4b1 + 2a4b4, 2a3b3 + a3b4 + a4b3 + 2a4b4, w2

1 + w2 2 + · · · + w2 17 + w2 18

equivalent to checking if bilinear system has non-trivial solution: y⊤Akz = 0, x⊤Bkz = 0, x⊤Cky = 0, k = 1, . . . , n

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 13 / 20

slide-14
SLIDE 14

spectral hypergraph theory

G = (V , E) is 3-hypergraph, V vertices, E hyperedges adjacency hypermatrix A ∈ Cn×n×n, aijk =

  • 1

[i, j, k] ∈ E

  • therwise

Lemma (L, 2007)

G m-regular 3-hypergraph and A adjacency hypermatrix. Then

1 m is an eigenvalue of A 2 if λ is an eigenvalue of A, then |λ| ≤ m 3 λ has multiplicity 1 if and only if G is connected

Lemma (L, 2007)

G connected m-regular k-partite k-hypergraph on n vertices. Then

1 k ≡ 1 mod 4, eigenvalue of A occurs with multiplicity a multiple of k 2 k ≡ 3 mod 4, spectrum of A symmetric, ie. λ is eigenvalue iff −λ is L.-H. Lim (Stat Retreat) Tensors October 27, 2012 14 / 20

slide-15
SLIDE 15

higher order optimization

first and second order conditions for local minimum necessary: ∇f (x) = 0, ∇2f (x) 0 sufficient: ∇f (x) = 0, ∇2f (x) ≻ 0 for local minimum at x, wlog ∇2f (x) = A

  • =

    

a11

...

app

...      , a11, . . . , app > 0

◮ A ∈ Rp×p: (1, 1)-block of ∇2f (x) ◮ B ∈ R(n−p)×(n−p)×(n−p): (2, 2, 2)-block of ∇3f (x) ◮ B′ ∈ Rp×(n−p)×(n−p): (1, 2, 2)-block of ∇3f (x) ◮ C ∈ R(n−p)×(n−p)×(n−p)×(n−p): (2, 2, 2, 2)-block of ∇4f (x)

third and fourth order conditions for local minimum necessary: B = 0, 4C − A−1, B′ ⊗ B′ 0 sufficient: B = 0, 4C − A−1, B′ ⊗ B′ ≻ 0

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 15 / 20

slide-16
SLIDE 16

mapping the connectome

identify neural fibers as accurately as possible from diffusion MRI data fODF Maxima Schultz-Seidel

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 16 / 20

slide-17
SLIDE 17

mapping the connectome

after preprocessing, may regard signal as function f : S2 → R f is homogeneous polynomial of even degree, f (x) = n

j1,...,jp=1 aj1···jpxj1 · · · xjp ∈ R[x1, . . . , xn]p

coefficients are hypermatrices A = (aj1···jp) ∈ Rn×n×···×n model mandates that f must be sum of powers of linear forms: f (x) = r

i=1(vT i x)p

equivalently, A has ‘Cholesky decomposition’: A = r

i=1 v⊗p i

vi gives direction of ith fiber in a voxel [Schultz-Seidel, 2008], [Schultz-Fuster-Ghosh-Florack-Deriche-LHL, 2012], [LHL-Schultz, 2012]

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 17 / 20

slide-18
SLIDE 18

principal components for higher-order cumulants

−0.5 0.5 −0.3 −0.2 −0.1 0.1 0.2 0.3 0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 PCA right singular vectors Comp 1 Comp 2 −0.5 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Principal kurtosis components Comp 1 Comp 2

Figure: 17 and 39 non-Gaussian; all others Gaussian; left: 1st vs 2nd principal components; right: 1st vs 2nd principal kurtosis components; [LHL-Morton, 2012]

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 18 / 20

slide-19
SLIDE 19

multiarray signal processing

ith sensor, i = 1, . . . , l, impinged by r narrowband waves transmitted by independent radiating sources through linear stationary medium assumption: arrays may overlap but differ only by translations

(a) (b) (c)

signal received by ith sensor in jth array, j = 1, . . . , m, si,j(k) = r

p=1 σp(tk)εi,j(θp)

assumption implies i and j decouple [LHL-Comon, 2010, 2012], εi,j(θp) = εi,1(θp)ϕ(j, p) may identify individual signals using low-rank tensor approximation

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 19 / 20

slide-20
SLIDE 20

more plugs

C.J. Hillar and L.-H. Lim, “Most tensor problems are NP hard,” J.

  • Assoc. Comput. Mach., to appear.

L.-H. Lim, “Tensors,” in L. Hogben (Ed.), Handbook of Linear Algebra, 2nd Ed., CRC Press, Boca Raton, FL, 2013. L.-H. Lim and J. Morton, “Principal components of cumulants,” preprint, (2012). L.-H. Lim and P. Comon, “Multisensor signal processing: tensor decomposition meets compressed sensing,” C. R. Acad. Sci. Paris, 338 (2010), no. 6, pp. 311–320. L.-H. Lim and P. Comon, “Separable identification,” preprint, (2012).

  • T. Schultz, A. Fuster, A. Ghosh, L. Florack, R. Deriche, and

L.-H. Lim, “Higher-order tensors in diffusion imaging,” in B. Burgeth, A.V. Bartroli, and C.-F. Westin (Eds.), Visualization and Processing

  • f Tensors and Higher Order Descriptors for Multi-Valued Data,

Springer Verlag, Berlin, 2013.

L.-H. Lim (Stat Retreat) Tensors October 27, 2012 20 / 20