[PPT] - Decomposing a Third-Order Tensor in Rank-(L,L,1) Terms by Means of PowerPoint Presentation

SLIDE 1

Decomposing a Third-Order Tensor in Rank-(L,L,1) Terms by Means of Simultaneous Matrix Diagonalization

Dimitri Nion & Lieven De Lathauwer

K.U. Leuven, Kortrijk campus, Belgium

E-mails: Dimitri.Nion@kuleuven-kortrijk.be Lieven.DeLathauwer@kuleuven-kortrijk.be 2009 SIAM Conference on Applied Linear Algebra, Session MS33 “Computational Methods for Tensors” Monterey, USA, October 26-29, 2009

SLIDE 2

Roadmap

I. Introduction

Tensor decompositions: PARAFAC, Tucker, Block-Component Decompositions

II. Block-Component Decomposition in Rank-(L,L,1) Terms

Definition of the BCD-(L,L,1), Uniqueness bound, ALS Algorithm

III. Reformulation of BCD-(L,L,1) in terms of simultaneous

matrix diagonalization

New algorithm, relaxed uniqueness bound

IV. An application of the BCD-(L,L,1): blind source separation

in telecommunications V. Conclusion and Future Research

SLIDE 3

I. Introduction

Tensor decompositions: PARAFAC, Tucker, Block-Component Decompositions

II. Block-Component Decomposition in Rank-(L,L,1) Terms

Definition of the BCD-(L,L,1), Uniqueness bound, ALS Algorithm

III. Reformulation of BCD-(L,L,1) in terms of simultaneous

matrix diagonalization

New algorithm, relaxed uniqueness bound

IV. An application of the BCD-(L,L,1): blind source separation

in telecommunications V. Conclusion and Future Research

Roadmap

SLIDE 4

Tucker/ HOSVD and PARAFAC

T

V U

I

J K

=

L N M

W

1 2 3

= × × × U V W

[Tucker, 1966] / [De Lathauwer, 2000]

C

A

I

J K

= = c c c cR

R R R

b b b bR

R R R

a a a aR

R R R

+

c c c c1

1 1 1

a a a a1

1 1 1

b b b b1

1 1 1

+ …

is diagonal

( if i=j=k, hijk=1, else, hijk=0 ) Sum of R rank-1 tensors:

1+…+
R

R R R

R

R R

T

B

PARAFAC [Harshman, 1970]

SLIDE 5

From PARAFAC/HOSVD to Block Components Decompositions (BCD) [De Lathauwer and Nion, SIMAX 2008]

J

I

K

=

1 T

B

1

A

L1

1

c

L1

+ … +

T R

B

R

A

LR

R

c

LR

BCD in rank (Lr,Lr,1) terms BCD in rank (Lr, Mr, . ) terms BCD in rank (Lr, Mr, Nr) terms

I

J K

=

1 T

B

1

A

1

L1

N1 M1

1

C

T R

B

R

A

R

LR NR MR

R

C

+…+

I

J K

=

1 T

B

1

A

1

L1

K M1

T R

B

R

A

+…+

1

LR

K MR

SLIDE 6

Roadmap

I. Introduction

Tensor decompositions: PARAFAC, Tucker, Block-Component Decompositions

II. Block-Component Decomposition in Rank-(L,L,1) Terms

Definition of the BCD-(L,L,1), Uniqueness bound, ALS Algorithm

III. Reformulation of BCD-(L,L,1) in terms of simultaneous

matrix diagonalization

New algorithm, relaxed uniqueness bound

IV. An application of the BCD-(L,L,1): blind source separation

in telecommunications V. Conclusion and Future Research

SLIDE 7

The BCD(L,L,1) as a generalization of PARAFAC.

Generalization of PARAFAC [De Lathauwer, de Baynast, 2003] BCD-(1,1,1)=PARAFAC Unknown matrices:

J

I

K

=

1 T

B

1

A

L

1

c

L

+ … +

T R

B

R

A

L

R

c

L

BCD-(L,L,1)

1

A

R

A ...

L L I

= A

1

B

R

B ...

L L J

= B = C ...

1

c

R

c

K

BCD-(L,L,1) is said essentially unique if only remaining ambiguities are: Arbitrary permutation of the blocks in A A A A and B B B B and of the columns of C C C C Rotational freedom of each block (block-wise subspace estimation) + scaling ambiguity on the columns of C C C C

SLIDE 8

The BCD(L,L,1) as a constrained Tucker model.

J

I

K

=

1 T

B

1

A

L

1

c

L

+ … +

T R

B

R

A

L

R

c

L

R

C

1

A

R

A ...

L L

1

B

R

B ...

L L J

R K

= I

L L

The BCD-(L ,L , 1) can be seen as a particular case of Tucker model, where the core tensor is « block-diagonal », with L by L blocks on its diagonal.

SLIDE 9

BCD(L,L,1): existing results on algorithms and uniqueness

Several usual algorithms used to compute PARAFAC have been adapted to

the BCD(L,L,1). Example 1: ALS algorithm (alternate between Least Squares updates of unknowns A, B and C). Example 2: ALS with Enhanced Line Search to speed up convergence. Example 3: Gauss-Newton based algorithms (Levenberg-Marquardt).

First result on essential uniqueness, in the generic sense [De Lathauwer, 2006]

(1) and ) (R+ (K,R) ,R)+ L J ( ,R)+ L I ( IJ LR 1 2 min min min ≥ ≤

           

SLIDE 10

Starting point of this work

In 2005, De Lathauwer has shown that, under certain assumptions on the

dimensions, PARAFAC can be reformulated as a simultaneous diagonalization (SD) problem. This yields:

A very fast and accurate algorithm to compute PARAFAC A new, relaxed, uniqueness bound Is it possible to generalize these results to the BCD-(L,L,1)? If so, does it also yield a fast algoritm and a new uniqueness bound (more

relaxed than the one on previous slide)?

The answer is YES

SLIDE 11

Roadmap

I. Introduction

Tensor decompositions: PARAFAC, Tucker, Block-Component Decompositions

II. Block-Component Decomposition in Rank-(L,L,1) Terms

Definition of the BCD-(L,L,1), Uniqueness bound, ALS Algorithm

III. Reformulation of BCD-(L,L,1) in terms of simultaneous

matrix diagonalization

New algorithm, relaxed uniqueness bound

IV. An application of the BCD-(L,L,1): blind source separation

in telecommunications V. Conclusion and Future Research

SLIDE 12

R R

C

×

∈ ∃W

H T

V W C W E X ⋅ = ⋅ =

−1

~

Reformulation of DCB-(L,L,1) in terms of SD: overview (1)

A

A A Ar c c c cr

r r r

B B B Br

T

∑

r= 1 R

= ∑

r= 1 R

c c c cr

r r r

I J L L

X X X Xr

I J

=

rank L Assumption: i.e., K has to be a sufficiently long dimension

) , min( K IJ R ≤

Build Y, the JI by K matrix unfolding of

I J K K K

BCD-(L,L,1) in matrix format :

( )

) 1 ( ~ ) ( ) (

1 T T R

vec vec C X C X X Y ⋅ = ⋅ =

SVD of Y (generically rank-R):

) 2 (

H H

V E V Σ U Y ⋅ = ⋅ ⋅ =

Goal: Find W, i.e., find the linear combinations of the columns of E that yield vectorized rank-L matrices.

SLIDE 13

Reformulation of DCB-(L,L,1) in terms of SD: overview (2)

H T

V W C W E X ⋅ = ⋅ =

−1

~

Note 1: Once W found, the unknown matrices A, B, C of the BCD-(L,L,1) follow

T −

⋅ = W V C

*

( )

) ( ) ( ) ( ) ( ~

1 1 1 T R R T R

vec vec vec vec B A B A X X X

=

=

Matricize and estimate A1 and B1 from best rank-L approximation. Matricize and estimate AR and BR from best rank-L approximation. Note 2: For PARAFAC (i.e. L=1), we have

( )

product Rao

Khatri

the is where

A

B a b a b b a b a X = ⊗ ⊗ = =

R R T R R T

vec vec , , ) ( , ), ( ~

1 1 1 1

W E X ⋅ = ~

is a Khatri-Rao structure recovery problem, and can be solved by simultaneous diagonalization [De Lathauwer, 2005]

SLIDE 14

Reformulation of DCB-(L,L,1) in terms of SD: overview (3)

Remark: on typical matrix factorization problems in Signal Processing Problem formulation: Given only an (MxN) rank-R observed matrix X, find the (MxR) and (RxN) matrices H and S s.t. X = H S

M

X

=

H S

N N M R R

But infinite number of solutions X = (HF) (F-1S) so we need extra constraints. Examples: ICA (Independent Component Analysis) find H that makes the R source signals in S as much statistically independent as possible. Blind Source Separation. FIR filter estimation H holds the impulse response of a FIR filter, and S is

Toeplitz. Blind Channel Estimation in telecommunications.

Source localization H is Vandermonde and holds the individual response of the M antennas to the R source signals, each signal impinging with a Direction Of Arrival (DOA). Non-negative matrix factorization Finite Alphabet projection S holds numerical symbols

SLIDE 15

R R ) ( ... ) (

1 R

vec vec X X ) ( ... ) (

1 R

vec vec E E JI JI

W

= = + +

I J

r

X

I J

1

E

I J

R

E

r 1

W

Rr

W R r

1

= For

How to find the coefficients of the linear combinations of the Er that yield rank-L matrices?

W E X ⋅ = ~

Reformulation of DCB-(L,L,1) in terms of SD: overview (4)

Tool: mapping for rank-L detection. Let , then iif is at most rank-L.

L

φ

J I r

C × ∈ X ) , , , ( =

r r r L

X X X

φ

r

X

After several algebraic manipulations, one can show that W is solution of a SD problem

T R T T

W D W Q W D W Q W D W Q

R 2 1

⋅ ⋅ = ⋅ ⋅ = ⋅ ⋅ =

2

1

SLIDE 16

Reformulation of DCB-(2,2,1) in terms of SD Technical details

Trilinear mapping for rank-2 detection:

2

φ

2.

rank

most at is iif have we Then X X X X Z Y X Z Y X Z Y X ) , , ( )] , , ( [ ) , , ( ) , , ( ) , , ( :

2 2 2 2

3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 2 1 3 2 1

= + + + + + = ∈ → ∈

× × × × × × × ×

φ φ φ φ

j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j j j i i i J J J I I I J I J I J I

x x x y y y z z z y y y x x x z z z x x x z z z y y y z z z x x x y y y y y y z z z x x x z z z y y y x x x C C C C

SLIDE 17

Reformulation of DCB-(2,2,1) in terms of SD Technical details = + +

I J

r

E

I J

1

X

I J

R

X

1 1 − r

W

1 − Rr

W R r

1

= For

) , , (

2 t s r rst

E E E φ =

Build the set of R3 tensors

r=1,…,R, s=1,…,R, t=1,…,R

∑

= − − −

=

R w v u w v u wt vs ur rst 1 , , 2 1 1 1

) , , ( ) ( ) ( ) ( X X X W W W φ

:

have we trilinear, is Since

2

φ

One can show that, if the tensors of the set are linearly independent,

) (

3 2

R CR −

+

Ω

{ } { }

, 1 ), , , ( 1 ), , , (

2 2

R u R w v u

u u u w v u

≤ ≤ ≤ ≤ ≤ ≤ = Ω X X X X X X φ φ

then W is solution of

W W W

3 2 1

× × × =

where is an arbitrary diagonal tensor and is a symmetric tensor

satisfying

,

,

=

∑

rst R t s r rst

q

SLIDE 18

Reformulation of DCB-(2,2,1) in terms of SD: A new uniqueness bound

« The tensors of the set are linearly independent »

) (

3 2

R CR −

+

Ω

Crucial assumption in the reformulation: One can show that this is generically true if The generalization to any value of L yields that the DCB-(L,L,1) is unique if To be compared to the old uniqueness bound C C C and

3 2 R 3 J I

R K IJ R − ≥ ≤

+

. ) , min(

3

)! ( ! ! k n k n − =

k n

C

C C C and

1 L L R 1 L J I

R K IJ R

L

− ≥ ≤

+ + + + .

) , min(

1

) (R+ (K,R) ,R)+ L J ( ,R)+ L I ( IJ LR 1 2 min min min ≥ ≤

           

and

SLIDE 19

Reformulation of DCB-(2,2,1) in terms of SD Uniqueness

New bound, L={2,3,4} Old bound, L={2,3,4}

SLIDE 20

Roadmap

I. Introduction

Tensor decompositions: PARAFAC, Tucker, Block-Component Decompositions

II. Block-Component Decomposition in Rank-(L,L,1) Terms

Definition of the BCD-(L,L,1), Uniqueness bound, ALS Algorithm

III. Reformulation of BCD-(L,L,1) in terms of simultaneous

matrix diagonalization

New algorithm, relaxed uniqueness bound

IV. An application of the BCD-(L,L,1): blind source separation

in telecommunications V. Conclusion and Future Research

SLIDE 21

H H H Hr S S S Sr

T

a a a ar

I K J

= ∑ = R r 1

I J L L K

Symbols of user r Toeplitz structure (convolution)

Channel impulse

response of user r (spans L symbol periods for each user)

Data model: DS-CDMA system

Fast time: I=number of samples within a symbol period Slow time: observation during J period symbols Spatial dimension: K receiving antennas R users transmitting at the same time Array steering vector (response

f the K antennas)

SLIDE 22

Performance: comparison between ALS and SD algorithms

SLIDE 23

Conclusion

Reformulation of PARAFAC in terms of Simultaneous Diagonalization (SD) yields a fast and accurate algorithm, with improved identifiability results [De Lathauwer, 2005]. The starting point for this reformulation is that one dimension is long enough: , where I,J and K can be interchanged. The BCD-(L,L,1), which is a generalization of PARAFAC, can also be formulated in terms of SD, which also yield a fast and accurate algorithm and improved identifiability result. The starting point for this reformulation is that the third dimension (K) is long enough . I,J and K can not be interchanged When the long dimension is I or J, i.e.,

r

we have recently shown (CAMSAP 2009), that the BCD-(L,L,1) can be reformulated as Joint-Block-Diagonalization problem. This yields a new set of identifiability results. ) , min( K IJ R ≤ ) , min( K IJ R ≤ ) , min( J IK R ≤ ) , min( I JK R ≤