Decomposing a Third-Order Tensor in Rank-(L,L,1) Terms by Means of - - PowerPoint PPT Presentation
Decomposing a Third-Order Tensor in Rank-(L,L,1) Terms by Means of - - PowerPoint PPT Presentation
Decomposing a Third-Order Tensor in Rank-(L,L,1) Terms by Means of Simultaneous Matrix Diagonalization Dimitri Nion & Lieven De Lathauwer K.U. Leuven, Kortrijk campus, Belgium E-mails: Dimitri.Nion@kuleuven-kortrijk.be
Roadmap
I. Introduction
Tensor decompositions: PARAFAC, Tucker, Block-Component Decompositions
II. Block-Component Decomposition in Rank-(L,L,1) Terms
Definition of the BCD-(L,L,1), Uniqueness bound, ALS Algorithm
- III. Reformulation of BCD-(L,L,1) in terms of simultaneous
matrix diagonalization
New algorithm, relaxed uniqueness bound
- IV. An application of the BCD-(L,L,1): blind source separation
in telecommunications V. Conclusion and Future Research
I. Introduction
Tensor decompositions: PARAFAC, Tucker, Block-Component Decompositions
II. Block-Component Decomposition in Rank-(L,L,1) Terms
Definition of the BCD-(L,L,1), Uniqueness bound, ALS Algorithm
- III. Reformulation of BCD-(L,L,1) in terms of simultaneous
matrix diagonalization
New algorithm, relaxed uniqueness bound
- IV. An application of the BCD-(L,L,1): blind source separation
in telecommunications V. Conclusion and Future Research
Roadmap
Tucker/ HOSVD and PARAFAC
T
V U
- I
J K
=
L N M
W
1 2 3
= × × × U V W
- [Tucker, 1966] / [De Lathauwer, 2000]
C
A
- I
J K
= = c c c cR
R R R
b b b bR
R R R
a a a aR
R R R
+
c c c c1
1 1 1
a a a a1
1 1 1
b b b b1
1 1 1
+ …
- is diagonal
( if i=j=k, hijk=1, else, hijk=0 ) Sum of R rank-1 tensors:
- 1+…+
- R
R R R
- R
R R
T
B
PARAFAC [Harshman, 1970]
From PARAFAC/HOSVD to Block Components Decompositions (BCD) [De Lathauwer and Nion, SIMAX 2008]
J
- I
K
=
1 T
B
1
A
L1
1
c
L1
+ … +
T R
B
R
A
LR
R
c
LR
BCD in rank (Lr,Lr,1) terms BCD in rank (Lr, Mr, . ) terms BCD in rank (Lr, Mr, Nr) terms
- I
J K
=
1 T
B
1
A
1
- L1
N1 M1
1
C
T R
B
R
A
R
LR NR MR
R
C
+…+
- I
J K
=
1 T
B
1
A
1
- L1
K M1
T R
B
R
A
+…+
1
- LR
K MR
Roadmap
I. Introduction
Tensor decompositions: PARAFAC, Tucker, Block-Component Decompositions
II. Block-Component Decomposition in Rank-(L,L,1) Terms
Definition of the BCD-(L,L,1), Uniqueness bound, ALS Algorithm
- III. Reformulation of BCD-(L,L,1) in terms of simultaneous
matrix diagonalization
New algorithm, relaxed uniqueness bound
- IV. An application of the BCD-(L,L,1): blind source separation
in telecommunications V. Conclusion and Future Research
The BCD(L,L,1) as a generalization of PARAFAC.
Generalization of PARAFAC [De Lathauwer, de Baynast, 2003] BCD-(1,1,1)=PARAFAC Unknown matrices:
J
- I
K
=
1 T
B
1
A
L
1
c
L
+ … +
T R
B
R
A
L
R
c
L
BCD-(L,L,1)
1
A
R
A ...
L L I
= A
1
B
R
B ...
L L J
= B = C ...
1
c
R
c
K
BCD-(L,L,1) is said essentially unique if only remaining ambiguities are: Arbitrary permutation of the blocks in A A A A and B B B B and of the columns of C C C C Rotational freedom of each block (block-wise subspace estimation) + scaling ambiguity on the columns of C C C C
The BCD(L,L,1) as a constrained Tucker model.
J
- I
K
=
1 T
B
1
A
L
1
c
L
+ … +
T R
B
R
A
L
R
c
L
R
C
1
A
R
A ...
L L
1
B
R
B ...
L L J
R K
= I
L L
The BCD-(L ,L , 1) can be seen as a particular case of Tucker model, where the core tensor is « block-diagonal », with L by L blocks on its diagonal.
BCD(L,L,1): existing results on algorithms and uniqueness
Several usual algorithms used to compute PARAFAC have been adapted to
the BCD(L,L,1). Example 1: ALS algorithm (alternate between Least Squares updates of unknowns A, B and C). Example 2: ALS with Enhanced Line Search to speed up convergence. Example 3: Gauss-Newton based algorithms (Levenberg-Marquardt).
First result on essential uniqueness, in the generic sense [De Lathauwer, 2006]
(1) and ) (R+ (K,R) ,R)+ L J ( ,R)+ L I ( IJ LR 1 2 min min min ≥ ≤
Starting point of this work
In 2005, De Lathauwer has shown that, under certain assumptions on the
dimensions, PARAFAC can be reformulated as a simultaneous diagonalization (SD) problem. This yields:
A very fast and accurate algorithm to compute PARAFAC A new, relaxed, uniqueness bound Is it possible to generalize these results to the BCD-(L,L,1)? If so, does it also yield a fast algoritm and a new uniqueness bound (more
relaxed than the one on previous slide)?
The answer is YES
Roadmap
I. Introduction
Tensor decompositions: PARAFAC, Tucker, Block-Component Decompositions
II. Block-Component Decomposition in Rank-(L,L,1) Terms
Definition of the BCD-(L,L,1), Uniqueness bound, ALS Algorithm
- III. Reformulation of BCD-(L,L,1) in terms of simultaneous
matrix diagonalization
New algorithm, relaxed uniqueness bound
- IV. An application of the BCD-(L,L,1): blind source separation
in telecommunications V. Conclusion and Future Research
R R
C
×
∈ ∃W
H T
V W C W E X ⋅ = ⋅ =
−1
~
Reformulation of DCB-(L,L,1) in terms of SD: overview (1)
- A
A A Ar c c c cr
r r r
B B B Br
T
∑
r= 1 R
= ∑
r= 1 R
c c c cr
r r r
I J L L
X X X Xr
I J
=
rank L Assumption: i.e., K has to be a sufficiently long dimension
) , min( K IJ R ≤
Build Y, the JI by K matrix unfolding of
I J K K K
BCD-(L,L,1) in matrix format :
( )
) 1 ( ~ ) ( ) (
1 T T R
vec vec C X C X X Y ⋅ = ⋅ =
- SVD of Y (generically rank-R):
) 2 (
H H
V E V Σ U Y ⋅ = ⋅ ⋅ =
Goal: Find W, i.e., find the linear combinations of the columns of E that yield vectorized rank-L matrices.
Reformulation of DCB-(L,L,1) in terms of SD: overview (2)
H T
V W C W E X ⋅ = ⋅ =
−1
~
Note 1: Once W found, the unknown matrices A, B, C of the BCD-(L,L,1) follow
T −
⋅ = W V C
*
( )
( )
) ( ) ( ) ( ) ( ~
1 1 1 T R R T R
vec vec vec vec B A B A X X X
- =
=
Matricize and estimate A1 and B1 from best rank-L approximation. Matricize and estimate AR and BR from best rank-L approximation. Note 2: For PARAFAC (i.e. L=1), we have
( )
( )
product Rao
- Khatri
the is where
- A
B a b a b b a b a X = ⊗ ⊗ = =
R R T R R T
vec vec , , ) ( , ), ( ~
1 1 1 1
W E X ⋅ = ~
is a Khatri-Rao structure recovery problem, and can be solved by simultaneous diagonalization [De Lathauwer, 2005]
Reformulation of DCB-(L,L,1) in terms of SD: overview (3)
Remark: on typical matrix factorization problems in Signal Processing Problem formulation: Given only an (MxN) rank-R observed matrix X, find the (MxR) and (RxN) matrices H and S s.t. X = H S
M
X
=
H S
N N M R R
But infinite number of solutions X = (HF) (F-1S) so we need extra constraints. Examples: ICA (Independent Component Analysis) find H that makes the R source signals in S as much statistically independent as possible. Blind Source Separation. FIR filter estimation H holds the impulse response of a FIR filter, and S is
- Toeplitz. Blind Channel Estimation in telecommunications.
Source localization H is Vandermonde and holds the individual response of the M antennas to the R source signals, each signal impinging with a Direction Of Arrival (DOA). Non-negative matrix factorization Finite Alphabet projection S holds numerical symbols
R R ) ( ... ) (
1 R
vec vec X X ) ( ... ) (
1 R
vec vec E E JI JI
W
= = + +
I J
r
X
I J
1
E
I J
R
E
r 1
W
Rr
W R r
- 1
= For
How to find the coefficients of the linear combinations of the Er that yield rank-L matrices?
W E X ⋅ = ~
Reformulation of DCB-(L,L,1) in terms of SD: overview (4)
Tool: mapping for rank-L detection. Let , then iif is at most rank-L.
L
φ
J I r
C × ∈ X ) , , , ( =
r r r L
X X X
- φ
r
X
After several algebraic manipulations, one can show that W is solution of a SD problem
T R T T
W D W Q W D W Q W D W Q
R 2 1
⋅ ⋅ = ⋅ ⋅ = ⋅ ⋅ =
- 2
1
Reformulation of DCB-(2,2,1) in terms of SD Technical details
Trilinear mapping for rank-2 detection:
2
φ
2.
- rank
most at is iif have we Then X X X X Z Y X Z Y X Z Y X ) , , ( )] , , ( [ ) , , ( ) , , ( ) , , ( :
2 2 2 2
3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 3 2 3 1 3 3 2 2 2 1 2 3 1 2 1 1 1 3 2 1 3 2 1
= + + + + + = ∈ → ∈
× × × × × × × ×
φ φ φ φ
j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j i j j j i i i J J J I I I J I J I J I
x x x y y y z z z y y y x x x z z z x x x z z z y y y z z z x x x y y y y y y z z z x x x z z z y y y x x x C C C C
Reformulation of DCB-(2,2,1) in terms of SD Technical details = + +
I J
r
E
I J
1
X
I J
R
X
1 1 − r
W
1 − Rr
W R r
- 1
= For
) , , (
2 t s r rst
E E E φ =
- Build the set of R3 tensors
r=1,…,R, s=1,…,R, t=1,…,R
∑
= − − −
=
R w v u w v u wt vs ur rst 1 , , 2 1 1 1
) , , ( ) ( ) ( ) ( X X X W W W φ
- :
have we trilinear, is Since
2
φ
One can show that, if the tensors of the set are linearly independent,
) (
3 2
R CR −
+
Ω
{ } { }
, 1 ), , , ( 1 ), , , (
2 2
R u R w v u
u u u w v u
≤ ≤ ≤ ≤ ≤ ≤ = Ω X X X X X X φ φ
- then W is solution of
W W W
3 2 1
× × × =
- where is an arbitrary diagonal tensor and is a symmetric tensor
satisfying
- ,
,
=
∑
rst R t s r rst
q
Reformulation of DCB-(2,2,1) in terms of SD: A new uniqueness bound
« The tensors of the set are linearly independent »
) (
3 2
R CR −
+
Ω
Crucial assumption in the reformulation: One can show that this is generically true if The generalization to any value of L yields that the DCB-(L,L,1) is unique if To be compared to the old uniqueness bound C C C and
3 2 R 3 J I
R K IJ R − ≥ ≤
+
. ) , min(
3
)! ( ! ! k n k n − =
k n
C
C C C and
1 L L R 1 L J I
R K IJ R
L
− ≥ ≤
+ + + + .
) , min(
1
) (R+ (K,R) ,R)+ L J ( ,R)+ L I ( IJ LR 1 2 min min min ≥ ≤
and
Reformulation of DCB-(2,2,1) in terms of SD Uniqueness
New bound, L={2,3,4} Old bound, L={2,3,4}
Roadmap
I. Introduction
Tensor decompositions: PARAFAC, Tucker, Block-Component Decompositions
II. Block-Component Decomposition in Rank-(L,L,1) Terms
Definition of the BCD-(L,L,1), Uniqueness bound, ALS Algorithm
- III. Reformulation of BCD-(L,L,1) in terms of simultaneous
matrix diagonalization
New algorithm, relaxed uniqueness bound
- IV. An application of the BCD-(L,L,1): blind source separation
in telecommunications V. Conclusion and Future Research
H H H Hr S S S Sr
T
a a a ar
I K J
= ∑ = R r 1
I J L L K
Symbols of user r Toeplitz structure (convolution)
- Channel impulse
response of user r (spans L symbol periods for each user)
Data model: DS-CDMA system
Fast time: I=number of samples within a symbol period Slow time: observation during J period symbols Spatial dimension: K receiving antennas R users transmitting at the same time Array steering vector (response
- f the K antennas)
Performance: comparison between ALS and SD algorithms
Conclusion
Reformulation of PARAFAC in terms of Simultaneous Diagonalization (SD) yields a fast and accurate algorithm, with improved identifiability results [De Lathauwer, 2005]. The starting point for this reformulation is that one dimension is long enough: , where I,J and K can be interchanged. The BCD-(L,L,1), which is a generalization of PARAFAC, can also be formulated in terms of SD, which also yield a fast and accurate algorithm and improved identifiability result. The starting point for this reformulation is that the third dimension (K) is long enough . I,J and K can not be interchanged When the long dimension is I or J, i.e.,
- r