Dimitri Nion & Lieven De Lathauwer K.U. Leuven, Kortrijk campus, - - PowerPoint PPT Presentation

dimitri nion lieven de lathauwer
SMART_READER_LITE
LIVE PREVIEW

Dimitri Nion & Lieven De Lathauwer K.U. Leuven, Kortrijk campus, - - PowerPoint PPT Presentation

The Joint Block Diagonalization (JBD) problem: a tensor framework. Dimitri Nion & Lieven De Lathauwer K.U. Leuven, Kortrijk campus, Belgium E-mails: Dimitri.Nion@kuleuven-kortrijk.be Lieven.DeLathauwer@kuleuven-kortrijk.be TDA 2010 ,


slide-1
SLIDE 1

1

The Joint Block Diagonalization (JBD) problem: a tensor framework.

Dimitri Nion & Lieven De Lathauwer

K.U. Leuven, Kortrijk campus, Belgium

E-mails: Dimitri.Nion@kuleuven-kortrijk.be Lieven.DeLathauwer@kuleuven-kortrijk.be TDA 2010, Monopoli, Italy, September 13-17, 2010

slide-2
SLIDE 2

2

Joint-Block-Diagonalization (JBD): model

I

L1 LR

1 T

A

T R

A

I

1

A

R

A

L1 LR

11

D

1 R

D

1K

D

RK

D 1

X

K

X

=

K k

  • r

k H k k k T k k

,..., 1 ), ( ) (      N A AD X N A AD X

JBD is a generalization of JD (Joint Diagonalization)/INDSCAL

1

X

K

X

1

D

A

T

A

=

I R

= cR aR aR

+

c1 a1 a1

+ …

K

D

slide-3
SLIDE 3

3

I

L1 L

R 1 T

A

T R

A I

1

A

R

A

L1 LR

11

D

1 R

D

1K

D

RK

D

1

X

K

X

=

Z Z-1 ZT Z-T

=

~

A

~ T

A

1

{ }

K k k

D 

JBD : ambiguities

Observation: if you choose Z arbitrary, you lose the JBD structure. Question: what is the structure of Z such that the JBD model is still valid?

slide-4
SLIDE 4

4

1

A

3

A

2

A

~ 1

A

~ 2

A

~ 3

A

=

3

Λ

1

Λ

2

Λ

1 1 1 1 1 1

JBD : essential uniqueness

Solving a JBD problem Estimation of {Span(Ar)}r=1,…,R in an arbitrary order The JBD of is said essentially unique if

  • an arbitrary block-diagonal matrix,
  • an arbitrary block-wise permutation matrix.

 

K k k 1 

X

Λ

Π ΛΠ Z 

slide-5
SLIDE 5

5

JBD: State of the art

JBD is becoming popular signal processing tool in applications such as:

  • Blind Source Separation (BSS) of convolutive mixtures in the time-domain,
  • Independent Subspace Analysis.

Two approaches in the literature:

  • Approach 1: Unitary-JBD [Abed Meraim and Belouchrani, 2004]

« A is a square unitary matrix » (AT A = I )

  • Approach 2: Non-Unitary JBD
  • Approach 2.1: [H. Ghennioui, N. Thirion-Moreau, E. Moreau, 2008, 2010]

« A is tall and full column-rank. »  Indeed, their approach only works if A is a square non-unitary matrix.

  • Approach 2.2: This talk

« A can be a tall, square or fat non-unitary matrix »  JBD is a particular instance of Block-Component-Decompositions  Computation by a gradient-based algorithm  In the square case, better performance than 2.1

slide-6
SLIDE 6

6

 

  K k F k T K k F k T

  • ffbdiag
  • r

bdiag

1 2 1 2

) ( min ) ( max A X A A X A

A A

Joint-Block-Diagonalization : state of the art (1)

  • Approach 1: Unitary-JBD [Abed Meraim and Belouchrani, 2004]
  • A is square unitary matrix (ATA = I )

1

X

K

X

T

A A

Xk = A D A Dk AT (+ Nk ) AT Xk A A = Dk + (A N Nk AT )

=

slide-7
SLIDE 7

7

 

  K k F T k K k F T k

  • ffbdiag
  • r

bdiag

1 2 1 2

) ( min ) ( max B BX B BX

B B

Joint-Block-Diagonalization : state of the art (2)

  • Approach 2: Non-Unitary-JBD [H. Ghennioui, N. Thirion-Moreau, E. Moreau, 2008, 2010]
  • A is tall and full column-rank (Let B = A

, then BA = I ) 1

X

K

X

B

T

B

Xk = A D A Dk AT (+ Nk ) BX BXk BT = Dk + (B Nk BT )

=

slide-8
SLIDE 8

8

k

  • ffbdiag
  • ffbdiag
  • ffbdiag

k T T k T k

    , ) ( ) ( ) ( D B A BAD B BX

Joint-Block-Diagonalization : state of the art (3)

  • Approach 2: Non-Unitary-JBD [H. Ghennioui, N. Thirion-Moreau, E. Moreau, 2008, 2010]
  • 2 gradient-descent based algorithms, JBDOG and JBDORG to solve

) 1 ( ) ( min

1 2

K k F T k

  • ffbdiag

B BX

B

  • ff

  • Drawbacks of approach 2:
  • B=0 is a trivial minimizer
  • Under-determined case (A fat, I<N) not handled, since it is assumed that BA=I
  • Indeed, the over-determined case (A tall, I>N) is not successfully handled either

because if is solution of an exact JBD problem, i.e., then is also solution of (1) but not solution of the JBD problem

A B B B  

T T T

] , [

2 1

T T T

] , ) [(

2 ~

B A B

       X X A B

~

because is not full rank. Idem for

T T T T

] , ) ( [

2 1 ~

B A B B

 

slide-9
SLIDE 9

9

Joint-Block-Diagonalization : our contributions

  • Starting point: the gradient-descent based algorithms JBDOG and JBDORG of

[H. Ghennioui, N. Thirion-Moreau, E. Moreau, 2008, 2010] for Non-Unitary JBD can only

handle the case where A is square (I=N).

  • Main motivation: Propose a novel technique to solve non-unitary JBD problems

that can handle exactly-, over- and under- determined cases (i.e., A may be square, tall or fat).

  • Main contributions:
  • Formulate JBD as a tensor decomposition fitting problem;
  • Build a Conjugate Gradient (CG) algorithm to compute the tensor decomposition;
  • In the over-determined case, build a good starting point for any JBD algorithm;
  • Application: blind source separation via Independent Subspace Analysis (ISA).
slide-10
SLIDE 10

10

JBD in tensor format : link to BCD

1 T

A

T R

A

I

1

A

R

A

L1 LR

11

D

1 R

D

1K

D

RK

D

1

X

K

X

= =

r

A

T r

A

Lr

I

Dr X

 R r 1

r r R r r

A A

2 1 1

   

D X

* 2 1 1 r r R r r

A A    

D X

(or in case of hermitian symmetry) JD particular case of candecomp-parafac JBD particular case of BCD-(Lr, Mr, .), (« Block- Component-Decomposition in rank-(Lr, Mr, .) terms »)

slide-11
SLIDE 11

11

  • BCD-(Lr , Mr , .) :

where A=[A1,…,AR] is I by N B=[B1,…,BR] is J by Q

  • Theorem [De Lathauwer, 2008]: Suppose that rank(A)=N, rank(B)=Q, K>2 and that

the tensors {Dr}r=1,…,R are generic, then the BCD-(Lr , Mr , .) of X is essentially unique (Sufficient condition).

JBD in tensor format : conditions for essential uniqueness

r r R r r

B A

2 1 1

   

D X

  • JBD :

where A=[A1,…,AR] is I by N

r r R r r

A A

2 1 1

   

D X

  • The same theorem can be invoked (the proof still holds with A instead of B)
  • In summary, it means that JBD is generically unique if

K>2 and rank(A)=N

  • This is only a sufficient condition: uniqueness still holds but is harder to prove

in several cases where the condition is not satisfied.

  • For instance, uniqueness may still hold when rank(A)=I (A fat, I<N)
slide-12
SLIDE 12

12

JBD in tensor format : cost function

 

D A A X N N A D A X N N A A

A

) ( min

2 1 1 2 2 1 1 2 ,

1

          

  

  

with with with

F R r T r kr r k k K k F k r r R r r F LS

R r r

D X N N

D

.

Where:  A A = [A1 A1 , …, AR AR] is the Khatri-Rao product (block-wise Kronecker product)  X and N are the I2xK matrix unfoldings of the IxIxK tensors X and N, resp.

.

X X

slide-13
SLIDE 13

13

JBD in tensor format : derivation of the gradient

k T k k

N A AD X  

Real-valued data Complex-valued data, standard symmetry Complex-valued data, hermitian symmetry

k T k k

N A AD X  

k H k k

N A AD X  

N A A AD N AD N

D A

) ( 2 ) ( , ) ( 2 ) (

1 T LS K k T k k k T k LS

      

 

.

N A A D A N D A N

D A H LS K k H k k k T k LS

) ( ) ( , ) ( ) (

* *

1 * * *

      

 

.

N A A AD N AD N

D A T LS K k H k k k H k LS

) ( ) ( , ) ( ) (

* 1

* *

      

 

.

slide-14
SLIDE 14

14

Conjugate Gradient algorithm for JBD: JBD-CG

End while

LS D D A A D D A A D A D D D A A A D A

d D D d A A d D d A d d d d A D

D A

         

 

               : unknwons Update

  • 4

( : Search Line Exact Joint

  • 3

: directions search New

  • 2.2

formula) Ribière

  • Polak

stabilized with (e.g. Compute

  • 2.1

directions search Update

  • 2

and from and Compute

  • 1

satisfied) not criterion (stop ) , ( min arg ) ,

,

slide-15
SLIDE 15

15

Step 3 of JBD-CG : Joint Exact Line Search

) , ( min

, D D A A

d D d A

D A

  

 

 

LS

.

 Solve

) , (   

D D A

   LS

 

) 1 ( ) ( ) ( 2 ) ( ) ( ) ( ) ( ) , (

1 2 2 2 A A D A D D D A A A A D D A A

X d D d A d A d D d A            Q Q Q

F LS

         

) 2 ( ) ( ) (

2 1 A A D

   Q Q  

 Substitute (2) in (1):

) 3 ( ) ( ) ( ) ( ) ( ) (

2 2 2 1 A A A A A

      Q Q Q Q

LS

  

 Minimize (3) w.r.t. (degree 7 polynomial rooting)  Given , is given by (2)

A

A

D

slide-16
SLIDE 16

16

JBD in tensor format : a closed form solution (when A is full column-rank)

T T

A AD X A AD X

2 2 1 1

 

  • Idea: when A is square or tall (full column-rank), the exact (noise-free) JBD

problem can be solved by eigenvalue analysis. In presence of noise, this solution can be used to initialize iterative algorithms.

  • Derivation in the square case (I=N):

Take two matrices

K k L L r Span

T k T k r r r rr r r

,..., 1 , . , , . , ), ( , ] [

1 1 12 12 12 12 12 1 12 1 2 1 12

         

    

Π ΠD E X E Π Π EΠ A E E X X A D A A X D A AD X X X matrices the

  • f

structure diagonal

  • block

permuted the checking by Find by rs eigenvecto the groups that matrix n permutatio a is where have We diagonal. is where : EVD

  • f

subspace invariant an is i.e., So diagonal.

  • block

is where have We

slide-17
SLIDE 17

17

Performance index

  • We have :

where

  • is an unknown block-diagonal matrix,
  • is an unknown block-wise permutation matrix,
  • is the estimation error.
  • Compute

and such that matches in the least squares sense.

  • Compute the relative error:

N Π Λ A A   ˆ

Π

Λ N

Π

Λ A ˆ Π Λ A

F F

A Π Λ A A   ˆ

rel

slide-18
SLIDE 18

18

Numerical experiments (1)

Exactly determined case: I=N=6, L1=L2=L3=2, R=3, K=30

Comparison of our JBD-CG technique with JBD-OG of [H. Ghennioui, N. Thirion-Moreau, E. Moreau, 2008, 2010]

slide-19
SLIDE 19

19

Numerical experiments (2)

Under determined case: I=6, N=8, L1=L2=L3=L4=2, R=4, K=100

slide-20
SLIDE 20

20

Application: Blind Subspace Separation

11

h 

1

1L

h  ] [

1 p

s ] [

11 p

x ] [

1

1

p x L

1 R

h 

R

RL

h  ] [p sR ] [

1 p

xR

] [p x

R

RL

] [p x A  ] [ ] [ p p y Ax 

R primary sources (mutually uncorrelated)

Set of FIR filters

N Secondary sources:

  • Mutually correlated within the same subset
  • Mutually uncorrelated if not in the same subset

R r r

L N

1

] [

1 p

y ] [

2 p

y ] [ p yM

Linear instantaneous mixing (M sensors). Mixing matrix: A is M by N

slide-21
SLIDE 21

Blind Subspace Separation via Second-Order- Statistics (SOS)

] [ ] [ p p Ax y 

  • Covariance matrix:

where D is block diagonal because on assumptions on secondary sources.

  • Set of K covariance matrices:
  • Estimate A by JBD-ACG algorithm
  • In the (over)-determined case (A square or tall), LS estimate of secondary sources:
  • Blind SIMO identification stage to recover primary sources from secondary ones,

see, e.g., E. Moulines, P. Duhamel, J.-F. Cardoso, and S. Mayrargue, “Subspace Methods for the Blind

Identification of Multichannel FIR filters,” IEEE Trans. Signal Proc., vol. 43, pp. 516–525, 1995.

   

H H H H

p p E p p E ADA A x x A y y Ryy      ] [ ] [ ] [ ] [ ] [   

H K K H

A AD R A AD R

yy yy

  ] [ ] [

1 1

  

JBD problem !

] [ ] [

1

p p y A x

slide-22
SLIDE 22

22

Practical example: separation of speech signals

  • R=3 primary sources.
  • Number of filters for each source: L1=L2=L3=4; each filter of length 50

generated randomly  N=12 secondary sources.

  • M=14 sensors (A is 14 by 12).
  • K=21 covariance matrices.

s1 S2 s3

slide-23
SLIDE 23

23

Practical example: separation of speech signals

 

 

K k k K k k 1 1

] [ D Rxx  A A ˆ

1 

y1 y2 y3 … y14

slide-24
SLIDE 24

24

Practical example: separation of speech signals

1

ˆ s

2

ˆ s

3

ˆ s

Separation is successful

slide-25
SLIDE 25

25

Conclusion

  • JBD is a generalization of JD
  • JD

a particular case of Candecomp/Parafac JBD a particular case of Block-Component-Decomposition (BCD)

  • Uniqueness conditions for BCD can be invoked.
  • In the exactly- and over- determined cases, we proposed an EVD-based

technique useful for good initialization of JBD algorithms.

  • We proposed a JBD-CG algorithm that works in exactly-, over- and under-

determined cases

  • Application: CG can accuratly achieve blind subspace separation based on

Second Order Statistics.