Random Matrix Improved Covariance Estimation for a Large Class of - - PowerPoint PPT Presentation

random matrix improved covariance estimation for a large
SMART_READER_LITE
LIVE PREVIEW

Random Matrix Improved Covariance Estimation for a Large Class of - - PowerPoint PPT Presentation

Random Matrix Improved Covariance Estimation for a Large Class of Metrics Malik TIOMOKO , Florent BOUCHARD, Guillaume GINOLHAC and Romain COUILLET GSTATS IDEX DataScience Chair, GIPSA-lab, University GrenobleAlpes, France. Laboratoire des


slide-1
SLIDE 1

Random Matrix Improved Covariance Estimation for a Large Class of Metrics

Malik TIOMOKO, Florent BOUCHARD, Guillaume GINOLHAC and Romain COUILLET

GSTATS IDEX DataScience Chair, GIPSA-lab, University Grenoble–Alpes, France. Laboratoire des Signaux et Syst` emes (L2S), University Paris-Sud. LISTIC, University Savoie Mont-Blanc, France

June 10, 2019

1 / 5

slide-2
SLIDE 2

Context

Observations:

◮ X = [x1, . . . , xn], xi ∈ Rp with E[xi] = 0, E[xixT

i ] = C.

2 / 5

slide-3
SLIDE 3

Context

Observations:

◮ X = [x1, . . . , xn], xi ∈ Rp with E[xi] = 0, E[xixT

i ] = C.

Objective:

◮ From the data xi, estimate C. 2 / 5

slide-4
SLIDE 4

Context

Observations:

◮ X = [x1, . . . , xn], xi ∈ Rp with E[xi] = 0, E[xixT

i ] = C.

Objective:

◮ From the data xi, estimate C.

State of the Art:

◮ Sample Covariance Matrix (SCM):

ˆ C = 1 n

n

  • i=1

xixT

i = 1

n XXT.

2 / 5

slide-5
SLIDE 5

Context

Observations:

◮ X = [x1, . . . , xn], xi ∈ Rp with E[xi] = 0, E[xixT

i ] = C.

Objective:

◮ From the data xi, estimate C.

State of the Art:

◮ Sample Covariance Matrix (SCM):

ˆ C = 1 n

n

  • i=1

xixT

i = 1

n XXT. − → Often justified by Law of Large Numbers: n → ∞.

2 / 5

slide-6
SLIDE 6

Context

Observations:

◮ X = [x1, . . . , xn], xi ∈ Rp with E[xi] = 0, E[xixT

i ] = C.

Objective:

◮ From the data xi, estimate C.

State of the Art:

◮ Sample Covariance Matrix (SCM):

ˆ C = 1 n

n

  • i=1

xixT

i = 1

n XXT. − → Often justified by Law of Large Numbers: n → ∞.

◮ Numerical inversion of asymptotic spectrum (QuEST).

  • 1. Bai-Silverstein equation: Estimate λ( ˆ

C) from λ(C) in “large p, n” regime.

  • 2. Need for non trivial inversion of the equation.

2 / 5

slide-7
SLIDE 7

Key Idea

◮ Elementary idea

C ≡ argminM≻0 δ(M, C) where δ(M, C) can be the Fisher, Bhattacharyya, KL, R´ enyi divergence.

3 / 5

slide-8
SLIDE 8

Key Idea

◮ Elementary idea

C ≡ argminM≻0 δ(M, C) where δ(M, C) can be the Fisher, Bhattacharyya, KL, R´ enyi divergence.

◮ Divergence δ(M, C) =

  • f(t)dνp(t) inaccessible, νp ≡ 1

p

p

i=1 δλi(M−1C).

3 / 5

slide-9
SLIDE 9

Key Idea

◮ Elementary idea

C ≡ argminM≻0 δ(M, C) where δ(M, C) can be the Fisher, Bhattacharyya, KL, R´ enyi divergence.

◮ Divergence δ(M, C) =

  • f(t)dνp(t) inaccessible, νp ≡ 1

p

p

i=1 δλi(M−1C).

◮ Random Matrix improved estimate

ˆ δ(M, X) of δ(M, C) using µp ≡ 1

p

p

i=1 δλi(M−1 ˆ C).

  • f(t)νp(dt)
  • h(t)µp(dt)
  • G(mµp(z))dz
  • H(mνp(z))dz

3 / 5

slide-10
SLIDE 10

Key Idea

◮ Elementary idea

C ≡ argminM≻0 δ(M, C) where δ(M, C) can be the Fisher, Bhattacharyya, KL, R´ enyi divergence.

◮ Divergence δ(M, C) =

  • f(t)dνp(t) inaccessible, νp ≡ 1

p

p

i=1 δλi(M−1C).

◮ Random Matrix improved estimate

ˆ δ(M, X) of δ(M, C) using µp ≡ 1

p

p

i=1 δλi(M−1 ˆ C).

  • f(t)νp(dt)
  • h(t)µp(dt)
  • G(mµp(z))dz
  • H(mνp(z))dz

◮ ˆ

δ(M, X) < 0 with non zero probability.

3 / 5

slide-11
SLIDE 11

Key Idea

◮ Elementary idea

C ≡ argminM≻0 δ(M, C) where δ(M, C) can be the Fisher, Bhattacharyya, KL, R´ enyi divergence.

◮ Divergence δ(M, C) =

  • f(t)dνp(t) inaccessible, νp ≡ 1

p

p

i=1 δλi(M−1C).

◮ Random Matrix improved estimate

ˆ δ(M, X) of δ(M, C) using µp ≡ 1

p

p

i=1 δλi(M−1 ˆ C).

  • f(t)νp(dt)
  • h(t)µp(dt)
  • G(mµp(z))dz
  • H(mνp(z))dz

◮ ˆ

δ(M, X) < 0 with non zero probability.

◮ Proposed estimation

ˇ C ≡ argminM≻0 h(M), h(M) = ˆ δ(M, X)

2

3 / 5

slide-12
SLIDE 12

Algorithm

◮ Gradient descent over the Positive Definite manifold.

Algorithm 1 Proposed estimation algorithm. Require M0 ∈ C++

n

. Repeat M ← M

1 2 exp

  • −tM− 1

2 ∇hX(M)M− 1 2

  • M

1 2 .

Until Convergence. Return ˇ C = M.

4 / 5

slide-13
SLIDE 13

Experiments

◮ 2 Data classes x(1)

1 , . . . , x(1) n1 ∼ N(µ1, C1) and x(2) 1 , . . . , x(2) n2 ∼ N(µ2, C2).

◮ Classify point x using Linear Discriminant Analysis based on the sign of

δLDA

x

= (ˆ µ1 − ˆ µ2)T ˇ C−1x + 1 2 ˆ µT

2 ˇ

C−1ˆ µ2 − 1 2 ˆ µT

1 ˇ

C−1ˆ µ1.

◮ Estimate ˇ

C ≡

n1 n1+n2 ˇ

C1 +

n2 n1+n2 ˇ

C2.

5 / 5

slide-14
SLIDE 14

Experiments

◮ 2 Data classes x(1)

1 , . . . , x(1) n1 ∼ N(µ1, C1) and x(2) 1 , . . . , x(2) n2 ∼ N(µ2, C2).

◮ Classify point x using Linear Discriminant Analysis based on the sign of

δLDA

x

= (ˆ µ1 − ˆ µ2)T ˇ C−1x + 1 2 ˆ µT

2 ˇ

C−1ˆ µ2 − 1 2 ˆ µT

1 ˇ

C−1ˆ µ1.

◮ Estimate ˇ

C ≡

n1 n1+n2 ˇ

C1 +

n2 n1+n2 ˇ

C2.

2 3 4 5 6 0.8 0.85 0.9 0.95 1

n1+n2 p

Accuracy B/E A/E B/D A/D B/C A/C 0.75 0.8 0.85 0.9 0.95 1 (Healthy/Epileptic) SCM QuEST1 QuEST2 Proposed

Figure: Mean accuracy obtained over 10 realizations of LDA classification. (Left) C1 and C2 Toeplitz-0.2/Toeplitz-0.4, and (Right) real EEG data.

5 / 5