[PPT] - Matrix-correlated random variables: A statistical physics and signal PowerPoint Presentation

SLIDE 1

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Matrix-correlated random variables: A statistical physics and signal processing duet

Florian Angeletti Work in collaboration with Hugo Touchette, Patrice Abry and Eric Bertin. 13 January 2015

SLIDE 2

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Presentation

Thesis: ”Sums and extremes in statistical physics and signal processing” Advisors: Eric Bertin and Patrice Abry, Physics laboratory of ENS Lyon. Postdoc NITheP, Stellenbosch, South Africa, working with Hugo Touchette on Large deviation theory Themes:

Application of statistical physics to signal processing Extreme statistics Random vectors with matrix representation Large deviation functions

SLIDE 3

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Out-of-equilibrium statistical physics

At equilibrium: Microcanonical ensemble: p(x1, . . . , xn) = cst Canonical ensemble: p(x1, . . . , xn) = e−βH(x1,...,xn) Out-of-equilibrium: Constant flow of heat or particles Dynamic description Stationary distribution?

SLIDE 4

Introduction Duality Statistical properties Limit laws for the sum Large deviation

ASEP

A simple and iconic out-of-equilibrium systems Asymmetric: Particles only move from left to right Exclusion : One particle by site Creation rate α Destruction rate β

SLIDE 5

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Matrix-correlated random variable

How do we describe the stationary solution ? Matrix product ansatz (Derrida and al 1993) p(x1, . . . , xn) = W | R(x1) . . . R(xn) |V W | (R(0) + R(1))n |V matrix R(x) Long range correlation Similar solution for 1D diffusion-reaction system Formal similarity with DMRG

SLIDE 6

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Objectives

Mathematical model p(x1, . . . , xn) ≈ R(x1) · · · R(xn) Study the statistical properties of theses models Hidden Markov model representation Signal processing application Topology induces correlation Large deviation functions Limit distributions for the sums Limit distributions for the extremes Then go back to physical models

SLIDE 7

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Matrix representation

p(x1, . . . , xn) = L (R(x1) . . . R(xn)) L (En) linear form L: L(M) = tr(ATM)

A: d × d positive matrix

R(x): d × d positive matrix function

structure matrix E =

R

R(x)dx probability density function matrix Ri,j(x) = Ei,jPi,j(x)

d > 1: Non-commutativity = ⇒ Correlation

SLIDE 8

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Correlation

Product structure: p(x1, . . . , xn) = L(R(x1)...R(xn))

L(En)

Moment matrix: Q (q) =

xqR(x)dx
X p

k

= L
Ek−1Q (p) En−k

L (En) XkXl = L

Ek−1Q (1) El−k−1Q (1) En−l

L (En) XkXlXm = L

Ek−1Q (1) El−k−1Q (1) Em−l−1Q (1) En−m

L (En) . . .

SLIDE 9

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Stationarity

Translation invariance: p(Xk1 = x1, . . . , Xkl = xl) = p(Xc+k1 = x1, . . . , Xc+kl = xl) Sufficient condition [AT, E] = ATE − EAT = 0 ∀M, L(ME) = L(EM) p(Xk = x) =

L(R(x)En−1) L(En)

p(Xk = x, Xl = y) =

L(R(x)El−k−1R(y)En−|l−k|−1) L(En)

p(Xk = x, Xl = y, Xm = z) =

L(R(x)El−k−1R(y)Em−l−1R(z)En−|m−k|−1) L(En)

SLIDE 10

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Numerical generation

p(x1, . . . , xn) = L (R(x1) . . . R(xn)) L (En) How do we generate a random vector X for a given triple (A, E, P)? Expand the matrix product p(x1, . . . , xn) = 1 L (En)

Γ∈{1,...,d}n+1

AΓ1,Γn+1EΓ1,Γ2PΓ1,Γ2(x1) . . . EΓn,Γn+1 p(x1, . . . , xn) =

Γ

P(Γ)P(X|Γ) Γ, hidden Markov chain

SLIDE 11

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Hidden Markov Model

Markov chain Γ Observable X = X1, . . . , Xk (Xk|Γk) is distributed according to the pdf p(Xk|Γk)

SLIDE 12

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Hidden Markov Chain representation

Hidden Markov Chain p(Γ) = AΓ1,Γn+1 L (En)

k

EΓk,Γk+1 Conditional pdf (X|Γ) p(Xk = x|Γ) = PΓk,Γk+1(x) E non-stochastic = ⇒ Non-homogeneous markov chain Specific non-homogeneous Hidden Markov model:

Hidden Markov Model Matrix representation

SLIDE 13

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Stationary time series design

Generation of random vector with prescribed correlation and marginal distribution:

Matrix representation: Choice of (A, E, P) Hidden Markov Model: Numerical generation

Higher-order dependency structure: correlation of squares Realization Marginal Correlation Square corr. Prescribed

SLIDE 14

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Dual representation

Matrix representation Algebraic properties Statistical properties computation Hidden Markov Model 2-layer model: correlated layer + independant layer Efficient synthesis

SLIDE 15

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Correlation and Jordan decomposition

XkXl = L

Ek−1Q (1) El−k−1Q (1) En−l

L (En) The dependency structure of X depends on the behavior of En λk eigenvalues of E ordered by their real parts ℜ(λ1)ℜ(λ2) > · · · > λr Jk,l Jordan block associated with eigeivalue λk E = B−1    J1,1 ... Jk,l    B, Jk,l =          λk 1 · · · ... ... ... . . . . . . ... ... ... . . . ... ... 1 · · · · · · λk         

SLIDE 16

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Dependency structure

Case 1: Short-range correlation λ2 exists: XkXl − Xk Xl ≈

m>1 αm λm

λ1 |k−l|

25 50 75 100 k −0.4 0.0 0.4 0.8 1.2 Corr(1, k)

Case 2: Constant correlation More than one block J1,k: Constant correlation term Case 3: Long-range correlation J1,k with size p > 1:

XkXl − Xk Xl ≈ P( k

n, k−l n , l n),

P ∈ R[X, Y , Z]

15 30 45 k 20 40 l

SLIDE 17

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Short-range correlation: Ergodic chain

E irreducible ⇐ ⇒ Γ ergodic Irreducible matrix E ⇐ ⇒ G(E) is strongly connected Short-range correlation

SLIDE 18

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Constant correlation: Identity E

Disconnected components: E =    1 ... 1    The chain Γ is trapped inside its starting state Constant correlation:

XkXl − Xk Xl =

L(Q(1)2)−L(Q(1))2 L(E)

SLIDE 19

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Long-range correlation: Linear irreducible E

Irreversible transitions:

n

n-1

1

E =    1 ǫ ... ... 1    The chain Γ can only stay in its current state or jump to the next All chains with a non-zero probability and the same starting and ending points are equiprobable Polynomial correlation: XkXl ≈

r+s+t=d−1

cr,s,t k n r l − k n s 1 − l n t

SLIDE 20

Introduction Duality Statistical properties Limit laws for the sum Large deviation

General shape of E

E =    I1 ∗ Tk,l ... ∗ Ir   

1 2 3 4 5 8 6 7 12 11 17 9 10 26 13 14 15 16 24 18 19 21 20 22 23 25

Irreducible blocks Ik Irreversibles transitions Tk,l Correlation: Mixture of short-range, constant and long-range correlations

SLIDE 21

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Summary

Short-range correlation = ⇒ Strongly connected component

f size s > 1

More than one weakly connected component = ⇒ constant correlation Polynomial correlation = ⇒ More than one strongly connected component Necessary but non sufficient conditions

SLIDE 22

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Random vector sum

Sum S(X) = 1 n

n

i=1

Xi Correlated random variables Law of large numbers? Central limit theorem? Large deviations? Two paths: Hidden Markov chain representation Matrix representation

SLIDE 23

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Hidden Markov path

S(X|Γ): sum of sums of i.i.d. random variables: S(X|Γ) =

i,j

nνi,j

k=1

(Xk|i, j) ≡ S(X|ν) νi,j fraction of (i → j)-transition: ν = card{k/Γk = i, Γk+1 = j} n

i,j

Standard convergence theorem (law of large numbers or central limit theorem )

SLIDE 24

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Layer combination

p(S(X) = s) =

ν

p(ν)p(S(X|ν) = s) Limit distribution for S(X|ν) p(S(X|ν) = s)

+

Limit distribution for ν p(ν)

⇓

Limit distribution for S(X) p(S(X) = s)

SLIDE 25

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Distribution of ν

How ν is distributed? Difficulty: Γ non-homogeneous Markov chain. Three important subclasses: Irreducible E (short-range correlation)

ν converges towards a dirac distribution

Identity E (constant correlation)

ν converges towards a discrete mixture of dirac distributions

Linear irreversible E (long-range polynomial correlation)

ν converges towards a uniform distribution on a d-simplex

SLIDE 26

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Limits laws for core examples

Irreducible E (short-range correlations):

Standard limit laws + = ⇒

Identity E (constant correlation):

Discrete mixture of standard limit laws: + = ⇒

Linear irreversible E (long-range correlation):

Continuous mixture of limit laws + = ⇒

SLIDE 27

Introduction Duality Statistical properties Limit laws for the sum Large deviation

General case

1 2 3 4 5 8 6 7 12 11 17 9 10 26 13 14 15 16 24 18 19 21 20 22 23 25

Combinations of three core behaviors

Irreducible blocks: Fast convergence to the stationary state : dirac distribution Separated componentes : discrete mixture Irreversible transitions: continuous mixture

Limit laws :

Discrete mixture of continuous mixture of standard limits distributions

SLIDE 28

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Large deviation function

Sum S(X) = 1 n

n

i=1

Xi Law of large number: existence of a concentration point Central limit theorem: fluctuations around this concentration point Large deviation: fluctuations far away of the concentrations points Large deviation principle P(S(X)/n = s) ≈ e−nI(s) Do we have a large principle?

SLIDE 29

Introduction Duality Statistical properties Limit laws for the sum Large deviation

G¨ artner-Ellis theorem

Generating function gn(w) =

ewSn

Scaled cumulant generating function Φ(w) Φ(w) = lim

n→∞

ln gn(w) n G¨ artner-Ellis Theorem If Φ(w) exists and is differentiable, I(s) exists and is I(s) = sup

w {ws − Φ(w)}

SLIDE 30

Introduction Duality Statistical properties Limit laws for the sum Large deviation

G¨ artner-Ellis theorem for i.i.d. random variables

Φ(w) = g(w) g(w) cumulant generating function Large deviation principle: I(s) = sup

w {ws − g(w)}

SLIDE 31

Introduction Duality Statistical properties Limit laws for the sum Large deviation

G¨ artner-Ellis theorem for matrix-correlated random variables

matrix generating function: G(w) =

R

R(x)e−wxdx Φ(w) = ln λ1(w) λ1(w) dominant eigenvalue of G(w) Large deviation principle for short-range correlation: Short-range correlation: λ1(w) is differentiable near 0 I(s) = sup

w {ws − ln λ1(w)}

SLIDE 32

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Explicit rate function for long-range correlation

Long-range or constant correlation Φ(w) is not differentiable in 0. Constant correlation: Non-convex rate function

µ1 µ2 s I(s)

Polynomial correlation: Rate function with a flat branch

µ1 µ2 s I(s)

SLIDE 33

Introduction Duality Statistical properties Limit laws for the sum Large deviation

Conclusion

Three kind of correlation:

Exponential short-range correlation Constant correlation Polynomial long-range correlation

Extension of the law of large numbers and the central limit theorems:

Long-range correlation : Continuous and discrete mixture of standard limit laws

Large deviation principle:

Long-range correlation: Non-strictly convex rate function

Perspective Extreme statistics Physical model Infinite dimension, higher tensor order