Cleaning correlation matrices, Random Matrix Theory & HCIZ - - PowerPoint PPT Presentation

cleaning correlation matrices random matrix theory hciz
SMART_READER_LITE
LIVE PREVIEW

Cleaning correlation matrices, Random Matrix Theory & HCIZ - - PowerPoint PPT Presentation

Cleaning correlation matrices, Random Matrix Theory & HCIZ integrals J.P Bouchaud with: M. Potters, L. Laloux, R. Allez, J. Bun, S. Majumdar http://www.cfm.fr Portfolio theory: Basics Portfolio weights w i , Asset returns X t i If


slide-1
SLIDE 1

Cleaning correlation matrices, Random Matrix Theory & HCIZ integrals

J.P Bouchaud with: M. Potters, L. Laloux, R. Allez, J. Bun, S. Majumdar

http://www.cfm.fr

slide-2
SLIDE 2

Portfolio theory: Basics

  • Portfolio weights wi, Asset returns Xt

i

  • If expected/predicted gains are gi then the expected gain of

the portfolio is G =

  • i

wigi

  • Let risk be defined as:

variance of the portfolio returns (maybe not a good definition !) R2 =

  • ij

wiσiCijσjwj where σ2

i is the variance of asset i, and

Cij is the correlation matrix.

J.-P. Bouchaud

slide-3
SLIDE 3

Markowitz Optimization

  • Find the portfolio with maximum expected return for a given

risk or equivalently, minimum risk for a given return (G)

  • In matrix notation:

wC = G C−1g gTC−1g

where all gains are measured with respect to the risk-free rate and σi = 1 (absorbed in gi).

  • Note: in the presence of non-linear contraints, e.g.
  • i

|wi| ≤ A a “spin-glass” problem! (see [JPB,Galluccio,Potters])

J.-P. Bouchaud

slide-4
SLIDE 4

Markowitz Optimization

  • More explicitly:

w ∝

  • α

λ−1

α

(Ψα · g) Ψα = g +

  • α

(λ−1

α

− 1) (Ψα · g) Ψα

  • Compared to the naive allocation w ∝ g:

– Eigenvectors with λ ≫ 1 are projected out – Eigenvectors with λ ≪ 1 are overallocated

  • Very important for “stat. arb.” strategies (for example)

J.-P. Bouchaud

slide-5
SLIDE 5

Empirical Correlation Matrix

  • Before inverting them, how should one estimate/clean cor-

relation matrices?

  • Empirical Equal-Time Correlation Matrix E

Eij = 1 T

  • t

Xt

iXt j

σiσj Order N2 quantities estimated with NT datapoints. When T < N, E is not even invertible. Typically: N = 500 − 2000; T = 500 − 2500 days (10 years – Beware of high frequencies) − → q := N/T = O(1)

J.-P. Bouchaud

slide-6
SLIDE 6

Risk of Optimized Portfolios

  • “In-sample” risk (for G = 1):

R2

in = wT EEwE =

1

gTE−1g

  • True minimal risk

R2

true = wT CCwC =

1

gTC−1g

  • “Out-of-sample” risk

R2

  • ut = wT

ECwE = gTE−1CE−1g

(gTE−1g)2

J.-P. Bouchaud

slide-7
SLIDE 7

Risk of Optimized Portfolios

  • Let E be a noisy, unbiased estimator of C. Using convexity

arguments, and for large matrices: R2

in ≤ R2 true ≤ R2

  • ut
  • In fact, using RMT: R2
  • ut = R2

true(1 − q)−1 = R2 in(1 − q)−2,

  • indep. of C! (For large N)
  • If C has some time dependence (beyond observation noise)
  • ne expects an even worse underestimation

J.-P. Bouchaud

slide-8
SLIDE 8

In Sample vs. Out of Sample

10 20 30

Risk

50 100 150

Return

Raw in-sample Cleaned in-sample Cleaned out-of-sample Raw out-of-sample

J.-P. Bouchaud

slide-9
SLIDE 9

Rotational invariance hypothesis (RIH)

  • In the absence of any cogent prior on the eigenvectors, one

can assume that C is a member of a Rotationally Invariant Ensemble – “RIH”

  • Surely not true for the “market mode”

v1 ≈ (1, 1, . . . , 1)/ √ N, with λ1 ≈ Nρ but OK in the bulk (see below) A more plausible assumption: factor model → hierarchical, block diagonal C’s (“Parisi matrices”)

  • “Cleaning” E within RIH: keep the eigenvectors, play with

eigenvalues → The simplest, classical scheme, shrinkage:

C = (1 − α)E + αI →

λC = (1 − α)λE + α, α ∈ [0, 1]

J.-P. Bouchaud

slide-10
SLIDE 10

RMT: from ρC(λ) to ρE(λ)

  • Solution using different techniques (replicas, diagrams, free

matrices) gives the resolvent GE(z) = N−1Tr(E − zI) as: GE(z) =

  • dλ ρC(λ)

1 z − λ(1 − q + qzGE(z)),

Note: One should work from ρC − → GE

  • Example 1: C = I (null hypothesis) → Marcenko-Pastur [67]

ρE(λ) =

  • (λ+ − λ)(λ − λ−)

2πqλ , λ ∈ [(1 − √q)2, (1 + √q)2]

  • Suggests a second cleaning scheme (Eigenvalue clipping, [Laloux

et al. 1997]): any eigenvalue beyond the Marcenko-Pastur edge can be trusted, the rest is noise.

J.-P. Bouchaud

slide-11
SLIDE 11

Eigenvalue clipping

λ < λ+ are replaced by a unique one, so as to preserve TrC = N.

J.-P. Bouchaud

slide-12
SLIDE 12

RMT: from ρC(λ) to ρE(λ)

  • Solution using different techniques (replicas, diagrams, free

matrices) gives the resolvent GE(z) as: GE(z) =

  • dλ ρC(λ)

1 z − λ(1 − q + qzGE(z)),

Note: One should work from ρC − → GE

  • Example 2: Power-law spectrum (motivated by data)

ρC(λ) = µA (λ − λ0)1+µΘ(λ − λmin)

  • Suggests a third cleaning scheme (Eigenvalue substitution,

Potters et al. 2009, El Karoui 2010): λE is replaced by the theoretical λC with the same rank k

J.-P. Bouchaud

slide-13
SLIDE 13

Empirical Correlation Matrix

1 2 3 4 5 λ 0.5 1 1.5 ρ(λ) Data Dressed power law (µ=2) Raw power law (µ=2) Marcenko-Pastur 250 500 rank

  • 2

2 4 6 8 κ

MP and generalized MP fits of the spectrum

J.-P. Bouchaud

slide-14
SLIDE 14

Eigenvalue cleaning

0.2 0.4 0.6 0.8 1 α 0.5 1 1.5 2 2.5 3 3.5 R

2

Classical Shrinkage Ledoit-Wolf Shrinkage Power Law Substitution Eigenvalue Clipping

Out-of sample risk for different 1-parameter cleaning schemes

J.-P. Bouchaud

slide-15
SLIDE 15

A RIH Bayesian approach

  • All the above schemes lack a rigorous framework and are at

best ad-hoc recipes

  • A Bayesian framework: suppose C belongs to a RIE, with

P(C) and assume Gaussian returns. Then one needs: C|Xt

i =

  • DCCP(C|{Xt

i})

with P(C|{Xt

i}) = Z−1 exp

  • −NTrV (C, {Xt

i})

  • ;

where (Bayes): V (C, {Xt

i}) = 1

2q

  • log C + EC−1

+ V0(C)

J.-P. Bouchaud

slide-16
SLIDE 16

A Bayesian approach: a fully soluble case

  • V0(C) = (1 + b) ln C + bC−1, b > 0: “Inverse Wishart”
  • ρC(λ) ∝
  • (λ+−λ)(λ−λ−)

λ2

; λ± = (1 + b ±

  • (1 + b)2 − b2/4)/b
  • In this case, the matrix integral can be done, leading exactly

to the “Shrinkage” recipe, with α = f(b, q)

  • Note that b can be determined from the empirical spectrum
  • f E, using the generalized MP formula

J.-P. Bouchaud

slide-17
SLIDE 17

The general case: HCIZ integrals

  • A Coulomb gas approach:

integrate over the orthogonal group C = OΛO†, where Λ is diagonal.

  • DO exp
  • − N

2qTr

  • log Λ + EO†Λ−1O + 2qV0(Λ)
  • Can one obtain a large N estimate of the HCIZ integral

F(ρA, ρB) = lim

N→∞ N−2 ln

  • DO exp
  • N

2qTrAO†BO

  • in terms of the spectrum of A and B?

J.-P. Bouchaud

slide-18
SLIDE 18

The general case: HCIZ integrals

  • Can one obtain a large N estimate of the HCIZ integral

F(ρA, ρB) = lim

N→∞ N−2 ln

  • DO exp
  • N

2qTrAO†BO

  • in terms of the spectrum of A and B?
  • When A (or B) is of finite rank, such a formula exists in terms
  • f the R-transform of B [Marinari, Parisi & Ritort, 1995].
  • When the rank of A, B are of order N, there is a formula due

to Matytsin [94](in the unitary case), later shown rigorously by Zeitouni & Guionnet, but its derivation is quite obscure...

J.-P. Bouchaud

slide-19
SLIDE 19

An instanton approach to large N HCIZ

  • Consider Dyson’s Brownian motion matrices. The eigenval-

ues obey: dxi =

  • 2

βN dW + 1 N dt

  • j=i

1 xi − xj ,

  • Constrain xi(t = 0) = λAi and xi(t = 1) = λBi. The proba-

bility of such a path is given by a large deviation/instanton formula, with: d2xi dt2 = − 2 N2

  • ℓ=i

1 (xi − xℓ)3.

J.-P. Bouchaud

slide-20
SLIDE 20

An instanton approach to large N HCIZ

  • Constrain xi(t = 0) = λAi and xi(t = 1) = λBi. The proba-

bility of such a path is given by a large deviation/instanton formula, with: d2xi dt2 = − 2 N2

  • ℓ=i

1 (xi − xℓ)3.

  • This can be interpreted as the motion of particles interacting

through an attractive two-body potential φ(r) = −(Nr)−2. Using the virial formula, one finally gets Matytsin’s equations: ∂tρ + ∂x[ρv] = 0, ∂tv + v∂xv = π2ρ∂xρ.

J.-P. Bouchaud

slide-21
SLIDE 21

An instanton approach to large N HCIZ

  • Finally, the “action” associated to these trajectories is:

S ≈ 1 2

  • dxρ
  • v2 + π2

3 ρ2

  • − 1

2

  • dxdyρZ(x)ρZ(y) ln |x − y|

Z=B

Z=A

  • Now, the link with HCIZ comes from noticing that the prop-

agator of the Brownian motion in matrix space is: P(B|A) ∝ exp −[N 2 Tr(A−B)2] = exp −N 2 [TrA2+TrB2−2TrAOBO†] Disregarding the eigenvectors of B (i.e. integrating over O) leads to another expression for P(λBi|λAj) in terms of HCIZ that can be compared to the one using instantons

  • The final result for F(ρA, ρB) is exactly Matytsin’s expression,

up to details (!)

J.-P. Bouchaud

slide-22
SLIDE 22

Back to eigenvalue cleaning...

  • Estimating HCIZ at large N is only the first step, but...
  • ...one still needs to apply it to B = C−1, A = E = X†CX

and to compute also correlation functions such as O2

ijE→C−1

with the HCIZ weight

  • As we were working on this we discovered the work of Ledoit-

P´ ech´ e that solves the problem exactly using tools from RMT...

J.-P. Bouchaud

slide-23
SLIDE 23

The Ledoit-P´ ech´ e “magic formula”

  • The Ledoit-P´

ech´ e [2011] formula is a non-linear shrinkage, given by:

  • λC =

λE |1 − q + qλE limǫ→0GE(λE − iǫ)|2.

  • Note 1: Independent of C: only GE is needed (and is observ-

able)!

  • Note 2: When applied to the case where C is inverse Wishart,

this gives again the linear shrinkage

  • Note 3:

Still to be done: reobtain these results using the HCIZ route (many interesting intermediate results to hope for!)

J.-P. Bouchaud

slide-24
SLIDE 24

Eigenvalue cleaning: Ledoit-P´ ech´ e

Fit of the empirical distribution with V ′

0(z) = a/z +b/z2 +c/z3. J.-P. Bouchaud

slide-25
SLIDE 25

What about eigenvectors?

  • Up to now, most results using RMT focus on eigenvalues
  • What about eigenvectors? What natural null-hypothesis be-

yond RIH?

  • Are eigen-values/eigen-directions stable in time?
  • Important source of risk for market/sector neutral portfolios:

a sudden/gradual rotation of the top eigenvectors!

  • ..a little movie...

J.-P. Bouchaud

slide-26
SLIDE 26

What about eigenvectors?

  • Correlation matrices need a certain time T to be measured
  • Even if the “true” C is fixed, its empirical determination

fluctuates:

Et = C + noise

  • What is the dynamics of the empirical eigenvectors induced

by measurement noise?

  • Can one detect a genuine evolution of these eigenvectors

beyond noise effects?

J.-P. Bouchaud

slide-27
SLIDE 27

What about eigenvectors?

  • More generally, can one say something about the eigenvec-

tors of randomly perturbed matrices:

H = H0 + ǫH1

where H0 is deterministic or random (e.g. GOE) and H1 random.

J.-P. Bouchaud

slide-28
SLIDE 28

Eigenvectors exchange

  • An issue: upon pseudo-collisions of eigenvectors, eigenvalues

exchange

  • Example: 2 × 2 matrices

H11 = a, H22 = a + ǫ, H21 = H12 = c, − → λ± ≈ǫ→0 a + ǫ 2 ±

  • c2 + ǫ2

4

  • Let c vary: quasi-crossing for c → 0, with an exchange of the

top eigenvector: (1, −1) → (1, 1)

  • For large matrices, these exchanges are extremely numerous

→ labelling problem

J.-P. Bouchaud

slide-29
SLIDE 29

Subspace stability

  • An idea: follow the subspace spanned by P-eigenvectors:

|ψk+1, |ψk+2, . . . |ψk+P − → |ψ′

k+1, |ψ′ k+2, . . . |ψ′ k+P

  • Form the P × P matrix of scalar products:

Gij = ψk+i|ψ′

k+j

  • The determinant of this matrix is insensitive to label per-

mutations and is a measure of the overlap between the two P-dimensional subspaces – D = − 1

P ln | det G| is a measure of how well the first sub-

space can be approximated by the second

J.-P. Bouchaud

slide-30
SLIDE 30

Intermezzo

  • Non equal time correlation matrices

ij = 1

T

  • t

Xt

iXt+τ j

σiσj N × N but not symmetrical: ‘leader-lagger’ relations

  • General rectangular correlation matrices

Gαi = 1 T

T

  • t=1

Y t

αXt i

N ‘input’ factors X; M ‘output’ factors Y – Example: Y t

α = Xt+τ j

, N = M

J.-P. Bouchaud

slide-31
SLIDE 31

Intermezzo: Singular values

  • Singular values:

Square root of the non zero eigenvalues

  • f GGT or GTG, with associated eigenvectors uk

α and vk i →

1 ≥ s1 > s2 > ...s(M,N)− ≥ 0

  • Interpretation: k = 1: best linear combination of input vari-

ables with weights v1

i , to optimally predict the linear com-

bination of output variables with weights u1

α, with a cross-

correlation = s1.

  • s1: measure of the predictive power of the set of Xs with

respect to Y s

  • Other singular values: orthogonal, less predictive, linear com-

binations

J.-P. Bouchaud

slide-32
SLIDE 32

Intermezzo: Benchmark

  • Null hypothesis: No correlations between Xs and Y s:

Gtrue ≡ 0

  • But arbitrary correlations among Xs, CX, and Y s, CY , are

possible

  • Consider exact normalized principal components for the sam-

ple variables Xs and Y s: ˆ Xt

i =

1 √λi

  • j

UijXt

j;

ˆ Y t

α = ...

and define ˆ G = ˆ Y ˆ XT.

J.-P. Bouchaud

slide-33
SLIDE 33

Intermezzo: Random SVD

  • Final result:([Wachter] (1980); [Laloux,Miceli,Potters,JPB])

ρ(s) = (m + n − 1)+δ(s − 1) +

  • (s2 − γ−)(γ+ − s2)

πs(1 − s2) with γ± = n + m − 2mn ± 2

  • mn(1 − n)(1 − m),

0 ≤ γ± ≤ 1

  • Analogue of the Marcenko-Pastur result for rectangular cor-

relation matrices

  • Many applications; finance, econometrics (‘large’ models),

genomics, etc. and subspace stability!

J.-P. Bouchaud

slide-34
SLIDE 34

Back to eigenvectors

  • Extend the target subspace to avoid edge effects:

|ψk+1, |ψk+2, . . . |ψk+P − → |ψ′

k−Q+1, |ψ′ k+2, . . . |ψ′ k+Q

  • Form the P × Q matrix of scalar products:

Gij = ψk+i|ψ′

k+j

  • The singular values of G indicates how well the Q perturbed

vectors approximate the initial ones D = − 1 P

  • i

ln si

J.-P. Bouchaud

slide-35
SLIDE 35

Null hypothesis

  • Note: if P and Q are large, D can be “accidentally” small
  • One can compute D exactly in the limit P, Q → ∞, N → ∞,

with fixed p = P/N, q = Q/N:

  • Final result: (same problem as above!)

D = −

1

0 ds ln s ρ(s)

with: ρ(s) =

  • (s2 − γ−)(γ+ − s2)

πs(1 − s2) and γ± = p + q − 2pq ± 2

  • pq(1 − p)(1 − q),

0 ≤ γ± ≤ 1

J.-P. Bouchaud

slide-36
SLIDE 36

Back to eigenvectors: perturbation theory

  • Consider a randomly perturbed matrix:

H = H0 + ǫH1

  • Perturbation theory to second order in ǫ yields:

D ≈ ǫ2 2P

  • i∈{k+1,...,k+P}
  • j∈{k−Q+1,...,k+Q}

ψi|H1|ψj

λi − λj

2

.

  • The full distribution of s can again be computed exactly (in

some limits) using free random matrix tools.

J.-P. Bouchaud

slide-37
SLIDE 37

GOE: the full SV spectrum

  • Initial eigenspace: spanned by [a, b] ⊂ [−2, 2], b − a = ∆

Target eigenspace: spanned by [a − δ, b + δ] ⊂ [−2, 2]

  • Two cases (set s = ǫ2ˆ

s): – Weak fluctuations ∆ ≪ δ ≪ 1 → ρ(ˆ s) is a semi circle centered around ∼ δ−1, of width ∼ √ ∆ – Strong fluctuations δ ≪ ∆ ≪ 1 → ρ(ˆ s) ∼ ˆ smin/ˆ s2, with ˆ smin ∼ ∆−1 and ˆ smax ∼ δ−1.

J.-P. Bouchaud

slide-38
SLIDE 38

The case of correlation matrices

  • Consider the empirical correlation matrix:

E = C + η

η = 1 T

T

  • t=1

(XtXt − C)

  • The noise η is correlated as:
  • ηijηkl
  • = 1

T (CikCjl + CilCjk)

  • from which one derives:

D ≈ 1 2TP

 

P

  • i=1

N

  • j=Q+1

λiλj (λi − λj)2

  .

(and a similar equation for eigenvalues)

J.-P. Bouchaud

slide-39
SLIDE 39

Stability of eigenvalues: Correlations

Eigenvalues clearly change: well known correlation crises

J.-P. Bouchaud

slide-40
SLIDE 40

Stability of eigenspaces: Correlations

D(τ) for a given T, P = 5, Q = 10

J.-P. Bouchaud

slide-41
SLIDE 41

Stability of eigenspaces: Correlations

D(τ = T) for P = 5, Q = 10

J.-P. Bouchaud

slide-42
SLIDE 42

Conclusion

  • Many RMT tools available to understand the eigenvalue spec-

trum and suggest cleaning schemes

  • The understanding of eigenvectors is comparatively poorer
  • The dynamics of the top eigenvector (aka market mode) is

relatively well understood

  • A plausible, realistic model for the “true” evolution of C is

still lacking (many crazy attempts – Multivariate GARCH, BEKK, etc., but “second generation models” are on their way)

J.-P. Bouchaud

slide-43
SLIDE 43

Bibliography

  • J.P. Bouchaud, M. Potters, Financial Applications of Ran-

dom Matrix Theory: a short review, in “The Oxford Hand- book of Random Matrix Theory” (2011)

  • R. Allez and J.-P. Bouchaud, Eigenvectors dynamics: general

theory & some applications, arXiv 1108.4258

  • P.-A. Reigneron, R. Allez and J.-P. Bouchaud, Principal re-

gression analysis and the index leverage effect, Physica A, Volume 390 (2011) 3026-3035.

J.-P. Bouchaud