Seeking Gold in Sand Financial applications of Random Matrix Theory - - PowerPoint PPT Presentation

seeking gold in sand
SMART_READER_LITE
LIVE PREVIEW

Seeking Gold in Sand Financial applications of Random Matrix Theory - - PowerPoint PPT Presentation

Seeking Gold in Sand Financial applications of Random Matrix Theory in stock market data Mike S. Wang Faculty of Mathematics // Department of Chemistry University of Cambridge October, 2016 Mike S. Wang Seeking Gold in Sand with Random Matrix


slide-1
SLIDE 1

Seeking Gold in Sand

Financial applications of Random Matrix Theory in stock market data Mike S. Wang

Faculty of Mathematics // Department of Chemistry University of Cambridge October, 2016

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-2
SLIDE 2

Preview of the talk

1

What is Random Matrix Theory?

2

Data Overview

3

Random Matrix Theory: The Marˇ cenko-Pastur Law

4

Mode & Clustering Analyses

5

A Multi-layer Structured Correlation Model & Its Predictions

6

Effect of Layer Division & an Excellent Match

7

Summary & Further Developments

8

Q&A Time

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-3
SLIDE 3

1 What is Random Matrix Theory?

Random Matrix Theory (RMT) is the study of matrices with random variable (r.v.) entries , e.g.    X11 X12 · · · X21 X22 · · · . . . . . . ...    . In particular, it concerns the emergent behaviours of random matrices in the asymptotic limit. Introduced by Wishart (1928), RMT gained prominence when Wigner (1950s) applied the theory to spacing of energy levels in nuclear physics.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-4
SLIDE 4

1 What is Random Matrix Theory?

Random Matrix Theory (RMT) is the study of matrices with random variable (r.v.) entries , e.g.    X11 X12 · · · X21 X22 · · · . . . . . . ...    . In particular, it concerns the emergent behaviours of random matrices in the asymptotic limit. Introduced by Wishart (1928), RMT gained prominence when Wigner (1950s) applied the theory to spacing of energy levels in nuclear physics.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-5
SLIDE 5

1 What is Random Matrix Theory?

Random Matrix Theory (RMT) is the study of matrices with random variable (r.v.) entries , e.g.    X11 X12 · · · X21 X22 · · · . . . . . . ...    . In particular, it concerns the emergent behaviours of random matrices in the asymptotic limit. Introduced by Wishart (1928), RMT gained prominence when Wigner (1950s) applied the theory to spacing of energy levels in nuclear physics.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-6
SLIDE 6

2 Data overview: S&P 500

Procedure Price indices → logarithmic returns → de-meaned and normalised data.

P stocks T days

   126.8 30.5 · · · 126.3 30.7 · · · . . . . . . ...   

  • raw data

→    −0.9 1.7 · · · 1.5 0.3 · · · . . . . . . ...   

  • standardised data X

→    1.0 0.2 · · · 0.2 1.0 · · · . . . . . . ...   

  • correlation matrix E

The log return is Ri = log pi pi−1 (≈ pi − pi−1 pi−1 ), i > 1 where pi is the i-th trading day price index. The empirical correlation matrix is E = 1 T X tX.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-7
SLIDE 7

3 Random Matrix Theory: the Marˇ cenko-Pastur law

Covariance-correlation matrices are of fundamental importance to modern portfolio theory. They belong a class of random matrices called the Wishart ensemble. An important universality law for this ensemble in RMT: The Marˇ cenko-Pastur law If X : T × P has independently identically distributed (i.i.d.) r.v. entries with mean 0 and variance 1, then the limiting eigenvalue density function (e.d.f.) of matrix E = T −1X tX is f (λ) = 1 2π

  • (λ+ − λ)(λ − λ−)

rλ as P, T → ∞ and P/T → r ∈ (0, 1), where λ± = (1 ± √r)2.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-8
SLIDE 8

3 Random Matrix Theory: the Marˇ cenko-Pastur law

Covariance-correlation matrices are of fundamental importance to modern portfolio theory. They belong a class of random matrices called the Wishart ensemble. An important universality law for this ensemble in RMT: The Marˇ cenko-Pastur law If X : T × P has independently identically distributed (i.i.d.) r.v. entries with mean 0 and variance 1, then the limiting eigenvalue density function (e.d.f.) of matrix E = T −1X tX is f (λ) = 1 2π

  • (λ+ − λ)(λ − λ−)

rλ as P, T → ∞ and P/T → r ∈ (0, 1), where λ± = (1 ± √r)2.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-9
SLIDE 9

3 The Marˇ cenko-Pastur law: a crude prediction

underlying correlation matrix C predicted e.d.f. of E IP ⇒

0.5 1 1.5 2 2.5 3 x 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 f(x)

Marˇ cenko-Pastur distribution

c = P

T

λ± = (1 ± √c)2 f(x) =

1 2π

(λ+−x)(x−λ ) cx All noise!!

? ⇒

1 2 3 4 5 6 7 8 eigenvalue 0.2 0.4 0.6 0.8 1 1.2 normalised eigenvalue density

histogram of observed eigenvalues Marˇ cenko-Pastur distribution

20 40 60 80 100 0.005 0.01 0.015

market mode signals

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-10
SLIDE 10

4 Mode analysis

50 100 150 200 250 300 350 400 450 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 Consumer Discretionary Consumer Staples Energy Financials Healthcare Industrials IT Materials Telecom Utilities 50 100 150 200 250 300 350 400 450

  • 0.6
  • 0.4
  • 0.2

0.2 0.4 Consumer Discretionary Consumer Staples Energy Financials Healthcare Industrials IT Materials Telecom Utilities

The market mode (L: IPR 3.98 × 10−5) and the lowest mode (R: IPR 0.149). Localisation The inverse participation ratio is defined by IPR(v) = P

i=1|˜

vi|4 where ˜ v is the vector v demeaned and normalised.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-11
SLIDE 11

4 Clustering analysis

Market mode removal: E ′ = E − λ1v1vt

1 (λ1, v1 largest eigenvalue pair);

Dissimilarity distance: dij = 1 − corr(i, j); Average linkage: DIJ =

1 |I||J|

  • i∈I,j∈J dij.

100 200 300 400 50 100 150 200 250 300 350 400 450

  • 0.5

0.5 1

strong clustering

Heatmap

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 CVS HCBK WAG ANF GPS TJX LTD FDO TGT WMT KSS JCP COST BBY HD JWN BBBY RSH ROST

Dendrogram (L) Minimum spanning tree (R)

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-12
SLIDE 12

5 A multi-layered correlation model: preview visualisation

100 200 300 400 50 100 150 200 250 300 350 400 450

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

5 10 15

eigenvalue

0.2 0.4 0.6 0.8 1 1.2

normalised eigenvalue density histogram of observed eigenvalues Marˇ cenko-Pastur distribution simulated analytic prediction

Heatmap (L) and analytic prediction (R) for the empirical e.d.f of a multi-layered model.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-13
SLIDE 13

5 A multi-layered correlation model: analytic prediction

Model: ν(λ) = P−1 P

i=1 δ(λ − λi), the e.d.f. of underlying correlation

matrix C. Prediction: f (λ), the limiting e.d.f. of empirical correlation matrix E. The Stieltjes transform pair G(z) = ∞

−∞

dλ f (λ) λ − z , f (λ) = lim

ǫ→0 Im{G(λ + iǫ)}

The Marˇ cenko-Pastur equation − 1 G(z) = z − r ∞

−∞

dλ λν(λ) 1 + λG(z) A polynomial equation [1 + zG(z)]

P

  • i=1

[1 + λiG(z)] = 1 T

P

  • i=1

λiG(z)

P

  • j=i

[1 + λjG(z)] .

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-14
SLIDE 14

5 A multi-layered correlation model: analytic prediction

Model: ν(λ) = P−1 P

i=1 δ(λ − λi), the e.d.f. of underlying correlation

matrix C. Prediction: f (λ), the limiting e.d.f. of empirical correlation matrix E. The Stieltjes transform pair G(z) = ∞

−∞

dλ f (λ) λ − z , f (λ) = lim

ǫ→0 Im{G(λ + iǫ)}

The Marˇ cenko-Pastur equation − 1 G(z) = z − r ∞

−∞

dλ λν(λ) 1 + λG(z) A polynomial equation [1 + zG(z)]

P

  • i=1

[1 + λiG(z)] = 1 T

P

  • i=1

λiG(z)

P

  • j=i

[1 + λjG(z)] .

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-15
SLIDE 15

5 A multi-layered correlation model: analytic prediction

Model: ν(λ) = P−1 P

i=1 δ(λ − λi), the e.d.f. of underlying correlation

matrix C. Prediction: f (λ), the limiting e.d.f. of empirical correlation matrix E. The Stieltjes transform pair G(z) = ∞

−∞

dλ f (λ) λ − z , f (λ) = lim

ǫ→0 Im{G(λ + iǫ)}

The Marˇ cenko-Pastur equation − 1 G(z) = z − r ∞

−∞

dλ λν(λ) 1 + λG(z) A polynomial equation [1 + zG(z)]

P

  • i=1

[1 + λiG(z)] = 1 T

P

  • i=1

λiG(z)

P

  • j=i

[1 + λjG(z)] .

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-16
SLIDE 16

5 A multi-layered correlation model: analytic prediction

Model: ν(λ) = P−1 P

i=1 δ(λ − λi), the e.d.f. of underlying correlation

matrix C. Prediction: f (λ), the limiting e.d.f. of empirical correlation matrix E. The Stieltjes transform pair G(z) = ∞

−∞

dλ f (λ) λ − z , f (λ) = lim

ǫ→0 Im{G(λ + iǫ)}

The Marˇ cenko-Pastur equation − 1 G(z) = z − r ∞

−∞

dλ λν(λ) 1 + λG(z) A polynomial equation [1 + zG(z)]

P

  • i=1

[1 + λiG(z)] = 1 T

P

  • i=1

λiG(z)

P

  • j=i

[1 + λjG(z)] .

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-17
SLIDE 17

5 A multi-layered correlation model: analytic prediction

Model: ν(λ) = P−1 P

i=1 δ(λ − λi), the e.d.f. of underlying correlation

matrix C. Prediction: f (λ), the limiting e.d.f. of empirical correlation matrix E. The Stieltjes transform pair G(z) = ∞

−∞

dλ f (λ) λ − z , f (λ) = lim

ǫ→0 Im{G(λ + iǫ)}

The Marˇ cenko-Pastur equation − 1 G(z) = z − r ∞

−∞

dλ λν(λ) 1 + λG(z) A polynomial equation [1 + zG(z)]

P

  • i=1

[1 + λiG(z)] = 1 T

P

  • i=1

λiG(z)

P

  • j=i

[1 + λjG(z)] .

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-18
SLIDE 18

6 The layer division process: fundamental structures

Layer division: M := Mm(1, α) a single cluster ⇓ M′ := Mm1(1, α1) B Bt Mm2(1, α2)

  • two smaller clusters

where m = m1 + m2. Fundamental structures: Mn(x, y) ≡       x y · · · y y ... ... . . . . . . ... ... y y · · · y x      

  • n

, B = β(1, . . . , 1)

  • m1

t (1, 1, . . . , 1)

  • m2

.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-19
SLIDE 19

6 The layer division process: useful results

Useful techniques and results:

  • elementary operations;
  • the identity

S T U V I −V −1U I

  • =

S − TV −1U T V

  • where V is invertible;
  • the Sherman-Morrison formula:

(A + uvT)−1 = A−1 − A−1uvTA−1 1 + vTA−1u where A is invertible and 1 + vTA−1u = 0.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-20
SLIDE 20

6 The layer division process: eigenvalue splitting

Findings:

  • M has an eigenvalue λ1 = 1 − α of multiplicity m − 1 and an

eigenvalue λ2 = 1 + (m − 1)α of multiplicity 1.

  • M′ has eigenvalues 1 − α1,2 of multiplicity m1,2 − 1, and the

remaining eigenvalues λ′

1,2 are roots of a quadratic polynomial with

λ′

1 + λ′ 2 = λ1 + λ2.

The interaction correlation β perturbs these two eigenvalues, causing them to separate and repel.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-21
SLIDE 21

6 The layer division process: an excellent match

5 10 15

eigenvalue

0.2 0.4 0.6 0.8 1 1.2

normalised eigenvalue density histogram of observed eigenvalues Marˇ cenko-Pastur distribution simulated analytic prediction

0.5 1 1.5 2 2.5 3 0.5 1

5 10 15

eigenvalue

0.2 0.4 0.6 0.8 1 1.2

normalised eigenvalue density histogram of observed eigenvalues Marˇ cenko-Pastur distribution simulated analytic prediction

0.5 1 1.5 2 2.5 3 0.5 1

Comparison of the predicted empirical spectral density functions of a 10-layer correlation model (L) and a 148-layer one (R) with the market mode.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-22
SLIDE 22

7 Summary & further developments

Summary. More considerations:

  • edge asymptotics with the Tracy-Widom law;
  • time evolution;
  • fine-tuning;
  • . . .

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-23
SLIDE 23

7 Summary & further developments

Summary. More considerations:

  • edge asymptotics with the Tracy-Widom law;
  • time evolution;
  • fine-tuning;
  • . . .

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-24
SLIDE 24

Q&A

Acknowledgements Many thanks to Dr Lucy Colwell and her PhD student Chongli Qin at the Molecular Informatics Centre, Department of Chemistry. This project was generously supported by the Bridgwater Scheme. More information about this project at

http://people.ds.cam.ac.uk/sw664/SUROP%20Project%202016/SUROP.html

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory