Seeking Gold in Sand Applying Random Matrix Theory to Separation of - - PowerPoint PPT Presentation

seeking gold in sand
SMART_READER_LITE
LIVE PREVIEW

Seeking Gold in Sand Applying Random Matrix Theory to Separation of - - PowerPoint PPT Presentation

Seeking Gold in Sand Applying Random Matrix Theory to Separation of Signals from Noise in Stock Market Data Mike S. Wang Department of Chemistry University of Cambridge August, 2016 Mike S. Wang Seeking Gold in Sand with Random Matrix Theory


slide-1
SLIDE 1

Seeking Gold in Sand

Applying Random Matrix Theory to Separation of Signals from Noise in Stock Market Data Mike S. Wang

Department of Chemistry University of Cambridge August, 2016

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-2
SLIDE 2

Preview of the talk

1

What is Random Matrix Theory?

2

Project investigations: Seeking Gold in Sand

3

Looking beyond Financial Applications

4

Q&A Time

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-3
SLIDE 3

What is Random Matrix Theory?

Random Matrix Theory (RMT) is the study of matrices with random variable entries, e.g.    X11 X12 · · · X21 X22 · · · . . . . . . ...    . In particular, we are interested in the emergent behaviours of random matrices in the asymptotic limit. Introduced by Wishart (1928), RMT gained prominence when Wigner (1950) applied the theory in nuclear physics.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-4
SLIDE 4

Project overview

We will

  • investigate the correlation matrix of stock market data (e.g. S&P

500);

  • given a ‘crude’ prediction, build improved analytic predictions for

the purpose of signals/noise detection. Improving a ‘crude’ prediction mode analysis + clustering analysis ⇓ new correlation matrix model ⇓ RMT better analytic predictions for signal and noise separation ⇓ more profitable investment portfolios

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-5
SLIDE 5

Preview of data: obtaining correlation matrix

Price indices → logarithmic returns → de-meaned and normalised data:

P stocks T days

   126.8 30.5 · · · 126.3 30.7 · · · . . . . . . ...   

  • raw data

→    −0.9 1.7 · · · 1.5 0.3 · · · . . . . . . ...   

  • standardised data X

→    1.0 0.2 · · · 0.2 1.0 · · · . . . . . . ...   

  • correlation matrix E

If X is the standardised data matrix, the correlation matrix E is calculated as E = X tX T .

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-6
SLIDE 6

Visualisation and features of the correlation matrix

2 4 6 8 10

eigenvalues

0.2 0.4 0.6 0.8 1 1.2

normalised distribution density

20 40 60 80 100 0.02 0.04 0.06

market mode

(a) Eigenvalues → different modes

100 200 300 400 50 100 150 200 250 300 350 400 450

  • 0.5

0.5 1

sector clustering

(b) Heatmap → suggests clustering

Study of the structure Mode analysis: localisation of modes. Clustering analysis: hierarchical clustering structure.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-7
SLIDE 7

A ‘crude’ prediction for signals and noise

If the stock prices are independent and purely random, then whatever the distribution, then    1 ... 1   

  • underlying correlation matrix

0.5 1 1.5 2 2.5 3

x

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

ν(x)

ν(x) =

1 2π

(λ+−x)(x−λ ) cx

c = P

T

λ± = σ2(1 ± √c)2 All noise!!

Marˇ cenko-Pastur (M-P) distribution

In reality, the stock prices of course cannot be completely independent and purely random, and we see that ? ⇒

1 2 3 4 5

eigenvalues

0.2 0.4 0.6 0.8 1 1.2 1.4

normalised distribution density histogram of observed eigenvalues Marchenko-Pastur law signals incorrect noise band

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-8
SLIDE 8

Mode analysis

We have studied localisation of specific modes.

50 100 150 200 250 300 350 400 450 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

1st mode; eigenvalue = 99.1247; IPR = 0.0029781

  • Cons. D
  • Cons. S

Energy Financials Healthcare Industrials IT Materials Telecom Utilities

(a) The market mode: uniform.

50 100 150 200 250 300 350 400 450

  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6

452-th mode; eigenvalue = 0.059557; IPR = 0.14855

  • Cons. D
  • Cons. S

Energy Financials Healthcare Industrials IT Materials Telecom Utilities

(b) The lowest mode: localised.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-9
SLIDE 9

Sector analysis

Having removed the market mode, the internal structure of the stock market is

  • revealed by the hierarchical clustering method;
  • confirmed by the minimum spanning tree.

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 CVS HCBK WAG ANF GPS TJX LTD COST FDO TGT WMT KSS JCP BBY HD JWN BBBY RSH ROST SPLS TIF AZO ORLY COH

(a) A hierarchical dendrogram. (b) The minimum spanning tree.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-10
SLIDE 10

A better prediction

A new model for the underlying correlation matrix structure: What next? a model for underlying correlation matrix ⇓ better analytic predictions for noise bands ⇓ more profitable investment portfolios

!

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-11
SLIDE 11

Beyond financial applications

Many other mathematical and scientific fields:

  • statistics and numerical analysis;
  • number theory;
  • theoretical neuroscience;
  • optimisation and control;
  • ...

. . . and biochemistry!

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-12
SLIDE 12

Application of RMT in biochemistry

Covariance analysis of protein sequence alignments can be used to infer protein structure and function.

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-13
SLIDE 13

Q&A time

Thanks for listening!

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-14
SLIDE 14

Challenges ahead

We have made many simplifications in deriving a new analytic prediction. In particular, we could further investigate:

  • what is the optimal number of layers of hierarchy in the correlation

matrix structure model?

  • how could we take time evolution of the market into account?
  • . . .

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory

slide-15
SLIDE 15

Effect of the market mode: a visualisation

20 40 60 70 80 100 120 20 40 60 70 80 100 120 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(a) market mode unremoved

20 40 60 80 100 120 20 40 60 80 100 120

  • 0.2

0.2 0.4 0.6 0.8

(b) market mode removed

Mike S. Wang Seeking Gold in Sand with Random Matrix Theory