Assessing the dependence of high-dimensional time series via sample - - PowerPoint PPT Presentation

assessing the dependence of high dimensional time series
SMART_READER_LITE
LIVE PREVIEW

Assessing the dependence of high-dimensional time series via sample - - PowerPoint PPT Presentation

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny Ruhr University Bochum, Germany Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia), and Jianfeng Yao (HKU).


slide-1
SLIDE 1

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Johannes Heiny

Ruhr University Bochum, Germany

Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia), and Jianfeng Yao (HKU). KIAS, Random Matrices and Related Topics, May 9, 2019

  • J. Heiny

Heavy-tailed correlation and covariance matrices 1 / 37

slide-2
SLIDE 2

Motivation: S&P 500 Index

  • 2.0

2.5 3.0 3.5 4.0 4.5 5.0 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Lower tail index Upper tail index

Figure: Estimated tail indices of log-returns of 478 time series in the S&P 500 index.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 2 / 37

slide-3
SLIDE 3

Setup

Data matrix X = Xp: p × n matrix with iid centered columns. X = (Xit)i=1,...,p;t=1,...,n Sample covariance matrix S = 1

nXX′

Ordered eigenvalues of S λ1(S) ≥ λ2(S) ≥ · · · ≥ λp(S) Applications:

Principal Component Analysis Linear Regression, . . .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 3 / 37

slide-4
SLIDE 4

Sample Correlation Matrix

Sample correlation matrix R with entries Rij = Sij

  • SiiSjj

, i, j = 1, . . . , p and eigenvalues λ1(R) ≥ λ2(R) ≥ · · · ≥ λp(R) .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 4 / 37

slide-5
SLIDE 5

The Model

Data structure: Xp = ApZp , where Ap is a deterministic p × p matrix such that (Ap) is bounded and Zp = (Zit)i=1,...,p;t=1,...,n has iid, centered entries with unit variance (if finite).

  • J. Heiny

Heavy-tailed correlation and covariance matrices 5 / 37

slide-6
SLIDE 6

The Model

Data structure: Xp = ApZp , where Ap is a deterministic p × p matrix such that (Ap) is bounded and Zp = (Zit)i=1,...,p;t=1,...,n has iid, centered entries with unit variance (if finite). Population covariance matrix Σ = AA′. Population correlation matrix Γ = (diag(Σ))−1/2Σ(diag(Σ))−1/2 Note: E[S] = Σ but E[Rij] = Γij + O(n−1).

  • J. Heiny

Heavy-tailed correlation and covariance matrices 5 / 37

slide-7
SLIDE 7

The Model

Sample Population Covariance matrix S Σ Correlation matrix R Γ

  • J. Heiny

Heavy-tailed correlation and covariance matrices 6 / 37

slide-8
SLIDE 8

The Model

Sample Population Covariance matrix S Σ Correlation matrix R Γ Growth regime: n = np → ∞ and p np → γ ∈ [0, ∞) , as p → ∞ . High dimension: lim

p→∞ p n ∈ (0, ∞)

Moderate dimension: lim

p→∞ p n = 0

  • J. Heiny

Heavy-tailed correlation and covariance matrices 6 / 37

slide-9
SLIDE 9

Main Result

Approximation Under Finite Fourth Moment Assume X = AZ and E[Z4

11] < ∞. Then we have as p → ∞,

  • n/p diag(S) − diag(Σ) a.s.

→ 0 . Approximation Under Infinite Fourth Moment Assume X = Z and E[Z4

11] = ∞. Then we have as p → ∞,

cnp

  • →0

S − diag(S) P → 0 .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 7 / 37

slide-10
SLIDE 10

Main result Assume X = AZ and E[Z4

11] < ∞. Then we have as p → ∞,

  • n/p diag(S) − diag(Σ) a.s.

→ 0 , and

  • n/p (diag(S))−1/2 − (diag(Σ))−1/2 a.s.

→ 0 .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 8 / 37

slide-11
SLIDE 11

Main result Assume X = AZ and E[Z4

11] < ∞. Then we have as p → ∞,

  • n/p diag(S) − diag(Σ) a.s.

→ 0 , and

  • n/p (diag(S))−1/2 − (diag(Σ))−1/2 a.s.

→ 0 . Relevance: Note that R = (diag(S))−1/2S (diag(S))−1/2 . S = 1

nXX′ and R = Y Y ′, where

Y = (Yij)p×n =

  • Xij

√n

t=1 X2 it

  • p×n

In general, any two entries of Y are dependent.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 8 / 37

slide-12
SLIDE 12

A Comparison Under Finite Fourth Moment

Approximation of the sample correlation matrix Assume X = AZ and E[Z4

11] < ∞. Then we have

n p R − (diag(Σ))−1/2S(diag(Σ))−1/2

  • SQ

a.s. → 0 .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 9 / 37

slide-13
SLIDE 13

A Comparison Under Finite Fourth Moment

Approximation of the sample correlation matrix Assume X = AZ and E[Z4

11] < ∞. Then we have

n p R − (diag(Σ))−1/2S(diag(Σ))−1/2

  • SQ

a.s. → 0 . Spectrum comparison An application of Weyl’s inequality yields n p max

i=1,...,p

  • λi(R) − λi(SQ)

n p R − SQ a.s. → 0 .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 9 / 37

slide-14
SLIDE 14

A Comparison Under Finite Fourth Moment

Approximation of the sample correlation matrix Assume X = AZ and E[Z4

11] < ∞. Then we have

n p R − (diag(Σ))−1/2S(diag(Σ))−1/2

  • SQ

a.s. → 0 . Spectrum comparison An application of Weyl’s inequality yields n p max

i=1,...,p

  • λi(R) − λi(SQ)

n p R − SQ a.s. → 0 . Operator norm consistent estimation R − Γ = O(

  • p/n)

a.s.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 9 / 37

slide-15
SLIDE 15

Notation

Empirical spectral distribution of p × p matrix C with real eigenvalues λ1(C), . . . , λp(C): FC(x) = 1 p

p

  • i=1

1{λi(C)≤x}, x ∈ R . Stieltjes transform: sC(z) =

  • R

1 x − z dFC(x) = 1 p tr(C − zI)−1 , z ∈ C+ , Limiting spectral distribution: Weak convergence of (FCp) to distribution function F a.s.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 10 / 37

slide-16
SLIDE 16

Limiting Spectral Distribution of R

Assume X = AZ, E[Z4

11] < ∞ and that FΓ converges to a

probability distribution H.

1 If p/n → γ ∈ (0, ∞), then FR converges weakly to a

distribution function Fγ,H, whose Stieltjes transform s satisfies s(z) =

  • d

H(t) t(1 − γ − γs(z)) − z , z ∈ C+ .

2 If p/n → 0, then F√

n/p(R−Γ) converges weakly to a

distribution function F, whose Stieltjes transform s satisfies s(z) = −

  • d

H(t) z + t s(z) , z ∈ C+ , where s is the unique solution to

  • s(z) = −
  • (z + t

s(z))−1t d H(t) and z ∈ C+.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 11 / 37

slide-17
SLIDE 17

Special Case A = I

Simplified assumptions:

1 iid, symmetric entries Xit

d

= X

2 Growth regime: lim

p→∞ p n = γ ∈ [0, 1]

  • J. Heiny

Heavy-tailed correlation and covariance matrices 12 / 37

slide-18
SLIDE 18

Marˇ cenko–Pastur and Semicircle Law

Marˇ cenko–Pastur law Fγ has density fγ(x) =

  • 1

2πxγ

  • (b − x)(x − a) ,

if x ∈ [a, b], 0 ,

  • therwise,

where a = (1 − √γ)2 and b = (1 + √γ)2.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 13 / 37

slide-19
SLIDE 19

Marˇ cenko–Pastur and Semicircle Law

Marˇ cenko–Pastur law Fγ has density fγ(x) =

  • 1

2πxγ

  • (b − x)(x − a) ,

if x ∈ [a, b], 0 ,

  • therwise,

where a = (1 − √γ)2 and b = (1 + √γ)2. Semicircle law SC

  • J. Heiny

Heavy-tailed correlation and covariance matrices 13 / 37

slide-20
SLIDE 20

Extreme Eigenvalues

Largest and smallest eigenvalues of R If p/n → γ ∈ [0, 1] and E[X4] < ∞, then

  • n/p (λ1(R) − 1) a.s.

→ 2 + √γ and

  • n/p (λp(R) − 1) a.s.

→ −2 + √γ .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 14 / 37

slide-21
SLIDE 21

Extreme Eigenvalues

Largest and smallest eigenvalues of R If p/n → γ ∈ [0, 1] and E[X4] < ∞, then

  • n/p (λ1(R) − 1) a.s.

→ 2 + √γ and

  • n/p (λp(R) − 1) a.s.

→ −2 + √γ . Earlier: R − Γ = O(

  • p/n) a.s.

In this case:

  • n/p R − Γ a.s.

→ 2 + √γ .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 14 / 37

slide-22
SLIDE 22

Limiting Spectral Distribution

Marˇ cenko–Pastur Theorem Assume E[X2] = 1. Then (FS) converges weakly to Fγ. If E[X4] < ∞ and p/n → 0, then (F√

n/p (S−I)) converges weakly

to SC.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 15 / 37

slide-23
SLIDE 23

Limiting Spectral Distribution

Marˇ cenko–Pastur Theorem Assume E[X2] = 1. Then (FS) converges weakly to Fγ. If E[X4] < ∞ and p/n → 0, then (F√

n/p (S−I)) converges weakly

to SC. JH (2018+) Under the domain of attraction type-condition for the Gaussian law, lim

p→∞

n p nE

  • Y 4

11

  • = 0 ,

the sequence (FR) converges weakly to Fγ. If in addition p/n → 0, then (F√

n/p (R−I)) converges weakly to

SC. Here Yij =

Xij

√n

t=1 X2 it

.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 15 / 37

slide-24
SLIDE 24

Simulation Study

Regular variation with index α > 0: P(|X| > x) = x−αL(x), where L is a slowly varying function. This implies E[|X|α+ε] = ∞ for any ε > 0.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 16 / 37

slide-25
SLIDE 25

Simulation Study

Regular variation with index α > 0: P(|X| > x) = x−αL(x), where L is a slowly varying function. This implies E[|X|α+ε] = ∞ for any ε > 0. Procedure:

1

Simulate X

2

Plot histograms of (λi(R)) and (λi(S))

3

Compare with Marˇ cenko–Pastur density

  • J. Heiny

Heavy-tailed correlation and covariance matrices 16 / 37

slide-26
SLIDE 26

α = 6

α = 6, n = 2000, p = 1000

  • J. Heiny

Heavy-tailed correlation and covariance matrices 17 / 37

slide-27
SLIDE 27
  • J. Heiny

Heavy-tailed correlation and covariance matrices 18 / 37

slide-28
SLIDE 28

Infinite Fourth Moment

Regular variation with index α ∈ (0, 4) Normalizing sequence (a2

np) such that

np P(X2 > a2

npx) → x−α/2,

as n → ∞ for x > 0. Then anp = (np)

1/αℓ(np) for a slowly varying function ℓ.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 19 / 37

slide-29
SLIDE 29

Reduction to Diagonal

Diagonal X with iid regularly varying entries α ∈ (0, 4) and p = nβℓ(n) with β ∈ [0, 1]. We have a−2

np XX′ − diag(XX′) P

→ 0 , where · denotes the spectral norm. (XX′)ij =

n

  • t=1

XitXjt.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 20 / 37

slide-30
SLIDE 30

Eigenvalues

Weyl’s inequality max

i=1,...,p

  • λi(A + B) − λi(A)
  • ≤ B .

Choose A + B = XX′ and A = diag(XX′) to obtain a−2

np

max

i=1,...,p

  • λi(XX′) − λi(diag(XX′))
  • P

→ 0 , n → ∞ . Note: Limit theory for (λi(S)) reduced to (Sii).

  • J. Heiny

Heavy-tailed correlation and covariance matrices 21 / 37

slide-31
SLIDE 31

Heavy-tailed case

Theorem (Heiny and Mikosch, 2016) X with iid regularly varying entries α ∈ (0, 4) and pn = nβℓ(n) with β ∈ [0, 1].

1 If β ∈ [0, 1], then

a−2

np

max

i=1,...,p

  • λi(XX′) − λi(diag(XX′))
  • P

→ 0 .

2 If β ∈ ((α/2 − 1)+, 1], then

a−2

np

max

i=1,...,p

  • λi(XX′) − X2

(i),np

  • P

→ 0 .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 22 / 37

slide-32
SLIDE 32

Example: Eigenvalues

Figure: Smoothed histogram based on 20000 simulations of the approximation error for the normalized eigenvalue a−2

np λ1(S) for entries

Xit with α = 1.6, β = 1, n = 1000 and p = 200.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 23 / 37

slide-33
SLIDE 33

Eigenvectors

vk unit eigenvector of S associated to λk(S) Unit eigenvectors of diag(S) are canonical basisvectors ej. Eigenvectors X with iid regularly varying entries with index α ∈ (0, 4) and pn = nβℓ(n) with β ∈ [0, 1]. Then for any fixed k ≥ 1, vk − eLkℓ2

P

→ 0 , n → ∞ .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 24 / 37

slide-34
SLIDE 34

Localization vs. Delocalization

  • 50

100 150 200 0.0 0.2 0.4 0.6 0.8 1.0

Pareto data

Indices of components Size of components

Figure: X ∼ Pareto(0.8)

  • 50

100 150 200 −0.15 −0.10 −0.05 0.00 0.05 0.10 0.15

Normal Data

Indices of Components Size of Components

Figure: X ∼ N(0, 1)

Components of eigenvector v1. p = 200, n = 1000.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 25 / 37

slide-35
SLIDE 35

Point Process of Normalized Eigenvalues

Point process convergence Nn =

p

  • i=1

δa−2

np λi(XX′)

d

  • i=1

δΓ−2/α

i

= N The limit is a PRM on (0, ∞) with mean measure µ(x, ∞) = x−α/2, x > 0, and Γi = E1 + · · · + Ei , (Ei) iid standard exponential.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 26 / 37

slide-36
SLIDE 36

Point Process of Normalized Eigenvalues

Limiting distribution: For k ≥ 1, lim

n→∞ P(a−2 np λk ≤ x) = lim n→∞ P(Nn(x, ∞) < k) = P(N(x, ∞) < k)

=

k−1

  • s=0
  • x−α/2s

s! e−x−α/2, x > 0 .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 27 / 37

slide-37
SLIDE 37

Point Process of Normalized Eigenvalues

Limiting distribution: For k ≥ 1, lim

n→∞ P(a−2 np λk ≤ x) = lim n→∞ P(Nn(x, ∞) < k) = P(N(x, ∞) < k)

=

k−1

  • s=0
  • x−α/2s

s! e−x−α/2, x > 0 . Largest eigenvalue n a2

np

λ1(S) d → Γ−α/2

1

, where the limit has a Fr´ echet distribution with parameter α/2.

Soshnikov (2006), Auffinger et al. (2009), Auffinger and Tang (2016), Davis et al. (2014, 20162), JH and Mikosch (2016)

  • J. Heiny

Heavy-tailed correlation and covariance matrices 27 / 37

slide-38
SLIDE 38

α = 3.99

α = 3.99, n = 2000, p = 1000

  • J. Heiny

Heavy-tailed correlation and covariance matrices 28 / 37

slide-39
SLIDE 39

α = 3

α = 3, n = 2000, p = 1000

  • J. Heiny

Heavy-tailed correlation and covariance matrices 29 / 37

slide-40
SLIDE 40

α = 2.1

α = 2.1, n = 10000, p = 1000

  • J. Heiny

Heavy-tailed correlation and covariance matrices 30 / 37

slide-41
SLIDE 41

Heavy Tails and Dependence

(Zit): iid field of regularly varying random variables. Stochastic volatility model: X =

  • Zit σ(n)

it

  • p×n
  • J. Heiny

Heavy-tailed correlation and covariance matrices 31 / 37

slide-42
SLIDE 42

Heavy Tails and Dependence

(Zit): iid field of regularly varying random variables. Stochastic volatility model: X =

  • Zit σ(n)

it

  • p×n

Generate deterministic covariance structure A: X = A1/2Z Davis et al. (2014)

  • J. Heiny

Heavy-tailed correlation and covariance matrices 31 / 37

slide-43
SLIDE 43

Heavy Tails and Dependence

(Zit): iid field of regularly varying random variables. Dependence among rows and columns: Xit =

  • l=0

  • k=0

hklZi−k,t−l with some constants hkl. Davis et al. (2016)

  • J. Heiny

Heavy-tailed correlation and covariance matrices 32 / 37

slide-44
SLIDE 44

Heavy Tails and Dependence

(Zit): iid field of regularly varying random variables. Dependence among rows and columns: Xit =

  • l=0

  • k=0

hklZi−k,t−l with some constants hkl. Davis et al. (2016) Relation to iid case: XX′ =

  • l1,l2=0

  • k1,k2=0

hk1l1hk2l2Z(k1, l1)Z′(k2, l2) , where Z(k, l) = (Zi−k,t−l)i=1,...,p;t=1,...,n , l, k ∈ Z .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 32 / 37

slide-45
SLIDE 45

Heavy Tails and Dependence

(Zit): iid field of regularly varying random variables. Dependence among rows and columns: Xit =

  • l=0

  • k=0

hklZi−k,t−l with some constants hkl. Davis et al. (2016) Relation to iid case: XX′ =

  • l1,l2=0

  • k1,k2=0

hk1l1hk2l2Z(k1, l1)Z′(k2, l2) , where Z(k, l) = (Zi−k,t−l)i=1,...,p;t=1,...,n , l, k ∈ Z . Location of squares: M ij =

  • l∈Z

hilhjl, i, j ∈ Z .

  • J. Heiny

Heavy-tailed correlation and covariance matrices 32 / 37

slide-46
SLIDE 46

Autocovariance Matrices

For s ≥ 0, Xn(s) = (Xi,t+s)i=1,...,p; t=1,...,n , n ≥ 1 . Then Xn = Xn(0). Autocovariance matrix for lag s Xn(0)Xn(s)′ Limit theory for singular values of such matrices.

  • J. Heiny

Heavy-tailed correlation and covariance matrices 33 / 37

slide-47
SLIDE 47

Localization

  • 50

100 150 200 0.0 0.2 0.4 0.6 0.8 1.0

Pareto data

Indices of components Size of components

Components of eigenvector v1. p = 200, n = 1000. X ∼ Pareto(0.8).

  • J. Heiny

Heavy-tailed correlation and covariance matrices 34 / 37

slide-48
SLIDE 48

Autocovariance eigenvectors

  • J. Heiny

Heavy-tailed correlation and covariance matrices 35 / 37

slide-49
SLIDE 49

Autocovariance eigenvectors

Eigenvectors of K(0,0) 1

  • 0.5

0.5 1st eigenvector of P(0,0) 878 880 882 884 886 888 890 892

  • 0.5

0.5 2nd eigenvector of P(0,0) 878 880 882 884 886 888 890 892 Number of coordinate

  • 0.5

0.5 Coordinates of eigenvector 3rd eigenvector of P(0,0) 78 80 82 84 86 88 90 92

  • 0.5

0.5 4th eigenvector of P(0,0) 878 880 882 884 886 888 890 892

  • 0.5

0.5 5th eigenvector of P(0,0) 395 400 405 410

  • 0.5

0.5

  • J. Heiny

Heavy-tailed correlation and covariance matrices 36 / 37

slide-50
SLIDE 50

Thank you!

  • J. Heiny

Heavy-tailed correlation and covariance matrices 37 / 37