Causality with Non-Gaussian Time Series Arthur Charpentier (Universit - - PowerPoint PPT Presentation
Causality with Non-Gaussian Time Series Arthur Charpentier (Universit - - PowerPoint PPT Presentation
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7 Causality with Non-Gaussian Time Series Arthur Charpentier (Universit de Rennes 1 & UQM) Universit Paris 7 Diderot, May 2016. http://freakonometrics.hypotheses.org 1
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Motivation (Earthquakes)
- Time before and after a major eathquake (magnitude >6.5) in days
Number of earthquakes (magnitude >2) per 15 sec., average before=100
- −15
−10 −5 5 10 15 200 400 600 800 1000 Same techtonic plate as major one Different techtonic plate as major one
see Boudreault & C. (2011) on contagion among tectonic plates
@freakonometrics
2
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Motivation (Onsite vs. Online)
- nsite protestors, camped-out, arrests and injuries
- vs. online #indignados, #occupy and #vinegar on Twitter & Facebook
see Bastos, Mercea & C. (2015)
@freakonometrics
3
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Multivariate Stationary Time Series Definition A time series (Xt = (X1,t, · · · , Xd,t))t∈Z with values in Rd is called a VAR(1) process if X1,t = φ1,1X1,t−1 + φ1,2X2,t−1 + · · · + φ1,dXd,t−1 + ε1,t X2,t = φ2,1X1,t−1 + φ2,2X2,t−1 + · · · + φ2,dXd,t−1 + ε2,t · · · Xd,t = φd,1X1,t−1 + φd,2X2,t−1 + · · · + φd,dXd,t−1 + εd,t (1)
- r equivalently
X1,t X2,t . . . Xd,t
- Xt
= φ1,1 φ1,2 · · · φ1,d φ2,1 φ2,2 · · · φ2,d . . . . . . . . . φd,1 φd,2 · · · φd,d
- Φ
X1,t−1 X2,t−1 . . . Xd,t−1
- Xt−1
+ ε1,t ε2,t . . . εd,t
εt
@freakonometrics
4
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Multivariate Stationary Time Series For some real-valued d × d matrix Φ, and some i.i.d. random vectors εt with values in Rd. Assume that εt is a Gaussian white noise N(0, Σ), with density f(ε) = 1
- (2π)d| det Σ|
exp
- −εTΣ−1ε
2
- ,
∀ε ∈ Rd. Assume also that εt is independent of Xt−1 = σ({Xt−1, Xt−2, · · · , }). : (εt)t∈Z is the innovation process. Definition A time series (Xt)t∈N is said to be (weakly) stationary if
- E(Xt) is independent of t (=: µ)
- cov(Xt, Xt−h) is independent of t (=: γ(h)), called autocovariance matrix
@freakonometrics
5
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Multivariate Stationary Time Series Define the autocorrelation matrix, ρ(h) := ∆−1γ(h)∆−1, where ∆ :=
- diag (γ(0)).
(Xt)t∈N a stationary AR(1) time series, Xt = ΦXt−1 + εt Proposition (Xt)t∈N is a stationary AR(1) time series if and only if the d eigenvalues of Φ should have a norm lower than 1. Proposition If (Xt)t∈N is a stationary VAR(1) time series, ρ(h) = Φh, h ∈ N.
@freakonometrics
6
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Causality, in dimension 2 Two stationary time series (Xt, Yt)t∈Z. Heuristics on independence, f(xt, yt|Xt−1, Y t−1) = f(xt|Xt−1) · f(yt|Y t−1) Write (with X for Xt−1) f(xt, yt|X, Y ) f(xt|X) · f(yt|Y )
- (X,Y )
= f(xt|X, Y ) f(xt|X)
- X→Y
· f(yt|X, Y ) f(yt|Y )
- X←Y
· f(xt, yt|X, Y ) f(xt|X, Y ) · f(yt|X, Y )
- X⇔Y
Gouriéroux, Monfort & Renault (1987) define the following Kullback-measures C(X, Y ) = E
- log
f(Xt, Yt|X, Y ) f(Xt|X) · f(Yt|Y )
- @freakonometrics
7
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Causality, in dimension 2 C(X → Y ) = E
- log f(Xt|X, Y )
f(Xt|X)
- C(Y → X) = E
- log f(Yt|X, Y )
f(Yt|Y )
- C(X ⇔ Y ) = E
- log
f(Xt, Yt|X, Y ) f(Xt|X, Y ) · f(Yt|X, Y )
- so that C(X, Y ) = C(X → Y ) + C(X ← Y ) + C(X ⇔ Y ).
From Granger (1969) (X) causes (Y ) at time t if L(yt|Xt−1, Y t−1) = L(yt|Y t−1) (X) causes (Y ) instantaneously at time t if L(yt|Xt, Y t−1) = L(yt|Xt−1, Y t−1)
@freakonometrics
8
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Causality, in dimension 2, for VAR(1) time series Xt Yt
Xt
= φ1,1 φ1,2 φ2,1 φ2,2
- Φ
Xt−1 Yt−1
- Xt−1
+ ut vt
εt
, with Var ut vt = σ2
u
σuv σuv σ2
v
From Granger (1969) (see also Toda & Phillips (1994)) (X) causes (Y ) at time t, X → Y , if φ2,1 = 0 (Y ) causes (X) at time t, Y → X, if φ1,2 = 0 (X) causes (Y ) instantaneously at time t, X ⇔ X, if σu,v = 0
@freakonometrics
9
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Testing Causality, in dimension d For lagged causality, we test H0 : Φ ∈ P against H1 : Φ / ∈ P, where P is a set of constrained shaped matrix, e.g. P is the set of d × d diagonal matrices for lagged independence, or a set of block triangular matrices for lagged causality. Proposition Let Φ denote the conditional maximum likelihood estimate of Φ in the non-constrained MINAR(1) model, and Φ
c denote the conditional maximum
likelihood estimate of Φ in the constrained model, then under suitable conditions, 2[log L(X, Φ|X0) − log L(X, Φ
c|X0)] L
→ χ2(d2 − dim(P)), as T → ∞, under H0. Example Testing (X1,t)←(X2,t) is testing whether φ1,2 = 0, or not.
@freakonometrics
10
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Modeling Counts Processes Steutel & van Harn (1979) defined a thinning operator as follows Definition Define operator ◦ as p ◦N =
N
- i=1
Yi = Y1 + · · · + YN if N = 0, and 0 otherwise, where N is a random variable with values in N, p ∈ [0, 1], and Y1, Y2, · · · are i.i.d. Bernoulli variables, independent of N, with P(Yi = 1) = p = 1 − P(Yi = 0). Thus p ◦ N is a compound sum of i.i.d. Bernoulli variables. Hence, given N, p ◦ N has a binomial distribution B(N, p). Note that p ◦ (q ◦ N)
L
= [pq] ◦ N for all p, q ∈ [0, 1]. Further E (p ◦ N) = pE(N) and Var (p ◦ N) = p2Var(N) + p(1 − p)E(N).
@freakonometrics
11
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
(Poisson) Integer AutoRegressive processes INAR(1) Based on that thinning operator, Al-Osh & Alzaid (1987) and McKenzie (1985) defined the integer autoregressive process of order 1: Definition A time series (Xt)t∈N with values in R is called an INAR(1) process if Xt = p ◦ Xt−1 + εt, (2) where (εt) is a sequence of i.i.d. integer valued random variables, i.e. Xt =
Xt−1
- i=1
Yi + εt, where Y ′
i s are i.i.d. B(p).
Such a process can be related to Galton-Watson processes.
@freakonometrics
12
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
INAR(1) & Galton-Watson Xt+1 =
Xt
- i=1
Yi + εt+1, where Y ′
i s are i.i.d. B(p)
@freakonometrics
13
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Proposition E (Xt) = E(εt) 1 − p, Var (Xt) = γ(0) = pE(εt) + Var(εt) 1 − p2 and γ(h) = cov(Xt, Xt−h) = ph. It is common to assume that εt are independent variables, with a Poisson distribution P(λ), with probability function P(εt = k) = e−λ λk k! , k ∈ N. Proposition If (εt) are Poisson random variables, then (Xt) will also be a sequence of Poisson random variables. Note that we assume also that εt is independent of Xt−1, i.e. past observations X0, X1, · · · , Xt−1. Thus, (εt)t∈N is called the innovation process. Proposition (Xt)t∈N is a stationary INAR(1) time series if and only if p ∈ [0, 1). Proposition If (Xt)t∈N is a stationary INAR(1) time series, (Xt)t∈N is an homogeneous Markov chain.
@freakonometrics
14
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Markov Property of INAR(1) Time Series π(xt, xt−1) = P(Xt = xt|Xt−1 = xt−1) =
xt
- k=0
P xt−1
- i=1
Yi = xt − k
- Binomial
· P(ε = k)
- Poisson
.
@freakonometrics
15
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Inference of INAR(1) Processes Consider a Poisson INAR(1) process, then the likelihood is L(p, λ; X0, X) = n
- t=1
ft(Xt)
- ·
λX0 (1 − p)X0X0! exp
- −
λ 1 − p
- where
ft(y) = exp(−λ)
min{Xt,Xt−1}
- i=0
λy−i (y − i)! Yt−1 i
- pi(1 − p)Yt−1−y, for t = 1, · · · , n.
Maximum likelihood estimators are ( p, λ) ∈ argmax {log L(p, λ; (X0, X))}
@freakonometrics
16
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Multivariate Integer Autoregressive processes MINAR(1) Let Xt := (X1,t, · · · , Xd,t), denote a multivariate vector of counts. Definition Let P := [pi,j] be a d × d matrix with entries in [0, 1]. If X = (X1, · · · , Xd) is a random vector with values in Nd, then P ◦ X is a d-dimensional random vector, with i-th component [P ◦ X]i =
d
- j=1
pi,j ◦ Xj, for all i = 1, · · · , d, where all counting variates Y in pi,j ◦ Xj’s are assumed to be independent. Note that P ◦ (Q ◦ X)
L
= [P Q] ◦ X. Further, E (P ◦ X) = P E(X), and E
- (P ◦ X)(P ◦ X)T
= P E(XXT)P T + ∆, with ∆ := diag(V E(X)) where V is the d × d matrix with entries pi,j(1 − pi,j).
@freakonometrics
17
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Multivariate Integer Autoregressive processes MINAR(1) Definition A time series (Xt) with values in Nd is called a d-variate MINAR(1) process if Xt = P ◦ Xt−1 + εt (3) for all t, for some d × d matrix P with entries in [0, 1], and some i.i.d. random vectors εt with values in Nd. (Xt) is a Markov chain with states in Nd with transition probabilities π(xt, xt−1) = P(Xt = xt|Xt−1 = xt−1) (4) satisfying π(xt, xt−1) =
xt
- k=0
P(P ◦ xt−1 = xt − k) · P(ε = k).
@freakonometrics
18
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Inference for MINAR(1) Proposition Let (Xt) be a d-variate MINAR(1) process satisfying stationary conditions, as well as technical assumptions (called C1-C6 in Franke & Subba Rao (1993)), then the conditional maximum likelihood estimate θ of θ = (P , Λ) is asymptotically normal, √n( θ − θ)
L
→ N(0, Σ−1(θ)), as n → ∞. Further, 2[log L(N, θ|N 0) − log L(N, θ|N 0)]
L
→ χ2(d2 + dim(λ)), as n → ∞.
@freakonometrics
19
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Granger causality with BINAR(1) (X1,t) and (X2,t) are instantaneously related if ε is a noncorrelated noise,g g g g g g g g g g g g g g X1,t X2,t
- Xt
= p1,1 p1,2 p2,1 p2,2
- P
-
X1,t−1 X2,t−1
- Xt−1
+ ε1,t ε2,t
εt
, with Var ε1,t ε2,t = λ1 ϕ ϕ λ2
@freakonometrics
20
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Granger causality with BINAR(1)
- 1. (X1) and (X2) are instantaneously related if ε is a noncorrelated noise, g g g
g g g g g g g g g g g X1,t X2,t
- Xt
= p1,1 p1,2 p2,1 p2,2
- P
-
X1,t−1 X2,t−1
- Xt−1
+ ε1,t ε2,t
εt
, with Var ε1,t ε2,t = λ1 ⋆ ⋆ λ2
@freakonometrics
21
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Granger causality with BINAR(1)
- 2. (X1) and (X2) are independent, (X1)⊥(X2) if P is diagonal, i.e.
p1,2 = p2,1 = 0, and ε1 and ε2 are independent, X1,t X2,t
- Xt
= p1,1 p2,2
- P
-
X1,t−1 X2,t−1
- Xt−1
+ ε1,t ε2,t
εt
, with Var ε1,t ε2,t = λ1 λ2
@freakonometrics
22
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Granger causality with BINAR(1)
- 3. (N1) causes (N2) but (N2) does not cause (X1), (X1)→(X2), if P is a lower
triangle matrix, i.e. p2,1 = 0 while p1,2 = 0, X1,t X2,t
- Xt
= p1,1 ⋆ p2,2
- P
-
X1,t−1 X2,t−1
- Xt−1
+ ε1,t ε2,t
εt
, with Var ε1,t ε2,t = λ1 ϕ ϕ λ2
@freakonometrics
23
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Granger causality with BINAR(1)
- 4. (N2) causes (N1) but (N1,t) does not cause (N2), (N1)←(N2,t), if P is a
upper triangle matrix, i.e. p1,2 = 0 while p2,1 = 0, X1,t X2,t
- Xt
= p1,1 ⋆ p2,2
- P
-
X1,t−1 X2,t−1
- Xt−1
+ ε1,t ε2,t
εt
, with Var ε1,t ε2,t = λ1 ϕ ϕ λ2
@freakonometrics
24
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Granger causality with BINAR(1)
- 5. (N1) causes (N2) and conversely, i.e. a feedback effect (N1)↔(N2), if P is a
full matrix, i.e. p1,2, p2,1 = 0 X1,t X2,t
- Xt
= p1,1 ⋆ ⋆ p2,2
- P
-
X1,t−1 X2,t−1
- Xt−1
+ ε1,t ε2,t
εt
, with Var ε1,t ε2,t = λ1 ϕ ϕ λ2
@freakonometrics
25
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Bivariate Poisson BINAR(1) A classical distribution for εt is the bivariate Poisson distribution, with one common shock, i.e. ε1,t = M1,t + M0,t ε2,t = M2,t + M0,t where M1,t, M2,t and M0,t are independent Poisson variates, with parameters λ1 − ϕ, λ2 − ϕ and ϕ, respectively. In that case, εt := (ε1,t, ε2,t) has joint probability function e−[λ1+λ2−ϕ] (λ1 − ϕ)k1 k1! (λ2 − ϕ)k2 k2!
min{k1,k2}
- i=0
k1 i k2 i
- i!
- ϕ
[λ1 − ϕ][λ2 − ϕ]
- with λ1, λ2 > 0, ϕ ∈ [0, min{λ1, λ2}].
λ = λ1 λ2 and Λ = λ1 ϕ ϕ λ2
@freakonometrics
26
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Bivariate Poisson BINAR(1) and Granger causality For instantaneous causality, we test H0 : ϕ = 0 against H1 : ϕ = 0 Proposition Let λ denote the conditional maximum likelihood estimate of λ = (λ1, λ2, ϕ) in the non-constrained MINAR(1) model, and λ⊥ denote the conditional maximum likelihood estimate of λ⊥ = (λ1, λ2, 0) in the constrained model (when innovation has independent margins), then under suitable conditions, 2[log L(X, λ|X0) − log L(X, λ
⊥|X0)] L
→ χ2(1), as n → ∞, under H0.
@freakonometrics
27
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Bivariate Poisson BINAR(1) and Granger causality For lagged causality, we test H0 : P ∈ P against H1 : P / ∈ P, where P is a set of constrained shaped matrix, e.g. P is the set of d × d diagonal matrices for lagged independence, or a set of block triangular matrices for lagged causality. Proposition Let P denote the conditional maximum likelihood estimate of P in the non-constrained MINAR(1) model, and P
c denote the conditional maximum
likelihood estimate of P in the constrained model, then under suitable conditions, 2[log L(X, P |X0) − log L(X, P
c|X0)] L
→ χ2(d2 − dim(P)), as n → ∞, under H0. Example Testing (X1,t)←(X2,t) is testing whether p1,2 = 0, or not.
@freakonometrics
28
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Autocorrelation of MINAR(1) processes Proposition Consider a MINAR(1) process with representation Xt = P ◦ Xt−1 + εt, where (εt) is the innovation process, with λ := E(εt) and Λ := Var(εt). Let µ := E(Xt) and γ(h) := cov(Xt, Xt−h). Then µ = [I − P ]−1λ and for all h ∈ Z, γ(h) = P hγ(0) with γ(0) solution of γ(0) = P γ(0)P T + (∆ + Λ). See Boudreault & C. (2011) for additional properties
@freakonometrics
29
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Granger causality X1 → X2 or X1 ← X2
- 1. North American Plate, 2. Eurasian Plate,3. Okhotsk Plate, 4. Pacific Plate (East), 5. Pacific Plate (West), 6.
Amur Plate, 7. Indo-Australian Plate, 8. African Plate, 9. Indo-Chinese Plate, 10. Arabian Plate, 11. Philippine Plate, 12. Coca Plate, 13. Caribbean Plate, 14. Somali Plate, 15. South American Plate, 16. Nasca Plate, 17. Antarctic Plate
- 17
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Granger Causality test, 3 hours
- 17
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Granger Causality test, 6 hours
@freakonometrics
30
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Granger causality X1 → X2 or X1 ← X2
- 1. North American Plate, 2. Eurasian Plate,3. Okhotsk Plate, 4. Pacific Plate (East), 5. Pacific Plate (West), 6.
Amur Plate, 7. Indo-Australian Plate, 8. African Plate, 9. Indo-Chinese Plate, 10. Arabian Plate, 11. Philippine Plate, 12. Coca Plate, 13. Caribbean Plate, 14. Somali Plate, 15. South American Plate, 16. Nasca Plate, 17. Antarctic Plate
- 17
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Granger Causality test, 12 hours
- 17
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Granger Causality test, 24 hours
@freakonometrics
31
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Granger causality X1 → X2 or X1 ← X2
- 1. North American Plate, 2. Eurasian Plate,3. Okhotsk Plate, 4. Pacific Plate (East), 5. Pacific Plate (West), 6.
Amur Plate, 7. Indo-Australian Plate, 8. African Plate, 9. Indo-Chinese Plate, 10. Arabian Plate, 11. Philippine Plate, 12. Coca Plate, 13. Caribbean Plate, 14. Somali Plate, 15. South American Plate, 16. Nasca Plate, 17. Antarctic Plate
- 17
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Granger Causality test, 36 hours
- 17
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Granger Causality test, 48 hours
@freakonometrics
32
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Using Ranks for Time Series Haugh (1976) suggested to use ranks to test for independence. Set Rt denote the rank of Xt within {X1, · · · , XT }, and set Ut = Rt T = 1 T
T
- s=1
1Xt≤Xs = FX(Xt) and similarly Vt = St T = 1 T
T
- s=1
1Yt≤Ys = FY (Yt) See also Dufour(1981) for rank tests for serial dependence.
@freakonometrics
33
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Causality, in dimension 2 From Taamouti, Bouezmarni & El Ghouch (2014), consider some copula based causality approach: C(X → Y ) = E
- log f(Xt|X, Y )
f(Xt|X)
- can be written, for Markov 1 processes
C(X → Y ) = E
- log f(Xt|Xt−1, Yt−1)
f(Xt|Xt−1)
- = E
- log f(Xt, Xt−1, Yt−1) · f(Xt−1)
f(Xt, Xt−1) · f(Xt−1, Yt−1)
- i.e.
C(X → Y ) = E
- log
c(FX(Xt), FX(Xt−1), FY (Yt−1)) c(FX(Xt), FX(Xt−1)) · c(FX(Xt−1), FY (Yt−1))
- @freakonometrics
34
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Using a Probit-type Transformation Following Geenens, C. & Paindaveine (2014), consider some Probit-type transformation, for stationary time series
- Xt = Φ−1(Ut) = Φ−1(
FX(Xt))
- Yt = Φ−1(Vt) = Φ−1(
FY (Yt)) Application in Bastos, Mercea & C. (2015)
@freakonometrics
35
Arthur Charpentier, Causality & (non-Gaussian) Time Series, P7
Online vs. Onsite Causality For #occupy and #indignados
F Twitter T Protestors P Injuries I Arrests A
F Twitter T Protestors P Camped C Arrests A