FMS161/MASM18 Financial Statistics Lecture 2, Linear Time Series - - PowerPoint PPT Presentation

▶

Jan 23, 2023 140 likes •421 views

FMS161/MASM18 Financial Statistics Lecture 2, Linear Time Series Erik Lindstrm Systems with discrete time Linear systems space models form while being causal time-invariant stationary Discrete time models are written as a difference

SLIDE 1

FMS161/MASM18 Financial Statistics Lecture 2, Linear Time Series

Erik Lindström

SLIDE 2

Systems with discrete time

Linear systems

◮ Can be represented on a polynomial or state

space models form while being

◮ Stability ◮ Lyapunov stable, ∥x(t) − xe∥ < ϵ ◮ Asympt. stable limt→∞ ∥x(t) − xe∥ = 0

causal time-invariant stationary

Discrete time models are written as a difference equation Impulse Response - h s t or h Transfer Function - H z Frequency Function - H ei2 f Typical process: The SARIMAX-process

SLIDE 3

Systems with discrete time

Linear systems

◮ Can be represented on a polynomial or state

space models form while being

◮ Stability ◮ Lyapunov stable, ∥x(t) − xe∥ < ϵ ◮ Asympt. stable limt→∞ ∥x(t) − xe∥ = 0 ◮ causal ◮ time-invariant ◮ stationary

Discrete time models are written as a difference equation Impulse Response - h s t or h Transfer Function - H z Frequency Function - H ei2 f Typical process: The SARIMAX-process

SLIDE 4

Systems with discrete time

Linear systems

◮ Can be represented on a polynomial or state

space models form while being

◮ Stability ◮ Lyapunov stable, ∥x(t) − xe∥ < ϵ ◮ Asympt. stable limt→∞ ∥x(t) − xe∥ = 0 ◮ causal ◮ time-invariant ◮ stationary

◮ Discrete time models are written as a difference

equation

◮ Impulse Response - h(s, t) or h(τ) ◮ Transfer Function - H(z) ◮ Frequency Function - H(ei2πf)

Typical process: The SARIMAX-process

SLIDE 5

Systems with discrete time

Linear systems

◮ Can be represented on a polynomial or state

space models form while being

◮ Stability ◮ Lyapunov stable, ∥x(t) − xe∥ < ϵ ◮ Asympt. stable limt→∞ ∥x(t) − xe∥ = 0 ◮ causal ◮ time-invariant ◮ stationary

◮ Discrete time models are written as a difference

equation

◮ Impulse Response - h(s, t) or h(τ) ◮ Transfer Function - H(z) ◮ Frequency Function - H(ei2πf)

Typical process: The SARIMAX-process

SLIDE 6

Impulse response

◮ A causal linear stable system (Gaussian or

non-Gaussian) has a well defined impulse response h(·).

◮ The impulse response is the output of a system if

we let the input be 1 at time zero and then zero for the rest of the time.

◮ The output for a general input u is given as

y(t) =

∞

∑

i=0

h(i)u(t − i) = (h ∗ u)(t) It is the convolution of the input u and the impulse response h.

SLIDE 7

Difference equations

Difference equation representation for ARX/ARMA structure: yt + a1yt−1 + · · · + apyt−p = ut + b1ut−1 + · · · + bqut−q using the delay operator, zyt = yt−1 leads to the Transfer Function (which can be defined also for a system not following this linear difference equation) yt = 1 + b1z1 + · · · + bqzq 1 + a1z1 + · · · + apzp ut (1) = B(z) A(z)ut = H(z)ut with (the latter equation with a Z-transform interpretation of the operations) H(z) =

∞

∑ h(τ)z−τ; Y(Z) = H(Z)U(Z)

SLIDE 8

Frequency representation

The frequency function is defined from the transfer function as H ( ei2πf) = H(f) giving a amplitude and phase shift of an input trigonometric signal, as e.g. u(k) = cos(2πfk) y(k) = |H(f)| cos (2πfk + arg (H(f))) |f| ≤ 0.5

SLIDE 9

Spectrum

◮ If we filter standard white noise, i.e. a sequence

f i.i.d. zero mean random variables with

variance one, through a linear system with frequency function H then we get a signal with spectrum R(f) = |H(f)|2. The spectrum at frequency f is the average energy in the output with frequency f.

◮ The spectrum is also the Fourier transform of

covariance function γ(k) = E [XnXn−k] with R(f) =

∞

∑

k=−∞

γ(k)e−i2πfk [γ(·) is sym] =

∞

∑

k=−∞

γ(k) cos(2πfk). (2) Note: The covariance is not symmetric for multivariate processes (think Granger causality)

SLIDE 10

Inverse filtering in discrete time

AIM: Reconstruct the input u from the output y signal.

◮ Assume that we have a filter g with (h) being

linear, stable, and time-invariant. It then follows that w(k) = (g ∗ y)(k) = (g ∗ h ∗ u)(k) (3)

◮ We say that g is an inverse if w(k) = u(k) for all

k, or equivalently that there exists ∃ causal and stable g such that (g ∗ h)(k) = δ(k) G(z)H(z) = 1 NOTE: The causality means that we do reconstruct the signal from old values, h(k) = 0 ∀ k < 0.

SLIDE 11

ARMA(p,q)-filter

◮ The process is defined as

yt +a1yt−1 +· · ·+apyt−p = xt +c1xt−1 +· · ·+cqxt−q

◮ The corresponding transfer function is given by

H(z) = 1 + c1z1 + · · · + cqzq 1 + a1z1 + · · · + apzp = C(z) A(z)

◮ Properties:

◮ Frequency Function as H(ei2πf), f ∈ (−0.5, 0.5] ◮ Stability: Poles to A(z−1) = 0, st

|πi| < 1, i = 1, . . . , p

◮ Invertability: Zeroes to C(z−1) = 0, st

|ηi| < 1, i = 1, . . . , q

SLIDE 12

ARMAX process

We can combine moving average (MA) and exogenous variables in the ARMAX process yt + a1yt−1 + · · · + apyt−p = xt + b1xt−1 + · · · + bqxt−q + et + c1et−1 + · · · + cret−r (4) Or in transfer function form yt = B(z) A(z)xt + C(z) A(z)et

SLIDE 13

Auto correlation and cross-correlation

◮ The auto covariance is defined as

γ(k) = E [YtYt+k] (5) and corresponding autocorrelation function ρ(k) = γ(k) γ(0) (6)

◮ The cross covariance is defined as

γXY(k) = E [XtYt+k] (7) and corresponding autocorrelation function ρXY(k) = γXY(k) γXY(0) (8)

SLIDE 14

Auto covariance for ARMA

◮ Consider the ARMA(p,q) process

Yt + a1Yt−1 + . . . apYt−p = et + . . . cqet−q (9)

◮ The auto covariance then satisfies

γ(k)+a1γ(k−1)+. . . apγ(k−p) = ckγeY(0)+. . . cqγeY(q−k) (10) This is known as the Yule-Walker equation.

◮ Proof: Multiply with Yt−k, and use that

E [et−lYt−k] = 0 for k > l

SLIDE 15

Cointegration

◮ It is rather common that financial time series

{X(t)} are non-stationary, often integrated

◮ This means that ∇X(t) is typically stationary. We

then say that is an integrated process, X(t) ∼ I(1).

◮ Assume that the processes X(t) ∼ I(1) and

Y(t) ∼ I(1) but X(t) − βY(t) ∼ I(0). We then say that X(t) and Y(t) are cointegrated.

◮ NOTE: that the asymptotic theory for β is

non-standard.

SLIDE 16

Log-real money and Bond rates 1974-1985

1975 1980 1985 11.5 11.6 11.7 11.8 11.9 12 12.1 Money Levels 1975 1980 1985 −0.05 0.05 0.1 Differences 1975 1980 1985 0.1 0.12 0.14 0.16 0.18 0.2 0.22 Bond rate Levels 1975 1980 1985 −0.04 −0.03 −0.02 −0.01 0.01 Differences

Figure: Log-real money and interest rates

SLIDE 17

Estimation

Two dominant approaches

◮ Optimization based estimation (LS, WLS, ML,

PEM, GMM)

◮ Matching properties (MM, GMM, EF, ML, IV)

We focus on the optimization based estimators today.

SLIDE 18

General properties

◮ Denote the true parameter θ0 ◮ Introduce an estimator ˆ

θ = T(X1, . . . , XN)

◮ Observation: The estimator is a function of data.

Implications...

◮ Bias: b = (E[T(X)] − θ0). ◮ Consistency: T(X) p

→ θ0.

◮ Efficiency: Var(T(X)) ≥ IN(θ0)−1.

Where IN(θ0)ij = −E [ ∂2 ∂θi∂θj ℓ(X1, . . . , XN, θ) ]

|θ=θ0

, = Cov [ ∂ ∂θi ℓ(X1, . . . , XN, θ) ∂ ∂θj ℓ(X1, . . . , XN, θ)T ]

|θ=θ0

where ℓ is the log-likelihood function.

SLIDE 19

Estimators

The maximum likelihood estimator is defined as ˆ θMLE = arg max ℓ(θ) = arg max

N

∑

n=1

log pθ(xn|x1, . . . xn−1) (11) The asymptotics for the MLE is given by √ N ( ˆ θMLE − θ0 ) d → N ( 0, I−1

F

) . (12) Hint: MLmax will help you during the labs and project.

SLIDE 20

Estimators

The general so-called M estimator (ex GMM) is defined as ˆ θ = arg min Q(θ) = arg min log Q(θ) (13) The asymptotics for that estimator is given by √ N ( ˆ θ − θ0 ) d → N ( 0, J−1IJ−1) . (14) with J = E[∇θ∇θ log Q] (15) I = E[(∇θ log Q) (∇θ log Q)T] (16)

SLIDE 21

ML methods for Gaussian processes

◮ Say that we have sample

Y = {yt}, t = 1, 2, , . . . , n from a Gaussian process. We then have that Y ∈ N(µ(θ), Σ(θ)), where θ is a vector of parameters.

◮ The log-likelihood for Y can be written as

ℓ(Y, θ) = −1 2 log(det(2πΣ(θ))) (17) − 1 2 (Y − µ(θ))T Σ(θ)−1 (Y − µ(θ)) . (18) If we can calculate the likelihood, then it follows that we can use a standard optimization routine to maximize ℓ(Y, θ) and thereby estimate θ.

SLIDE 22

Example: AR(2)

Y =    x3 . . . xN    , X =    x2 x1 . . . . . . xN−1 xN−2    Then ˆ θ = (XTX)−1(XTY). (XTX) = ( ∑ x2

i−1

∑ xi−1xi−2 ∑ xi−1xi−2 ∑ x2

i−2

) and (XTY) = ( ∑ xixi−1 ∑ xixi−2 ) Solve! Explanation on blackboard

SLIDE 23

An ARMA example

ARMA(1,1) model: xt + 0.7xt−1 = et − 0.5et−1, {et}t=0,1,2,... i.i.d. ∈ N(0, 1).

200 400 600 800 1000 −10 10 Realisation −20 −15 −10 −5 5 10 15 20 −5 5 Covariance −0.5 0.5 20 40 Spectrum

SLIDE 24

Explanation approximative method on blackboard

◮ ML ◮ 2LS

SLIDE 25

Comparison, LS2 and MLE

ARMA(1,1) model: xt + 0.7xt−1 = et − 0.5et−1, {et}t=0,1,2,... i.i.d. ∈ N(0, 1).

0.65 0.7 0.75 0.8 0.001 0.003 0.01 0.02 0.05 0.10 0.25 0.50 0.75 0.90 0.95 0.98 0.99 0.997 0.999 a1 LS2 Normal Probability Plot −0.6 −0.5 −0.4 0.001 0.003 0.01 0.02 0.05 0.10 0.25 0.50 0.75 0.90 0.95 0.98 0.99 0.997 0.999 c1 Probability Normal Probability Plot 0.65 0.7 0.75 0.001 0.003 0.01 0.02 0.05 0.10 0.25 0.50 0.75 0.90 0.95 0.98 0.99 0.997 0.999 a1 MLE Normal Probability Plot −0.6 −0.55 −0.5 −0.45 −0.4 0.001 0.003 0.01 0.02 0.05 0.10 0.25 0.50 0.75 0.90 0.95 0.98 0.99 0.997 0.999 c1 Probability Normal Probability Plot

1000 estimations using 1000 observations each.

SLIDE 26

Extra material

Feel free to dig deeper into any of:

◮ Lindgren, G., Rootzén, H., & Sandsten, M. (2013).

Stationary stochastic processes for scientists and

engineers. CRC press.

◮ Jakobsson, A. (2015) An Introduction to time

series modeling. Studentlitteratur AB,

◮ Madsen, H. (2007). Time series analysis. CRC

Press.

◮ (PhD level) Lindgren, G. (2012). Stationary

Stochastic Processes: Theory and Applications. CRC Press.

SLIDE 27