Introduction to State Space Methods Siem Jan Koopman - - PowerPoint PPT Presentation

introduction to state space methods
SMART_READER_LITE
LIVE PREVIEW

Introduction to State Space Methods Siem Jan Koopman - - PowerPoint PPT Presentation

Introduction to State Space Methods Siem Jan Koopman s.j.koopman@feweb.vu.nl Vrije Universiteit Amsterdam Tinbergen Institute Introduction to State Space Methods p. 1 State Space Model Linear Gaussian state space model is defined in three


slide-1
SLIDE 1

Introduction to State Space Methods

Siem Jan Koopman

s.j.koopman@feweb.vu.nl

Vrije Universiteit Amsterdam Tinbergen Institute

Introduction to State Space Methods – p. 1

slide-2
SLIDE 2

State Space Model

Linear Gaussian state space model is defined in three parts: → State equation: αt+1 = Ttαt + Rtζt, ζt ∼ NID(0, Qt), → Observation equation: yt = Ztαt + εt, εt ∼ NID(0, Gt), → Initial state distribution α1 ∼ N(a1, P1). Notice that

  • ζt and εs independent for all t, s, and independent from α1;
  • observation yt can be multivariate;
  • state vector αt is unobserved;
  • matrices Tt, Zt, Rt, Qt, Gt determine structure of model.

Introduction to State Space Methods – p. 2

slide-3
SLIDE 3

State Space Model

  • state space model is linear and Gaussian: therefore properties

and results of multivariate normal distribution apply;

  • state vector αt evolves as a VAR(1) process;
  • system matrices usually contain unknown parameters;
  • estimation has therefore two aspects:
  • measuring the unobservable state (prediction, filtering and

smoothing);

  • estimation of unknown parameters (maximum likelihood

estimation);

  • state space methods offer a unified approach to a wide range of

models and techniques: dynamic regression, ARIMA, UC models, latent variable models, spline-fitting and many ad-hoc filters;

  • next, some well-known model specifications in state space form ...

Introduction to State Space Methods – p. 3

slide-4
SLIDE 4

Regression with Time Varying Coefficients

General state space model: αt+1 = Ttαt + Rtζt, ζt ∼ NID(0, Qt), yt = Ztαt + εt, εt ∼ NID(0, Gt). Put regressors in Zt, Tt = I, Rt = I, Result is regression model with coefficient αt following a random walk.

Introduction to State Space Methods – p. 4

slide-5
SLIDE 5

ARMA in State Space Form

Example: AR(2) model yt+1 = φ1yt + φ2yt−1 + ζt, in state space: αt+1 = Ttαt + Rtζt, ζt ∼ NID(0, Qt), yt = Ztαt + εt, εt ∼ NID(0, Gt). with 2 × 1 state vector αt and system matrices: Zt =

  • 1
  • ,

Gt = 0 Tt =

  • φ1

1 φ2

  • ,

Rt =

  • 1
  • ,

Qt = σ2

  • Zt and Gt = 0 imply that α1t = yt;
  • First state equation implies yt+1 = φ1yt + α2t + ζt with

ζt ∼ NID(0, σ2);

  • Second state equation implies α2,t+1 = φ2yt;

Introduction to State Space Methods – p. 5

slide-6
SLIDE 6

ARMA in State Space Form

Example: MA(1) model yt+1 = ζt + θζt−1, in state space: αt+1 = Ttαt + Rtζt, ζt ∼ NID(0, Qt), yt = Ztαt + εt, εt ∼ NID(0, Gt). with 2 × 1 state vector αt and system matrices: Zt =

  • 1
  • ,

Gt = 0 Tt =

  • 1
  • ,

Rt =

  • 1

θ

  • ,

Qt = σ2

  • Zt and Gt = 0 imply that α1t = yt;
  • First state equation implies yt+1 = α2t + ζt with ζt ∼ NID(0, σ2);
  • Second state equation implies α2,t+1 = θζt;

Introduction to State Space Methods – p. 6

slide-7
SLIDE 7

ARMA in State Space Form

Example: ARMA(2,1) model yt = φ1yt−1 + φ2yt−2 + ζt + θζt−1 in state space form αt =

  • yt

φ2yt−1 + θζt

  • Zt =
  • 1
  • ,

Gt = 0, Tt =

  • φ1

1 φ2

  • ,

Rt =

  • 1

θ

  • ,

Qt = σ2 All ARIMA(p, d, q) models have a (non-unique) state space representation.

Introduction to State Space Methods – p. 7

slide-8
SLIDE 8

UC models in State Space Form

State space model: αt+1 = Ttαt + Rtζt, yt = Ztαt + εt. LL model ∆µt+1 = ηt and yt = µt + εt: αt = µt, Tt = 1, Rt = 1, Qt = σ2

η,

Zt = 1, Gt = σ2

ε.

LLT model ∆µt+1 = βt + ηt, ∆βt+1 = ξt and yt = µt + εt: αt =

  • µt

βt

  • ,

Tt =

  • 1

1 1

  • ,

Rt =

  • 1

1

  • ,

Qt =

  • σ2

η

σ2

ξ

  • ,

Zt =

  • 1
  • ,

Gt = σ2

ε.

Introduction to State Space Methods – p. 8

slide-9
SLIDE 9

UC models in State Space Form

State space model: αt+1 = Ttαt + Rtζt, yt = Ztαt + εt. LLT model with season: ∆µt+1 = βt + ηt, ∆βt+1 = ξt, S(L)γt+1 = ωt and yt = µt + γt + εt: αt =

  • µt

βt γt γt−1 γt−2 ′ , Tt =        1 1 1 −1 −1 −1 1 1        , Qt =    σ2

η

σ2

ξ

σ2

ω

   , Rt =        1 1 1        , Zt =

  • 1

1

  • ,

Gt = σ2

ε.

Introduction to State Space Methods – p. 9

slide-10
SLIDE 10

Kalman Filter

  • The Kalman filter calculates the mean and variance of the

unobserved state, given the observations.

  • The state is Gaussian: the complete distribution is characterized

by the mean and variance.

  • The filter is a recursive algorithm; the current best estimate is

updated whenever a new observation is obtained.

  • To start the recursion, we need a1 and P1, which we assumed

given.

  • There are various ways to initialize when a1 and P1 are unknown,

which we will not discuss here.

Introduction to State Space Methods – p. 10

slide-11
SLIDE 11

Kalman Filter

The unobserved state αt can be estimated from the observations with the Kalman filter: vt = yt − Ztat, Ft = ZtPtZ′

t + Gt,

Kt = TtPtZ′

tF −1 t

, at+1 = Ttat + Ktvt, Pt+1 = TtPtT ′

t + RtQtR′ t − KtFtK′ t,

for t = 1, . . . , n and starting with given values for a1 and P1.

  • Writing Yt = {y1, . . . , yt},

at+1 = E(αt+1|Yt), Pt+1 = var(αt+1|Yt).

Introduction to State Space Methods – p. 11

slide-12
SLIDE 12

Kalman Filter

State space model: αt+1 = Ttαt + Rtζt, yt = Ztαt + εt.

  • Writing Yt = {y1, . . . , yt}, define

at+1 = E(αt+1|Yt), Pt+1 = var(αt+1|Yt);

  • The prediction error is

vt = yt − E(yt|Yt−1) = yt − E(Ztαt + εt|Yt−1) = yt − Zt E(αt|Yt−1) = yt − Ztat;

  • It follows that vt = Zt(αt − at) + εt and E(vt) = 0;
  • The prediction error variance is Ft = var(vt) = ZtPtZ′

t + Gt.

Introduction to State Space Methods – p. 12

slide-13
SLIDE 13

Lemma

The proof of the Kalman filter uses a lemma from multivariate Normal regression theory. Lemma Suppose x, y and z are jointly Normally distributed vectors with E(z) = 0 and Σyz = 0. Then E(x|y, z) = E(x|y) + ΣxzΣ−1

zz z,

var(x|y, z) = var(x|y) − ΣxzΣ−1

zz Σ′ xz,

Introduction to State Space Methods – p. 13

slide-14
SLIDE 14

Kalman Filter

State space model: αt+1 = Ttαt + Rtζt, yt = Ztαt + εt.

  • We have Yt = {Yt−1, yt} = {Yt−1, vt} and E(vtyt−j) = 0 for

j = 1, . . . , t − 1;

  • Lemma E(x|y, z) = E(x|y) + ΣxzΣ−1

zz z, and take x = αt+1,

y = Yt−1 and z = vt = Zt(αt − at) + εt;

  • It follows that E(αt+1|Yt−1) = Ttat;
  • Furter, E(αt+1v′

t) = Tt E(αtv′ t) + Rt E(ζtv′ t) = TtPtZ′ t;

  • We carry out lemma and obtain the state update

at+1 = E(αt+1|Yt−1, yt) = Ttat + TtPtZ′

tF −1 t

vt = Ttat + Ktvt; with Kt = TtPtZ′

tF −1 t

Introduction to State Space Methods – p. 14

slide-15
SLIDE 15

Kalman Filter

Our best prediction of yt is Ztat. When the actual observation arrives, calculate the prediction error vt = yt − Ztat and its variance Ft = ZtPtZ′

t + Gt. The new best estimates of the state mean is based

  • n both the old estimate at and the new information vt:

at+1 = Ttat + Ktvt, similarly for the variance: Pt+1 = TtPtT ′

t + RtQtR′ t − KtFtK′ t.

The Kalman gain Kt = TtPtZ′

tF −1 t

is the optimal weighting matrix for the new evidence.

Introduction to State Space Methods – p. 15

slide-16
SLIDE 16

Kalman Filter Illustration

1880 1900 1920 1940 1960 500 750 1000 1250

  • bservation

filtered level a_t

1880 1900 1920 1940 1960 6000 7000 8000 9000 10000

state variance P_t

1880 1900 1920 1940 1960 −250 250

prediction error v_t

1880 1900 1920 1940 1960 21000 22000 23000 24000 25000

prediction error variance F_t

Introduction to State Space Methods – p. 16

slide-17
SLIDE 17

Smoothing

  • The filter calculates the mean and variance conditional on Yt;
  • The Kalman smoother calculates the mean and variance

conditional on the full set of observations Yn;

  • After the filtered estimates are calculated, the smoothing

recursion starts at the last observations and runs until the first. ˆ αt = E(αt|Yn), Vt = var(αt|Yt), rt = weighted sum of innovations, Nt = var(rt), Lt = Tt − KtZt. Starting with rn = 0, Nn = 0, the smoothing recursions are given by rt−1 = F −1

t

vt + Ltrt, Nt−1 = F −1

t

+ L2

tNt,

ˆ αt = at + Ptrt−1, Vt = Pt − P 2

t Nt−1.

Introduction to State Space Methods – p. 17

slide-18
SLIDE 18

Smoothing Illustration

1880 1900 1920 1940 1960 500 750 1000 1250

  • bservations

smoothed state

1880 1900 1920 1940 1960 2500 3000 3500 4000

V_t

1880 1900 1920 1940 1960 −0.02 0.00 0.02

r_t

1880 1900 1920 1940 1960 0.000025 0.000050 0.000075 0.000100

N_t

Introduction to State Space Methods – p. 18

slide-19
SLIDE 19

Filtering and Smoothing

1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 500 600 700 800 900 1000 1100 1200 1300 1400

  • bservation

smoothed level filtered level

Introduction to State Space Methods – p. 19

slide-20
SLIDE 20

Missing Observations

Missing observations are very easy to handle in Kalman filtering:

  • suppose yj is missing
  • put vj = 0, Kj = 0 and Fj = ∞ in the algorithm
  • proceed further calculations as normal

The filter algorithm extrapolates according to the state equation until a new observation arrives. The smoother interpolates between

  • bservations.

Introduction to State Space Methods – p. 20

slide-21
SLIDE 21

Missing Observations

1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 500 750 1000 1250

  • bservation

a_t

1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 10000 20000 30000

P_t

Introduction to State Space Methods – p. 21

slide-22
SLIDE 22

Missing Observations, Filter and Smoohter

1880 1900 1920 1940 1960 500 750 1000 1250

filtered state

1880 1900 1920 1940 1960 10000 20000 30000

P_t

1880 1900 1920 1940 1960 500 750 1000 1250

smoothed state

1880 1900 1920 1940 1960 2500 5000 7500 10000

V_t

Introduction to State Space Methods – p. 22

slide-23
SLIDE 23

Forecasting

Forecasting requires no extra theory: just treat future observations as missing:

  • put vj = 0, Kj = 0 and Fj = ∞ for j = n + 1, . . . , n + k
  • proceed further calculations as normal
  • forecast for yj is Zjaj

Introduction to State Space Methods – p. 23

slide-24
SLIDE 24

Forecasting

1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 500 750 1000 1250

  • bservation

a_t

1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 10000 20000 30000 40000 50000

P_t

Introduction to State Space Methods – p. 24

slide-25
SLIDE 25

Parameter Estimation

The system matrices in a state space model typically depends on a parameter vector ψ. The model is completely Gaussian; we estimate by Maximum Likelihood. The loglikelihood af a time series is log L =

n

  • t=1

log p(yt|Yt−1). In the state space model, p(yt|Yt−1) is a Gaussian density with mean at and variance Ft: log L = −n 2 log 2π − 1 2

n

  • t=1
  • log Ft + F −1

t

v2

t

  • ,

with vt and Ft from the Kalman filter. This is called the prediction error decomposition of the likelihood. Estimation proceeds by numerically maximising log L.

Introduction to State Space Methods – p. 25

slide-26
SLIDE 26

ML Estimate of Nile Data

1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 500 600 700 800 900 1000 1100 1200 1300 1400

  • bservation

level (q = 1000) level (q = 0) level (q = 0.0973 = ML estimate)

Introduction to State Space Methods – p. 26

slide-27
SLIDE 27

Diagnostics

  • Null hypothesis: standardised residuals

vt/Ft ∼ NID(0, 1)

  • Apply standard test for Normality, heteroskedasticity, serial

correlation;

  • A recursive algorithm is available to calculate smoothed

disturbances (auxilliary residuals), which can be used to detect breaks and outliers;

  • Model comparison and parameter restrictions: use likelihood

based procedures (LR test, AIC, BIC).

Introduction to State Space Methods – p. 27

slide-28
SLIDE 28

Nile Data Residuals Diagnostics

1880 1900 1920 1940 1960 −2 2

Residual Nile

5 10 15 20 −0.5 0.0 0.5 1.0 Correlogram

Residual Nile

−4 −3 −2 −1 1 2 3 4 0.1 0.2 0.3 0.4 0.5

N(s=0.996)

−2 −1 1 2 −2 2 QQ plot

normal

Introduction to State Space Methods – p. 28