Lecture on Parameter Estimation for Stochastic Differential - - PowerPoint PPT Presentation

lecture on parameter estimation for stochastic
SMART_READER_LITE
LIVE PREVIEW

Lecture on Parameter Estimation for Stochastic Differential - - PowerPoint PPT Presentation

Lecture on Parameter Estimation for Stochastic Differential Equations Erik Lindstrm FMS161/MASM18 Financial Statistics Erik Lindstrm Lecture on Parameter Estimation for Stochastic Differential Equations Recap We are interested in the


slide-1
SLIDE 1

Lecture on Parameter Estimation for Stochastic Differential Equations

Erik Lindström

FMS161/MASM18 Financial Statistics

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-2
SLIDE 2

Recap

◮ We are interested in the parameters θ in the Stochastic

Integral Equations X(t) = X(0)+

t

0 µθ(s,X(s))ds +

t

0 σθ(s,X(s))dW(s) (1)

Why?

◮ Model validation ◮ Risk management ◮ Advanced hedging (Greeks 9.2.2 and quadratic hedging

9.2.2.1 (P/Q))

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-3
SLIDE 3

Recap

◮ We are interested in the parameters θ in the Stochastic

Integral Equations X(t) = X(0)+

t

0 µθ(s,X(s))ds +

t

0 σθ(s,X(s))dW(s) (1)

Why?

◮ Model validation ◮ Risk management ◮ Advanced hedging (Greeks 9.2.2 and quadratic hedging

9.2.2.1 (P/Q))

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-4
SLIDE 4

Recap

◮ We are interested in the parameters θ in the Stochastic

Integral Equations X(t) = X(0)+

t

0 µθ(s,X(s))ds +

t

0 σθ(s,X(s))dW(s) (1)

Why?

◮ Model validation ◮ Risk management ◮ Advanced hedging (Greeks 9.2.2 and quadratic hedging

9.2.2.1 (P/Q))

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-5
SLIDE 5

Recap

◮ We are interested in the parameters θ in the Stochastic

Integral Equations X(t) = X(0)+

t

0 µθ(s,X(s))ds +

t

0 σθ(s,X(s))dW(s) (1)

Why?

◮ Model validation ◮ Risk management ◮ Advanced hedging (Greeks 9.2.2 and quadratic hedging

9.2.2.1 (P/Q))

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-6
SLIDE 6

Some asymptotics

Consider the arithmetic Brownian motion dX(t) = µdt +σdW(t) (2) The drift is estimated by computing the mean, and compensating for the sampling δ = tn+1 −tn ˆ µ = 1 δN

N−1

n=0

X(tn+1)−X(tn). (3) Expanding this expression reveals that the MLE is given by ˆ µ = X(tN)−X(t0) tN −t0 = µ +σ W(tN)−W(t0) tN −t0 . (4) The MLE for the diffusion (σ) parameter is given by ˆ σ2 = 1 δ(N −1)

N−1

n=0

(X(tn+1)−X(tn)− ˆ µδ)2 d → σ2 χ2(N −1) N −1 (5)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-7
SLIDE 7

Some asymptotics

Consider the arithmetic Brownian motion dX(t) = µdt +σdW(t) (2) The drift is estimated by computing the mean, and compensating for the sampling δ = tn+1 −tn ˆ µ = 1 δN

N−1

n=0

X(tn+1)−X(tn). (3) Expanding this expression reveals that the MLE is given by ˆ µ = X(tN)−X(t0) tN −t0 = µ +σ W(tN)−W(t0) tN −t0 . (4) The MLE for the diffusion (σ) parameter is given by ˆ σ2 = 1 δ(N −1)

N−1

n=0

(X(tn+1)−X(tn)− ˆ µδ)2 d → σ2 χ2(N −1) N −1 (5)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-8
SLIDE 8

Some asymptotics

Consider the arithmetic Brownian motion dX(t) = µdt +σdW(t) (2) The drift is estimated by computing the mean, and compensating for the sampling δ = tn+1 −tn ˆ µ = 1 δN

N−1

n=0

X(tn+1)−X(tn). (3) Expanding this expression reveals that the MLE is given by ˆ µ = X(tN)−X(t0) tN −t0 = µ +σ W(tN)−W(t0) tN −t0 . (4) The MLE for the diffusion (σ) parameter is given by ˆ σ2 = 1 δ(N −1)

N−1

n=0

(X(tn+1)−X(tn)− ˆ µδ)2 d → σ2 χ2(N −1) N −1 (5)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-9
SLIDE 9

Some asymptotics

Consider the arithmetic Brownian motion dX(t) = µdt +σdW(t) (2) The drift is estimated by computing the mean, and compensating for the sampling δ = tn+1 −tn ˆ µ = 1 δN

N−1

n=0

X(tn+1)−X(tn). (3) Expanding this expression reveals that the MLE is given by ˆ µ = X(tN)−X(t0) tN −t0 = µ +σ W(tN)−W(t0) tN −t0 . (4) The MLE for the diffusion (σ) parameter is given by ˆ σ2 = 1 δ(N −1)

N−1

n=0

(X(tn+1)−X(tn)− ˆ µδ)2 d → σ2 χ2(N −1) N −1 (5)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-10
SLIDE 10

A simple method

Many data sets are sampled at high frequency, making the bias due to discretization of the SDEs some of the schemes in Chapter 12 acceptable. The simplest discretization, the Explicit Euler method, would for the stochastic differential equation dX(t) = µ(t,X(t))dt +σ(t,X(t))dW(t) (6) correspond the Discretized Maximum Likelihood (DML) estimator given by ˆ θDML = argmax

θ∈Θ N−1

n=1

logφ (X(tn+1),X(tn)+ µ(tn,X(tn))∆,Σ(tn,X(tn))∆) (7) where φ(x,m,P) is the density for a multivariate Normal distribution with argument x, mean m and covariance P and Σ(t,X(t)) = σ(t,X(t))σ(t,X(t))T. (8)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-11
SLIDE 11

A simple method

Many data sets are sampled at high frequency, making the bias due to discretization of the SDEs some of the schemes in Chapter 12 acceptable. The simplest discretization, the Explicit Euler method, would for the stochastic differential equation dX(t) = µ(t,X(t))dt +σ(t,X(t))dW(t) (6) correspond the Discretized Maximum Likelihood (DML) estimator given by ˆ θDML = argmax

θ∈Θ N−1

n=1

logφ (X(tn+1),X(tn)+ µ(tn,X(tn))∆,Σ(tn,X(tn))∆) (7) where φ(x,m,P) is the density for a multivariate Normal distribution with argument x, mean m and covariance P and Σ(t,X(t)) = σ(t,X(t))σ(t,X(t))T. (8)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-12
SLIDE 12

Consistency

◮ The DMLE is generally NOT consistent. ◮ Approximate ML estimators (13.5) are, provided enough

computational resources are allocated

◮ Simulation based estimators ◮ Fokker-Planck based estimators ◮ Series expansions.

◮ GMM-type estimators (13.6) are consistent if the moments

are correctly specified (which is a non-trivial problem!)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-13
SLIDE 13

Consistency

◮ The DMLE is generally NOT consistent. ◮ Approximate ML estimators (13.5) are, provided enough

computational resources are allocated

◮ Simulation based estimators ◮ Fokker-Planck based estimators ◮ Series expansions.

◮ GMM-type estimators (13.6) are consistent if the moments

are correctly specified (which is a non-trivial problem!)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-14
SLIDE 14

Consistency

◮ The DMLE is generally NOT consistent. ◮ Approximate ML estimators (13.5) are, provided enough

computational resources are allocated

◮ Simulation based estimators ◮ Fokker-Planck based estimators ◮ Series expansions.

◮ GMM-type estimators (13.6) are consistent if the moments

are correctly specified (which is a non-trivial problem!)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-15
SLIDE 15

Simultion based estimators

◮ Discretely observed SDEs are Markov processes ◮ Then it follows that

pθ(xt|xs) = Eθ [pθ(xt|xτ)|F(s)],t > τ > s (9) This is the Pedersen algorithm.

◮ Improved by Durham-Gallant (2002) and Lindström (2012) ◮ Works very well for Multivariate models! ◮ and is easily (...) extended to Levy driven SDEs.

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-16
SLIDE 16

Simultion based estimators

◮ Discretely observed SDEs are Markov processes ◮ Then it follows that

pθ(xt|xs) = Eθ [pθ(xt|xτ)|F(s)],t > τ > s (9) This is the Pedersen algorithm.

◮ Improved by Durham-Gallant (2002) and Lindström (2012) ◮ Works very well for Multivariate models! ◮ and is easily (...) extended to Levy driven SDEs.

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-17
SLIDE 17

Simultion based estimators

◮ Discretely observed SDEs are Markov processes ◮ Then it follows that

pθ(xt|xs) = Eθ [pθ(xt|xτ)|F(s)],t > τ > s (9) This is the Pedersen algorithm.

◮ Improved by Durham-Gallant (2002) and Lindström (2012) ◮ Works very well for Multivariate models! ◮ and is easily (...) extended to Levy driven SDEs.

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-18
SLIDE 18

Simultion based estimators

◮ Discretely observed SDEs are Markov processes ◮ Then it follows that

pθ(xt|xs) = Eθ [pθ(xt|xτ)|F(s)],t > τ > s (9) This is the Pedersen algorithm.

◮ Improved by Durham-Gallant (2002) and Lindström (2012) ◮ Works very well for Multivariate models! ◮ and is easily (...) extended to Levy driven SDEs.

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-19
SLIDE 19

Some key points

◮ Naive implementation only provides a point wise estimate -

use CRNs or importance sampling

◮ Variance reduction helps (antithetic variates, control

variates)

◮ The near optimal importance sampler is a Bridge process,

as it reduces variance AND improves the asymptotics.

◮ There is a version that is completely bias free, albeit

somewhat restrictive in terms of the class of feasible models.

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-20
SLIDE 20

Fokker-Planck

Consider the expectation E[h(x(t))|F(0)] =

  • h(x(t))p(x(t)|x(0))dx(t)

(10) and then ∂ ∂t E[h(x(t))|F(0)] (11) Two possible ways to compute this, direct and using the It¯

  • formula. Equating these yields

∂p ∂t (x(t)|x(0)) = A ⋆p(x(t)|x(0)) (12) where A ⋆p(x(t)) = − ∂ ∂x(t) (µ(·)p(x(t)))+ 1 2 ∂ 2 ∂x2(t)

  • σ2(·)p(x(t))
  • .

(13)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-21
SLIDE 21

Fokker-Planck

Consider the expectation E[h(x(t))|F(0)] =

  • h(x(t))p(x(t)|x(0))dx(t)

(10) and then ∂ ∂t E[h(x(t))|F(0)] (11) Two possible ways to compute this, direct and using the It¯

  • formula. Equating these yields

∂p ∂t (x(t)|x(0)) = A ⋆p(x(t)|x(0)) (12) where A ⋆p(x(t)) = − ∂ ∂x(t) (µ(·)p(x(t)))+ 1 2 ∂ 2 ∂x2(t)

  • σ2(·)p(x(t))
  • .

(13)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-22
SLIDE 22

Fokker-Planck

Consider the expectation E[h(x(t))|F(0)] =

  • h(x(t))p(x(t)|x(0))dx(t)

(10) and then ∂ ∂t E[h(x(t))|F(0)] (11) Two possible ways to compute this, direct and using the It¯

  • formula. Equating these yields

∂p ∂t (x(t)|x(0)) = A ⋆p(x(t)|x(0)) (12) where A ⋆p(x(t)) = − ∂ ∂x(t) (µ(·)p(x(t)))+ 1 2 ∂ 2 ∂x2(t)

  • σ2(·)p(x(t))
  • .

(13)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-23
SLIDE 23

Example of the Fokker-Planck equation

From (Lindström, 2007)

0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.05 0.055 0.06 0.065 0.07 0.075 0.08 100 200 300 400 Time Grid p(s,xs;t,xt)

Figure: Fokker-Planck equation computed for the CKLS process

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-24
SLIDE 24

Comments on the PDE approach

Generally better than the Monte Carlo method in low dimensional problems.

10

2

10

3

10

−5

10

−4

10

−3

10

−2

10

−1

MAE time Durham−Gallant Poulsen Order2−Pade(1,1) Order4−Pade(2,2)

Figure: Comparing Monte Carlo, 2nd order and 4th order numerical approximations of the Fokker-Planck equation

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-25
SLIDE 25

Discussion

◮ Fokker-Planck is the preferred method is the state space is

non-trivial (see Pedersen et. al, 2011)

◮ Successfully used in 1-d and 2-d problems ◮ but the “curse of dimensionality” will eventually make the

method infeasable

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-26
SLIDE 26

Discussion

◮ Fokker-Planck is the preferred method is the state space is

non-trivial (see Pedersen et. al, 2011)

◮ Successfully used in 1-d and 2-d problems ◮ but the “curse of dimensionality” will eventually make the

method infeasable

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-27
SLIDE 27

Discussion

◮ Fokker-Planck is the preferred method is the state space is

non-trivial (see Pedersen et. al, 2011)

◮ Successfully used in 1-d and 2-d problems ◮ but the “curse of dimensionality” will eventually make the

method infeasable

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-28
SLIDE 28

Series expansion

◮ The solution to the Fokker-Planck equation when

dX(t) = µdt +σdW(t) (14) is p(x(t)|x(0)) = N(x(t);x(0)+ µt,σ2t).

◮ Hermite polynomials are the orthogonal polynomial basis

when using a Gaussian as weight function.

◮ This is used in the ’series expansion approach’, see e.g.

(Ait-Sahalia, 2002, 2008)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-29
SLIDE 29

Series expansion

◮ The solution to the Fokker-Planck equation when

dX(t) = µdt +σdW(t) (14) is p(x(t)|x(0)) = N(x(t);x(0)+ µt,σ2t).

◮ Hermite polynomials are the orthogonal polynomial basis

when using a Gaussian as weight function.

◮ This is used in the ’series expansion approach’, see e.g.

(Ait-Sahalia, 2002, 2008)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-30
SLIDE 30

Series expansion

◮ The solution to the Fokker-Planck equation when

dX(t) = µdt +σdW(t) (14) is p(x(t)|x(0)) = N(x(t);x(0)+ µt,σ2t).

◮ Hermite polynomials are the orthogonal polynomial basis

when using a Gaussian as weight function.

◮ This is used in the ’series expansion approach’, see e.g.

(Ait-Sahalia, 2002, 2008)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-31
SLIDE 31

Key ideas

Transform from X → Y → Z where Z is approximately standard

  • Gaussian. We assume that

dX(t) = µ(X(t))dt +σ(X(t))dW(t) (15) First step. Y(t) =

  • du

σ(u) (16) It then follows that dY(t) = µY(Y(t))dt +dW(t) (17) Second step: Transform Z(tk) = Y(tk)−Y(tk−1) tk −tk−1 . (18)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-32
SLIDE 32

Key ideas

Transform from X → Y → Z where Z is approximately standard

  • Gaussian. We assume that

dX(t) = µ(X(t))dt +σ(X(t))dW(t) (15) First step. Y(t) =

  • du

σ(u) (16) It then follows that dY(t) = µY(Y(t))dt +dW(t) (17) Second step: Transform Z(tk) = Y(tk)−Y(tk−1) tk −tk−1 . (18)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-33
SLIDE 33

Key ideas

Transform from X → Y → Z where Z is approximately standard

  • Gaussian. We assume that

dX(t) = µ(X(t))dt +σ(X(t))dW(t) (15) First step. Y(t) =

  • du

σ(u) (16) It then follows that dY(t) = µY(Y(t))dt +dW(t) (17) Second step: Transform Z(tk) = Y(tk)−Y(tk−1) tk −tk−1 . (18)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-34
SLIDE 34

Key ideas

Transform from X → Y → Z where Z is approximately standard

  • Gaussian. We assume that

dX(t) = µ(X(t))dt +σ(X(t))dW(t) (15) First step. Y(t) =

  • du

σ(u) (16) It then follows that dY(t) = µY(Y(t))dt +dW(t) (17) Second step: Transform Z(tk) = Y(tk)−Y(tk−1) tk −tk−1 . (18)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-35
SLIDE 35

Expansion

A Hermite expansion for the density pZ at order J is given by pJ

Z(z|y(0),tk −tk−1) = φ(z) J

j=0

ηj(tk −tk−1,y0)Hj(z) (19) where Hj(z) = ez2/2 dj dzj e−z2/2. (20) The coefficients are computed by projecting the density onto the basis functions Hj(z) (recall Hilbert space theory) ηj(t,y0) = 1 j!

  • Hj(z)pJ

Z(z|y(0),tk −tk−1)dz

(21)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-36
SLIDE 36

Expansion

A Hermite expansion for the density pZ at order J is given by pJ

Z(z|y(0),tk −tk−1) = φ(z) J

j=0

ηj(tk −tk−1,y0)Hj(z) (19) where Hj(z) = ez2/2 dj dzj e−z2/2. (20) The coefficients are computed by projecting the density onto the basis functions Hj(z) (recall Hilbert space theory) ηj(t,y0) = 1 j!

  • Hj(z)pJ

Z(z|y(0),tk −tk−1)dz

(21)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-37
SLIDE 37

Practical concerns

◮ The series expansion can be extremely accurate. ◮ The standard approach is to compute ηj by Taylor

expansion up to order (tk −tk−1)K = (∆t)K

◮ Some restrictions ’so-called reducible diffusion’ when using

the method for multivariate diffusions.

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-38
SLIDE 38

Other alternatives - GMM/EF

What about non-likelihood methods?

◮ The model is govern by some p-dimensional parameter. ◮ Suppose some set of features are important,

hl(x),l = 1,...,q ≥ p

◮ Compute

f(x(t);θ) =   h1(x(t))−Eθ [h1(Xt)] ... hq(xt)−Eθ [hq(Xt)]   (22) and form

JN(θ) =

  • 1

N

N

n=1

f(x(n);θ) T W

  • 1

N

N

n=1

f(x(n);θ)

  • (23)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-39
SLIDE 39

Other alternatives - GMM/EF

What about non-likelihood methods?

◮ The model is govern by some p-dimensional parameter. ◮ Suppose some set of features are important,

hl(x),l = 1,...,q ≥ p

◮ Compute

f(x(t);θ) =   h1(x(t))−Eθ [h1(Xt)] ... hq(xt)−Eθ [hq(Xt)]   (22) and form

JN(θ) =

  • 1

N

N

n=1

f(x(n);θ) T W

  • 1

N

N

n=1

f(x(n);θ)

  • (23)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-40
SLIDE 40

Other alternatives - GMM/EF

What about non-likelihood methods?

◮ The model is govern by some p-dimensional parameter. ◮ Suppose some set of features are important,

hl(x),l = 1,...,q ≥ p

◮ Compute

f(x(t);θ) =   h1(x(t))−Eθ [h1(Xt)] ... hq(xt)−Eθ [hq(Xt)]   (22) and form

JN(θ) =

  • 1

N

N

n=1

f(x(n);θ) T W

  • 1

N

N

n=1

f(x(n);θ)

  • (23)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-41
SLIDE 41

GMM

The Generalized Methods of Moments (GMM) estimators is then given by ˆ θ = argminJN(θ) (24) It can be shown that √ N

  • ˆ

θN −θ0

  • → N(0,Σ)

(25) where Σ =

  • ΓT

N Ω−1 N ΓN

−1 . (26) and ΓN and ΩN are estimates of Γ = E ∂f(x,θ) ∂θ T

  • , Ω = Var[f(x,θ)].

(27)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations

slide-42
SLIDE 42

GMM

The Generalized Methods of Moments (GMM) estimators is then given by ˆ θ = argminJN(θ) (24) It can be shown that √ N

  • ˆ

θN −θ0

  • → N(0,Σ)

(25) where Σ =

  • ΓT

N Ω−1 N ΓN

−1 . (26) and ΓN and ΩN are estimates of Γ = E ∂f(x,θ) ∂θ T

  • , Ω = Var[f(x,θ)].

(27)

Erik Lindström Lecture on Parameter Estimation for Stochastic Differential Equations