SLIDE 1 Lecture on Stochastic Differential Equations
Erik Lindström
SLIDE 2
Motivation
◮ Continuous time models are more 'interpretable'
than discrete time models, at least if you have a background in science or engineering. It is often argued that continuous time models need fewer parameters compared to discrete time models, as the parameters often can be given an interpretation. Consistent with option valuation due to path wise properties. Integration between time scales (e.g. irregularly sampled data) Heteroscedasticity is easily integrated into the models.
SLIDE 3
Motivation
◮ Continuous time models are more 'interpretable'
than discrete time models, at least if you have a background in science or engineering.
◮ It is often argued that continuous time models
need fewer parameters compared to discrete time models, as the parameters often can be given an interpretation. Consistent with option valuation due to path wise properties. Integration between time scales (e.g. irregularly sampled data) Heteroscedasticity is easily integrated into the models.
SLIDE 4
Motivation
◮ Continuous time models are more 'interpretable'
than discrete time models, at least if you have a background in science or engineering.
◮ It is often argued that continuous time models
need fewer parameters compared to discrete time models, as the parameters often can be given an interpretation.
◮ Consistent with option valuation due to path
wise properties. Integration between time scales (e.g. irregularly sampled data) Heteroscedasticity is easily integrated into the models.
SLIDE 5
Motivation
◮ Continuous time models are more 'interpretable'
than discrete time models, at least if you have a background in science or engineering.
◮ It is often argued that continuous time models
need fewer parameters compared to discrete time models, as the parameters often can be given an interpretation.
◮ Consistent with option valuation due to path
wise properties.
◮ Integration between time scales (e.g. irregularly
sampled data) Heteroscedasticity is easily integrated into the models.
SLIDE 6
Motivation
◮ Continuous time models are more 'interpretable'
than discrete time models, at least if you have a background in science or engineering.
◮ It is often argued that continuous time models
need fewer parameters compared to discrete time models, as the parameters often can be given an interpretation.
◮ Consistent with option valuation due to path
wise properties.
◮ Integration between time scales (e.g. irregularly
sampled data)
◮ Heteroscedasticity is easily integrated into the
models.
SLIDE 7
ODEs in physics
Physics is often modelled as (a system of) ordinary differential equations dX dt (t) = µ(X(t)) (1) Similar models are found in finance Bond
dB dt (t) = rB(t)
Stock
dS dt t
noise t S t , cf. RCAR CAPM
dS dt t
r noise t S t
SLIDE 8
ODEs in physics
Physics is often modelled as (a system of) ordinary differential equations dX dt (t) = µ(X(t)) (1) Similar models are found in finance Bond
dB dt (t) = rB(t)
Stock
dS dt (t) = (µ + “noise′′(t))S(t), cf. RCAR
CAPM
dS dt t
r noise t S t
SLIDE 9
ODEs in physics
Physics is often modelled as (a system of) ordinary differential equations dX dt (t) = µ(X(t)) (1) Similar models are found in finance Bond
dB dt (t) = rB(t)
Stock
dS dt (t) = (µ + “noise′′(t))S(t), cf. RCAR
CAPM
dS dt (t) = (r + βσ + σ“noise′′(t))S(t)
SLIDE 10
Noise processes
The noise process should ideally be the time derivative of random walk. Examples of continuous time processes (see Chapter 7.5) Brownian motion W t Poisson process N t , or N t t Compound Poisson process S t
N t n 1 Yn
Note that N t
N t n 1 1
Lévy process L t
SLIDE 11
Noise processes
The noise process should ideally be the time derivative of random walk. Examples of continuous time processes (see Chapter 7.5)
◮ Brownian motion W(t)
Poisson process N t , or N t t Compound Poisson process S t
N t n 1 Yn
Note that N t
N t n 1 1
Lévy process L t
SLIDE 12
Noise processes
The noise process should ideally be the time derivative of random walk. Examples of continuous time processes (see Chapter 7.5)
◮ Brownian motion W(t) ◮ Poisson process N(t), or (N(t) − λt)
Compound Poisson process S t
N t n 1 Yn
Note that N t
N t n 1 1
Lévy process L t
SLIDE 13
Noise processes
The noise process should ideally be the time derivative of random walk. Examples of continuous time processes (see Chapter 7.5)
◮ Brownian motion W(t) ◮ Poisson process N(t), or (N(t) − λt) ◮ Compound Poisson process S(t) = ∑N(t) n=1 Yn
Note that N t
N t n 1 1
Lévy process L t
SLIDE 14
Noise processes
The noise process should ideally be the time derivative of random walk. Examples of continuous time processes (see Chapter 7.5)
◮ Brownian motion W(t) ◮ Poisson process N(t), or (N(t) − λt) ◮ Compound Poisson process S(t) = ∑N(t) n=1 Yn ◮ Note that N(t) = ∑N(t) n=1 1
Lévy process L t
SLIDE 15
Noise processes
The noise process should ideally be the time derivative of random walk. Examples of continuous time processes (see Chapter 7.5)
◮ Brownian motion W(t) ◮ Poisson process N(t), or (N(t) − λt) ◮ Compound Poisson process S(t) = ∑N(t) n=1 Yn ◮ Note that N(t) = ∑N(t) n=1 1 ◮ Lévy process L(t)
SLIDE 16
Wiener process aka Standard Brownian Motion
A processes satisfying the following conditions is a Standard Brownian Motion
◮ X(0) = 0 with probability 1. ◮ The increments W(u) − W(t), W(s) − W(0) with
u > t ≥ s > 0 are independent.
◮ The increment W(t) − W(s) ∼ N(0, t − s) ◮ The process has continuous trajectories.
SLIDE 17
Time derivative of the Wiener process
Study the object ξh = W(t + h) − W(t) h (2) (Think dW(t)/dt = limh→0 ξh). Compute
◮ E[ξh] ◮ Var[ξh]
The limit does not converge in mean square sense!
SLIDE 18
Time derivative of the Wiener process
Study the object ξh = W(t + h) − W(t) h (2) (Think dW(t)/dt = limh→0 ξh). Compute
◮ E[ξh] ◮ Var[ξh]
The limit does not converge in mean square sense!
SLIDE 19 Re-interpreting ODEs
In physics, dX dt (t) = µ(X(t)) (3) really means dX t X t dt (4)
t
dX s X t X 0
t
X s ds (5) NOTE: No derivatives needed!
SLIDE 20 Re-interpreting ODEs
In physics, dX dt (t) = µ(X(t)) (3) really means dX(t) = µ(X(t))dt (4)
t
dX s X t X 0
t
X s ds (5) NOTE: No derivatives needed!
SLIDE 21 Re-interpreting ODEs
In physics, dX dt (t) = µ(X(t)) (3) really means dX(t) = µ(X(t))dt (4)
∫ t dX(s) = X(t) − X(0) = ∫ t µ(X(s))ds (5) NOTE: No derivatives needed!
SLIDE 22
Stochastic differential equations
Interpret dX dt = ( µ(X(t)) + “noise′′(t) ) (6) as X(t) − X(0) ≈ ∫ t ( µ(X(s)) + “noise′′(s) ) ds (7) ≈ ∫ t µ(X(s))ds + ∫ σ(X(s))dv ds ds (8) The mathematically correct approach is to define Stochastic Differential Equations as X t X 0 X s ds X s dW s (9)
SLIDE 23
Stochastic differential equations
Interpret dX dt = ( µ(X(t)) + “noise′′(t) ) (6) as X(t) − X(0) ≈ ∫ t ( µ(X(s)) + “noise′′(s) ) ds (7) ≈ ∫ t µ(X(s))ds + ∫ σ(X(s))dv ds ds (8) The mathematically correct approach is to define Stochastic Differential Equations as X(t) − X(0) = ∫ µ(X(s))ds + ∫ σ(X(s))dW(s) (9)
SLIDE 24
Integrals
The ∫ µ(X(s))ds (10) integral is an ordinary Riemann integral, whereas the X s dW s (11) integral is an Ito integral.
SLIDE 25 Integrals
The ∫ µ(X(s))ds (10) integral is an ordinary Riemann integral,whereas the ∫ σ(X(s))dW(s) (11) integral is an It¯
SLIDE 26 e It¯
The It¯
- integral is defined (for a piece-wise constant
integrand σ(s, ω)) as
b
∫
a
σ(s, ω)dW(s) =
n−1
∑
k=0
σ(tk, ω)(W(tk+1) − W(tk)). (12) General functions are approximated by piece-wise constant functions, while letting the discretization tend to zero. The limit is computed in L2 sense.
SLIDE 27 e It¯
The It¯
- integral is defined (for a piece-wise constant
integrand σ(s, ω)) as
b
∫
a
σ(s, ω)dW(s) =
n−1
∑
k=0
σ(tk, ω)(W(tk+1) − W(tk)). (12) General functions are approximated by piece-wise constant functions, while letting the discretization tend to zero. The limit is computed in L2(P) sense.
SLIDE 28
Properties
Stochastic integrals are martingales. Definition: A stochastic process {X(t), t ≥ 0} is called a martingale with respect to a filtration {F(t)}t≥0 if
◮ X(t) is F(t)-measurable for all t ◮ E [|X(t)|] < ∞ for all t, and ◮ E [X(t)|F(s)] = X(s) for all s ≤ t.
Proof: E X t s E X s X t X t s (13) X s E u dW u s (14) X s E E
n 1 k
tk W tk
1
W tk tk s X s (15)
SLIDE 29
Properties
Stochastic integrals are martingales. Definition: A stochastic process {X(t), t ≥ 0} is called a martingale with respect to a filtration {F(t)}t≥0 if
◮ X(t) is F(t)-measurable for all t ◮ E [|X(t)|] < ∞ for all t, and ◮ E [X(t)|F(s)] = X(s) for all s ≤ t.
Proof: E[X(t)|F(s)] = E[X(s) + (X(t) − X(t)|F(s)] (13) = X(s) + E[ ∫ σ(u, ω)dW(u)|F(s)] (14) = X(s) + E[E[
n−1
∑
k=0
σ(tk, ω)(W(tk+1) − W(tk))|F(tk)]|F(s)] = X(s) (15)
SLIDE 30 Other properties (eorem 7.1)
◮ Stochastic integrals are linear operators ◮ The unconditional expectation of a stochastic
integral is zero
◮ Stochastic integrals are measurable wrt the
Filtration of the driving Brownian motion
◮ The It¯
- isometry is useful when computing the
covariance E [(∫ σ(s)dW(s) )2] = ∫ E [ σ2(s) ] ds (16)
SLIDE 31
Interpretation
It can be shown that
◮ The drift is given by
µ(t, X(t)) = lim
h→0
1 hE [X(t + h) − X(t)] (17) While the squared diffusion is given by
T t X t
lim
h
1 hVar X t h X t (18)
SLIDE 32
Interpretation
It can be shown that
◮ The drift is given by
µ(t, X(t)) = lim
h→0
1 hE [X(t + h) − X(t)] (17)
◮ While the squared diffusion is given by
σσT(t, X(t)) = lim
h→0
1 hVar [X(t + h) − X(t)] (18)
SLIDE 33
Simple Monte Carlo simulation
The system X(t) − X(0) = ∫ µ(X(s))ds + ∫ σ(X(s))dW(s) (19) can be simulated through the Euler-Maruyama scheme, see Chap 12 in the book. The scheme is given by X((n + 1)h) = X(nh) + µ(X(nh))h+ (20) + σ(X(nh)) (W((n + 1)h) − W(nh)) .
SLIDE 34
Continuous time volatility
◮ We can compute the volatility in a continuous
time model.
◮ Advantage: A continuous time model can use
data from any time scale, and does not assume that data is equidistantly sampled.
◮ Can derive a limit theory when data is sampled
at high frequency.
◮ This is based on the general theory on quadratic
variation.
SLIDE 35
Quadratic variation
◮ Let {S} be a general semimartingale. ◮ Let πN = {0 = τ0 < τ1 < . . . < τN = T} be a
partition of [0, T], and denote ∆ = τn − τn−1, where ∆ = T/N.
◮ Define
QN =
N
∑
n=1
(S(τn) − S(τn−1))2 .
◮ What are the properties of QN? ◮ QN converges to the quadratic variation.
SLIDE 36
Quadratic variation, cont
Let St = σWt.
◮ Then
QN =
N
∑
n=1
(S(τn) − S(τn−1))2 .
◮ Note that (S(τn) − S(τn−1))2 ∼ σ2∆χ2(1). ◮ Remember E[χ2(p)] = p, V[χ2(p)] = 2p.
What are the properties of QN? E QN
2
E
2 N 2
N
2T.
QN
2 2 2 N 4 T2 N2
2N Chebyshev's inequality then states that QN
p 2T.
SLIDE 37
Quadratic variation, cont
Let St = σWt.
◮ Then
QN =
N
∑
n=1
(S(τn) − S(τn−1))2 .
◮ Note that (S(τn) − S(τn−1))2 ∼ σ2∆χ2(1). ◮ Remember E[χ2(p)] = p, V[χ2(p)] = 2p. ◮ What are the properties of QN? ◮ E[QN] = σ2∆E[χ2(N)] = σ2∆N = σ2T. ◮ V[QN] =
( σ2∆ )2 V[χ2(N)] = ( σ4 T2
N2
) 2N → 0
◮ Chebyshev's inequality then states that
QN
p
→ σ2T.
SLIDE 38 Quadratic variation of daily log returns for the Black-Scholes model
50 100 150 200 250 300 350 400 450 500 0.02 0.04 0.06 0.08 0.1 0.12 0.14
SLIDE 39
Quadratic variation, cont
◮ For a diffusion process
dXt = µ(t, Xt)dt + σ(t, Xt)dWt, the quadratic variation converge to QN → ∫ σ2(s, Xs)ds. For a jump diffusion dXt t Xt dt t Xt dWt dZt where Z is a Poisson process Nt with random jumps of size Ji the quadratic variation yields QN
2 s Xs ds Nt i
J2
i
SLIDE 40
Quadratic variation, cont
◮ For a diffusion process
dXt = µ(t, Xt)dt + σ(t, Xt)dWt, the quadratic variation converge to QN → ∫ σ2(s, Xs)ds.
◮ For a jump diffusion
dXt = µ(t, Xt)dt + σ(t, Xt)dWt + dZt, where {Z} is a Poisson process Nt with random jumps of size Ji the quadratic variation yields QN → ∫ σ2(s, Xs)ds +
Nt
∑
i=0
J2
i .
SLIDE 41
Realized variation
◮ The quadratic (realized) variation is estimated as
QVN =
N
∑
n=1
(S(τn) − S(τn−1))2 .
◮ The Bipower variation is estimated as
BPVN = π 2
N
∑
n=1
|S(τn+1) − S(τn)||S(τn) − S(τn−1)|.
◮ It can be shown that the Bipower variation
converge to BPVN → ∫ σ2(s, Xs)ds, for a jump diffusion process (and even for a general semimartingale).
◮ The difference between the realized variation
and Bipower variation is used to estimate the size of the jump component.
SLIDE 42 Example: Realised variation for daily log return of Black-Scholes
100 200 300 400 500 600 700 800 900 0.05 0.1 0.15 100 200 300 400 500 600 700 800 900 −3 −2 −1 1 2 x 10
−3
QV−BPV (jumps ?) QV BPV
SLIDE 43 Example: Realised variation for daily log return of OMXS30
1995 2000 2005 2010 0.2 0.4 0.6 0.8 1 1995 2000 2005 2010 0.01 0.02 0.03 0.04 QV−BPV (jumps ?) QV BPV
SLIDE 44
Practical considerations
◮ Theory suggests that ∆ → 0 would be a good
thing. Practice suggests otherwise, cf. stylized facts. Problem is micro structure noise. Several strategies for correcting for this.
SLIDE 45
Practical considerations
◮ Theory suggests that ∆ → 0 would be a good
thing.
◮ Practice suggests otherwise, cf. stylized facts.
Problem is micro structure noise. Several strategies for correcting for this.
SLIDE 46
Practical considerations
◮ Theory suggests that ∆ → 0 would be a good
thing.
◮ Practice suggests otherwise, cf. stylized facts. ◮ Problem is micro structure noise. ◮ Several strategies for correcting for this.
SLIDE 47
Solving SDEs
Generally rather difficult... Use the definitions if possible. The Ito formula states the if dX t X t dt X t dW t (21) Y t F t X t C1 2 (22) Then the Ito formula applies dY t Ft FX 1 2
TFXX
dt FXdW t (23) where the dependence on X t is suppressed and Ft F t FX F X ``Proof'': Essentially Taylor expansions, and using that X and hence Y is continuous.
SLIDE 48
Solving SDEs
Generally rather difficult... Use the definitions if possible. The Ito formula states the if dX t X t dt X t dW t (21) Y t F t X t C1 2 (22) Then the Ito formula applies dY t Ft FX 1 2
TFXX
dt FXdW t (23) where the dependence on X t is suppressed and Ft F t FX F X ``Proof'': Essentially Taylor expansions, and using that X and hence Y is continuous.
SLIDE 49 Solving SDEs
Generally rather difficult... Use the definitions if possible. The It¯
dX(t) = µ(X(t))dt + σ(X(t))dW(t) (21) Y(t) = F(t, X(t)) ∈ C1,2 (22) Then the Ito formula applies dY t Ft FX 1 2
TFXX
dt FXdW t (23) where the dependence on X t is suppressed and Ft F t FX F X ``Proof'': Essentially Taylor expansions, and using that X and hence Y is continuous.
SLIDE 50 Solving SDEs
Generally rather difficult... Use the definitions if possible. The It¯
dX(t) = µ(X(t))dt + σ(X(t))dW(t) (21) Y(t) = F(t, X(t)) ∈ C1,2 (22) Then the It¯
dY(t) = ( Ft + µFX + 1 2σσTFXX ) dt + σFXdW(t) (23) where the dependence on X(t) is suppressed and Ft = ∂F/∂t, FX = ∂F/∂X, . . . ``Proof'': Essentially Taylor expansions, and using that X and hence Y is continuous.
SLIDE 51 Solving SDEs
Generally rather difficult... Use the definitions if possible. The It¯
dX(t) = µ(X(t))dt + σ(X(t))dW(t) (21) Y(t) = F(t, X(t)) ∈ C1,2 (22) Then the It¯
dY(t) = ( Ft + µFX + 1 2σσTFXX ) dt + σFXdW(t) (23) where the dependence on X(t) is suppressed and Ft = ∂F/∂t, FX = ∂F/∂X, . . . ``Proof'': Essentially Taylor expansions, and using that X and hence Y is continuous.
SLIDE 52