[PPT] - Estimating Models Based on Markov Jump Processes Given Fragmented PowerPoint Presentation

SLIDE 1

Estimating Models Based on Markov Jump Processes Given Fragmented Observation Series

Markus Hahn

Johann Radon Institute for Computational and Applied Mathematics (RICAM), Linz Austrian Academy of Sciences Joint work with S. Fr¨ uhwirth-Schnatter (JKU Linz) and J. Sass (TU Kaiserslautern)

Linz, December 2, 2008

Work funded by FWF project P17947

SLIDE 2

Introduction

Problem

◮ Estimation from set of observed series

◮ Independent series ◮ Data with breaks

◮ Each single series is (based) Markov process

◮ Same generator for all series

◮ How to estimate common generator? ◮ How to cope with short observation series?

2 / 37

SLIDE 3

Introduction

Outline

Introduction Markov Jump Processes Merged Markov Jump Processes Inference for Merged Markov Jump Processes Generalization: Markov Switching Models Conclusion

3 / 37

SLIDE 4

Markov Jump Processes

Finite state Markov jump process (MJP)

◮ Y = (Yt)t∈[0,T) is a continuous time Markov process ◮ Finite state space {1, . . . , d} ◮ Y is time homogeneous ◮ Jumps of Y are governed by rate matrix Q ∈ Rd×d ◮ Exponential rate of leaving state k:

λk = −Qkk =

l=k

Qkl < ∞ i.e. average waiting time for leaving k is 1/λk

◮ Conditional transition probability:

P(Yt = l | Yt− = k, Yt = Yt−) = Qkl/λk

4 / 37

SLIDE 5

Markov Jump Processes

Inference about rate matrix

◮ Ok occupation time of state k ◮ Nkl number of jumps from k to l ◮ Maximum likelihood estimation:

ˆ Qkl = Nkl/Ok

◮ Observing a path (Yt)t∈[0,T), Ok and Nkl are sufficient for

estimating Qkl

◮ Unbiased?

E( ˆ Qkl | Q) = ∞ E( ˆ Qkl | Q, Ok > 0) = ∞

◮ ˆ

Q is consistent, i.e. lim

T→∞ P(| ˆ

Qkl − Qkl| > ǫ) = 0

5 / 37

SLIDE 6

Markov Jump Processes

Some Bayesian inference

◮ Using uninformative prior Qkl ∼ Ga(1, 0):

Qkl | Y ∼ Ga(Nkl + 1, Ok)

◮ Mean: (Nkl + 1)/Ok ◮ Variance: (Nkl + 1)/O2

k , hence decreasing with 1/T

◮ Mode: Nkl/Ok 6 / 37

SLIDE 7

Markov Jump Processes

Remarks on inference

◮ ˆ

Q is well-known and looks nice

◮ In fact, estimation of Q is not so easy ◮ We need some observations for each transition k to l ◮ Basically, T needs to be as large as possible

7 / 37

SLIDE 8

Merged Markov Jump Processes

Introduction Markov Jump Processes Merged Markov Jump Processes Inference for Merged Markov Jump Processes Generalization: Markov Switching Models Conclusion

8 / 37

SLIDE 9

Merged Markov Jump Processes

Observing a number of series of MJPs

◮ M series of MJPs are observed ◮ All Y (m) characterized by same rate matrix Q ◮ Series may be independent or come from data with breaks

0.05 1 2 3 0.05 1 2 3 0.05 1 2 3 0.05 1 2 3 0.05 1 2 3

Figure: Processes Y (1), . . . , Y (M)

9 / 37

SLIDE 10

Merged Markov Jump Processes

Observing merged MJPs

◮ Merged process of M single MJPs is observed ◮ Single MJPs characterized by same rate matrix Q

0.05 0.1 0.15 0.2 0.25 1 2 3

Figure: Merged process ¯ Y

10 / 37

SLIDE 11

Merged Markov Jump Processes

Merged MJPs

◮ Given: Observation process ¯

Y = ( ¯ Yt)t∈[0, ¯

T) ◮ ¯

Y is concatenation of MJPs Y (1), . . . , Y (M)

◮ In detail, Y (m) = (Y (m) t

)t∈[0,T), ¯ T = M T, and ¯ Yt = Y (1)

t

if 0 ≤ t < T, ¯ Yt = Y (2)

t−T

if T ≤ t < 2T, . . . ¯ Yt = Y (m)

t−mT

if (m − 1)T ≤ t < mT

◮ All Y (m) characterized by same rate matrix Q ◮ NB: ¯

Y itself is not Markov!

◮ Assumption of equal length is for notational convenience only

11 / 37

SLIDE 12

Merged Markov Jump Processes

Merged MJPs – Example

M = 5, d = 3, T = 0.05, ¯ T = 0.25, Q = −50

30 20 20 −40 20 30 40 −70

0.05

0.1 0.15 0.2 0.25 1 2 3 0.05 1 2 3 0.05 1 2 3 0.05 1 2 3 0.05 1 2 3 0.05 1 2 3

Figure: Merged process ¯ Y and single processes Y (1), . . . , Y (M)

12 / 37

SLIDE 13

Inference for Merged Markov Jump Processes

Introduction Markov Jump Processes Merged Markov Jump Processes Inference for Merged Markov Jump Processes Generalization: Markov Switching Models Conclusion

13 / 37

SLIDE 14

Inference for Merged Markov Jump Processes

Splitting merged process

◮ Split ¯

Y into Y (1), . . . , Y (M)

0.05 1 2 3 0.05 1 2 3 0.05 1 2 3 0.05 1 2 3 0.05 1 2 3

◮ Pooling:

ˆ Qkl = 1 M

M

m=1

ˆ Q(m)

kl ◮ Problem: MLE does not exist if some O(m) k

= 0; short occupation times lead to unstable results

14 / 37

SLIDE 15

Inference for Merged Markov Jump Processes

Estimating directly from merged process

◮ Consider ¯

Y

0.05 0.1 0.15 0.2 0.25 1 2 3

◮ Number of jumps and occupation time:

¯ Nkl =

M

m=1

N(m)

kl

+ N+

kl ,

where N+

kl = M−1

m=1

I

Y (m)

T−=k, Y (m+1)

=l

, ¯ Ok =

M

m=1

O(m)

k

15 / 37

SLIDE 16

Inference for Merged Markov Jump Processes

Estimating directly from merged process – first try

◮ First attempt: ˜

Qkl = ¯ Nkl/ ¯ Ok

◮ Problem: Bias caused from “artificial” jumps N+ kl / ¯

Ok

◮ If N+ kl is observed explicitly, things are easy ◮ We assume N+ kl cannot be observed

◮ Location of splitting points unknown ◮ Process not directly observed 16 / 37

SLIDE 17

Inference for Merged Markov Jump Processes

Estimating directly from merged process – bias

◮ Assume Y (m)

∼ π, where π stationary distribution

◮ Extra jumps N+ do not affect distribution of occupation times and

stationary distribution

◮ Hence, π = ¯

π and ˜ π = ˜ π( ˜ Q) is “unbiased” estimate for π

◮ Joining two independent stationary processes generates a jump from

k to l with probability πkπl: P

N+

kl | π

= Bin(M − 1, πkπl)

17 / 37

SLIDE 18

Inference for Merged Markov Jump Processes

Estimating directly from merged process – bias (2)

◮ P

N+

kl | π

= Bin(M − 1, πkπl) is justified if

◮ Y (m) are independent series ◮ Data with breaks:

Length of break τ such that ρτ(Y ) is close to zero, where ρt(Y ) = Pd

k=1 πk(Xkk(t) − πk)

Pd

k=1 πk(1 − πk)

and Xkl(t) = P(Yt = l | Y0 = k) = exp(Q t)kl I.e. Xkk(τ) should be close to πk

18 / 37

SLIDE 19

Inference for Merged Markov Jump Processes

Estimating directly from merged process – bias (3)

◮ Note: If π is given, N+ kl and ¯

Ok are independent

◮ Recall: P

N+

kl | π

= Bin(M − 1, πkπl)

◮ As

T ¯ Ok

P

− − − − →

T→∞

1 πk , we have N+

kl

¯ Ok ≈ (M − 1)πkπl πkT = πl M − 1 T

◮ As ˜

π is unbiased estimate for π, these quantities can be estimated knowing ˜ Q

19 / 37

SLIDE 20

Inference for Merged Markov Jump Processes

Estimating directly from merged process – correction

◮ 2-step construction for corrected estimate:

1) ˜ Qkl = ¯ Nkl/ ¯ Ok 2) ¯ Qkl = ˜ Qkl − (M − 1) ˜ πl/ ¯ T

◮ “Merging”

20 / 37

SLIDE 21

Inference for Merged Markov Jump Processes

Comparison: Splitting vs. Merging

◮ Variance for Splitting for Ga(1, 0) prior:

Var

M−1

M

m=1

Q(m)

kl

Y (1), . . . , Y (M)
=

1 M2

M

m=1

N(m)

kl

+ 1

O(m)

k

2

◮ Variance for Merging for Ga(1, 0) prior:

Var

Qkl | ¯

Y

=

M

m=1 N(m) kl

+ 1 M

m=1 O(m) k

2

◮ For rather short single observation times T we expect Merging to

give more reliable results

21 / 37

SLIDE 22

Inference for Merged Markov Jump Processes

Numerical example

◮ M = 100, d = 3, T = 0.25, ¯

T = 25, Q = −100

60 40 40 −70 30 40 60 −100

◮ On average, about 22 jumps in each single process Y (m)

◮ Simulate merged data and apply Splitting and Merging ◮ Repeat 100 000 times and consider sampling distributions of ˆ

Qkl and ¯ Qkl

◮ π = (0.29

0.46 0.25) and ¯ Q = ˜ Q − −2.8

1.8 1.0 1.1 −2.1 1.0 1.1 1.8 −2.9

22 / 37

SLIDE 23

Inference for Merged Markov Jump Processes

Numerical example – results

−140 −120 −100 −80 0.05 0.1 50 70 90 0.05 0.1 0.15 30 40 50 60 0.1 0.2 30 40 50 0.1 0.2 −90 −80 −70 −60 0.1 0.2 20 30 40 0.1 0.2 30 40 50 60 0.1 0.2 50 70 90 0.05 0.1 0.15 −140 −120 −100 −80 0.05 0.1

Green circles: true Qkl, blue: Merging ¯ Qkl, red: Splitting ˆ Qkl

23 / 37

SLIDE 24

Inference for Merged Markov Jump Processes

Numerical example – remarks

◮ ¯

Qkl outperforms ˆ Qkl wrt. both location and dispersion

◮ ˆ

Qkl is skewed towards over-estimation: If some O(m)

k

is very small, ˆ Q(m)

kl

heavily over-estimates Qkl and also ˆ Qkl is too high

◮ This is the more severe, the smaller T is

24 / 37

SLIDE 25

Inference for Merged Markov Jump Processes

Numerical example (2)

◮ M = 62, d = 3, T = 1/M, ¯

T = 1, Q = −800

450 350 100 −200 100 300 400 −700

◮ On average, about 6 jumps in each single process Y (m)

◮ Probability that one state is visited never or only for a very short

time is high!

◮ Simulate merged data and apply Splitting and Merging ◮ Repeat 100 000 times and consider sampling distributions of ˆ

Qkl and ¯ Qkl

◮ π = (0.15

0.68 0.17) and ¯ Q = ˜ Q − −51.8

41.4 10.4 9.1 −19.6 10.5 9.1 41.4 −50.5

25 / 37

SLIDE 26

Inference for Merged Markov Jump Processes

Numerical example (2) – results

−3,000 −2,000 −1,000 0.003 0.006 500 1000 1500 0.004 0.008 500 1000 1500 0.005 0.01 100 200 0.02 0.04 −400 −200 0.01 0.02 0.03 100 200 0.02 0.04 500 1,000 1,500 0.005 0.01 500 1000 1500 0.005 0.01 −3,000 −2,000 −1,000 0.004 0.008

Green circles: true Qkl, blue: Merging ¯ Qkl, red: Splitting ˆ Qkl

26 / 37

SLIDE 27

Generalization: Markov Switching Models

Introduction Markov Jump Processes Merged Markov Jump Processes Inference for Merged Markov Jump Processes Generalization: Markov Switching Models Conclusion

27 / 37

SLIDE 28

Generalization: Markov Switching Models

A continuous-time Markov switching model

◮ Observation process R = (Rt)t∈[0,T] (e.g. stock returns) with

dynamics dRt = µt dt + σt dWt, Rt = t µs ds + t σs dWs

◮ W standard Brownian motion ◮ Drift and volatility jump between d levels:

µt = µ(Yt), σt = σ(Yt)

◮ State process Y is a MJP with state space {1, . . . , d}, Y ⊥ W

28 / 37

SLIDE 29

Generalization: Markov Switching Models

Example

0.2 0.4 0.6 0.8 1 1 2 3

State process

0.2 0.4 0.6 0.8 1 −2 2

Drift process

0.2 0.4 0.6 0.8 1 0.1 0.15 0.2

Volatility process

0.2 0.4 0.6 0.8 1 −0.03 0.03

Daily stock returns

0.2 0.4 0.6 0.8 1 0.8 1 1.2

Price process

∆t =

1 250, µ = (3, 0, −2), σ = (0.20, 0.12, 0.15), Q =

−70

40 30 20 −40 20 30 50 −80

29 / 37

SLIDE 30

Generalization: Markov Switching Models

MSMs in Finance

◮ Short rate models: Elliott/Hunter/Jamieson (2001) ◮ Investment problems: Zhang (2001), Guo (2005) ◮ Risk measures for derivatives: Elliott/Siu/Chan (2008) ◮ Portfolio optimization: Honda (2003), Zhou/Yin (2003),

Sass/Haussmann (2004), B¨ auerle/Rieder (2005), . . .

◮ Option pricing: Guo (2001), Buffington/Elliott (2002),

Chan/Elliott/Siu (2005), Liu/Zhang/Yin (2006), Yao/Zhang/Zhou (2006), . . .

◮ . . .

30 / 37

SLIDE 31

Generalization: Markov Switching Models

Estimation from discretely observed data

◮ Return process R is observable at times t = i ∆t,

Vi = ∆Ri = i ∆t

(i−1) ∆t

µs ds + i ∆t

(i−1) ∆t

σs dWs, i = 1, . . . , H

◮ Vi daily stock returns ◮ State process Y is independent of W and not observable (hidden) ◮ Wanted: µ(k), σ(k), Q ◮ Problems:

◮ ∆t given and fixed ◮ Noise high compared to signal ◮ High-frequency switching of states, i.e. λk ∆t high ◮ Number of observations low (say, less than 5000) 31 / 37

SLIDE 32

Generalization: Markov Switching Models

Data with breaks / merged processes

◮ Often, we encounter data with breaks ◮ Weekends for daily data, nights for intra-day data, . . . ◮ Observable discrete time process:

¯ V = ( ¯ Vi)i=1,..., ¯

H is concatenation of V (1), . . . , V (M)

V (m) = (V (m)

i

)i=1,...,H and ¯ H = M H

◮ Hidden continuous-time state process:

¯ Y is concatenation of Y (1), . . . , Y (M) Y (m) = (Y (m)

t

)t∈[0,T) and ¯ T = M T

32 / 37

SLIDE 33

Generalization: Markov Switching Models

Estimation from data with breaks

◮ Proceed similarly as for MJPs ◮ Use merged data ¯

V for estimation

◮ Employ arbitrary method to obtain estimates ˜

µ, ˜ σ, and ˜ Q

◮ µ and σ are not affected by merging, hence ¯

µ = ˜ µ, ¯ σ = ˜ σ

◮ Correction for Q: As described for MJPs

◮ Point estimates: Correct ˜

Q to obtain ¯ Q

◮ Simulation based: Correct each sample ˜

Qj to obtain samples ¯ Qj

33 / 37

SLIDE 34

Generalization: Markov Switching Models

Numerical example

◮ ¯

T = 10, ¯ H = 10 000, ∆t = 1/1000

◮ d = 2,

Q = −60

60 40 −40

,

µ = (2 − 1), σ = (0.10 0.05), i.e. µ ∆t = (0.002 − 0.001), σ √ ∆t = (0.0032 0.0016)

◮ M, T, H varying s.t. ¯

T = M T and ¯ H = M H

◮ M = 1 corresponds to one coherent series

M = 400 corresponds to 400 series with H = 25 observations each

◮ Simulate merged data ◮ Perform method of moments-type estimation for merged data ◮ Repeat 1 000 times

34 / 37

SLIDE 35

Generalization: Markov Switching Models

Numerical example – results

µ(1) µ(2) σ(1) σ(2) ˜ λ1 ˜ λ2 ¯ λ1 ¯ λ2 true 2.00

1.00

0.100 0.050 60.0 40.0 60.0 40.0 M = 1, 2.02

0.97

0.100 0.052 58.6 37.5 58.6 37.5 τ = 10 0.11 0.05 0.002 0.002 9.8 6.5 9.8 6.5 M = 50, 2.02

0.97

0.100 0.052 61.5 39.5 58.6 37.7 τ = 0.2 0.12 0.05 0.001 0.003 10.7 6.3 10.7 6.7 M = 100, 2.03

0.97

0.100 0.052 64.7 41.2 58.6 37.3 τ = 0.1 0.11 0.05 0.001 0.003 11.6 6.4 10.5 6.8 M = 200, 2.03

0.96

0.100 0.052 71.2 45.1 59.0 37.4 τ = 0.05 0.12 0.06 0.001 0.003 15.6 8.2 10.7 6.7 M = 400, 2.06

0.96

0.100 0.053 85.3 52.9 60.7 37.6 τ = 0.025 0.13 0.06 0.002 0.003 28.4 15.1 12.6 8.0 Table: Results for MSM ( ¯ T = 10, ¯ H = 10 000): mean (top), RMSE (bottom)

35 / 37

SLIDE 36

Generalization: Markov Switching Models

Numerical example – remarks

◮ Estimates of µ(k), σ(k) are not affected by merging ◮ Quality of corrected estimates ¯

Qkl is (nearly) independent of M

◮ Ignoring breaks can lead to considerable bias in estimate for Q ◮ Method of moments-type estimation requires a lot of observations,

Splitting for M > 2 not applicable; ML or Bayesian methods could be applied for M ≤ 10, but are computationally much more costly

36 / 37

SLIDE 37

Conclusion

◮ Estimation for set of short (independent) observation series or

data containing (long) breaks

◮ Applicable to processes based on MJPs ◮ First, estimate parameters for merged series

Second, correct rates for bias afterwards

◮ Post-processing correction ◮ Works with arbitrary estimation approach for coherent series ◮ Single series need not be of same length /

splitting times need not be known –

nly number of breaking points required

◮ Works similarly for discrete time processes

37 / 37