Model-based measurement of volatility using high-frequency data - - PowerPoint PPT Presentation

model based measurement of volatility using high
SMART_READER_LITE
LIVE PREVIEW

Model-based measurement of volatility using high-frequency data - - PowerPoint PPT Presentation

Model-based measurement of volatility using high-frequency data Borus Jungbacker & Siem Jan Koopman bjungbacker@feweb.vu.nl s.j.koopman@feweb.vu.nl Vrije Universiteit Amsterdam Tinbergen Institute - - - Econometric Society World


slide-1
SLIDE 1

Model-based measurement of volatility using high-frequency data

Borus Jungbacker & Siem Jan Koopman

bjungbacker@feweb.vu.nl s.j.koopman@feweb.vu.nl

Vrije Universiteit Amsterdam Tinbergen Institute

  • - -

Econometric Society World Congress, London, 2005

  • - -

Model-based measurement of volatilityusing high-frequency data – p. 1

slide-2
SLIDE 2

The paper in fast mode

  • Aim of the paper: to measure volatility within a model-based

framework using high-frequency data;

  • Background: microstructure error exists in price data when
  • bserved at high-frequency; for measuring realised volatility,

nonparametric methods have been considered to deal with microstructure error;

  • Problem: how to correct for microstructure error in a model-based

framework for volatility ?

  • Solution: discretised pricing model can be represented as a

dynamic latent process (random walk) with observation error;

  • Econometrics: how to estimate parameters from a non-linear

state space model ? how to measure the latent volatility ?

  • Illustration: IBM tick-by-tick prices.

Model-based measurement of volatilityusing high-frequency data – p. 2

slide-3
SLIDE 3

Overview

The contents of this presentation is

  • Introduction
  • Models for high-frequency prices
  • Estimation methods
  • Three months of IBM prices
  • Some more recent developments
  • Conclusion

Model-based measurement of volatilityusing high-frequency data – p. 3

slide-4
SLIDE 4

Introduction and notation

The price of a financial asset is denoted by Pt. Common assumption is that the log of Pt can be represented by SDE d log Pt = µt(ψ)dt + σt(ψ)dBt, t > 0,

(1)

where µt(ψ) is the drift (expected return), σt(ψ) is a stochastic process (spot volatility), Bt is standard Brownian motion and ψ is a vector of unknown parameters, see Campbell Lo and MacKinlay (1997) for more background. The variability in prices (integrated volatility) is σ∗2(0, t) = t σ2

t (ψ)dt.

(2)

Actual volatility for interval [t1, t2] is σ∗2(t1, t2) = σ∗2(0, t2) − σ∗2(0, t1).

(3)

Model-based measurement of volatilityusing high-frequency data – p. 4

slide-5
SLIDE 5

Introduction and notation

For the high-frequency data, we use notation as follows.

  • the observed log price at discrete time point tn (in seconds) is

denoted by Yn = log Ptn;

  • number of time points (seconds) in trading day d is Nd;
  • time series Y1, Y2, . . . , YNd (frequency in seconds) are log prices
  • f trades on a particular day d.
  • value Yn is not available when no trade has taken place at time tn

and is treated as missing;

  • number of trades is denoted by N ≤ Nd so that we have Nd − N

missing values in day d.

Model-based measurement of volatilityusing high-frequency data – p. 5

slide-6
SLIDE 6

Realised volatility

A natural estimator of actual volatility is realised volatility (RV) given by ˜ σ∗2(t0, tNd) =

Nd/m

  • j=2

(Ymj − Ymj−m)2 ,

(4)

where m is the sampling frequency. For example, when sampling frequency is 5 minutes, m equals 300. In the case a transaction has not taken place at time n, so that Yn is missing in (4), it can be approximated via interpolation methods, see Malliavin and Mancino (2002) and Hansen and Lunde (2004). An alternative semi-parametric approach for computing RV can be based on the estimated smoothness parameter for a trend (spline) model.

Model-based measurement of volatilityusing high-frequency data – p. 6

slide-7
SLIDE 7

Model for prices observed with micro-structure noise

We consider model (1): d log Pt = µt(ψ)dt + σt(ψ)dBt, t > 0, we assume that drift term equals zero: var(Pt+τ|Pt) only depends on the diffusion term σt(ψ). For volatility σt(ψ) we consider various specifications.

  • First we take volatility as constant through time: σt(ψ) = σ(ψ);
  • leads to following model for efficient price process,

d log Pt = σ(ψ)dBt, (5)

where Bt is standard Brownian motion.

  • trade prices Yn are observed with micro-structure noise Un;

Model-based measurement of volatilityusing high-frequency data – p. 7

slide-8
SLIDE 8

Basic model for observed prices

The discrete time model then becomes Yn = pn + σUUn, Un ∼ IID(0, 1),

(6)

pn+1 = pn + σεεn, εn ∼ NID(0, 1),

(7)

where pn = log Pt is unobserved price at time t = tn for n = 1, . . . , Nd;

  • this leads to simple expression for actual volatility:

σ∗2(tn, tn+1) = (tn+1 − tn)σ2

ε;

  • it further implies that observed return

Rn = ∆Yn+1 = ∆pn+1 + σU∆Un+1 = σεεn + σUUn+1 − σUUn, follows Rn ∼ MA(1), see Harvey (1989).

Model-based measurement of volatilityusing high-frequency data – p. 8

slide-9
SLIDE 9

Basic model for observed prices

For the discrete time model Yn = pn + σUUn, Un ∼ IID(0, 1), pn+1 = pn + σεεn, εn ∼ NID(0, 1), assumption of constant volatility is too strong but it allows us to obtain a preliminary estimate of daily volatility.

  • standard Kalman filter methods can be used;
  • further it justifies (to some extent) the nonparametric estimates

based on splines; Results based on this model will be presented later.

Model-based measurement of volatilityusing high-frequency data – p. 9

slide-10
SLIDE 10

Some considerations

Aït Sahalia, Mykland and Zhang (2004) also consider local level model and observe that returns therefore follow an MA(1) process. they argue that distributional properties of Un do not matter asymptotically; “modelling the noise explicitly restores the first order statistical effect that sampling as often as possible is optimal”; “this remains the case if one misspecifies the assumed distribution of the noise term”; these quotes also endorse a modelling approach. They further discuss possible extensions:

  • the modelling of Un as a stationary autoregressive process; this

can be easily done within the state space framework;

  • allowing for contemporaneous correlation between Un and εn; this

correlation however is not identified in likelihood when both variances are unrestricted, see Harvey and Koopman (2000); it can be imposed nonetheless.

Model-based measurement of volatilityusing high-frequency data – p. 10

slide-11
SLIDE 11

Weight or Kernel functions for price extraction

Weights are computed as in Koopman and Harvey (JEDC, 2003).

−20 −10 10 20 0.05 0.10 0.15 (i)

q=0.3 and ρ= 0

−20 −10 10 20 0.05 0.10 (ii)

q=0.3 and ρ= 0.2

−20 −10 10 20 0.05 0.10 0.15 0.20 (iii)

q=0.3 and ρ= 0.5

−20 −10 10 20 0.1 0.2 0.3 (iv)

q=0.3 and ρ= 1

Model-based measurement of volatilityusing high-frequency data – p. 11

slide-12
SLIDE 12

Intra-day seasonal patterns in volatility

At the opening and closure of financial markets, price changes are more volatile than at other times during the trading session, see Dacorogna et al.(1993) and Andersen and Bollerslev (1997). Therefore we replace the spot volatility σ2 in (5) by an intra-daily seasonal function σ2

t = σ2 exp g(t),

  • r

log σ2

t = log σ2 + g(t),

where g(t) is deterministic (spline) function with diurnal pattern. The integrated volatility becomes σ∗2(0, t) = t σ2

sds = σ2

t exp g(s)ds.

(8)

Model-based measurement of volatilityusing high-frequency data – p. 12

slide-13
SLIDE 13

Intra-day seasonal patterns in volatility

Further we observe that

  • Actual volatility can be analytically derived or approximated by

σ∗2(tn, tn+1) ≈ σ2

tn+1

  • s=tn

exp g(s), with index step length very small.

  • The function g(t) = g(t; ψ) depends on parameters collected in ψ

together with the variances σ2

ε and σ2 U.

  • Standard time-varying Kalman filter methods can be used for

maximum likelihood estimation of ψ and the measurement of actual volatility.

Model-based measurement of volatilityusing high-frequency data – p. 13

slide-14
SLIDE 14

Stochastic volatility

To be more realistic, constant σ is replaced by stochastic process:

d log Pt = σtdB(1)

t

, log σ2

t = log σ′2 t + ξ,

(9) d log σ′2

t = −λ log σ′2 t dt + σηdB(2) t

, where B(1)

t

and B(2)

t

are independent Brownian motions while log σ′2

t represents an Ornstein-Uhlenbeck process.

Using the Euler-Maruyama method, a discrete representation is log Ptn+1 = log Ptn + σnεn, εn ∼ NID(0, 1), log σ2

n = log σ′2 n + ξ,

(10)

log σ′2

n+1 = (1 − λ) log σ′2 n + σηηn,

ηn ∼ NID(0, 1). Note that λ = ση = 0 implies constant volatility with log σ2

n = ξ.

Model-based measurement of volatilityusing high-frequency data – p. 14

slide-15
SLIDE 15

Stochastic volatility

  • actual volatility is then approximated by

σ∗2(tn, tn+1) ≡ tn+1

tn

σ2

sds ≈ (tn+1 − tn)σ2 n;

  • model in terms of returns log(Ptn+1 / Ptn) can also be considered;

with micro-structure noise, the observed returns Rn follow an MA(1) process;

  • with stochastic volatility, we obtain

Rn = σnεn + σUWn,

(11)

where log σ2

n is modelled as in (10) and Wn = Un+1 − Un such that

Wn ∼ MA(1);

  • model (11) with σU = 0 is the basic SV model.

Model-based measurement of volatilityusing high-frequency data – p. 15

slide-16
SLIDE 16

Final model

The final model is based on the system of SDE’s

d log Pt = σtdB(1)

t

, log σ2

t = log σ′2 t + ξ + g(t),

d log σ′2

t = −λ log σ′2 t dt + σηdB(2) t

, The flexible function g(t) is incorporated in the SV specification in the same way as described earlier; In particular, log σ2

n in (10) is replaced by

log σ2

n = log σ′2 n + ξ + g(tn).

Model-based measurement of volatilityusing high-frequency data – p. 16

slide-17
SLIDE 17

Estimation

The basic pricing model with SV is given by Yn = pn + σUUn, Un ∼ IID(0, 1), pn+1 = pn + σnεn, εn ∼ NID(0, 1), log σ2

n = log σ′2 n + ξ + g(tn).

  • estimating this model with SV σn is known to be intricate;
  • without micro noise, we have standard SV model and various

methods can be employed for estimation (Bayesian, classical), see collection of articles in Shephard (2005).

  • next we consider model for returns and discuss feasible methods

for the estimation of the model with SV and noise;

  • we limit ourselves to maximum likelihood methods.

Model-based measurement of volatilityusing high-frequency data – p. 17

slide-18
SLIDE 18

Estimation of returns model

The model for returns with stochastic volatility, intra-daily seasonality and micro-structure noise is represented as the nonlinear state space model Rn = exp(1 2hn)εn + σUWn,

(12)

hn = ξ(tn) + φ {hn−1 − ξ(n − 1)} + σηηn,

(13)

where hn = h′

n + ξ(tn), h′ n = log σ′2 n , ξ(tn) = ξ + g(tn) and φ = 1 − λ.

  • the log-volatility h′

n follows an AR process;

  • the micro-structure noise Wn follows an MA process.
  • extended Kalman filter techniques are considered but did not

work, importance sampling (IS) techniques are adopted instead.

Model-based measurement of volatilityusing high-frequency data – p. 18

slide-19
SLIDE 19

Estimation

The model with observation y and unobserved signal θ is nonlinear such that analytical expressions for estimation are scarce. GMM and related methods can be considered but we aim for maximum likelihood (subject to Monte Carlo error). Importance sampling methods are employed for which we require a device for simulation. We simulate from a joint Gaussian density with location equals the mode of p(θ|y) and with scale equals the curvature of p(θ|y) at the mode. We refer to this as the importance function f(θ; y) To obtain the mode of p(θ|y), we employ Newton-Raphson (NR) methods that maximises p(θ|y) with respect to θ. The necessary computations at each step of the NR can be carried out by the Kalman filter and smoothing (KFS) algorithm provided that the 2nd derivative w.r.t θ, that is ¨ p(y|θ), is negative definite. Jungbacker and Koopman (2005) show that the KFS can still be adopted for any ¨ p(y|θ).

Model-based measurement of volatilityusing high-frequency data – p. 19

slide-20
SLIDE 20

Constructing the importance function

Since log p(θ|y) = log p(y|θ) + log pG(θ) − log p(y), we have ˙ p(θ|y) = ˙ p(y|θ) + ˙ pG(θ) = ˙ p(y|θ) − Ψ−1(θ − µ), and ¨ p(θ|y) = ¨ p(y|θ) + ¨ pG(θ) = ¨ p(y|θ) − Ψ−1, The Newton-Raphson step for maximizing p(θ|y) with respect to θ requires the computation of (Ψ−1 + A−1)−1(A−1x + Ψ−1µ), where A = −¨ p(y|θ = g)−1, x = g + A ˙ p(y|θ = g), for a current guess g of mode estimate of θ. This is done by KFS.

Model-based measurement of volatilityusing high-frequency data – p. 20

slide-21
SLIDE 21

Estimation, IS continued

In the case of an SV model with micro-structure IID noise, we have log ˙ p(yn|θ) = 1 2bn y2

n

an − 1

  • ,

and log ¨ p(yn|θ) = 1 2 − bn bny2

n

an − 1 2

  • bn − b2

n

  • ,

where an = exp(hn) + σ2

U,

and bn = exp(hn) / an. We note that an > 0 and 0 < bn ≤ 1.

Model-based measurement of volatilityusing high-frequency data – p. 21

slide-22
SLIDE 22

Estimation, IS continued

After the construction of the importance function f(θ; y), we continue as follows.

  • simulations from f(θ; y) can be obtained using simulation

smoothing algorithms, see de Jong and Shephard (B, 1995) and Durbin and Koopman (B, 2002). The simulation smoothing derivation for the case ¨ p(y|θ) > 0 is considered by Jungbacker and Koopman (2005).

  • the resulting simulations are θ(i) ∼ f(θ; y).
  • the importance sampling estimator of the likelihood is based on

L(ψ) = p(y; ψ) =

  • p(y, θ)dθ =

p(y, θ) f(θ; y)f(θ; y)dθ.

  • this expression can be simplified in standard cases, otherwise

value of f(θ; y) for a given realisation of θ is computed by simulation smoother.

Model-based measurement of volatilityusing high-frequency data – p. 22

slide-23
SLIDE 23

Estimation, IS continued

The IS estimator of the likelihood function L(ψ) is then

  • L(ψ) = Lg(ψ)

M

  • i=1

p(y, θ(i)) f(θ(i); y), where θ(i) ∼ f(θ; y) for i = 1, . . . , M. On the basis of the importance sampling weights p(y, θ(i)) f(θ(i); y), diagnostics can be carried out about the effectiveness of importance sampling (does the CLT applies to the likelihood estimator ? see Koopman and Shephard, 2004) and estimation of state vectors (signal extraction, volatility measurement).

Model-based measurement of volatilityusing high-frequency data – p. 23

slide-24
SLIDE 24

Empirical results for three months of IBM prices

A small subset of the TAQ database for NYSE has been made available to us. We took IBM equity transactions reported on Consolidated Tape. The NYSE market opens at 9:30 AM and closes at 4 PM. Database consists of prices and times (measured in seconds) of transactions realised in the three months of November 2002, December 2002 and January 2003.

No manipulations have been carried out on this dataset.

Prices for each trading day are considered in time series of seconds: possibly many missing observations ! When no trade has taken place in the last two minutes: 120 consecutive missing values.

Model-based measurement of volatilityusing high-frequency data – p. 24

slide-25
SLIDE 25

Data for one day

Prices and returns of IBM, November 1, 2002, against seconds.

78.5 79.0 79.5 80.0 80.5

Price IBM Stock 11/1/2002 10:00 11:00 12:00 13:00 14:00 15:00 16:00

−0.25 0.00 0.25 0.50

Log Returns IBM Stock 11/1/2002 10:00 11:00 12:00 13:00 14:00 15:00 16:00

Model-based measurement of volatilityusing high-frequency data – p. 25

slide-26
SLIDE 26

Data for one day

Prices and returns of IBM, November 1, 2002, against trades.

250 500 750 1000 1250 1500 1750 2000 2250 2500 2750 3000 3250 78.5 79.0 79.5 80.0 80.5

Price IBM Stock 11/1/2002

250 500 750 1000 1250 1500 1750 2000 2250 2500 2750 3000 3250 −0.25 0.00 0.25 0.50

Log Returns IBM stock 11/1/2002

250 500 750 1000 1250 1500 1750 2000 2250 2500 2750 3000 3250 0.05 0.10 0.15

Squared Log Returns IBM stock 11/1/2002

Model-based measurement of volatilityusing high-frequency data – p. 26

slide-27
SLIDE 27

Estimating model for one day

We first consider model for prices (5) with constant σ and intra-daily pattern g(t) for spot volatility. For function g(t) we have cubic spline function with three knots. Standard Kalman filter can be used for the estimation of the coefficients of this model (Ox/SsfPack is used). The estimation results are log σ = −5.112,

  • γ2 = −1.747,
  • γ3 = −1.135.

These results give some initial indication of results that can be

  • btained from a high-frequency dataset.

Model-based measurement of volatilityusing high-frequency data – p. 27

slide-28
SLIDE 28

Estimating a model with SV for one day

More interestingly from theoretical and empirical perspectives are the results for the returns model with SV and intra-daily seasonality. The estimates for the parameters are as follows. For model without micro-structure noise:

  • φ = 0.961,
  • σ2

η = 0.0619,

log σ = −7.977

  • γ2 = −1.654,
  • γ3 = −1.135.

For the model with micro-structure noise:

  • σ2

U = 0.00003985,

  • σU = 0.00631,
  • φ = 0.955,
  • σ2

η = 0.0821,

log σ = −8.033,

  • γ2 = −1.629,
  • γ3 = −1.065.

Although apparently the micro-structure noise seems low, it has a big impact on the estimate ση.

Model-based measurement of volatilityusing high-frequency data – p. 28

slide-29
SLIDE 29

Estimated volatility for one day

(i) log-volatility, (ii) intra-daily effect, (iii) integrated volatility.

1800 5400 9000 12600 16200 19800 23400 −2.5 0.0 2.5 1800 5400 9000 12600 16200 19800 23400 −9.5 −9.0 −8.5 −8.0 1800 5400 9000 12600 16200 19800 23400 0.05 0.10 0.15 0.20

Model-based measurement of volatilityusing high-frequency data – p. 29

slide-30
SLIDE 30

Estimated volatility for one day

Estimates of log σ′

t for interval of 30 minutes.

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

10:00 10:10 10:20 10:30

Model-based measurement of volatilityusing high-frequency data – p. 30

slide-31
SLIDE 31

Estimated volatility for one day

Estimates of log σ′

t for interval of 5 minutes.

−2 −1 1 2

13:00 13:01 13:02 13:03 13:04 13:05

−2 −1 1 2

13:05 13:06 13:07 13:08 13:09 13:10

−2 −1 1 2

13:10 13:11 13:12 13:13 13:14 13:15

−2 −1 1 2

13:15 13:16 13:17 13:18 13:19 13:20

Model-based measurement of volatilityusing high-frequency data – p. 31

slide-32
SLIDE 32

Multiple day analysis

  • The estimation procedure is repeated for 61 days of high

frequency data.

  • For each day, a new model is estimated with a new AR(1) process

for log-volatility and a new seasonal “diurnal” pattern.

  • The integrated volatility is considered and an estimate of the

actual volatility for one trading day is constructed.

  • The 61 estimates are collected and they form a daily time series.
  • These can be compared with realised volatility measures.

Model-based measurement of volatilityusing high-frequency data – p. 32

slide-33
SLIDE 33

Estimates of daily volatility for 61 days

(i) RV, (ii) LL model; (iii) with seasonal; (iv) with seasonal & SV.

20 40 60 2 4 6 8 10

(i)

20 40 60 2 4 6 8 10

(ii)

20 40 60 2 4 6 8 10

(iii)

20 40 60 2 4 6 8 10

(iv)

Model-based measurement of volatilityusing high-frequency data – p. 33

slide-34
SLIDE 34

Estimates of daily volatility for 61 days

ACF: (i) RV, (ii) LL model; (iii) with seasonal; (iv) with seas & SV.

5 10 −0.5 0.0 0.5 1.0

(i)

5 10 0.25 0.50 0.75 1.00

(ii)

5 10 0.25 0.50 0.75 1.00

(iii)

5 10 0.25 0.50 0.75 1.00

(iv)

Model-based measurement of volatilityusing high-frequency data – p. 34

slide-35
SLIDE 35

Discussion

  • measure volatility from high-frequency prices using a

model-based framework;

  • a basic model is considered that captures the salient features of

prices and volatilities in financial markets;

  • it accounts for micro-structure noise, an intra-daily volatility

pattern and stochastic volatility;

  • feasible estimation methods have been implemented;
  • no information is lost as opposed to realised volatility for which

prices are sampled at a low frequency, say 1 or 5 minutes;

  • nonparametric methods can also be considered but they need to

explicitly address end-of-sample problems, this does not apply to model-based treatments.

  • presented a first attempt to analyse ultra high-frequency prices

using a model .

  • apart from technicalities in estimation, the approach is just a

standard example of model-based signal extraction.

Model-based measurement of volatilityusing high-frequency data – p. 35

slide-36
SLIDE 36

Further research

The basic pricing model with SV is given by Yn = pn + σUUn, Un ∼ IID(0, 1), pn+1 = pn + σnεn, εn ∼ NID(0, 1),

  • can we develop ML estimation procedures for this model directly ?
  • methods must be fast given the huge number of observations.
  • we have implemented a procedure using particle filtering but is

slow and parameter estimation is not straightforward.

  • Bayesian methods can be considered too but are also slow and

depend on similar techniques as presented here.

  • therefore, we are still working on importance sampling techniques

. . .

Model-based measurement of volatilityusing high-frequency data – p. 36