Modeling Financial Durations Using Estimating Functions Yaohua Zhang - - PowerPoint PPT Presentation

modeling financial durations using estimating functions
SMART_READER_LITE
LIVE PREVIEW

Modeling Financial Durations Using Estimating Functions Yaohua Zhang - - PowerPoint PPT Presentation

Modeling Financial Durations Using Estimating Functions Yaohua Zhang 1 Jian Zou 2 Nalini Ravishanker 1 Aerambamoorthy Thavaneswaran 3 1 Department of Statistics, University of Connecticut 2 Department of Statistics, Worcester Polytechnic Institute


slide-1
SLIDE 1

Modeling Financial Durations Using Estimating Functions

Yaohua Zhang1 Jian Zou2 Nalini Ravishanker1 Aerambamoorthy Thavaneswaran3

1Department of Statistics, University of Connecticut 2Department of Statistics, Worcester Polytechnic Institute 3Department of Statistics, University of Manitoba

QPRC, June 15, 2017

slide-2
SLIDE 2

Outline

◮ Introduction ◮ Estimating Functions Approach for LogACD Models. ◮ Simulation Study ◮ Application on Real Stock Prices ◮ Summary

slide-3
SLIDE 3

Background

◮ Investigators are interested in studying the behavior of the

exchange rate process

◮ High frequency price quote data inherently arrive over irreg-

ularly spaced time intervals, so that time duration between consecutive data points is not uniform

◮ Traditional discrete-time models which bin the data into equally

spaced-time intervals are inadequate (too small = zero, too large = smooth)

slide-4
SLIDE 4

Why Do We Care?

◮ Information is important! (How long it will be until prices

change)

◮ Rothchild Family

◮ Knowing the time interval as it could influence the speed

with which he please an order

◮ In an active market, the price may last much less than a

minute/second.

◮ If automated trading system is used, opportunities may be

eliminated.

slide-5
SLIDE 5

Literature Review

◮ Engle & Russell (1998) proposed a nonlinear model for ir-

regularly spaced inter-event durations, called the Autore- gressive Conditional Duration (ACD) model

◮ In fact, the authors treat the arrival times of the data as a

point process with an intensity defined conditional on past activity

◮ Several generalizations have been discussed in the litera-

ture (Thavaneswaran et.al 2014)

◮ Developing fast and accurate methods for fitting models to

long time series of durations under least restrictive assump- tions is an interesting ongoing research problem

slide-6
SLIDE 6

A Review of Duration Models

Let xi = ti − ti−, where i = , , . . ., denote a time series of du- rations, and let Fx

i−1 denote the information associated with pre-

vious durations. The ACD(p, q) model (Engle & Russell, 1998) is defined as: xi = ψiεi/µε, where ψi = ω +

p

  • j=

αjxi−j +

q

  • j=

βjψi−j, The conditions ω > , αj ≥  for j = , . . . , p, βj ≥ 0 for j = , . . . , q and p

j= αj + q j= βj <  ensure that the durations process is

non-negative and weakly stationary.

slide-7
SLIDE 7

A Review of Duration Models Cont’d

The Log ACD(p, q) model (Bauwens 2000,Pacurar 2008), which relaxes the restrictions on the parameters that ensure nonneg- ativity on the durations and thus provides greater flexibility than the ACD(p, q) model. xi = exp(ψi)εi/µε, where ψi = ω +

p

  • j=

αj log xi−j +

q

  • j=

βjψi−j. where the condition max(p, q)

j=

(αj + βj) <  ensures weak sta- tionarity.

slide-8
SLIDE 8

The Problem

Suppose durations data {xi}n

i=1 that follow the Log ACD(p, q)

model are available. Let g = max(p, q). The the maximum likelihood estimates (MLEs) θ may be obtained by maximizing the conditional likelihood function (Tsay 2009): L(θ|xn) =

n

  • i=g+

f(xi|xi−1, θ).

◮ In practice, the true fε(.) in usually unknown ◮ In some cases, the ML or QML approach may not be

feasible (Thavaneswaran, Ravishanker & Liang, 2014)

◮ Model orders (p, q) are unknown.

slide-9
SLIDE 9

General Framework

◮ We propose a semi-parametric estimation approach which

based on combined martingale estimating functions

◮ It only requires the specification of the first four conditional

moments of the duration process

◮ Our method can be easily extended by adding a penalized

term.

slide-10
SLIDE 10

General Framework Cont’d

Suppose xi is a realization of a duration process and let Fx

i−1

denote the information associated with {x1, . . . , xi−1}. Suppose the first four conditional moments of {xi} given Fx

i−1 are µi(θ),

σ

i (θ), γi(θ), and κi(θ). Define the linear and quadratic martin-

gale differences by mi(θ) = xi − µi(θ) and Mi(θ) = m

i (θ) − σ i (θ).

Their quadratic variations and covariation are mi = E[m

i (θ)|Fx i−] = σ i (θ)

Mi = E[m

i (θ)|Fx i−] −

  • E[m

i (θ)|Fx i−]

2 = κi(θ) − σ

i (θ)

m, Mi = E[m

i (θ)|Fx i−] = γi(θ).

slide-11
SLIDE 11

General Framework Cont’d

Consider the class M of zero-mean, square integrable p-dimensional martingale estimating functions, M =

  • gn(θ) : gn(θ) =

n

  • i=

(ai−(θ)mi(θ) + bi−(θ)Mi(θ))

  • ,

where ai−(θ) and bi−(θ) are p × q matrices that are functions

  • f θ and x, . . . , xi−,  ≤ i ≤ n.
slide-12
SLIDE 12

Three Approaches

◮ Nonlinear Equation Solver Estimation (NESE): solve the sys-

tem of nonlinear equations g∗

C(θ) = 0 for θ ◮ Approximate Vector Recursive Estimation (AVRE): estimate

θ via recursive formulas

◮ Approximate Iterated Scalar Recursive Estimation (AISRE):

estimate θ through a sequence of scalar recursions for each component and iterating these to convergence

slide-13
SLIDE 13

Starting Values for the Recursion

Suppose {xi} follows the Log ACD(p, q) model. The natural logarithm of xi is yi = log xi. Then, yi = ψi + log εi − log µε = ω +

p

  • j=1

αjyi−j +

q

  • j=1

βjψi−j + log εi − log µε = ω +

p

  • j=1

αjyi−j +

q

  • j=1

βj(yi−j − log εi−j + log µε) + log εi − log µε = ω⋆ +

p

  • j=1

αjyi−j +

q

  • j=1

βjyi−j −

q

  • j=1

βjνi−j + νi from which it follows that yi = log xi follows an ARMA(max(p, q), q) model with non-normal errors, i.e., (1 −

max(p,q)

  • j=1

(αj + βj)Bj)yi = ω⋆ + (1 −

q

  • j=1

βjBj)νi

slide-14
SLIDE 14

Simulation Study

Table: Percentiles of parameter estimates for the Log ACD(p, q) mod- els for L = 250 simulated durations of length n = 7500.

fε(.) Param True NESE AVRE AISRE 5th 50th 95th 5th 50th 95th 5th 50th 95th Gamma ω 0.25 0.23 0.25 0.26 0.23 0.25 0.26 0.24 0.25 0.27 (0.6, 0.7) α 0.06 0.04 0.06 0.08 0.04 0.06 0.08 0.05 0.06 0.08 ω 0.04 0.02 0.04 0.07 0.02 0.04 0.18 0.03 0.04 0.06 Exp(1) α 0.05 0.03 0.05 0.07 0.02 0.05 0.24 0.04 0.05 0.07 β 0.75 0.42 0.74 0.89 0.48 0.73 0.83 0.62 0.73 0.83 Weibull ω 1.00 0.37 1.12 3.65 0.63 1.06 1.83 0.63 1.08 1.90 (0.4, 0.5) α 0.05 0.01 0.05 0.08 −0.03 0.05 0.26 0.04 0.05 0.07 β 0.60 −0.45 0.55 0.85 0.32 0.58 0.75 0.29 0.57 0.75 ω 0.50 0.37 0.51 0.68 0.42 0.51 0.62 0.42 0.51 0.62 Weibull α1 0.05 0.03 0.05 0.07 0.03 0.05 0.07 0.03 0.05 0.07 (0.9, 0.9) α2 0.10 0.07 0.10 0.13 0.07 0.10 0.13 0.08 0.10 0.12 β 0.60 0.47 0.59 0.69 0.52 0.59 0.65 0.52 0.59 0.65 ω 0.15 0.07 0.18 0.63 −0.15 0.20 0.55 0.09 0.19 1.61 Gamma α1 0.10 0.08 0.10 0.11 −0.02 0.10 0.22 0.08 0.10 0.12 (0.5, 0.8) α2 −0.05 −0.07 −0.04 0.01 −0.55 −0.04 0.19 −0.07 −0.04 −0.01 β1 0.05 −0.54 0.02 0.15 −0.34 −0.01 0.37 −0.26 0.01 0.14 β2 0.70 0.28 0.68 0.78 0.11 0.66 0.78 0.45 0.67 0.76

slide-15
SLIDE 15

Penalized Estimating Equations

◮ Penalized methods are usually used in regression settings ◮ However, the literature on variable selection in estimating

equations is rare

◮ Wang et al. (2012) and the references therein discussed

penalized generalized estimating equations in longitudinal setup

slide-16
SLIDE 16

Penalized Estimating Equations Cont’d

◮ Recap

gn(θ) : gn(θ) =

n

  • i=

(ai−(θ)mi(θ) + bi−(θ)Mi(θ))

◮ Now

g∗

C(θ) − np′ λ(|θ|)

where p′

λ(|θ|) is the first derivative of Smoothly Clipped Ab-

solute Deviation (SCAD) penalty (Fan et al. 2001) and is defined as p′

λ(|θ|) = λ{I(|θ| ≤ λ) + (aλ − |θ|)+

(a − 1)λ I(|θ| > λ)}

◮ Remark: SCAD can achieve unbiasedness (LASSO), spar-

sity and continuity.

slide-17
SLIDE 17

Illustrative Simulation Study

Table: Percentiles of parameter estimates for the Log ACD(p, 0) models for L = 500 durations of length n = 7500.

fε(.) Param True EF w Penalty 5th 50th 95th Gamma ω 0.25 0.22 0.25 0.27 (0.5, 0.6) α1 0.20 0.19 0.20 0.20 α2 0.10 0.09 0.10 0.10 Gamma ω 0.10 0.06 0.09 0.14 (0.5, 0.6) α1 0.10 0.09 0.10 0.10 α2 0.05 0.04 0.05 0.05 α3 0.05 0.04 0.05 0.05 α4 0.10 0.09 0.10 0.10

slide-18
SLIDE 18

Illustrative Simulation Study Cont’d

0.0 0.1 0.2 0.3 0.4 0.70 0.50 0.30 0.10 0.05 λ θ ω α1 α2 α3 α4 α5 α6 α7 α8 α9 α10 α11 α12 α13 α14 α15 α16 α17 α18 α19 α20

Figure: Solution path to the simulated LogACD(2, 0) model. The vertical bar represents the optimal λ.

slide-19
SLIDE 19

Application on Real Stock Prices

◮ Stock price data from a few trading days in June 2013 of

three assets (BAC, GE, IBM and MMM)

◮ The data set is obtained from the Trade and Quotes (TAQ)

database at Wharton Research Data Services (WRDS) from the Wharton School at the University of Pennsylvania

◮ An event occurs when the change in log return between two

successive transactions exceeds a certain given threshold, ̟. We then define a duration as the elapsed time between two successive occurrences of this event

slide-20
SLIDE 20

Diurnal Effect

Follow Tsay (2005), we adjust raw durations to get rid of time of day effect. Xi = xiφ(ti) .

Time of Day effect

Time in sec Mean of Durations 1000 2000 3000 4000 5000 6000 7000 2.5 3.0 3.5 4.0 4.5

slide-21
SLIDE 21

Model fitting for BAC

Table: Parameter estimates for adjusted BAC durations in June, 2013.

Date ˆ ω ˆ α1 ˆ α2 ˆ α3 ˆ β1 ˆ β2 ˆ β3 d 20130603 0.227 0.047 0.001 20130604 0.319 0.103 0.005 20130605 0.258 0.074 0.001 20130606 0.289 0.093 0.006 20130607 0.261 0.034 0.000 20130610 0.299 0.011 0.005 20130611 −0.009 −1.146 0.054 20130612 0.330 0.051 0.001 20130613 0.293 0.025 0.001 20130614 0.137 0.046 0.048 −0.057 0.753 0.111 20130617 −0.015 0.555 0.305 20130618 0.045 0.014 0.041 −0.009 0.167 0.394 0.341 0.162 20130619 −0.501 −0.625 0.767 20130620 0.240 0.066 0.002 20130621 0.213 0.059 0.002

slide-22
SLIDE 22

Model Fitting Using Penalized EF

Table: Parameter estimates for adjusted durations in June, 2013 using penalized estimating functions.

Date BAC GE IBM MMM λ ˆ ω d λ ˆ ω d λ ˆ ω d λ ˆ ω d 20130603 0.7 0.220 0.010 0.5 0.275 0.009 10 0.425 0.023 14 0.428 0.025 20130604 0.7 0.306 0.018 0.7 0.358 0.012 10 0.421 0.029 20 0.447 0.029 20130605 0.5 0.251 0.012 1.0 0.364 0.026 10 0.405 0.041 41 0.463 0.022 20130606 0.5 0.255 0.007 1.0 0.341 0.021 10 0.426 0.031 50 0.437 0.055 20130607 1.0 0.289 0.020 1.0 0.335 0.019 10 0.442 0.030 34 0.467 0.024 20130610 0.7 0.315 0.019 1.0 0.344 0.016 10 0.437 0.036 24 0.430 0.027 20130611 0.5 0.324 0.009 1.0 0.369 0.013 10 0.437 0.032 50 0.475 0.037 20130612 0.5 0.285 0.010 1.0 0.397 0.010 10 0.417 0.037 2 0.458 0.025 20130613 0.7 0.316 0.019 1.0 0.360 0.027 10 0.427 0.021 8 0.419 0.048 20130614 0.7 0.341 0.013 1.0 0.386 0.035 10 0.451 0.032 88 0.505 0.034 20130617 0.7 0.316 0.019 0.7 0.332 0.008 10 0.443 0.028 11 0.401 0.020 20130618 0.7 0.341 0.013 1.0 0.318 0.035 10 0.464 0.040 24 0.445 0.016 20130619 1.0 0.339 0.040 0.7 0.285 0.013 40 0.454 0.072 90 0.499 0.085 20130620 1.0 0.231 0.013 0.5 0.271 0.011 1.0 0.359 0.010 11 0.401 0.020 20130621 1.0 0.203 0.015 0.5 0.269 0.007 1.0 0.323 0.025 24 0.445 0.016

slide-23
SLIDE 23

Summary

◮ Three different estimation approaches for modeling dura-

tions using Log ACD(p, q)

◮ Good starting values are important ◮ Our method is naturally appealing for a wide range of finan-

cial modeling problems

◮ Penalized approach may encounter overfitting problems.

slide-24
SLIDE 24

Questions?

Thank You!