SLIDE 1 Higher-Order Conditional Moment Tests: A Simple Robust Approach
Yi-Ting Chen
Institute of Economics Academia Sinica (Incomplete draft) Abstract In this paper, we propose a simple approach to purge the estimation uncertainty effect
- n the higher-order conditional moment (CM) tests for checking the standardized error dis-
tributions of the GARCH-type models. Our approach is different from the Newey (1985)- Tauchen (1985) and Wooldridge (1990) approaches in both the ideas and results. By utilizing the sample mean and variance of the standardized residuals, we establish a class
- f higher-order CM tests that are free of the estimation uncertainty and robust to the
√ T- consistent estimators of the partially specified (conditional mean and variance) models. Importantly, our test statistics do not depend on the conditional mean and variance deriv- atives and hence they are model-invariant. This is a very attractive property in view of practical applications. As demonstrative examples, we apply our approach to generating some simple tests for the predictability of asymmetry and for the symmetry, normality, and some non-Gaussian distributions of standardized errors. Our approach encompasses the skewness-kurtosis-based normality test of Kiefer and Salmon (1983). Keywords: higher-order conditional moment test, estimation uncertainty, normality, symmetry, standardized error distribution. JEL Classifications: C12, G19
†Correspondence to: Yi-Ting Chen, Institute of Economics, Academia Sinica, Taipei 115, Taiwan. Tel:+886-2-27822791-622; Fax:+886-2-27853946; E-mail address: ytchen@gate.sinica.edu.tw ‡ This study is supported by the research grant NSC 94-2415-H-001-00.
SLIDE 2 1 Introduction
The GARCH-type models have been widely used in modelling financial volatilities. Given their conditional mean specifications, such models interpret the return volatilities by using their conditional variance specifications. For a variety of well documented economic and statistical reasons, like Value-at-Risk evaluation, estimation efficiency, and density fore- cast, it is undoubtedly important to extend such a model from a partially specified (con- ditional mean and variance) model to a fully specified (conditional distribution) model. It should be very useful to have some simple and valid tests that can help us to explore the unknown standardized error distribution in this bottom-up model-building process. In the spirit of the conditional moment (CM) testing method of Newey (1985), Tauchen (1985), and Wooldridge (1990), we may establish such tests by using the higher-order moments, such as but not restricted to the skewness and kurtosis, of the standardized residuals of the estimated model, if the problem of estimation uncertainty is properly dealt with. This problem, also known as “Durbin’s problem”, is due to the fact that the parameter estimation generally makes the asymptotic null distribution of a standardized-residuals- based test statistic different from that of its standardized-errors-based counterpart. More-
- ver, this difference also tends to be model-specific and estimation-method-specific. There-
fore, we may not establish the asymptotically valid higher-order CM tests without consid- ering this effect in some proper way. This also reminds us that the existing tests designed for a specific model (estimation method) may not be always valid for general models (es- timation methods). Nonetheless, it is often overlooked in practice. For examples, we can see that the Jarque-Bera (1980, JB) normality test is routinely applied to the standardized residuals of GARCH-type models, even though this test was originally designed for the lin- ear regression with an intercept and conditionally homoskedastic errors and the ordinary least squares (OLS) method. Theoretically, it should be the Kiefer-Salmon (1983, KS) test, a variant of the JB test, rather than the conventional JB test, that is ensured to be valid in the presence of estimation uncertainty; see Bontemps and Meddahi (2005a). The classical Kolmogorov-type test in such applications also suffers from a similar problem; see Bai (2003). The Newey (1985)-Tauchen (1985) approach and the Wooldridge (1990) approach con- duct different strategies to deal with this problem. On the basis of the maximum likeli- hood (ML) method, the former constructs the asymptotically valid tests by calculating the estimation uncertainty. By contrast, the latter establishes the asymptotically valid tests by purging this undesirable effect based on a regression-type “orthogonal transformation”. Recently, Bontemps and Meddahi (2005b) applied a method similar to the Wooldridge 1
SLIDE 3 approach to establishing their moment-based tests for the standardized error distribution. Compared to the Newey-Tauchen approach, the Wooldridge approach has two appealing
- properties. First, it is robust to the
√ T-consistent estimators of the partially specified model with T denoting the sample size. Second, it can check the predictability of the stan- dardized error distribution without assuming a fully specified model. These two properties are both due to the fact that this approach purges, rather than calculates, the estimation
- uncertainty. Nonetheless, these two approaches are both dependent on the conditional
mean and variance derivatives in their practical applications, so that the resulting higher-
- rder CM tests are not model-invariant. Indeed, the GARCH-type models typically imply
certain recursive types of the conditional mean and variance derivatives that may not be simple to compute, especially when the models are complicated. It is therefore very desirable to have other simple tests that can be free of this inconvenience (model-variant non-robustness) in view of practical applications. In this paper, we propose a simple approach for this purpose. In should be noted that, unlike the Newey-Tauchen and Wooldridge approaches that may also be applied to establishing the conditional mean and variance tests, our approach is focused on the higher-
- rder CM tests. Our idea is quite different from these two existing approaches. Intuitively,
given the partially specified model, the sample mean and variance of standardized resid- uals should contain no information about the distributional shape of standardized errors, but they may contribute very important information about the estimation uncertainty. Consequently, we may utilize these two lower-order moments to purge this undesirable effect on the higher-order CM tests. Following this idea, we establish our tests for check- ing the (un-)predictability of the standardized error distribution and the adequacy of the postulated unconditional standardized error distribution. Testing these two hypotheses has quite important implications on building the full specified GARCH-type model, as we will discuss. Interestingly, the proposed tests share the same appealing properties with the higher-order CM tests generated by the Wooldridge approach. However, our tests are free of the conditional mean and variance derivatives, and hence they are model-invariant. This is a very attractive convenience (model-invariant robustness) not shared by the tests derived from the Newey-Tauchen and Wooldridge approaches. The remainder of this paper is organized as follows. In Section 2, we discuss the prob- lem of estimation uncertainty and the existing approaches. In Section 3, we demonstrate
- ur approach and the proposed tests. In Section 4, we illustrate the applicability of our
approach in generating some simple and valid tests for the predictability of asymmetry and the symmetry, normality, and non-normal distributions of standardized errors. As we will see, this includes the KS test as a particular case. (Sections 5 is the simulation 2
SLIDE 4 to be added in future versions.) Finally, we conclude this paper in Section 6. Some mathematical proofs are provided in Appendix.
2 The existing approaches
Let yt be a univariate dependent variable, Xt be the vector of the explanatory variables available at time t, and xt be a sub-vector of Xt. We consider the following partially specified model: yt = νt(xt, α) + ut, ut = vtht(xt, α)1/2, (1) where νt := νt(xt, α) and ht := ht(xt, α) are, respectively, the conditional mean and variance specifications of yt|Xt; α ∈ A ⊂ Rp denotes a p × 1 parameter vector for some p ∈ N, and A is a compact subsect of Rp; ut = ut(α) denotes the regression error; xt = xt(α) if it includes ut−k for some k > 0; vt = vt(α) is the standardized error with I E[vt] = 0 and I E[v2
t ] = 1. This framework includes a variety of GARCH-type models
that have various ht’s. It also encompasses different static or dynamic, linear or nonlinear, regression models that may have conditionally homoskedastic or heteroskedastic regression
- errors. Therefore, the tests established under this framework will be applicable not only
for the GARCH-type models but also for many other particular models. Suppose that, after certain efforts of model-building and hypothesis-testing, we believe,
- r assume, that model (1) has the correct conditional mean and variance specifications for
yt|Xt in the sense that there exists a unique true parameter αo ∈ int(A) such that I E[vot|Xt] = 0 (2) and I E[v2
(3) where vot := vt(αo). Given this maintained assumption, denoted as assumption [A] here- after, we are interested in further studying the distribution of vot. This scenario is con- sistent with the bottom-up model-building process, see Wooldridge (1991), and widely adopted by the empirical studies of GARCH-type models. Most tests for the standardized error distribution also require this assumption; see, e.g., Jarque and Bera (1980), Bai and Ng (2001), and Bai (2003). Indeed, if [A] is not satisfied, then we should modify the conditional mean and variance specifications before studying a sensible standardized error distribution; otherwise, the distribution being tested is unlikely to be correct even under the null hypothesis. 3
SLIDE 5 Let F(·|Xt) be the conditional distribution of vot|Xt and Fo(·) be the unconditional distribution of vot. If the standardized error distribution is unpredictable in the sense that F(·|Xt) = Fo(·), (4) then we may approximate the unknown Fo by using a postulated distribution G that has the probability density function g. Note that G may include a r × 1 parameter vector β ∈ B ⊂ Rr for some r ∈ N; G could also be free of unknown parameter in certain cases. Given (4), the postulated distribution is adequate if Fo(·) = G(·)
Fo(·) = G(·, βo) for some βo ∈ int(B) when G includes β. (5) In empirical studies, researchers often base the fully specified GARCH-type models on these two conditions; see, e.g., Engle (1982) and Bollerslev (1986) for the standard normal distribution, Bollerslev (1987) for the standardized t distribution, Nelson (1991) for the generalized error distribution, Wang et al. (2001) for the exponential generalized beta type II distribution, Premaratne and Bera (2001) for the Pearson type IV distribution, and Bera and Park (2005) for the maximum entropy distribution, among many others. In this conventional modelling approach, it is important to detect the adequacy of G for a variety of economic and statistical reasons as mentioned before. This attracts some recent studies to propose tests for condition (5); see, e.g., Bai and Ng (2001) for testing symmetry, Bai (2003) for a corrected Kolmogorov-type test, Bontemps and Meddahi (2005a) for testing normality, Bontemps and Meddahi (2005b) for some moment-based distribution tests, and the references therein. These tests commonly treat (4) as another maintained assumption in addition to [A]. Testing condition (4) is of course also important in the model-building process, especially for evaluating the validity of the conventional modelling
- approach. If this condition is not satisfied, then it implies that the standardized error
distribution is predictable. In this case, it may be essential to consider certain types of the “autoregressive conditional density” models, such as those of Hansen (1994), Harvey and Siddique (1999, 2000), and Rockinger and Jondeau (2002), to interpret the conditional distribution of standardized errors. In the following discussion, we establish the higher-
- rder CM tests for these two conditions in the spirit of Newey (1985), Tauchen (1985),
and Wooldridge (1990).
2.1 The CM testing method
Let φ be an one-dimensional function of vt (and β if the hypothesis being tested is (5) and G includes β) continuously differentiable with respect to α (and β). Denote φt := φ(vt, β) 4
SLIDE 6 and φot = φ(vot, βo). Note that φt does not depend on β in testing hypothesis (4) or testing hypothesis (5) when G does not include a unknown parameter. Hypothesis (4) typically implies certain testable higher-order CM restrictions: I E[φot|Xt] = 0 (6) such that φt is not a linear combination of vt and v2
t − 1. This permits us to check this
hypothesis by testing whether φot is correlated with a Xt-based testing indicator. Let ζt := ζt(α, γ) be such a q-dimensional testing indicator, that is I E[ζt|Xt] = ζt, for some q ∈ N which is continuously differentiable with respect to α and a finite-dimensional nui- sance parameter vector γ. We assume that, under hypothesis (4), the nuisance parameter estimator, denoted as ˆ γT , has the property: √ T(ˆ γT − γoT ) = Op(1) (7) for some non-stochastic γoT , as in Wooldridge (1990, p.23). Denote ζot := ζt(αo, γoT ). We base the tests for hypothesis (4) on the moment condition: I E[φotζot] = 0 (8) and the tests for (5) on the moment condition: I E[φot] = 0. (9) The latter is a special case of the former corresponding to ζt = 1. (Practical examples will be shown in Section 4.)
2.2 The estimation uncertainty
Let ∇ανt and ∇αht be, respectively, the p×1 vectors of the conditional mean and variance derivatives taken with respect to α. (Other derivatives are similarly defined.) We also define the re-scaled conditional mean and variance derivatives wt := (∇ανt)h−1/2
t
and zt := (∇αht)h−1
t , the gradients sαt := ∇α ln g(vt, β) and sβt := ∇β ln g(vt, β), and the
partial derivatives ϕvt :=
∂ ∂vt φt, ϕβt := ∇βφt, ξαt := ∇αζt, ξγt := ∇γζt, and aβt := ∂ ∂vt sβt.
In addition, we let ˆ αT and ˆ βT be the estimators for αo and βo, respectively, and denote νt, ht, wt, zt, sαt, sβt, ϕvt, ϕβt, ξαt, ξγt, and aβt as ˆ νt, ˆ ht, ˆ wt, ˆ zt, ˆ sαt, ˆ sβt, ˆ ϕvt, ˆ ϕβt, ˆ ξαt, ˆ ξγt, and ˆ aβt, respectively, when (α
⊤, β ⊤, γ ⊤) ⊤ = (ˆ
α
⊤
T , ˆ
β
⊤
T , ˆ
γ
⊤
T )
⊤ and as νot, hot, wot, zot, so
αt,
so
βt, ϕo vt, ϕo βt, ξo αt, ξo γt, and ao βt, respectively, when (α
⊤, β ⊤, γ ⊤) ⊤ = (α ⊤
⊤
⊤
⊤. Given the
standardized residual: ˆ vt := yt − ˆ νt ˆ h1/2
t
= −
νt − νot ˆ h1/2
t
hot ˆ ht 1/2 , 5
SLIDE 7 we can define a feasible statistic: MT := 1 T
T
ˆ φtˆ ζt, where ˆ φt := φ(ˆ vt, ˆ βT ) and ˆ ζt := ζt(ˆ αT , ˆ γT ). The higher-order CM test checks the signifi- cance of this statistic or its proper variants. To establish the asymptotics of our tests, we make the following assumptions: [B1] √ T-consistency: √ T(ˆ αT − αo) = Op(1) under assumption [A], √ T(ˆ βT − βo) = Op(1) under [A], (4), and (5), and √ T(ˆ γT − γoT ) = Op(1) as noted in (7). [B2] uniform weak law of large numbers (UWLLN) and stationary ergodicity: the sequences: (i) {ϕvtζtwt}, {ϕvtvtζtzt}, {φtξαt}, {ζtϕβt}, {φtξγt} (ii) {ζtwt}, {vtζtzt}, {vtξαt}, {vtξγt}, {vtζtwt}, {v2
t ζtzt}, {(v2 t − 1)ξαt}, {(v2 t − 1)ξγt}
(iii) {∇β
⊤sβt}, {aβtwt}, and {aβtvtwt}
satisfy UWLLN and they are stationary-ergodic with the elements that have finite absolute expectations when (α
⊤, β ⊤, γ ⊤) ⊤ = (α ⊤
⊤
⊤
⊤.
Bollerslev and Wooldridge (1992) showed the √ T-consistency of ˆ αT for αo under [A] and certain regularity conditions when ˆ αT is the Gaussian quasi-MLE (QMLE). In addition, given [A], (4), and (5), we have a correct fully specified model: F(yt|Xt) = G
h1/2
, βo
(10) In practice, there are several studies that estimate βo (and αo) by using the ML method based on assumption (10). Assumption [B2] is convenient for presenting our first-order asymptotic results, and it is standard in the literature. In Appendix, we show the following result: Lemma 1 Given [A], [B1], and [B2] (i), under condition (4), we have 1 √ T
T
ˆ φtˆ ζt = 1 √ T
T
φotζot −
E[ϕo
vt]Iq
1 2I E[ϕo
vtvot]Iq
I E[ζotw
⊤
I E[ζotz
⊤
T(ˆ αT − αo) + I E[ζot]I E[ϕo
βt
⊤]
√ T(ˆ βT − βo) + op(1), (11) where Iq denotes the q × q identity matrix. 6
SLIDE 8 This result indicates that, because of the estimation uncertainty, the standardized-residuals- based statistic T −1/2 T
t=1 ˆ
φtˆ ζt is not asymptotically equivalent to the standardized-errors- based statistic T −1/2 T
t=1 φotζot in general. Specifically, the asymptotic null distribution
√ TMT is free of √ T(ˆ γT − γoT ) but it is dependent on √ T(ˆ αT − αo), unless I E[ϕo
vt]I
E[ζotw
⊤
2I E[ϕo
vtvot]I
E[ζotz
⊤
(12) and √ T(ˆ βT − βo), unless I E[ζot]I E[ϕo
βt
⊤] = 0.
(13) If (12) and (13) do not hold, then the effect of estimation uncertainty tends to be model- specific and estimation-method-specific. Ignoring this effect may make the MT -based test asymptotically invalid. This problem was addressed by Durbin (1973) and Khmal- adze (1981), among others, in discussing the asymptotic validity of the estimates-based Kolmogorov-type test. Newey (1985), Tauchen (1985), and Wooldridge (1990) contributed important approaches to this problem for establishing the CM tests.
2.3 Two existing approaches
In face of this problem, we may have two different strategies. The first one is to cal- culate the effect of estimation uncertainty on the asymptotic null distribution of √ TMT in accordance with (11). The second one is to purge this effect by designing some types
- f “transformation” that makes conditions (12) and (13) satisfied. The Newey-Tauchen
approach and the Wooldridge approach adopt these two strategies, respectively. From Lemma 1, it should be realized that the first strategy is estimation-method- specific in general. The Newey-Tauchen approach is based on the ML method. On the basis of (10), we may obtain the MLEs by maximizing the log-likelihood function: LT = − 1 2T
T
ln ht + 1 T
T
ln g
h1/2
t
, β
Denote SoT :=
t=1 so αt
⊤, T −1 T
t=1 so βt
⊤ ⊤
and let BoT be the associated information matrix evaluated at (α
⊤, β ⊤) ⊤ = (α ⊤
⊤
⊤. If assumption (10) is correct, then SoT will be a
sample average of martingale difference and we can utilize the information matrix equality to show the result: √ T(ˆ αT − αo) √ T(ˆ βT − βo)
√ TSoT + op(1); (14) 7
SLIDE 9 see, e.g., White (1994). By introducing (14) into (11) and using the martingale-difference central limit theorem and the Cram´ er-Wold device, see, e.g., White (2001), we can estab- lish the asymptotic null distribution of √ TMT that has taken into account the effect of estimation uncertainty and construct the asymptotically valid test accordingly; see also Davidson and MacKinnon (1993, Section 16.8) for the auxiliary-regressions-based test
- statistics. However, as addressed by Wooldridge (1990), this approach may have two defi-
- ciencies. First, it may not be valid for other
√ T-consistent estimators of αo; see Bera and Bilias (2002) for a survey of various estimators. Second, this approach may not be able to test hypothesis (4) in a robust way because it may reject the null hypothesis simply because (5), rather than (4), is invalid. (Though, this deficiency may be circumvented by using the Gaussian QML method in place of the ML method.) Motivated by these problems, Wooldridge (1990) proposed another approach to deal with the estimation uncertainty effect. His approach can be immediately applied to testing hypothesis (4). Because φt is free of β in testing this hypothesis, we can simplify (11) as 1 √ T
T
ˆ φtˆ ζt = 1 √ T
T
φotζot −
E[ϕo
vt]Iq
1 2I E[ϕo
vtvot]Iq
I E[ζotw
⊤
I E[ζotz
⊤
T(ˆ αT − αo) + op(1). (15) This reminds us that we may purge the estimation uncertainty by using a transformation
t , such that
1 T
T
ζ∗
t wt = 0
and 1 T
T
ζ∗
t zt = 0
(16) in place of ζt. This replacement ensures condition (12) to be asymptotically valid. Inter- estingly, we can easily make condition (16) satisfied by setting ζ∗
t as the OLS residuals of
the auxiliary regression: ζt on wt and zt. After this replacement, (15) becomes 1 √ T
T
ˆ φtˆ ζ∗
t =
1 √ T
T
φotζ∗
(17) in which ˆ ζ∗
t = ζ∗ t (ˆ
αT , ˆ γT ) and ζ∗
t (αo, γoT ). The standardized-residuals-based statis-
tic T −1/2 T
t=1 ˆ
φtˆ ζ∗
t is asymptotically equivalent to the standardized-errors-based statistic
T −1/2 T
t=1 φotζ∗
- t. Consequently, the resulting test can be free of the estimation uncer-
- tainty. Under hypothesis (4), {φotζ∗
- t} is a sequence of martingale difference, so we can
establish the asymptotic normality of T −1/2 T
t=1 ˆ
φtˆ ζ∗
t and the resulting test. Importantly,
8
SLIDE 10 unlike the Newey-Tauchen approach, the validity of (17) holds for all √ T-consistent esti- mators of αo, regardless of whether condition (5) is satisfied. This is a clever approach, as recommended by White (1994, p.228). Wooldridge (1990) also showed that this approach is quite useful in generating the conditional mean and variance tests that are robust to the unknown (or misspecified) conditional distribution. However, it can be seen from Wooldridge (1990, pp.27–28), this transformation is not applicable when ζt = 1 and zt (or wt) includes unity because ζ∗
t = 0 in this situation. This
case occurs in testing hypothesis (5) when model (1) has the conditionally homoskedastic errors (and an intercept in its conditional mean specification). Nonetheless, this problem can be easily solved by a similar method of Bontemps and Meddahi (2005b). Specifically, by setting ζt = 1, (11) becomes 1 √ T
T
ˆ φt = 1 √ T
T
φot + I E[∇αφt]
⊤
(α,β)=(αo,βo)
√ T(ˆ αT − αo) + I E[∇βφt]
⊤
(α,β)=(αo,βo)
√ T(ˆ βT − βo) + op(1). (18) Under condition (10), we have I E[φt|Xt] =
φ
h1/2
t
, β
h1/2
t
, β
If the differentiation and integration are interchangeable, then we have the results: ∇αI E[φt|Xt] = I E[∇αφt] + I E[φtsαt] = 0 and ∇βI E[φt|Xt] = I E[∇βφt] + I E[φtsβt] = 0. This implies that I E[∇αφt] = −I E[φtsαt] and I E[∇βφt] = −I E[φtsβt]. Therefore, we can purge the estimation uncertainty effect by using the OLS residuals of the auxiliary regres- sion: φt on sαt and sβt in place of the original φt. In fact, the ideas of such estimation-uncertainty-purging approaches are in spirit very close to that of Neyman’s (1959) C(α) test; see Bera and Bilias (2001, pp.24–25) for the latter. These approaches are all based on certain regression-type orthogonal trans- formations. From the above discussion, we can also see that the Newey-Tauchen and Wooldridge approaches are both dependent on the conditional mean and variance deriva- tives: ∇ανt and ∇αht (wt and zt) in their practical applications. Therefore, the resulting tests are not model-invariant. As noted before, it is not necessarily simple to compute 9
SLIDE 11 these derivatives (and hence their test statistics), especially for some complicated GARCH- type models. This motivates us to propose another simple approach that can preserve the advantages of the Wooldridge approach but get rid of this disadvantage in establishing the higher-order CM tests.
3 Our approach
Our approach is based on the fact that, given assumption [A], conditions (2) and (3) imply the following two cross-moment restrictions: I E[votζot] = 0 (19) and I E[(v2
(20)
- respectively. These two restrictions are redundant for testing hypotheses (4) and (5), but
their sample counterparts may contribute very valuable information about the estimation uncertainty and hence can be utilized to purge this undesirable effect. This idea is quite simple, and it is clearly different from those of the above-mentioned approaches.
3.1 The estimation-uncertainty-purging device
To formalize our idea, we define the lower-order sample moments: N1T := 1 T
T
ˆ vtˆ ζt and N2T := 1 T
T
(ˆ v2
t − 1)ˆ
ζt. In Appendix, we show the following result: Lemma 2 Given [A], [B1], [B2] (ii), we have
√ T
T
t=1 ˆ
vtˆ ζt
1 √ T
T
t=1(ˆ
v2
t − 1)ˆ
ζt
√ T
T
t=1 votζot 1 √ T
T
t=1(v2
E[ζotw
⊤
I E[ζotz
⊤
T(ˆ αT − αo) + op(1). (21) 10
SLIDE 12 The statistic N1T (and N2T ) and Lemma 2 are, respectively, the counterparts of MT and Lemma 1. From (15) and (21), we can observe that the effect of estimation uncertainty
√ TN1T and √ TN2T is quite similar to that on √ TMT , if φt is free of β. Indeed, by introducing (21) into (15), we can further show the result: Proposition 1 Given the assumptions of Lemmas 1 and 2, under hypothesis (4), we have 1 √ T
T
ˆ ψtˆ ζt = 1 √ T
T
ψotζot + op(1), (22) in which ψot := φot − I E[ϕo
vt]vot − 1
2I E[ϕo
vtvot](v2
(23) and ˆ ψt is the sample counterpart of ψot. This proposition shows that, similar to (17), the statistic T −1/2 T
t=1 ˆ
ψtˆ ζt is also free
- f the estimation uncertainty, and it also holds valid for all
√ T-consistent estimators of αo, regardless of whether condition (5) is satisfied. Importantly, unlike (17), (22) does not include the model-specific derivatives ∇ανt and ∇αht, so that it can be utilized to construct model-invariant tests for hypothesis (4). In testing this hypothesis, the expectations I E[ϕo
vt]
and I E[ϕo
vtvot] are typically unknown if we do not assume (5). In this case, we may estimate
these two expectations by using their sample counterparts and hence have ˆ ψt = ˆ φt −
T
T
ˆ ϕvt
vt − 1 2
T
T
ˆ ϕvtˆ vt
v2
t − 1).
(24) In testing hypothesis (5), we treat condition (4) as another maintained assumption, similar to the standardized error distribution tests mentioned in Section 2. Therefore, Proposi- tion 1 is also applicable to testing (5),if G does not include β. Given ζt = 1, we can re-express (22) as 1 √ T
T
ˆ ψt = 1 √ T
T
ψot + op(1). (25) Accordingly, we can establish a number of simple tests for the standardized error dis- tributions without unknown parameter. If G implies certain closed forms of I E[ϕo
vt] and
I E[ϕo
vtvot], then we may also write ˆ
ψt as ˆ ψt = ˆ φt − I E[ϕo
vt]ˆ
vt − 1 2I E[ϕo
vtvot](ˆ
v2
t − 1).
(26) In other cases, (24) remains applicable. 11
SLIDE 13 In testing hypothesis (5) when G includes unknown parameters, (25) becomes inap-
- plicable. We need an extension of our idea in this situation. To preserve the robustness of
- ur approach to the
√ T-consistent estimators of αo and the conditional mean and variance derivatives ∇ανt and ∇αht, we suggest to estimate βo by using the two-stage ML (2SML) estimator (2SMLE): ˜ βT := argmax
β∈B
1 T
T
ln g(ˆ vt, β) (27) because we can express √ T(˜ βT −βo) as an asymptotic linear transformation of √ T(ˆ αT −αo) under the null of (5). This will permit us to extend the applicability of our approach. Moreover, the 2SML method is much easier than the ML method to implement, especially when model (10) is complicated. In Appendix, by using the fact that ˆ βT = ˜ βT solves the estimating equation: 1 T
T
ˆ sβt = 0, (28) we show the asymptotic linearity between √ T(˜ βT − βo) and √ T(ˆ αT − αo): Lemma 3 Given ˆ βT = ˜ βT , [A] and (4), [B1], [B2] (iii), under hypothesis (5), we have √ T(˜ βT − βo) = I E[so
βtso βt
⊤]−1
√ T
T
so
βt
−
E[ao
βt]
1 2I E[ao
βtvot]
I E[w
⊤
I E[z
⊤
T(ˆ αT − αo)
(29) Then, we further apply Lemmas 1, 2, and 3 to show the following result: Proposition 2 Given the assumptions of Lemmas 1, 2, and 3, under hypothesis (5), we have 1 √ T
T
ˆ ηt = 1 √ T
T
(ηot + κoso
βt) + op(1),
(30) where κo := I E[ϕo
βt
⊤]I
E[so
βtso βt
⊤]−1,
ηot := ψot − κo
E[ao
βt]vot + 1
2I E[ao
βtvot](v2
and ˆ ηt is the sample counterpart of ηot. 12
SLIDE 14 Similar to (25), the result shown in (30) also holds valid for all √ T-consistent estimators
- f αo and it is free of the conditional mean and variance derivatives. However, unlike
(25), T −1/2 T
t=1 ˆ
ηt is in general not asymptotically equivalent to T −1/2 T
t=1 ηot unless
κo = 0 (or I E[ϕo
βt] = 0). This is due to the fact that we calculate the ˜
βT -associated effect before purging the ˆ αT -associated effect. In practical applications, we can express ˆ ηT as ˆ ηt = ˆ ψt − ˆ κT
T
T
ˆ aβt
vt + 1 2
T
T
ˆ aβtˆ vt
v2
t − 1)
(31) where ˆ κT =
T
T
ˆ ϕ
⊤
βt
1 T
T
ˆ sβtˆ s
⊤
βt
−1 . The sample moments shown in this formula may also be replaced by using their population expectations, if the closed forms of these expectations are available.
3.2 The proposed tests
Recall that ζt is a q-dimensional testing indicator with the property I E[ζt|Xt] = ζt. Hypoth- esis (4) implies that {ψotζot} is a martingale-difference sequence such that I E[ψotζot|Xt] = I E[ψot|Xt]ζot = 0. Therefore, by using the martingale-difference central limit theorem and the Cram´ er-Wold device, we can apply Proposition 1 to showing that 1 √ T
T
ˆ ψtˆ ζt
d
→ N (0, Ωo) , (32) in which the asymptotic variance-covariance matrix: Ωo := I E[ψ2
E[ζotζ
⊤
is assumed to be finite and positive definite. Let ˆ ΩT be a consistent estimator for Ωo, like ˆ ΩT =
T
T
ˆ ψ2
t
1 T
T
ˆ ζtˆ ζ
⊤
t
which is assumed to be uniformly positive definite. In accordance with (32), we can define the test statistic: CT = 1 T T
ˆ ψtˆ ζt
ˆ Ω−1
T
T
ˆ ψtˆ ζt
SLIDE 15 that has the asymptotic null distribution: CT
d
→ χ2(q). In what follows, we will refer to this test as the C test. This test is applicable to checking hypothesis (4). Note that, by setting ζt = 1 (q = 1), the C test can also be applied to checking hypothesis (5) if G does not include β. Similarly, given [A] and (4), hypothesis (5) implies that {ηot + κoso
βt} is a sequence of
martingale difference such that I E[ηot + κoso
βt|Xt] = I
E[ηot|Xt] + κoI E[so
βt|Xt] = 0.
Therefore, by using the martingale-difference central limit theorem, we can apply Propo- sition 2 to showing the result: 1 √ T
T
ˆ ηt
d
→ N(0, σ2
ηo),
(33) in which the asymptotic variance: σ2
ηo = I
E[(ηot + κoso
βt)2]
is assumed to be finite and positive. Let ˆ σ2
ηT be a consistent estimator for σ2 ηo. We can
define the test statistic: DT = 1 T T
ˆ ηt 2 ˆ σ−2
ηT
that has the asymptotic null distribution: DT
d
→ χ2(1). We will refer to this test as the D test. This test is applicable to testing hypothesis (5) when G includes β. It should be noted that although the C and D tests are so far based on one-dimensional φt, they can be straightforwardly extended to finite-dimensional moment restrictions with-
- ut difficulties. This is because Propositions 1 and 2 hold valid for an arbitrary φt. We
will show some examples in the next section. 14
SLIDE 16 4 Applications: Illustrative examples
Given the mean and variance, researchers are accustomed to characterize the distribution by using the skewness and kurtosis (coefficients). The skewness and kurtosis of vot are, respectively, the third and fourth moments: I E[v3
E[v4
- t]. The former characterizes
the symmetry (or asymmetry) of vot, and the latter describes the dispersion (or tail shape)
- f vot. The C and D tests can be applied to these arithmetic moments and any other
higher-order moments that are sensible for characterizing the distributional shape. The main purpose of this section is to demonstrate the applicability of our approach in testing several important hypotheses, such as the predictability of asymmetry, the symmetry of vot, normality of vot, and some non-Gaussian distributions of vot. The choice of moment conditions is beyond the scope of this study and needs further studies.
4.1 Predictability of asymmetry
The asymmetry of returns has important implications on portfolio optimization and other financial problems. There are some economic hypotheses proposed to interpret the pre- dictability of return asymmetry, such as the stochastic bubble hypothesis of Blanchard and Watson (1982) and the investor heterogeneity hypothesis of Hong and Stein (2003). Nonetheless, it remains an empirical question about whether the asymmetry of returns is
- predictable. Recently, there are some studies that investigate this question by using the
autoregressive conditional density model of Hansen (1994) or other similar models; see, e.g., Harvey and Siddique (1999, 2000) and Hueng and McDonald (2005). This modelling approach extends the partially specified GARCH-type model by approximating the con- ditional standardized error distribution as a skewed t distribution (or other parametric distributions) with some asymmetry parameter that follows a predictable law of motion. The C test may also be applicable to this question because of the following reasons. First, economic theories may have certain implications on the predictability of return asymmetry, but they typically imply no specific form of F(·|Xt). Unlike Hansen’s (1994) approach, the C test can check the predictability of asymmetry without assuming any spe- cific F(·|Xt). This may avoid the misspecification of F(·|Xt) and the resulting problems. Second, to perform the C test, we need only to estimate the partially specified model. This should be easier than to estimate a fully specified autoregressive conditional density
- model. This may also permit researchers to explore more possible laws of motion before
concluding the predictability or unpredictability of asymmetry. Note that if the asymmetry of F(·|Xt) is predictable, then I E[φot|Xt] should be a non- zero function of wt for some odd-symmetric φ’s such that φ(v) = −φ(−v) for all v ∈ R. 15
SLIDE 17 If the asymmetry of F(·|Xt) is characterized by the conditional skewness I E[v3
we have φt = v3
t . (Other possible choices will be shown in the next sub-section.) By
setting the testing indicator ζt, the C test becomes immediately applicable. The setting
- f ζt may be determined by statistical or economic considerations. For example, following
Engle’s (1982) idea, Hansen (1994, pp.710-717) suggested an autoregressive-type law of motion for the conditional asymmetry. Accordingly, we may specify ζt as a vector of lagged regression errors ut−k’s. Alternatively, we may also choose ζt as a vector of economic variables. For example, Chen et al. (2001) and Hueng and McDonald (2005) suggest to check the predictability of stock return asymmetry by using the stock turnover ratio because of the investor heterogeneity hypothesis.
4.2 Symmetry of vot
By setting ζt = 1 and φ as an odd-symmetric function, the C test is also applicable to testing the unconditional symmetry of vot. Under the symmetry of vot, we have I E[φot] = 0 and I E[ϕo
vtvot] = 0 (because φ′(v)v is an odd-symmetric function of v). Accordingly, we
can simplify (25) as 1 √ T
T
φt −
T
T
ˆ ϕvt
vt
1 √ T
T
{φot − I E[ϕo
vt]vot} + op(1).
(34) This generates the following C test statistic: CS := T
T
T
t=1 ˆ
φt
T
T
t=1 ˆ
ϕvt
1 T
T
t=1 ˆ
vt 2
T
T
t=1 ˆ
φ2
t
T
T
t=1 ˆ
ϕvt
1 T
T
t=1 ˆ
φtˆ vt
T
T
t=1 ˆ
ϕvt 2
1 T
T
t=1 ˆ
v2
t
- that has the asymptotic null distribution χ2(1). This statistic depends only on some simple
model-invariant sample moments of ˆ vt’s, so that it is very simple to compute. Let µk := I E[vk
mk := T −1 T
t=1 ˆ
vk
t be, respectively, the k-th population moment
- f vot if it exists and the k-th sample moment of ˆ
vt for some positive integer k. If we characterize distribution asymmetry by using the skewness, then we have φt = v3
t . In this
case, (34) is asymptotically equivalent to 1 √ T
T
(ˆ v3
t − 3ˆ
vt) = 1 √ T
T
(v3
(35) and consequently 1 √ T
T
(ˆ v3
t − 3ˆ
vt) d → N(0, µ6 − 6µ4 + 9). (36) 16
SLIDE 18 This shows an asymptotically particular example of CS: CS3 = T( ˆ m3 − 3 ˆ m1)2/( ˆ m6 − 6 ˆ m4 + 9). This statistic looks very close to the following one: C′
S3 = T ˆ
m2
3/( ˆ
m6 − 6 ˆ m2 ˆ m4 + 9 ˆ m3
2),
that can be seen from Gupta (1967). The C′
S3 test was originally considered for testing
the symmetry of i.i.d. random variables yt’s with the unknown mean and variance. (Cor- respondingly, the sample moment ˆ mk should be based on ˆ vt = (yt − ¯ yT ) /ˆ σy with ¯ yT and ˆ σ2
y denoting the sample mean and variance of yt’s, respectively.) Because ˆ
m2
p
→ 1, the test statistics CS3 and C′
S3 will be asymptotically equivalent if ˆ
m1 = 0 (or T
t=1 ˆ
vt = 0). This condition holds valid if we estimate a regression with conditionally homoskedastic errors and an intercept in its conditional mean specification by using the least squares method. If the condition ˆ m1 = 0 is not satisfied, then it is the CS3 test, rather than the C′
S3 test,
that has a general validity. Clearly, the CS3 test requires a finite sixth moment µ6 < ∞. This restriction precludes several important heavy-tailed distributions, such as the standardized t distribution with the degrees of freedom less than six. To be free of this problem, we may also base the CS test on certain bounded odd-symmetric φ’s, such as those used in Premaratne and Bera (2001): φ(v) = arctan(v) (37)
φ(v) = v 1 + v2 . (38) The former is a side condition of the maximum entropy density that corresponds to the Pearson-type IV distribution; see Bera and Park (2005). The later is based on the imag- inary part of the characteristic function defined on R+ and weighted by an exponential density; see Chen et al. (2000). Other examples can be found in Bera and Park (2005). The CS test with these bounded odd-symmetric φ’s can check the null of symmetry without assuming µ6 < ∞.
4.3 Normality of vot
Normality may be the most widely used distributional assumption in statistics and econo-
- metrics. In financial applications, it should be particularly essential to check this assump-
- tion. It is known that the GARCH-type models with the normal standardized errors may
17
SLIDE 19 be unable to interpret the excess kurtosis of financial returns; see, e.g., Bollerslev (1987), Engle and Gonz´ alez-Rivera (1991), and Bai et al. (2003). As noted before, practition- ers often check the normality of vot by applying the conventional JB test test statistic: T( ˆ m2
3/6 + ( ˆ
m4 − 3)2/24) to the standardized residuals. This statistic has the asymp- totic null distribution χ2(2) for a linear regression with an intercept and conditionally homoskedastic errors; see also Fisher (1930), D’Agostino and Pearson (1973), and Bow- man and Shenton (1975) for the case of i.i.d. random variables with unknown mean and
- variance. Our approach implies a variant of this popular test.
Note that the standard normal distribution has all odd moments equal to zero (because
- f symmetry) and the even moments:
µ2k =
k
(2i − 1), k = 1, 2, 3, 4 . . . ; (39) see, e.g., Davidson and MacKinnon (1993, p.804). Therefore, we should base the skewness- kurtosis normality test on φt = v3
t and φt = v4 t − 3. Given these two φt’s, we can apply
(25) and the symmetry to showing the result that (35) and 1 √ T
T
(ˆ v4
t − 6ˆ
v2
t + 3) =
1 √ T
T
(v4
(40) must hold simultaneously under the null of normality. As a consequence, we have
t=1(ˆ
v3
t − 3ˆ
vt) T −1/2 T
t=1(ˆ
v4
t − 6ˆ
v2
t + 3)
→ N
24
(41) (This example demonstrates that it is easy to extend our tests to finite dimensional moment conditions, as noted in Section 3.2.) The resulting omnibus test statistic: CN := T ( ˆ m3 − 3 ˆ m1)2 6 + ( ˆ m4 − 6 ˆ m2 + 3)2 24
has the asymptotic null distribution χ2(2) for all the models encompassed by (1). This test statistic may be viewed as a variant of the conventional JB test statistic. Note that the statistic CN is exactly the same as the KS test statistic based on the sample averages of the third and fourth Hermite polynomials of ˆ vt’s. Bontemps and Meddahi (2005a) showed the validity of the KS test in the context of GARCH-type models. Note that this statistic is equivalent to the above-mentioned JB test statistic if ˆ m1 = 0 and ˆ m2 = 1. These two conditions should be valid for the specific model considered by Jarque and Bera (1980) but not necessarily for the general model (1). Therefore, it should 18
SLIDE 20 be the KS test statistic, rather than the conventional JB test statistic, that has a general validity; see also Fiorentini et al. (2004). The KS test is robust to the estimation uncertainty and the √ T-consistent estimators and has a very simple model-invariant test statistic. In view of our approach, this test possesses these appealing properties because it purges the effect of estimation uncertainty
- n the sample skewness and kurtosis of standardized residuals by using the sample mean
and variance of standardized residuals. Importantly, our approach indicates that such attractive properties are also available for many other hypothesis testing problems. We show some examples in the rest of this section.
4.4 Non-Gaussian distributions of vot
Suppose that G is a symmetric distribution with a known β (or without β) and the moments µk’s such that µ8 < ∞. We may base the associated skewness-kurtosis test on φt = v3
t and φt = v4 t − µ4. By utilizing (25) and the symmetry, we have (35) and
1 √ T
T
(ˆ v4
t − 2µ4ˆ
v2
t + µ4) =
1 √ T
T
(v4
- t − 2µ4v2
- t + µ4) + op(1),
and hence
t=1(ˆ
v3
t − 3ˆ
vt) T −1/2 T
t=1(ˆ
v4
t − 2µ4ˆ
v2
t + µ4)
→ N
S
σ2
K
(43) where σ2
S := µ6 − 6µ4 + 9 and σ2 K := µ8 − 4µ6µ4 + 4µ3 4 − µ2 4, under the null of Fo = G.
Accordingly, we can define the test statistic CSK := T ( ˆ m3 − 3 ˆ m1)2 σ2
S
+ ( ˆ m4 − 2µ4 ˆ m2 + µ4)2 σ2
K
that has the asymptotic null distribution χ2(2). Note that the CSK test includes the CN test as a very particular example. By using (39), we can easily check that (44) degenerates to (42) when G is the standard normal
- distribution. The CSK test is also applicable to many other symmetric G’s, such as the
standardized logistic distribution: g(v, β) = π √ 3 exp
√ 3v 1 + exp
√ 3v −2 , ∀ v ∈ R, (45) and the standardized t distribution with the (known) degrees of freedom β: g(v, β) = Γ((β + 1)/2) Γ(β/2)
[1 + c(v, β)]− β+1
2 ,
c(v, β) := v2 β − 2, β > 2, (46) 19
SLIDE 21 where Γ(·) denotes the Gamma function, when β > 8. It is known that the standardized logistic distribution has the even moments: µ2k = 2 3 π2 k (2k!)
ζ(2k), k = 1, 2, 3, 4, . . . , (47) where ζ(k) := ∞
τ=1 τ −k is the Riemann zeta function with ζ(2) = π2/6, ζ(4) = π4/90,
ζ(6) = π6/945, and ζ(8) = π8/9450, see Balakrishnan and Nevzorov (2003, pp.199-200), and the standardized t distribution has the even moments: µ2k = (β − 2)k Γ(k + 1/2)Γ(β/2 − k) Γ(1/2)Γ(β/2) , β > 2k, k = 1, 2, 3, 4, . . . , (48) see Stuart and Ord (1994, p.548). Given (47) and (48), we can easily apply the CSK test to checking these two non-Gaussian distributions.
4.5 Distributions with unknown βo
To improve the flexibility of data-fitting, G is often specified to include a unknown para- meter vector. For testing such a G, we should replace the C test by using the D test. As a demonstrative example, we consider the skewness-kurtosis test for the standardized t dis- tribution with the unknown degrees of freedom β. It is easy to check that this distribution implies sβt = 1 2
β + 1 2
β 2
1 β − 2
2 ln (1 + ct) + 1 2 β + 1 β − 2 ct 1 + ct
and aβt =
β − 2 1 1 + ct β + 1 β − 2 1 1 + ct
(50) where ct := c(vt, β). These two functions are, respectively, symmetric and odd-symmetric to vt = 0. In Appendix, we show that, in this case, the skewness-kurtosis D test statistic is of the form: DSK = T ( ˆ m3 − 3 ˆ m1)2 ˜ σ2
S
+
m4 − ˜ µ4 −
µ4 + 1
2ˆ
κT
T
T
t=1 ˆ
aβtˆ vt
m2 − 1) 2 ˆ ς2
K
, (51) where ˜ σ2
S := ˜
µ6−6˜ µ4+9, ˜ µ4 and ˜ µ6 follow (48) with β = ˜ βT , ˆ κT = −
6 (˜ βT −4)2
T
T
t=1 ˆ
s2
βt
−1 , and ˆ ς2
k = 1
T
T
v4
t − ˜
µ4 −
µ4 + 1 2ˆ κT
T
T
ˆ aβtˆ vt
v2
t − 1) + ˆ
κT ˆ sβt 2 . This statistic has the asymptotic null distribution χ2(2). 20
SLIDE 22 5 Monte Carlo simulation
(To be added.)
6 Conclusions
For a variety of economic and statistical reasons, it is important to extend the GARCH- type model from a partially specified model to a fully specified model. In this model- building process, it is essential to have some suitable tests that can be used to explore the unknown standardized error distribution. We establish a class of the higher-order CM tests for this purpose. Importantly, our approach is different from the Newey-Tauchen and Wooldridge approaches in both the ideas and resulting tests. We purge the effect
- f estimation uncertainty by utilizing the information about this undesirable effect con-
tained in the sample mean and variance of the standardized residuals. Interestingly, our tests share the same appealing properties with the higher-order CM tests generated by the Wooldridge approach. However, unlike the latter, our test statistics are free of the condi- tional mean and variance derivatives and hence they are model-invariant. This advantage should be particularly appealing in view of practical applications. We also demonstrate the applicability of our approach by considering several practically useful examples. The Kiefer-Salmon normality test is included as a particular case. 21
SLIDE 23 References
Andrews, D. W. K. (1997). A conditional Kolmogorov test, Econometrica, 65, 1097–1128. Bai, J. (2003). Testing parametric conditional distributions of dynamic models, Review of Economics and Statistics, 85, 531–549. Bai, J. and S. Ng (2001). A test for conditional symmetry in time series models, Journal
- f Econometrics, 103, 225–258.
Bai, X., J. R. Russell, and G. C. Tiao (2003). Kurtosis of GARCH and stockastic volatility models with non-normal innovations, Journal of Econometrics, 114, 349–360. Balakrishnan, N. and V. B. Nevzorov (2003). A Primer on Statistical Distributions, New York: John Wiley. Bera, A. K. and Y. Bilias (2001). Rao’s score, Neyman’s C(α), and Silvey’s LM tests: An essay on historical developments and some new results, Journal of Statistical Planning and Inference, 97, 9–44. Bera, A. K. and Y. Bilias (2002). The MM, ME, ML, EL, EF, and GMM approaches to estimation: A synthesis, Journal of Econometrics, 107, 51–86. Bera, A. K. and S. Y. Park (2005). Maximum entropy autoregressive conditional het- eroskedasticity model, Working paper, University of Illinois and Urbana-Champaign. Blanchard, O. J. and M. Watson (1982). Bubbles, rational expectations, and financial mar-
- ket. In: Wachtel, P. (Ed.), Crises in Economic and Financial Structure, Lexington:
Lexington Books. Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity, Journal
- f Econometrics, 31, 307–327.
Bollerslev, T. (1987). A conditional heteroskedastic time series model for speculative prices and rates of return, Review of Economics and Statistics, 69, 542–547. Bollerslev, T. and J. M. Wooldridge (1992). Quasi-maximum likelihood estimation and inference in dynamic models with time-varying covariances. Econometric Reviews, 11, 143–172. Bontemps, C. and N. Meddahi (2005a). Testing normality: A GMM approach. Journal of Econometrics, 124, 149–186. 22
SLIDE 24 Bontemps, C. and N. Meddahi (2005b). Testing distributional assumptions: A GMM
- approach. Working Paper, CIRANO.
Bowman, K. O. and L. R. Shenton (1975). On omnibus contours for departures from normality based on √b1 and b2, Biometrika, 62, 243–250. Chen, Y.-T. (2003). Standardized-Residuals-based symmetry tests, Working paper, Acad- emia Sinica. Chen, J., H. Hong, and J. C. Stein (2001). Forecasting crashes: Trading volume, past returns and conditional skewness in stock prices, Journal of Financial Economics, 61, 345–381. Chen, Y.-T., R. Y. Chou, and C.-M. Kuan (2000). Testing time reversibility without moment restrictions, Journal of Econometrics, 95, 199–218. D’Agostino, R. B. and E. S. Pearson (1973). Tests for departure from normality: Empirical results for the distributions of b2 and √b1, Biometrika, 60, 613–622. Davidson, R. and J. G. MacKinnon (1993). Estimation and Inference in Econometrics, Oxford: Oxford University Press. Durbin, J. (1973). Weak convergence of sample distribution functions when parameters are estimated, Annals of Statistics, 1, 279–290. Engle, R. F. (1982). Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom inflation, Econometrica, 50, 987–1006. Engle, R. F. (2002). New frontiers for ARCH models, Journal of Applied Econometrics, 17, 425–446. Engle, R. F. and G. Gonzalez-Rivera (1991). Semiparametric ARCH models, Journal of Business and Economic Statistics, 9, 345–359. Fiorentini, G., E. Sentana, G. Calzolari (2004). On the validity of the Jarque-Bera nor- mality test in conditionally heteroskedastic dynamic regression models, Economics Letters, 83, 307–312. Fisher, R. A. (1930). The moments of the distribution for normal samples of measures of departure from normality, The Proceedings of the Royal Society of London, A, 130, 16–28. 23
SLIDE 25 Glosten, L. R., R. Jagannathan, and D. E. Runkle (1993). On the relation between the expected value and the volatility of the nominal excess return on stocks, Journal of Finance, 5, 1779–1801. Gupta, M. K. (1967). An asymptotically nonparametric test of symmetry, Annals of Math- ematical Statistics, 38, 849–866. Hansen, B. E. (1994). Autoregressive conditional density estimation, International Eco- nomic Review, 35, 705–730. Harvey, C. R. and Siddique, A. (1999). Autoregressive conditional skewness, Journal of Finance and Quantitative Analysis, 34, 465–487. Harvey, C. R. and A. Siddique (2000). Conditional skewness in asset pricing tests, Journal
- f Finance, 55, 1263–1295.
Hentschel, L. (1995). All in the family nesting symmetric and asymmetric GARCH models, Journal of Financial Economics, 39, 71–104. Hong, H. and J. C. Stein (2003). Differences of opinion, short-sales constraints and market crashes, Review of Financial Studies, 16, 487–525. Hueng, C. J. and J. B. McDonald (2005). Forecasting asymmetries in aggregate stock market returns: Evidence from conditional skewness, Journal of Empirical Finance, 12, 666–685. Jarque, C. M. and A. K. Bera (1980). Efficient tests for normality, homoscedasticity and serial independence of regression residuals, Economics Letters, 6, 255–259. Jondeau, E. and M. Rockinger (2000). Conditional volatility, skewness, and kurtosis: Ex- istence and persistence Working paper, Banque de France. Khmaladze, E. V. (1981). Martingale approach in the theory of goodness-of-fit tests. The-
- ry of Probability and Its Applications, 26, 240–257.
Kiefer, N. M. and M. Salmon (1983). Testing normality in econometric models. Economics Letters, 11, 123–127. Nelson, D. (1991). Conditional heteroskedasticity in asset returns: A new approach, Econo- metrica, 59, 347–370. Newey, W. K. (1985). Maximum likelihood specification testing and conditional moment tests, Econometrica, 53, 1047–1070. 24
SLIDE 26
Neyman, J. (1959). Optimal asymptotic tests of composite statistical hypotheses, In: U. Granander (Ed.), In Probability and Statistics: The Harald Cramer Volume, New York: Wiley, 213–234. Premaratne, G. and A. K. Bera (2001). Modelling asymmetry and excess kurtosis in stock return data, Working paper, University of Illinois and Urbana-Champaign. Rockinger, M. and E. Jondeau (2002). Entropy densities with an application to autore- gressive conditional skewness and kurtosis, Journal of Econometrics, 106, 119–142. Stuard, A. and J. K. Ord (1994). Kendall’s Advanced Theory of Statistics. Vol I: Distrib- ution Theory. London: Edward Arnold. Tauchen, G. (1985). Diagnostic testing and evaluation of maximum likelihood models, Journal of Econometrics, 30, 415–443. Wang, K.-L., C. Fawson, C. B. Barrett, and J. B. McDonald (2001). A flexible parametric GARCH model with an application to exchange rates, Journal of Applied Economet- rics, 16, 521–536. White, H. (1994). Estimation, Inference and Specification Analysis, New York: Cambridge University Press. White, H. (2001). Asymptotic Theory for Econometricians, San Diego: Academic Press. Wooldridge, J. M. (1990). A unified approach to robust, regression-based specification tests, Econometric Theory, 6, 17–43. Wooldridge, J. M. (1991). On the application of robust, regression-based diagnostics to models of conditional means and conditional variances, Journal of Econometrics, 47, 5–46. 25
SLIDE 27 Appendix
Proof of Lemma 1 By using the mean value theorem, we have the result: 1 √ T
T
ˆ φtˆ ζt = 1 √ T
T
φotζot −
T
T
ϕ∗
vtζ∗tw
⊤
∗t + 1
2T
T
ϕ∗
vtv∗tζ∗tz
⊤
∗t − 1
T
T
φ∗tξ∗
αt
⊤
T(ˆ αT − αo) +
T
T
ζ∗tϕ∗
βt
⊤
T(ˆ βT − βo) +
T
T
φ∗tξ∗
γt
⊤
T(ˆ γT − γoT ), (A1) where v∗t, φ∗t, ζ∗t, w∗t, z∗t, ϕ∗
vt, ϕ∗ βt, ξ∗ αt, and ξ∗ γt are, respectively, the vt, φt, ζt, wt, zt, ϕvt,
ϕβt, ξαt, and ξγt evaluated at (α
⊤, β ⊤, γ ⊤) ⊤ = (α ⊤
∗, β
⊤
∗, γ
⊤
∗)
⊤ for some α∗, β∗, and γ∗ which are,
respectively, on the segments connecting ˆ αT and αo, ˆ βT and βo, and ˆ γT and γoT . Given [B1] and [B2] (i), we have 1 T
T
ϕ∗
vtζ∗tw∗t = I
E[ϕo
vtζotwot] + op(1),
1 T
T
ϕ∗
vtv∗tζ∗tz∗t = I
E[ϕo
vtvotζotzot] + op(1),
1 T
T
ζ∗tϕ∗
βt = I
E[ζotϕo
βt] + op(1),
1 T
T
φ∗tξ∗
αt = I
E[φotξo
αt] + op(1),
and 1 T
T
φ∗tξ∗
γt = I
E[φotγo
αt] + op(1).
Note that condition (4) implies that vot is independent of wt. Therefore, we can further show I E[ϕo
vtζotwot] = I
E[ϕo
vt]I
E[ζotwot], I E[ϕo
vtvotζotzot] = I
E[ϕo
vtvot]I
E[ζotzot], I E[ζotϕo
βt] =
I E[ζot]I E[ϕo
βt], I
E[φotξo
αt] = I
E[I E[φot|Xt]ξo
αt] = 0, and I
E[φotγo
αt] = I
E[I E[φot|Xt]γo
αt] = 0 by
using the law of iterated expectations. From (A1) and these results, we have Lemma 1.✷ 26
SLIDE 28 Proof of Lemma 2 Similar to the proof of Lemma 1, we can apply the mean value theorem to √ TN1T and √ TN2T , utilize [B1] and [B2] (ii), the law of iterated expectations, and [A] to show the following results: 1 √ T
T
ˆ vtˆ ζt = 1 √ T
T
votζot − I E[ζotw
⊤
√ T(ˆ αT − αo) + op(1) (A2) and 1 √ T
T
(ˆ v2
t − 1)ˆ
ζt = 1 √ T
T
(v2
E[ζotz
⊤
√ T(ˆ αT − αo) + op(1). (A3) Lemma 2 is obtained from (A2) and (A3). ✷ Proof of Lemma 3 By applying the mean value theorem to (28), we have 1 √ T
T
so
βt +
T
T
∇β
⊤sβt
√ T(ˆ βT − βo) −
T
T
a∗
βtw
⊤
∗t + 1
2T
T
a∗
βtv∗tz
⊤
∗t
T(ˆ αT − αo) = 0, where v∗t, w∗t, z∗t, and a∗
βt, are, respectively, the vt, wt, zt, and aβt, evaluated at α = α∗
and β = β∗ for some α∗ and β∗ that are, respectively, on the segments connecting ˆ αT and αo and ˆ βT and βo. By utilizing [B1] and [B2] (iii), we can re-express this expansion as 1 √ T
T
so
βt + I
E[∇β
⊤sβt](α,β)=(αo,βo)
√ T(ˆ βT − βo) −
E[ao
βtw
⊤
2I E[ao
βtvotz
⊤
√ T(ˆ αT − αo) + op(1) = 0. (A4) The information matrix equality shows that I E[∇β
⊤sβt](α,β)=(αo,βo) = −I
E[so
βtso βt
⊤].
Moreover, by applying the law of iterated expectations and condition (4), we have I E[ao
βtw
⊤
2I E[ao
βtvotz
⊤
E[ao
βt]I
E[w
⊤
2I E[ao
βtvot]I
E[z
⊤
Lemma 3 is proved by introducing these results into (A4). ✷ 27
SLIDE 29 Proof of Proposition 2 Given ˆ βT = ˜ βT and ζt = 1 (q = 1), Lemma 1 and Lemma 2 imply 1 √ T
T
ˆ φt = 1 √ T
T
φot −
E[ϕo
vt]
1 2I E[ϕo
vtvot]
I E[w
⊤
I E[z
⊤
T(ˆ αT − αo) + I E[ϕo
βt
⊤]
√ T(˜ βT − βo) + op(1) (A5) and
√ T
T
t=1 ˆ
vt
1 √ T
T
t=1(ˆ
v2
t − 1)
√ T
T
t=1 vot 1 √ T
T
t=1(v2
E[w
⊤
I E[z
⊤
T(ˆ αT − αo) + op(1), (A6)
- respectively. From (A5) and Lemma 3, we have the result:
1 √ T
T
ˆ φt = 1 √ T
T
(φot + κoso
βt) − [λ1o
λ2o]
E[w
⊤
I E[z
⊤
T(ˆ αT − αo) + op(1), (A7) where λ1o := I E[ϕo
vt] + κoI
E[ao
βt]
and λ2o := 1 2
E[ϕo
vtvot] + κoI
E[ao
βtvot]
Proposition 2 is proved by introducing (A6) into (A7). ✷ Derivation of (51) To derive this test, note that the skewness-based choice of φt = v3
t implies ϕo vt = 3v2
ϕo
vtvot = 3v3
- t, and ϕβt = 0; consequently, I
E[ϕo
vt] = 3, I
E[ϕo
vtvot] = 0 (by the symmetry of
the standardized t distribution), and κo = 0. Therefore, we have ηot = ψot = v3
using Proposition 2, we have (35). Moreover, the kurtosis-based choice of φt = v4
t −µ4 with
µ4 = 3
β−4
vt = 4v3
vtvot = 4v4
6 (β−4)2 ; therefore, I
E[ϕo
vt] = 0,
I E[ϕo
vtvot] = 4µ4, and κo = − 6 (β−4)2 I
E[so
βt 2]−1. Note that the symmetry of the standardized
t distribution implies I E[ao
βt] = 0. Therefore, we have ψot = v4
- t − µ4 − 2µ4(v2
- t − 1) and
ηot = v4
2κoI
E[ao
βtvot]
- (v2
- t −1). In accordance with Proposition 2, we have
1 √ T
T
v4
t − ˜
µ4 −
µ4 + 1 2 ˆ kT
T
T
ˆ aβtˆ vt
v2
t − 1)
1 √ T
T
2κoI E[ao
βtvot]
βt
(A8) 28
SLIDE 30 Combined (35) and (A8), we have
T( ˆ m3 − 3 ˆ m1) √ T
m4 − ˜ µ4 −
µ4 + 1
2ˆ
κT
T
T
t=1 ˆ
aβtˆ vt
m2 − 1)
→ N
S
ς2
K
where ς2
k = I
E
2κoI E[ao
βtvot]
βt
2 and the asymptotic variance-covariance matrix is diagonal because the symmetry of the standardized t distribution implies I E
2κoI E[ao
βtvot]
βt
This generates the DSK test. ✷ 29