Sharp Adaptive Estimation of the Trend Coefficient of an Ergodic - - PowerPoint PPT Presentation
Sharp Adaptive Estimation of the Trend Coefficient of an Ergodic - - PowerPoint PPT Presentation
Sharp Adaptive Estimation of the Trend Coefficient of an Ergodic Diffusion Arnak S. Dalalyan Humboldt Universit at zu Berlin Sharp Adaptive Trend Estimation 2 The Model Let X = ( X t , t 0) be a random
Sharp Adaptive Trend Estimation 2
✬ ✫ ✩ ✪
The Model
Let X = (Xt, t ≥ 0) be a random process defined by SDE dXt = S(Xt) dt + dWt, (1) where Wt is a standard Wiener process and X0 = ξ is a random variable independent of W. Let Σ0 be the set of all functions S(·) ∈ C1 such that lim
|x|→∞ sgn(x) S(x) < 0,
- S(x)
- ≤ C(1 + x)ν,
(2) for some positive constants C and ν. Then the SDE (1) has a unique solution; in addition, this solution is ergodic.
Sharp Adaptive Trend Estimation 3
✬ ✫ ✩ ✪ For simplicity, we suppose that X0 = ξ follows the invariant law; the probability density of this law is given by f
S(y) =
1 G(S) exp
- 2
y S(v) dv
- .
The statistical problem we are interested in:
- The observation is a continuous path xT of X over [0, T].
- The unknown function is the trend coefficient S(·).
- The function of interest is S(·).
- We are interested in the behavior of the estimators as T → ∞.
- The quality of estimation is measured by L2(R, f 2
S )-risk:
RT ( ¯ ST , S) = ES
- R
¯ ST (x) − S(x) 2f 2
S (x) dx.
Sharp Adaptive Trend Estimation 4
✬ ✫ ✩ ✪
Historical Background
- Pham, T. D. (1981), Prakasa Rao, B. L. S. (1990), Van Zanten,
- J. H. (2001) studied the rate of convergence of a kernel estimator
and its asymptotic normality. If S(·) ∈ Hβ, then the (optimal) rate of convergence is proved to be T
2β 2β+1 .
- Spokoiny, V. G. (2000) constructed an adaptive, almost rate
- ptimal estimator of the trend coefficient via locally linear
approximation of log-likelihood.
- Galtchouk L. and Pergamenshchikov S. (2001a, 2001b) considered
the problem of trend estimation, when the diffusion is observed up to a stopping time.
Sharp Adaptive Trend Estimation 5
✬ ✫ ✩ ✪
Local Minimax Risk
- 1. The parameter space. Let Σ(k) = Σ0 ∩ Ck for any k ∈ N, be
the set of all k times differentiable trend coefficients satisfying condition (2). We fix an S0 ∈ Σ(k), k ∈ N∗, R > 0 and define Σδ =
- S ∈ Vδ(S0) :
- (S(k) − S(k)
0 )f S
- 2
2 ≤ R
- Thus Σδ = Σδ(S0, k, R).
- 2. The local risk. Let ¯
ST (·) = ¯ ST (·, xT ) be an estimator of the trend coefficient S(·), then RT ( ¯ ST , Σδ) = sup
S∈Σδ
ES
- ( ¯
ST − S)f
S
- 2
2.
Sharp Adaptive Trend Estimation 6
✬ ✫ ✩ ✪
- 3. The minimax approach. We study the minimax risk
rT (Σδ) = inf
¯ ST
sup
S∈Σδ
RT ( ¯ ST , S). The asymptotic behavior of this quantity is described by following Theorem 1 (Dalalyan, A. S. & Kutoyants, Yu. A. (2001)). Let k ≥ 1 and the order of smoothness of the density f0 corresponding to the central function S0 be > k + 1, then lim
δ→0 lim T →∞ T
2k 2k+1 rT (Σδ) = P(k, R),
where P(k, R) is the Pinsker constant (Pinsker, M. S. (1980)): P(k, R) = (2k + 1)
- k
π(k + 1)(2k + 1)
- 2k
2k+1
R
1 2k+1 .
Sharp Adaptive Trend Estimation 7
✬ ✫ ✩ ✪
Construction of Estimator
Let KT be a smooth approximation of the Dirac measure at zero δ0. (e. g. KT (x) = h−1
T K(x/hT )).
- A natural estimator of the distribution function is
ˆ FT (x) = 1 T T 1 l{Xt≤x}dt.
- A natural estimator of the invariant density is the convolution
fK,T (x) = (K ∗ ˆ FT )(x) = 1 T T KT (x − Xt) dt.
- A natural estimator of f ′
S is the convolution
f (1)
K,T (x) = (K′ ∗ ˆ
FT )(x) = 1 T T K′
T (x − Xt) dt.
Sharp Adaptive Trend Estimation 8
✬ ✫ ✩ ✪ Using the explicit form of the invariant density, we get S(x) = f ′
S(x)
2f
S(x) .
It provides a natural way of construction of an estimator: ¯ ST (x) = estimator of f ′
S(x)
2 × estimator of f
S(x) =
f (1)
K,T (x)
2fK,T (x) . The problem with this estimator is that at some points the denominator can be equal to zero while the numerator is = 0. We avoid it using the following modified estimator ¯ SK,T (x) = f (1)
K,T (x)
2fK,T (x) + νT (x) , where νT (x) = εT e−lT |x| with lT =
1 (log T ) and εT = T
1 √log T − 1 2 .
Sharp Adaptive Trend Estimation 9
✬ ✫ ✩ ✪ Theorem 2 (Dalalyan & Kutoyants, 2001). For any symmetric non-negative smooth function K(·) satisfying
- R K(u)du = 1 and for
any S ∈ Σδ, we have RT ( ¯ SK,T , S) ∼ 1 4 ES
- R
- f (1)
K,T (x) − f ′ S(x)
2dx, as T → ∞. For any function h ∈ L2(R), we define the Fourier transform ˆ h(λ) = 1 2π
- R
eiλxh(x) dx. Thus ˆ KT and ˆ f
S are the Fourier transforms of the kernel KT and the
invariant density f
- S. We set also
ˆ ϕT (λ) = 1 T T eiλXtdt.
Sharp Adaptive Trend Estimation 10
✬ ✫ ✩ ✪ The choice of the minimax kernel. The Parseval identity yields RT ( ¯ SK,T , S) ∼ 1 8π ES
- R
λ2 ˆ KT (λ) ˆ ϕT (λ) − ˆ f
S(λ)
- 2 dλ
Using the fact that ˆ ϕT is an unbiased estimate of ˆ f
S and the relation
|λ|2VarS[ ˆ ϕT (λ)] ∼ 4T −1 we obtain that the risk RT ( ¯ SK,T , S) is equivalent to ∆T ( ˆ K, | ˆ f|) = 1 8π
- R
λ2 ˆ KT (λ) − 1
- 2
ˆ f
S(λ)
- 2 dλ +
1 2πT
- R
- ˆ
KT (λ)
- 2 dλ
In the same time, our conditions imply that
- R
|λ|2k+2 ˆ f(λ) − ˆ f0(λ)
- 2dλ ≤ 8πR.
Sharp Adaptive Trend Estimation 11
✬ ✫ ✩ ✪ The functional ∆T has a saddle point, which provides the optimal (minimax) kernel ˆ K∗
T (λ) =
- 1 −
- λα∗
T
- k
+
with optimal bandwidth α∗
T = T −
1 2k+1
- 4k
πR(k + 1)(2k + 1)
- 1
2k+1
. The estimator of the trend coefficient S constructed via this kernel K∗
T is asymptotically optimal, but it can not be realised if we do not
know the smoothness order of the unknown function S. Our aim is now to construct an adaptive estimator with respect to parameters k and R. It will be done using the method developed by Golubev, G. (1992) and recently used in Cavalier L., Golubev G., Picard D. and Tsybakov A. (2002).
Sharp Adaptive Trend Estimation 12
✬ ✫ ✩ ✪
Sharp Adaptive Estimator
The main idea is to replace the function ˆ K∗
T (λ) =
- 1 −
- λα∗
T
- k
+
by a random function ˜ KT (λ) =
- 1 −
- λ˜
α
- ˜
β +,
where ˜ α and ˜ β are data driven (depend on the observation XT ). In
- rder to do it, we define
hα,β(λ) =
- 1 −
- λα
- β
+
HT =
- hα,β : α ∈ [T −1, (log T)−1]; β ≥ 0.5
- .
Sharp Adaptive Trend Estimation 13
✬ ✫ ✩ ✪ Recall that RT ( ¯ SK,T , S) ∼ ∆T ˆ KT , | ˆ f
S|
- .
where ∆T (h, | ˆ f|2) = 1 8π
- R
|λ|2 1 − h(λ) 2 ˆ f(λ)
- 2dλ +
1 2πT
- R
h2(λ) dλ. In order to choose the values α and β in an adaptive way, we should
- define a good estimator lT (h) of the functional ∆T (h, ˆ
f),
- minimise the (random) functional lT (h) over a suitably chosen finite
subset of HT . Since ∆T is a quadratic functional, its estimation by plug-in method is not good. That is why we define lT (h) = ∆T
- h,
- ˆ
ϕT (λ)
- 2 −
4 T|λ|2
Sharp Adaptive Trend Estimation 14
✬ ✫ ✩ ✪ Theorem 3. Let the function S0 be such that at an instant t0 the transition density pt0(x, y) is bounded in both variables. Then there exists a subset H′
T of HT containing [log T] elements, such that the
estimator ˜ f (1)
T
(x) = 1 2πT T
R
λ sin
- λ(x − Xt)
˜ hT (λ)dλ
- dt
where ˜ hT (λ) = minh∈H′ lT (h), is a minimax sharp adaptive in the problem of derivative f ′
S estimation. Therefore,
˜ ST (x) = ˜ f (1)
T
(x) 2fK,T (x) + νT (x) is a sharp adaptive estimator of the trend coefficient, i.e. lim
δ→0 lim T →∞ T
2k 2k+1 RT ( ˜
ST , Σδ) = P(k, R).
Sharp Adaptive Trend Estimation 15
✬ ✫ ✩ ✪
Concluding Remarks
1◦. If the function S is H¨
- lder continuous and satisfies condition
(2), then the transition density is bounded at any instant t (cf. Veretennikov, A. (1999)). 2◦. In the case where the diffusion coefficient is not identically
- ne, it holds
S(x) = (σ2(x)f
S(x))′
2f
S(x)
. That is why the extension of the described method to this case is straightforward. 3◦. This result can be easily globalised, provided that the conditions are satisfied uniformly on the parameter set.
Sharp Adaptive Trend Estimation 16
✬ ✫ ✩ ✪ αi = (1 + log−1 T)i, βi =
- 1 −