Sharp Adaptive Estimation of the Trend Coefficient of an Ergodic - PowerPoint PPT Presentation

✬ ✩ Sharp Adaptive Estimation of the Trend Coefficient of an Ergodic Diffusion Arnak S. Dalalyan Humboldt Universit¨ at zu Berlin ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 2 The Model Let X = ( X t , t ≥ 0) be a random process defined by SDE dX t = S ( X t ) dt + dW t , (1) where W t is a standard Wiener process and X 0 = ξ is a random variable independent of W . Let Σ 0 be the set of all functions S ( · ) ∈ C 1 such that � ≤ C (1 + x ) ν , � � | x |→∞ sgn( x ) S ( x ) < 0 , lim � S ( x ) (2) for some positive constants C and ν . Then the SDE (1) has a unique solution; in addition, this solution is ergodic. ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 3 For simplicity, we suppose that X 0 = ξ follows the invariant law; the probability density of this law is given by � y 1 � � f S ( y ) = G ( S ) exp 2 S ( v ) d v . 0 The statistical problem we are interested in: • The observation is a continuous path x T of X over [0 , T ]. • The unknown function is the trend coefficient S ( · ). • The function of interest is S ( · ). • We are interested in the behavior of the estimators as T → ∞ . • The quality of estimation is measured by L 2 ( R , f 2 S )-risk: � ¯ � � 2 f 2 R T ( ¯ S T , S ) = E S S T ( x ) − S ( x ) S ( x ) dx. R ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 4 Historical Background • Pham, T. D. (1981), Prakasa Rao, B. L. S. (1990), Van Zanten, J. H. (2001) studied the rate of convergence of a kernel estimator and its asymptotic normality. If S ( · ) ∈ H β , then the (optimal) 2 β 2 β +1 . rate of convergence is proved to be T • Spokoiny, V. G. (2000) constructed an adaptive, almost rate optimal estimator of the trend coefficient via locally linear approximation of log-likelihood. • Galtchouk L. and Pergamenshchikov S. (2001 a , 2001 b ) considered the problem of trend estimation, when the diffusion is observed up to a stopping time. ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 5 Local Minimax Risk 1. The parameter space. Let Σ( k ) = Σ 0 ∩ C k for any k ∈ N , be the set of all k times differentiable trend coefficients satisfying condition (2). We fix an S 0 ∈ Σ( k ), k ∈ N ∗ , R > 0 and define � ( S ( k ) − S ( k ) � � 2 � � � Σ δ = S ∈ V δ ( S 0 ) : 0 ) f 2 ≤ R S Thus Σ δ = Σ δ ( S 0 , k, R ). S T ( · , x T ) be an estimator of the 2. The local risk. Let ¯ S T ( · ) = ¯ trend coefficient S ( · ), then � 2 R T ( ¯ � ( ¯ � � S T , Σ δ ) = sup E S S T − S ) f 2 . S S ∈ Σ δ ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 6 3. The minimax approach. We study the minimax risk R T ( ¯ r T (Σ δ ) = inf sup S T , S ) . ¯ S T S ∈ Σ δ The asymptotic behavior of this quantity is described by following Theorem 1 ( Dalalyan, A. S. & Kutoyants, Yu. A. (2001)) . Let k ≥ 1 and the order of smoothness of the density f 0 corresponding to the central function S 0 be > k + 1 , then 2 k 2 k +1 r T (Σ δ ) = P ( k, R ) , δ → 0 lim lim T →∞ T where P ( k, R ) is the Pinsker constant (Pinsker, M. S. (1980)): 2 k � � k 2 k +1 1 2 k +1 . P ( k, R ) = (2 k + 1) R π ( k + 1)(2 k + 1) ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 7 Construction of Estimator Let K T be a smooth approximation of the Dirac measure at zero δ 0 . (e. g. K T ( x ) = h − 1 T K ( x/h T )). • A natural estimator of the distribution function is � T F T ( x ) = 1 ˆ 1 l { X t ≤ x } dt. T 0 • A natural estimator of the invariant density is the convolution � T F T )( x ) = 1 f K,T ( x ) = ( K ∗ ˆ K T ( x − X t ) dt. T 0 A natural estimator of f ′ • S is the convolution � T F T )( x ) = 1 K,T ( x ) = ( K ′ ∗ ˆ f (1) K ′ T ( x − X t ) dt. T 0 ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 8 Using the explicit form of the invariant density, we get f ′ S ( x ) S ( x ) = S ( x ) . 2 f It provides a natural way of construction of an estimator: f (1) K,T ( x ) estimator of f ′ S ( x ) ¯ S T ( x ) = S ( x ) = 2 f K,T ( x ) . 2 × estimator of f The problem with this estimator is that at some points the denominator can be equal to zero while the numerator is � = 0. We avoid it using the following modified estimator f (1) K,T ( x ) ¯ S K,T ( x ) = 2 f K,T ( x ) + ν T ( x ) , √ log T − 1 1 where ν T ( x ) = ε T e − l T | x | with l T = 2 . 1 (log T ) and ε T = T ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 9 Theorem 2 ( Dalalyan & Kutoyants, 2001) . For any symmetric � non-negative smooth function K ( · ) satisfying R K ( u ) du = 1 and for any S ∈ Σ δ , we have S K,T , S ) ∼ 1 � � 2 dx, R T ( ¯ f (1) K,T ( x ) − f ′ � 4 E S S ( x ) R as T → ∞ . L 2 ( R ), we define the Fourier transform For any function h ∈ � h ( λ ) = 1 � ˆ e iλx h ( x ) dx. 2 π R Thus ˆ K T and ˆ f S are the Fourier transforms of the kernel K T and the invariant density f S . We set also � T ϕ T ( λ ) = 1 e iλX t dt. ˆ T 0 ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 10 The choice of the minimax kernel. The Parseval identity yields S K,T , S ) ∼ 1 � � 2 dλ R T ( ¯ � ˆ ϕ T ( λ ) − ˆ λ 2 � � 8 π E S K T ( λ ) ˆ f S ( λ ) R ϕ T is an unbiased estimate of ˆ Using the fact that ˆ f S and the relation | λ | 2 Var S [ ˆ ϕ T ( λ )] ∼ 4 T − 1 we obtain that the risk R T ( ¯ S K,T , S ) is equivalent to f | ) = 1 1 � � � 2 dλ + � 2 dλ � 2 � ∆ T ( ˆ K, | ˆ � ˆ � ˆ � ˆ λ 2 � � � � � K T ( λ ) − 1 f S ( λ ) K T ( λ ) 8 π 2 πT R R In the same time, our conditions imply that � � 2 dλ ≤ 8 πR. � ˆ f ( λ ) − ˆ | λ | 2 k +2 � � f 0 ( λ ) R ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 11 The functional ∆ T has a saddle point, which provides the optimal (minimax) kernel � k � ˆ K ∗ � λα ∗ � � � T ( λ ) = 1 − T + with optimal bandwidth 1 � 4 k � 2 k +1 1 T = T − α ∗ . 2 k +1 πR ( k + 1)(2 k + 1) The estimator of the trend coefficient S constructed via this kernel K ∗ T is asymptotically optimal, but it can not be realised if we do not know the smoothness order of the unknown function S . Our aim is now to construct an adaptive estimator with respect to parameters k and R . It will be done using the method developed by Golubev, G. (1992) and recently used in Cavalier L., Golubev G., Picard D. and Tsybakov A. (2002). ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 12 Sharp Adaptive Estimator The main idea is to replace the function � k � ˆ K ∗ � λα ∗ � � � T ( λ ) = 1 − T + by a random function � ˜ β � ˜ � � � K T ( λ ) = 1 − � λ ˜ α + , β are data driven (depend on the observation X T ). In α and ˜ where ˜ order to do it, we define � β � � � � h α,β ( λ ) = 1 − � λα + � � h α,β : α ∈ [ T − 1 , (log T ) − 1 ]; β ≥ 0 . 5 H T = . ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 13 Recall that � ˆ K T , | ˆ R T ( ¯ � S K,T , S ) ∼ ∆ T f S | . where f | 2 ) = 1 � 1 � � 2 � � 2 dλ + ∆ T ( h, | ˆ � ˆ | λ | 2 � h 2 ( λ ) dλ. � 1 − h ( λ ) f ( λ ) 8 π 2 πT R R In order to choose the values α and β in an adaptive way, we should - define a good estimator l T ( h ) of the functional ∆ T ( h, ˆ f ), - minimise the (random) functional l T ( h ) over a suitably chosen finite subset of H T . Since ∆ T is a quadratic functional, its estimation by plug-in method is not good. That is why we define � 4 � � 2 − � ˆ � � l T ( h ) = ∆ T h, ϕ T ( λ ) T | λ | 2 ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 14 Theorem 3. Let the function S 0 be such that at an instant t 0 the transition density p t 0 ( x, y ) is bounded in both variables. Then there exists a subset H ′ T of H T containing [log T ] elements, such that the estimator � T 1 � � � ˜ � ˜ f (1) � ( x ) = λ sin λ ( x − X t ) h T ( λ ) dλ dt 2 πT T 0 R where ˜ h T ( λ ) = min h ∈H ′ l T ( h ) , is a minimax sharp adaptive in the problem of derivative f ′ S estimation. Therefore, ˜ f (1) ( x ) ˜ T S T ( x ) = 2 f K,T ( x ) + ν T ( x ) is a sharp adaptive estimator of the trend coefficient, i.e. 2 k 2 k +1 R T ( ˜ δ → 0 lim lim T →∞ T S T , Σ δ ) = P ( k, R ) . ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 15 Concluding Remarks 1 ◦ . If the function S is H¨ older continuous and satisfies condition (2), then the transition density is bounded at any instant t (cf. Veretennikov, A. (1999)). 2 ◦ . In the case where the diffusion coefficient is not identically one, it holds S ( x ) = ( σ 2 ( x ) f S ( x )) ′ . 2 f S ( x ) That is why the extension of the described method to this case is straightforward. 3 ◦ . This result can be easily globalised, provided that the conditions are satisfied uniformly on the parameter set. ✫ ✪

✬ ✩ Sharp Adaptive Trend Estimation 16 α i = (1 + log − 1 T ) i , � − 1 � i β i = 1 − . log T ✫ ✪

Sharp Adaptive Estimation of the Trend Coefficient of an Ergodic - PowerPoint PPT Presentation

Sharp Adaptive Estimation of the Trend Coefficient of an Ergodic Diffusion Arnak S. Dalalyan Humboldt Universit at zu Berlin Sharp Adaptive Trend Estimation 2 The Model Let X = ( X t , t 0) be a random

Business Statistics CONTENTS The correlation coefficient The rank correlation coefficient

Safety and Health Recogni2on Achievement Program (SHARP) OSHCON SHARP Introduc/on SHARP

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

Adaptive Control Chapter 1: Introduction to Adaptive Control Adaptive Control Landau, Lozano,

Adaptive Control Chapter 11: Direct Adaptive Control 1 Adaptive Control Landau, Lozano,

Estimation of Drag Coefficient Prof. Rajkumar S. Pant Aerospace Engineering Department IIT

Adaptive Control Chapter 12: Indirect Adaptive Control 1 Adaptive Control Landau, Lozano,

Temperature Coefficient of Temperature Coefficient of Radiant Sensitivity of Silicon Radiant

Multilevel Models Session 3: Random coefficient models Outline Random coefficient models

Class 12: Coefficient of restitution and Class 12: Coefficient of restitution and elastic collision

Time-Varying Coefficient Model with Time-Varying Coefficient Model with Linear Smoothing Function

Coefficient of Determination The coefficient of determination, R 2 , is defined as before: y i ) 2

Estimation of Lift Coefficient Prof. Rajkumar S. Pant Aerospace Engineering Department IIT

Estimation of cosmological parameters using adaptive importance sampling Gersende FORT LTCI,

TU TURNI NING NG TR TREND NDS S TO BU O BUSI SINESS NESS INS NSIGHTS GHTS COLUMBIA

Recent market trend 2 World market trend Lift trucks world market trend 164.267 170.000

A Performance Improvement Approach for Second-Order Optimization in Large Mini-batch Training

Isosurfaces Over Simplicial Partitions of Multiresolution Grids Josiah Manson and Scott Schaefer

SGD without Replacement: Sharper Rates for General Smooth Convex Functions Dheeraj Nagaraj

A Two-Stage Approach for Learning a Sparse Model with Sharp Excess Risk Analysis Zhe Li ,

A Piecewise Linear Model of Credit Traps and Credit Cycles: A Complete Characterization Iryna

to Uncover the Impacts of Income Taxation on Earnings Raj Chetty, Harvard and NBER John N.

Asymptotics of sharp constants of Markov-Bernstein inequalities in integral norm with classical

Televisions, Video Privacy, and Powerline Electromagnetic Interference Miro Enev , Sidhant Gupta,

Sambuz

Useful Links

Newsletter

Mail Us