three different approaches to frontier estimation
play

Three different approaches to frontier estimation St ephane Girard - PowerPoint PPT Presentation

Three different approaches to frontier estimation St ephane Girard INRIA Grenoble Rh one-Alpes & LJK http://mistis.inrialpes.fr/people/girard/ March, 2014 Joint work with A. Guillou (Universit e de Strasbourg, France), A. Iouditski


  1. Three different approaches to frontier estimation St´ ephane Girard INRIA Grenoble Rhˆ one-Alpes & LJK http://mistis.inrialpes.fr/people/girard/ March, 2014 Joint work with A. Guillou (Universit´ e de Strasbourg, France), A. Iouditski (Universit´ e de Grenoble, France), L. Menneteau (Universit´ e de Montpellier, France), A. Nazin (Institute of Control Sciences, Moscow, Russia) and G. Stupfler (Universit´ e Aix-Marseille, France). 1 St´ ephane Girard Three different approaches to frontier estimation

  2. Outline Very brief overview of the literature: First frontier estimator Geffroy (ISUP, 1964) Piecewise polynomial estimators H¨ ardle, Park, Tsybakov (JMVA, 1995) Extreme-value estimators, Linear programming estimators, High order moments estimators. 2 St´ ephane Girard Three different approaches to frontier estimation

  3. Framework Let ( X i , Y i ), 1 ≤ i ≤ n be n independent copies of a random pair ( X , Y ) such that their common distribution has a support S := { ( x , y ) ∈ Ω × R ; 0 ≤ y ≤ g ( x ) } where X has a density f X on the compact subset Ω ⊂ R d , Y | X = x has a density f ( . | x ) on [0 , g ( x )], g is a positive function, g ( x ) = sup { Y | X = x } . We address the problem of the estimation of g , called the frontier of S . 3 St´ ephane Girard Three different approaches to frontier estimation

  4. Illustration Ω = [0 , 1] 4 St´ ephane Girard Three different approaches to frontier estimation

  5. Sharp/non-sharp boundaries H¨ ardle, Park, Tsybakov (JMVA, 1995) assumed that, for all ( x , y ) ∈ S , f X ( x ) ≥ f min > 0, f ( y | x ) ≥ c ( g ( x ) − y ) α where c > 0 and α ≥ 0. Two cases arise: If α = 0 then f ( y | x ) ≥ c > 0 for all y ∈ [0 , g ( x )], this is the situation of a “sharp boundary”. If α > 0 then we may have f ( y | x ) → 0 as y → g ( x ), this is the situation of a “non-sharp boundary”. 5 St´ ephane Girard Three different approaches to frontier estimation

  6. Geffroy’s estimator First frontier estimator Geffroy (ISUP, 1964), based on the extreme-values of the sample: Partition of Ω = [0 , 1] into equidistant k n intervals I n , r , r = 1 , . . . , k n , Maxima on each bin: Y ∗ n , r = max { Y i : X i ∈ I n , r } , Piecewise constant estimator: k n � I { x ∈ I n , r } Y ∗ ˆ g n ( x ) = n , r . r =1 6 St´ ephane Girard Three different approaches to frontier estimation

  7. Illustration: Geffroy’s estimator 7 St´ ephane Girard Three different approaches to frontier estimation

  8. Geffroy’s estimator � 1 Asymptotic behaviour of the L 1 − distance ∆ n := 0 | ˆ g n ( x ) − g ( x ) | dx . Theorem Assume that g is γ − Lipschitzian γ ∈ (0 , 1] and α = 0 (sharp boundary). If some conditions on ( k n ) hold, then ( n / k n )(∆ n − β n ) converges in distribution to a Gumbel r.v. with c.d.f ψ ( z ) = exp( − exp( − θ z )) where x ∈ [0 , 1] f X ( x ) f ( g ( x ) | x ) , θ = inf and β n is the solution of the equation � � � 1 log k n − n β n f X ( x ) f ( g ( x ) | x ) exp dx = 1 . k n 0 The rate of convergence ( n / k n ) is (up to a logarithmic factor) n γ/ (1+ γ ) . 8 St´ ephane Girard Three different approaches to frontier estimation

  9. Piecewise polynomial estimators Proposed in H¨ ardle, Park, Tsybakov (JMVA, 1995) to deal with sharp or non-sharp boundaries ( α ≥ 0), smoother frontiers, i.e. for γ > 0, it is assumed that the ⌊ γ ⌋ th derivative of the frontier g is ( γ − ⌊ γ ⌋ ) − Lipschitzian. The estimator requires a partition I n , r , r = 1 , . . . , k n of Ω = [0 , 1]. On the r th bin, the estimator is defined as the polynomial of degree ⌊ γ ⌋ covering all the points and with smallest surface. k n � g θ ˆ n ( x ) = I { x ∈ I n , r } P n , r ( x ; θ n , r ) . r =1 � θ n , r = arg min P n , r ( x ; θ ) dx s.t. P n , r ( X i ; θ ) ≥ Y i , X i ∈ I n , r . θ I n , r Note that if γ ∈ (0 , 1] then ⌊ γ ⌋ = 0 and we find back Geffroy’s estimator. 9 St´ ephane Girard Three different approaches to frontier estimation

  10. Piecewise polynomial estimators Theorem Under the above assumptions, and for a well chosen partition, piecewise polynomial estimators have the optimal rate of convergence for the L 1 − error, that is n γ/ (1+( α +1) γ ) . In the case where α = 0 (sharp boundary) and γ ∈ (0 , 1], Geffroy’s estimator has the optimal rate of convergence. In practice, the estimators are biased downward and discontinuous. The choice of the partition ( k n ) is also an issue. 10 St´ ephane Girard Three different approaches to frontier estimation

  11. Illustration: Piecewise linear estimator 11 St´ ephane Girard Three different approaches to frontier estimation

  12. Contributions Extreme-value estimator (smoothed, bias correction, sharp boundary, pointwise asymptotic normality) Linear programming estimator (smoothed, no partition of Ω, sharp boundary, strong L 1 − consistency) High order moments estimator (smoothed, no partition of Ω, non-sharp boundary, pointwise asymptotic normality, strong L ∞ − consistency) 12 St´ ephane Girard Three different approaches to frontier estimation

  13. 1. Extreme-value estimator Support S = { ( x , y ) ∈ Ω × R ; 0 ≤ y ≤ g ( x ) } with Ω ⊂ R d . Geffroy’s estimator. k n � g (0) I { x ∈ I n , r } Y ∗ ˆ n ( x ) = n , r . r =1 where { I n , r , r = 1 , . . . , k n } is a partition of Ω and n , r = max { Y i : X i ∈ I n , r } . Y ∗ Bias correction. Assume that Y | X = x is uniformly distributed on [0 , g ( x )] (sharp boundary). k n � g (1) n , r (1 + N − 1 ˆ n ( x ) = I { x ∈ I n , r } Y ∗ n , r ) , r =1 where N n , r is the number of X i ∈ I n , r . 13 St´ ephane Girard Three different approaches to frontier estimation

  14. Extreme-value estimator Smoothing � g (2) g (1) ˆ n ( x ) = R d K h n ( x − t )ˆ n ( t ) dt where K h n ( u ) = h − d n K ( u / h n ), K is d − dimensional density with compact support and h n is a smoothing parameter. Nonparametric regression over the extreme-values of the sample: � k n � g (2) n , r (1 + N − 1 K h n ( x − t ) dt Y ∗ ˆ n ( x ) = n , r ) I n , r r =1 G & Menneteau (JSPI, 2005), Menneteau (ESAIM, 2008) Theorem Assume that g is γ − Lipschitzian, γ ∈ (0 , 1] . Under some conditions on the ( h n ) and ( k n ) sequences, for all ( x 1 , ..., x p ) ⊂ Ω , the random vector � � nh d / 2 k − 1 / 2 g (2) (ˆ n ( x j ) − g ( x j )) : 1 ≤ j ≤ p n n is asymptotically centred Gaussian with diagonal covariance matrix. 14 St´ ephane Girard Three different approaches to frontier estimation

  15. Extreme-value estimator Choosing h n ≍ n − 1 / ( γ + d ) and k n ≍ n d / ( γ + d ) , the rate of convergence is n γ/ ( d + γ ) , up to logarithmic factors. Optimal L 1 − rate of convergence for sharp boundaries ( α = 0) and γ − Lipschitzian frontiers, γ ∈ (0 , 1]. The rate of convergence of this extreme-value estimator is no more optimal for smoother frontier functions ( γ > 1). The approximation of g ( x ) by a constant value Y ∗ n , r for x ∈ I n , r is not precise enough. 15 St´ ephane Girard Three different approaches to frontier estimation

  16. Illustration: Extreme-value estimator 16 St´ ephane Girard Three different approaches to frontier estimation

  17. Contributions Extreme-value estimator (smoothed, bias correction, sharp boundary, pointwise asymptotic normality) Linear programming estimator (smoothed, no partition of Ω, sharp boundary, strong L 1 − consistency) High order moments estimator (smoothed, no partition of Ω, non-sharp boundary, pointwise asymptotic normality, strong L ∞ − consistency) 17 St´ ephane Girard Three different approaches to frontier estimation

  18. 2. Linear programming estimator Support S = { ( x , y ) ∈ [0 , 1] × R ; 0 ≤ y ≤ g ( x ) } , where g is γ − Lipschitzian, γ ∈ (0 , 1]. The estimator is a linear combination of kernel functions: n � g n ( x ) = ˆ α i K h n ( x − X i ) . i =1 The coefficients ( α i ) i =1 ,..., n are obtained by minimizing the surface of the estimated support: � n � min ˆ g n ( x ) dx = min α i , R i =1 under the following constraints : for all i = 1 , . . . , n ˆ g n ( X i ) ≥ Y i (the sample is below the estimated frontier) α i ≥ 0 (the estimated frontier function is positive) n ( X i ) | ≤ c 0 h γ − 1 | ˆ g ′ (Lipschitz constraint) n Linear Programming (LP) problem. 18 St´ ephane Girard Three different approaches to frontier estimation

  19. Linear programming estimator Assume that Y | X = x is uniformly distributed on [0 , g ( x )] Remark 1. Joint distribution of the sample Σ n = ( X i , Y i ) i =1 ,..., n : n � g ( X i ) 1 P (Σ n | g ) = · g ( X i ) I { 0 ≤ Y i ≤ g ( X i ) } , C g i =1 � with C g = R g ( x ) dx . g n = � n Log-likelihood. Since C ˆ i =1 α i , we have n n � � L ( α ) = log P (Σ n | ˆ g n ) = − n log α i + log I { Y i ≤ ˆ g n ( X i ) } . i =1 i =1 The (LP) problem can be read as the maximization of the log-likelihood n ( X i ) | ≤ c 0 h γ − 1 under the additional constraints | ˆ g ′ , i = 1 , . . . , n . n 19 St´ ephane Girard Three different approaches to frontier estimation

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend