conditional density estimation in a censored single index
play

Conditional density estimation in a censored single-index regression - PowerPoint PPT Presentation

Conditional density estimation in a censored single-index regression model Olivier Bouaziz 1 and Olivier Lopez 2 1 Laboratoire de Statistique Thorique et Applique 2 Crest-Ensai, Irmar, and Weierstrass Institute (Berlin) International Workshop


  1. Conditional density estimation in a censored single-index regression model Olivier Bouaziz 1 and Olivier Lopez 2 1 Laboratoire de Statistique Théorique et Appliquée 2 Crest-Ensai, Irmar, and Weierstrass Institute (Berlin) International Workshop on Applied Probability Compiègne, 10-07-08 O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 1 / 24

  2. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Introduction Standford Heart Transplant Data : Y i response variable : survival time of the patient i . X i covariate vector (age and square of age) Censored data : for some patients Y i is not observed. Possible causes : Administrative censoring Patient died of causes independent of the heart transplant ... Regression model on these data : Miller and Halpern (1982), Wei et al. (1990), Stute et al. (2000)... O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 2 / 24

  3. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Semiparametric model Conditional density estimation of Y given X = x : f ( Y | x ) . Problem of the “curse of dimensionality”. Semiparametric model for dimension reduction. S.I.M. assumption ∃ θ 0 ∈ Θ ⊂ R d s.a. f ( y | x ) = f θ 0 ( y , x ′ θ 0 ) where f θ ( y , u ) denotes the conditional density of Y given X ′ θ = u evaluated at Y = y . O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 3 / 24

  4. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Censored data We look at Y 1 ,..., Y n (non observed). C 1 ,..., C n censoring random variables. Observations  Z i = Y i ∧ C i 1 ≤ i ≤ n   δ i = 1 Y i ≤ C i 1 ≤ i ≤ n  X i ∈ χ ⊂ R d 1 ≤ i ≤ n .  Assumptions of Koul et al. (1981), Stute (1996), Stute (1999), Stute et al. (2000), Sellero et al. (2005)... For i = 1 ... n , P ( Y i = C i ) = 0 Y i ⊥ ⊥ C i P ( Y i ≤ C i | X i , Y i ) = P ( Y i ≤ C i | Y i ) . O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 4 / 24

  5. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Outline Estimation procedure 1 Asymptotic results for ˆ θ 2 Key ingredients of proof 3 Simulation study and analysis on real data 4 O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 5 / 24

  6. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Estimation procedure Assume we know f θ and define for any function J ≥ 0 , log f θ ( Y , θ ′ X ) J ( X ) � � L ( θ , J ) = E � log f θ ( y , θ ′ x ) J ( x ) dF X , Y ( x , y ) = where F X , Y ( x , y ) = P ( X ≤ x , Y ≤ y ) . Then θ 0 = argmax L ( θ , J ) . θ ∈ Θ Problems Estimation of F X , Y ( x , y ) Estimation of f θ O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 6 / 24

  7. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Estimation procedure Assume we know f θ and define for any function J ≥ 0 , log f θ ( Y , θ ′ X ) J ( X ) � � L ( θ , J ) = E � log f θ ( y , θ ′ x ) J ( x ) dF X , Y ( x , y ) = where F X , Y ( x , y ) = P ( X ≤ x , Y ≤ y ) . Then θ 0 = argmax L ( θ , J ) . θ ∈ Θ Problems Estimation of F X , Y ( x , y ) Estimation of f θ O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 6 / 24

  8. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Estimation of F X , Y Estimator of F X , Y Estimator of F X , Y proposed by Stute (1993) : n ˆ ∑ F ( x , y ) = δ i W in 1 Z i ≤ y , X i ≤ x i = 1 G ( Z i − )) and ˆ 1 where W in = G is the Kaplan Meier estimator of n ( 1 − ˆ G ( · ) = P ( C ≤ · ) . O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 7 / 24

  9. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Estimation of f θ We use a nonparametric kernel smoothing estimator. Let K be a kernel and h a bandwith with classical hypotheses. Estimator of f θ � K h ( θ ′ x − θ ′ u ) K h ( z − y ) d ˆ F ( u , y ) ˆ f h θ ( z , θ ′ x ) = , � K h ( θ ′ x − θ ′ u ) d ˆ F X ( u ) where K h ( · ) = h − 1 K ( · / h ) and ˆ F X is the empirical estimator of F X . O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 8 / 24

  10. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data First estimator of θ We use the following pseudo-likelihood : Pseudo likelihood � L n ( θ , ˆ log ˆ θ ( y , θ ′ x ) J ( x ) d ˆ f h f h θ , J ) = F X , Y ( x , y ) n δ i W in log ˆ f h θ ( Z i , θ ′ X i ) J ( X i ) ∑ = i = 1 We derive the following estimator : O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 9 / 24

  11. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data First estimator of θ We use the following pseudo-likelihood : Pseudo likelihood � L n ( θ , ˆ log ˆ θ ( y , θ ′ x ) J ( x ) d ˆ f h f h θ , J ) = F X , Y ( x , y ) n δ i W in log ˆ f h θ ( Z i , θ ′ X i ) J ( X i ) ∑ = i = 1 We derive the following estimator : Estimator of θ ˆ L n ( θ , ˆ f h θ ( h ) = argmax θ , J ) . θ ∈ Θ O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 9 / 24

  12. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data First estimator of θ We use the following pseudo-likelihood : Pseudo likelihood � L n ( θ , ˆ log ˆ θ ( y , θ ′ x ) J ( x ) d ˆ f h f h θ , J ) = F X , Y ( x , y ) n δ i W in log ˆ f h θ ( Z i , θ ′ X i ) J ( X i ) ∑ = i = 1 We derive the following estimator : Estimator of θ ˆ θ (ˆ ˆ L n ( θ , ˆ h h ) = argmax f θ , J ) . θ ∈ Θ O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 9 / 24

  13. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Adaptive choice of τ The Kaplan-Meier estimator does not behave well in the tail of the distribution. Truncation bound : we only keep observations lower than τ . SIM assumption For any τ , L ( Y | X , Y ≤ τ ) = L ( Y | X ′ θ 0 , Y ≤ τ ) How can we choose τ from the data ? Asymptotic criterion : O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 10 / 24

  14. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Adaptive choice of τ The Kaplan-Meier estimator does not behave well in the tail of the distribution. Truncation bound : we only keep observations lower than τ . SIM assumption For any τ , L ( Y | X , Y ≤ τ ) = L ( Y | X ′ θ 0 , Y ≤ τ ) How can we choose τ from the data ? Asymptotic criterion : O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 10 / 24

  15. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Adaptive choice of τ The Kaplan-Meier estimator does not behave well in the tail of the distribution. Truncation bound : we only keep observations lower than τ . SIM assumption For any τ , L ( Y | X , Y ≤ τ ) = L ( Y | X ′ θ 0 , Y ≤ τ ) How can we choose τ from the data ? Asymptotic criterion : O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 10 / 24

  16. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Adaptive choice of τ The Kaplan-Meier estimator does not behave well in the tail of the distribution. Truncation bound : we only keep observations lower than τ . SIM assumption For any τ , L ( Y | X , Y ≤ τ ) = L ( Y | X ′ θ 0 , Y ≤ τ ) How can we choose τ from the data ? Asymptotic criterion : O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 10 / 24

  17. Asymptotic results for ˆ Estimation procedure θ Key ingredients of proof Simulation study and analysis on real data Adaptive choice of τ The Kaplan-Meier estimator does not behave well in the tail of the distribution. Truncation bound : we only keep observations lower than τ . SIM assumption For any τ , L ( Y | X , Y ≤ τ ) = L ( Y | X ′ θ 0 , Y ≤ τ ) How can we choose τ from the data ? Asymptotic criterion : � h τ ) − θ 0 � 2 � E 2 ( τ ) := lim � ˆ θ τ (ˆ n E O. Bouaziz and O. Lopez Estimation in a SIM with censored data IWAP 10-07-08 10 / 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend