simulation based robust iv inference for lifetime data
play

Simulation-based robust IV inference for lifetime data Anand Acharya - PowerPoint PPT Presentation

Simulation-based robust IV inference for lifetime data Anand Acharya 1 Lynda Khalaf 1 Marcel Voia 1 Myra Yazbeck 2 David Wensley 3 1 Department of Economics Carleton University 2 Department of Economics University of Ottawa 3 Department of


  1. Simulation-based robust IV inference for lifetime data Anand Acharya 1 Lynda Khalaf 1 Marcel Voia 1 Myra Yazbeck 2 David Wensley 3 1 Department of Economics Carleton University 2 Department of Economics University of Ottawa 3 Department of Pediatrics University of British Columbia June 9, 2017 Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  2. Research question, model and complications ◮ Research Question ⇒ What is the relationship between a patient’s length of stay in the pediatric intensive care unit and their illness severity score at the time of admission. ◮ Duration Model ⇒ Accelerated failure time (AFT). ◮ Complications ⇒ (i) Unmeasured confounding or endogeneity arising from an omitted variable (unobserved heterogeneity or frailty). (ii) Censoring. ◮ Methods ⇒ Robust instrumental variables (IV): the generalized Anderson-Rubin (GAR) statistic and the generalized Andrews-Marmer (GAM) statistic. Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  3. Accelerated life model Underlying assumption is covariates “accelerate” or “decelerate” observed time, by a constant factor, exp ( Y β + X 1 δ ). Expressed as a transformation model: y = δ ι + Y β + X 1 δ + σǫ. (1) ◮ y ≡ ln ( t ) : transformed possibly right-censored ( n × 1) durations, ◮ Y : confounded observed ( n × 1) risk scores, ◮ X 1 : observed ( n × k 1 ) covariates, ◮ ǫ : unobserved ( n × 1) random disturbance. Also observe other ( n × 1) instrumental variables X 2 . Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  4. Parametric survival models ◮ Lognormal ( exp ( δ ι ) , σ 2 ) → ǫ iid ∼ Normal (0 , 1) , ◮ Loglogistic ( exp ( δ ι ) , σ ) → ǫ iid ∼ Logistic (0 , 1) , σ ) → ǫ iid ◮ Weibull ( exp ( δ ι ) , 1 ∼ Gumbel (0 , 1) where the Lognormal location, Loglogistic location, and Weibull scale parameters are respectively captured in the transformed regression intercept, δ ι . Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  5. Assumptions ◮ Assumption A 2 : X 1 , X 2 predetermined, or ◮ Assumption A 3 : X 2 , ǫ pairwise stochastically independent. ◮ Assumption A 4 : ( X 1 , ǫ ) independently distributed. ◮ Assumption D 1 : ǫ distribution unspecified. ◮ Assumption D 2,3,4 : ǫ iid ∼ Normal (0 , 1), Logistic(0,1) or Gumbel(0,1). ◮ Assumption C 3 : t ∗ = min ( τ, t ) and d is the censoring indicator. Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  6. Weak Instruments and Identification Robustness ◮ Explicitly make no assumptions on the data generating process that links Y and X 2 or on the functional form of the first stage regression ◮ Anderson and Rubin (1949) proposed inverting a least squares test that assesses the exclusion of the instruments in an auxiliary regression. ◮ auxiliary (least squares) regression y − Y β o = X 1 ι λ + X 2 γ + ω, (2) where ω is an ( n × 1) random disturbance and X 1 ι = [ ι, X 1 ]. Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  7. Least Squares Statistic ◮ Generalize Anderson and Rubin (1949) test statistic for H o : β = β o ⇒ γ = 0: GAR ( β o , ) = ( y − Y β o ) ′ ( M 1 − M )( y − Y β o ) / k 2 ( y − Y β o ) ′ M ( y − Y β o ) / ( n − k ) , (3) where M = I − X ( X ′ X ) − 1 X ′ , in which X = [ X 1 ι , X 2 ] and 1 ι X 1 ι ) − 1 X ′ M 1 = I − X 1 ι ( X ′ 1 ι . ◮ Pivotal statistic ⇒ Exact null distribution: GAR ( β o ) = ǫ ′ ( M 1 − M ) ǫ/ k 2 ǫ ′ M ǫ/ ( n − k ) , ⇒ gar calc ( α ) , (4) Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  8. Robust inference To construct a confidence set on β o , we invert 1 a generalized Anderson-Rubin ( GAR ) statistic derived from an auxiliary regression: C β ( α ) = { β o : GAR ( β o ) < gar calc ( α ) } , (5) Solution permits sets that are closed, open, empty, or the union of two or more disjoint intervals. 2 1 Dufour & Taamouti(2005) 2 Dufour(1997) Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  9. C β ( α ) = { β o : β ′ o A β o + b ′ β o + c ≤ 0 } , ◮ ( n × 1) vector u j is drawn from the uniform [0,1] ◮ j th realization of the GAR statistic ◮ Repeat for j=1..J . ◮ Construct the simulated exact null distribution. ◮ Appropriate α -level cut off → confidence set construction. Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  10. Aligned linear rank statistic. 3 ◮ Generalize Andrews and Marmer (2008) test statistic for H o : β = β o ⇒ γ = 0: rank ( y − Y β o − x 1 ˆ δ ( β o )) = x 2 γ + ω, (6) ◮ Test statistic: GAM ( β o ) = c ( i ) ′ ( p 2 ) c ( i ) , (7) 2 x 2 ) − 1 x ′ where: p 2 = x 2 ( x ′ 2 ◮ c is a score vector of: ( i ) = rank ( y − Y β o − x 1 ˆ δ ). 3 Andrews and Marmer (2008) Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  11. Rank scores. ◮ Rank scores are derived to be efficient for certain distributional specifications, F o . ◮ However, they are robust to misspecification. 4 . ◮ The score vector satisfy a non-decreasing and non-constant condition, c ( i ) ≤ ... ≤ c ( n ) and c ( i ) � = c ( n ) , where ( i ) is the rank label of the associated aligned residual order statistic. ◮ Two related and asymptotically equivalent scores are the quantile F o scores and the expected value F o scores. 4 Chernoff and Savage (1958) Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  12. Rank scores: Quantile and expected value. 5 ◮ Quantile F o scores: � ( i ) � c ( i ) = F − 1 . (8) o ( n + 1) ◮ Expected value F o scores: c ∗ ( i ) = E F o [ V ( i ) ] , (9) where V ( i ) is the i th order statistic in a random sample of size n and ( i ) is the rank label of the associated aligned residual order statistic. 5 Randles and Wolfe (1979) Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  13. Quantile scores. ◮ Quantile scores use the rank label to reconstruct the variate values from the quantile function of a presumed distribution. ◮ Normal quantile function of VanderWaerden (1953): c ( i ) = Φ − 1 (( i ) ∗ ) . (10) ◮ Logistic: ( i ) ∗ c ( i ) = ln ( 1 − ( i ) ∗ ) (11) ◮ Gumbel: c ( i ) = − ln ( − ln (( i ) ∗ )) . (12) Where ( i ) ∗ = � � ( i ) ( n +1) Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  14. Mata code: Quantile scores Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  15. Expected value scores. ◮ Well know classical expected value scores: ◮ Wilcoxon (1945), where the expected value of the order statistic is derived from sampling the logistic distribution, giving: 2( i ) c ∗ ( i ) = ( n + 1) − 1 . ◮ Savage (1956), where the expected value of the order statistic is derived from sampling the exponential distribution, giving: c ∗ ( i ) = 1 1 1 n + ( n − 1) + ... + ( n − ( i ) + 1) − 1 Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  16. Right censoring. ◮ We assume a right censoring scheme in which the censoring indicator, d is independently distributed. ◮ Where observed time is now, t ∗ = min ( τ, t ) in which τ is the censored time. ◮ Utilize the framework of Prentice (1978) to adjust the rank scores for right censoring. ◮ Index each censored observation within any adjacent non-censored pair by m . ◮ All censored observations within the same non-censored interval receive the same score. ◮ Conceptually, all censored observations now contribute to the rank vector probability via their survivor function. ◮ May only be applied to expected value scores. Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  17. Right censoring. Utilizing the above framework, the expected value rank scores 6 are: ◮ Wilcoxon (1945) i i n j n j c ( i ) = 1 − 2 c ( i ) � � n j + 1 , m i = 1 − n j + 1 . j =1 j =1 ◮ Savage (1956) i i c ( i ) = � c ( i ) � n − 1 n − 1 − 1 , m i = , j j j =1 j =1 where n j denotes the number of individuals at risk commencing period t ( j ) . 6 Kalbfleisch and Prentice (2002) Chapter 7 Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  18. Mata code Wilcoxon Savage Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

  19. Simulation Empirically relevant simulation design adopts the data generating process: � 1 − ρ 2 µ + ρǫ ) , y = Y β + X 1 δ + ǫ, Y = h ( X 1 π 1 + X 2 π 2 + Size control is achieved in all specifications. Power is increasing in: ◮ Instrument strength. ◮ Instrument balance. ◮ Effect size (clinically relevant difference). ◮ Sample size. Stata Users Group - Bank of Canada 9 June 2017 Simulation-based robust IV inference for lifetime data

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend