 
              Causal inference on the difference of the restricted mean lifetime between two groups work of P. Chen and A. Tsiatis (Biometrics 2001), among others Tianchen Qian Department of Biostatistics Bloomberg School of Public Health The Johns Hopkins University SLAM Seminar, Mar 14, 2014 Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 1 / 27
Outline of the Talk Introduction and Motivation 1 Method 2 Cox’s model and its asymptotic properties Rubin’s Causal Model Constructing estimator under two Cox models Asymptotic distribution of the estimator Simulations and Example 3 Simulations Example Summary and Discussion 4 Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 2 / 27
The Data Problem Data source: observational study of acute coronary syndrome patients from Duke University Medical Center. Duration of study: 5 years. (start of 1994 - end of 1998) Sample size: 6033 patients. 3786 have been followed for 5+ years or died prior to the end of study (1998); the rest have censored survival times. Treatment groups: PCI group (3868 patients), MED group (2165 patients). 1 Outcome of interest: survival time up to 5 years. Goal: Compare restricted mean lifetime between the two treatment groups, to assess treatment effect . 1 PCI: percutaneous coronary intervention. MED: medically treated. Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 3 / 27
Solution 1: compare group means directly Throw away censored data (assume non-informative censoring). Compare group means. Cons: loss efficiency. Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 4 / 27
Solution 2: use Kaplan-Meier estimate Denote survival function of group j as S j ( t ) , j = 0 , 1. Kaplan-Meier estimator ˆ S j ( t ) , using data from group j . Mean survival time: � L � L µ = E [ T ] = P ( T ≥ t ) dt = S ( t ) dt , (1) 0 0 where L = 5 years. Difference between groups: � L � ˆ � ˆ S 1 ( t ) − ˆ δ = ˆ µ 1 − ˆ µ 0 = S 0 ( t ) dt . (2) 0 Cons: Not adjust for different covariate distribution between groups, so the estimated “treatment effect” is likely to be biased. Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 5 / 27
Solution 3: use Cox model for ˆ S j ( t ) Still use � L � ˆ � ˆ S 1 ( t ) − ˆ δ = ˆ µ 1 − ˆ µ 0 = S 0 ( t ) dt (3) 0 as the treatment effect estimator. Estimate ˆ S j ( t ) using Cox’s proportional hazards model, which can incorporate covariate information in the model. This is the model we focus on. Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 6 / 27
Notations T i : restricted survival time ( ≤ L ). C i : censoring time. ∆ i = I ( T i ≤ C i ) : censoring indicator. X i = min ( T i , C i ) : observed failure time. Z i : covariate vector. N i ( t ) = I ( X i ≤ t , ∆ i = 1 ) . Y i ( t ) = I ( X i ≥ t ) . � t M i ( t ) = N i ( t ) − 0 λ i ( u ) Y i ( u ) du . M ( t ) = � n i = 1 M i ( t ) , N ( t ) = � n i = 1 N i ( t ) , Y ( t ) = � n i = 1 Y i ( t ) . Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 7 / 27
Review of Cox model Assume λ ( t | Z ) = λ 0 ( t ) e β T Z , (4) where λ 0 ( t ) is the unspecified baseline hazard. The estimator ˆ β is the maximizer of the partial likelihood function : � � δ i e β T Z i ( x i ) n � L P ( β ) = , (5) � j ∈ R i e β T Z j ( x i ) i = 1 where x 1 , . . . , x n are n observed survival times. R i = { j : x j ≥ x i } is the risk set, and δ i = I ( t i ≤ c i ) is the observed version of ∆ i . Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 8 / 27
Review of Cox model (continued) We will use Breslow’s estimator (Breslow, 1972 JRSSB [2]) to estimate the cumulative baseline hazard: � δ i ˆ Λ 0 ( t ) = β T Z j ( x i ) . (6) � j ∈ R i e ˆ x i ≤ t With the above definitions, Breslow’s estimator can be rewritten as: � n � t i = 1 dN i ( u ) ˆ Λ 0 ( t ) = β T Z i . (7) � n i = 1 Y i ( u ) e ˆ 0 Asymptotic results: Andersen and Gill, 1982 Annals of Statistics[1]. Basic idea: use counting process martingale representation, then apply martingale central limit theorem. See Fleming and Harrington’s book “Counting Process and Survival Analysis” [4] for a good reference. Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 9 / 27
Rubin’s causal model (very brief) For individual i , define T 0 i and T 1 i to be the outcome if the individual were assigned treatment 0 or 1. Individual causal treatment effect: δ i = T 1 i − T 0 i . Average causal treatment effect for a group of people: � � � � n n n � � � δ = 1 1 1 T 1 T 0 δ i = − . (8) i i n n n i = 1 i = 1 i = 1 This can be estimated by � � � � n n � � 1 1 ˆ ˆ ˆ T 1 T 0 δ = − . (9) i i n n i = 1 i = 1 Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 10 / 27
Our estimator According to Rubin’s model, we want to compare: the restricted mean lifetime if everyone were in treatment group 1. the restricted mean lifetime if everyone were in treatment group 0. So the estimator is: � L � ˆ � ˆ S 1 ( u ) − ˆ δ = S 0 ( u ) du (10) 0 � � � L n n � � 1 S 1 ( u | Z i ) − 1 ˆ ˆ = S 0 ( u | Z i ) du . (11) n n 0 i = 1 i = 1 We estimate the above using two different Cox models. Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 11 / 27
Two models Consider two models. ( A being the treament indicator.) Model 1: λ 0 ( t ) e β T 0 Z , λ ( t | A = 0 , Z ) = (12) λ 1 ( t ) e β T 1 Z . λ ( t | A = 1 , Z ) = (13) Model 2: 1 Z = λ 0 ( t ) e γ T W , λ ( t | A , Z ) = λ 0 ( t ) e γ 0 A + γ T (14) � � T and W = � A , Z T � T . γ 0 , γ T where γ = 1 Bias-variance tradeoff. Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 12 / 27
Estimate parameters in model 1 For model 1, λ 0 ( t ) e β T 0 Z , λ ( t | A = 0 , Z ) = (15) λ 1 ( t ) e β T 1 Z . λ ( t | A = 1 , Z ) = (16) Use individuals in treatment group 0 to estimate ˆ β 0 and ˆ Λ 0 ( u ) : � n � u i = 1 ( 1 − A i ) dN i ( t ) ˆ Λ 0 ( u ) = . (17) � n i = 1 ( 1 − A i ) e ˆ β T 0 Z i Y i ( t ) 0 Use individuals in treatment group 1 to estimate ˆ β 1 and ˆ Λ 1 ( u ) . � � �� ˆ − ˆ ˆ β T S j ( u | Z i ) = exp Λ j ( u ) exp j Z i , j = 0 , 1. � � � L � n � n ˆ i = 1 ˆ i = 1 ˆ 1 S 1 ( u | Z i ) − 1 δ = S 0 ( u | Z i ) du . 0 n n Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 13 / 27
Estimate parameters in model 2 For model 2, 1 Z = λ 0 ( t ) e γ T W , λ ( t | A , Z ) = λ 0 ( t ) e γ 0 A + γ T (18) � � T and W = � A , Z T � T . γ 0 , γ T where γ = 1 γ and ˆ Use all the data from both treatment groups to get ˆ Λ 0 ( u ) : � n � u i = 1 dN i ( t ) ˆ � n Λ 0 ( u ) = γ T W i Y i ( t ) . (19) i = 1 e ˆ 0 � �� � ˆ − ˆ γ T S 0 ( u | Z i ) = exp Λ 0 ( u ) exp ˆ 1 Z i , � �� � ˆ − ˆ γ T S 1 ( u | Z i ) = exp Λ 0 ( u ) exp ˆ γ 0 + ˆ 1 Z i . � � � L � n � n ˆ 1 i = 1 ˆ S 1 ( u | Z i ) − 1 i = 1 ˆ δ = S 0 ( u | Z i ) du . 0 n n Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 14 / 27
� � ˆ Var δ : Influence function Definition Let X n = ( X 1 , . . . , X n ) , with X i i.i.d. following some probability model. Suppose we are interested in estimating some parameter γ , whose true value is γ 0 . An estimator ˆ γ ( X n ) of γ is said to be asymptotically linear , if there exists ϕ ( x ) , such that n √ n (ˆ � 1 √ n γ ( X n ) − γ 0 ) = ϕ ( X i ) + o P ( 1 ) , (20) i = 1 � ϕ ( X ) ϕ ( X ) T � with E [ ϕ ( X )] = 0 and E finite and non-singular. The function ϕ ( x ) is called the influence function for the estimator ˆ γ ( X n ) . Useful in computing asymptotic variance. Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 15 / 27
� � ˆ : derive IF of ˆ Var δ δ General idea: Derive influence functions for ˆ S 0 ( u ) and ˆ S 1 ( u ) . Use Andersen and 1 Gill’s result (1982). � L � L 0 ˆ 0 ˆ Derive influence functions for S 0 ( u ) du and S 1 ( u ) du . 2 � L � L Derive influence functions for ˆ 0 ˆ 0 ˆ δ = S 1 ( u ) du − S 0 ( u ) du . 3 Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 16 / 27
Simulation 1 Under strong null hypothesis: H ∗ 0 : S 1 ( u | Z ) = S 0 ( u | Z ) for all Z . Z ∼ N ( 0 , 1 ) . � 1 + e Z � P ( A = 1 | Z ) = e Z / . � e 1 + 4 Z � T 0 , T 1 ∼ Exponential . Independent censoring: C ∼ Exponential ( 0 . 1 ) . L = 12 . 04. Strong Null Hypothesis δ = 0 ˆ ˆ ˆ δ 1 δ 2 δ KM Bias .0289 .0019 -3.0417 � ˆ � 2 se δ .2297 .1124 .5114 � ˆ � se � δ .2302 .1125 .5136 Coverage Prob. .9470 .9520 .0000 2 Table is extracted from Chen and Tsiatis, 2001 [3]. Tianchen Qian (JHSPH) Causal inference with Cox model SLAM Seminar, Mar 14, 2014 17 / 27
Recommend
More recommend