structural nested mean models for assessing time varying
play

Structural Nested Mean Models for Assessing Time-Varying Causal - PowerPoint PPT Presentation

1 Structural Nested Mean Models for Assessing Time-Varying Causal Effect Moderation Daniel Almirall 1 Thomas R. Ten Have 2 Susan A. Murphy 3 1 Health Services Research in Primary Care, Durham VA MC 1 Biostatistics & Bioinformatics Department,


  1. 1 Structural Nested Mean Models for Assessing Time-Varying Causal Effect Moderation Daniel Almirall 1 Thomas R. Ten Have 2 Susan A. Murphy 3 1 Health Services Research in Primary Care, Durham VA MC 1 Biostatistics & Bioinformatics Department, Duke Univ MC 2 Clinical Epi & Biostatistics, Univ of Pennsylvania Medicine 3 Statistics Department & ISR, Univ of Michigan May 21, 2009 2009 Atlantic Causal Modeling Conference Philadelphia, Pennsylvania

  2. Contents 2 Contents 1 Warm-up: Suppose we want A → Y . 4 2 Effect Moderation in One Time Point 7 3 Mean Model in One Time Point 9 4 Time-Varying Effect Moderation 10 5 Robins’ Structural Nested Mean Model 13 6 Estimation in Time-Varying Setting 15

  3. Contents 3 Sequential Ignorability Given ¯ 7 S K 21 8 Application of the SNMM 22 9 Future Work 25 10 Extra Slides 34

  4. Warm-up: Suppose we want A → Y . 1 4 1 Warm-up: Suppose we want A → Y . A Y ? S Examples S = pre- A covt A = txt/expsr Y = outcome Suicidal? Medication? Depression Gender,SES SAT Coaching? SAT Math Score Social Support Inpatient vs. Outpatient Substance Abuse Why condition on (“adjust for”) pre-exposure covariables S ?

  5. Warm-up: Suppose we want A → Y . 1 5 Suppose we want the effect of A on Y . Why condition on (adjust for) pre-treatment (or pre-exposure) variables S ? 1. Confounding : S is correlated with both A and Y . In this case, S is known as a “confounder” of the effect of A on Y . 2. Precision : S may be a pre-treatment measure of Y, or any other variable highly correlated with Y . 3. Missing Data : The outcome Y is missing for some units, S and A predict missingness, and S is associated with Y . 4. Effect Heterogeneity : S may moderate, temper, or specify the effect of A on Y . In this case, S is known as a “moderator” of the effect of A on Y .

  6. Warm-up: Suppose we want A → Y . 1 6 Suppose we want the effect of A on Y . Why condition on (adjust for) pre-treatment (or pre-exposure) variables S ? A S Y 4. Effect Heterogeneity : S may moderate, temper, or specify the effect of A on Y . In this case, S is known as a “moderator” of the effect of A on Y . Formalized in next slide.

  7. 2 Effect Moderation in One Time Point 7 2 Effect Moderation in One Time Point µ ( s, a ) ≡ E ( Y ( a ) − Y (0) | S = s ) µ ( s ) = E( Y(inpat) − Y(outpat) | S=s ) Y(a) = Substance Use: Low is better a = 1 = residential a = 0 = outpatient µ = 0 = No Effect S = Social Support: High is better S = Social Support: High is better Outpatient substance abuse treatment is better than residential treatment for individuals with higher levels of social support.

  8. 2 Effect Moderation in One Time Point 8 Causal Effect Moderation in Context: Relevance? Theoretical Implication: Understanding the heterogeneity of the effects of treatments or exposures enhances our understanding of various (competing) scientific theories; and it may suggest new scientific hypotheses to be tested. Elaboration of Yu Xie’s Social Grouping Principle: We really want Y i ( a ) − Y i (0) ∀ i . We settle for “groupings” of effects (here, groupings by S ); µ ( s, a ) “comes closer” than E ( Y ( a ) − Y (0)) . Practical Implication: Identifying types, or subgroups, of individuals for which treatment or exposure is not effective may suggest altering the treatment to suit the needs of those types of individuals.

  9. 3 Mean Model in One Time Point 9 3 Mean Model in One Time Point Decomposition of the conditional mean E ( Y ( a ) | S ) and the prototypical linear model: E ( Y ( a ) | S = s ) = E ( Y (0) | S = 0) � � + E ( Y (0) | S = s ) − E ( Y (0) | S = 0) + E ( Y ( a ) − Y (0) | S = s ) = η 0 + φ ( s ) + µ ( s, a ) e.g. = η 0 + η 1 s + β 1 a + β 2 as. This is precisely what I would do, too.

  10. 4 Time-Varying Effect Moderation 10 4 Time-Varying Effect Moderation The data structure in the time-varying setting is: S 1 a 1 a 2 Y ( a 1 , a 2 ) S 2 ( a 1 ) PROSPECT (Prevention of Suicide in Primary Care Elderly: CT) ( a 1 , a 2 ) Time-varying treatment pattern; a t is binary (0,1) Y ( a 1 , a 2 ) Depression at the end of the study; continuous S 1 Suicidal Ideation at baseline visit; continuous S 2 ( a 1 ) Suicidal Ideation at second visit; continuous Ex: What is the effect of switching off treatment for depression early versus later, as a function of time-varying suicidal ideation?

  11. 4 Time-Varying Effect Moderation 11 Formal Definition of Time-Varying Causal Effects Conditional Intermediate Causal Effect at t = 2 : µ 2 ( ¯ s 2 , ¯ a 2 ) = E [ Y ( a 1 , a 2 ) − Y ( a 1 , 0) | S 1 = s 1 , S 2 ( a 1 ) = s 2 ] a 2 a 1 Y ( a 1 , a 2 ) S 1 S 2 ( a 1 ) Conditional Intermediate Causal Effect at t = 1 : µ 1 ( s 1 , a 1 ) = E [ Y ( a 1 , 0) − Y (0 , 0) | S 1 = s 1 ] Set a 1 a 2 = 0 Y ( a 1 , 0) S 1

  12. 4 Time-Varying Effect Moderation 12 Formal Definition of Time-Varying Causal Effects Conditional Intermediate Causal Effect at t = 2 : µ 2 ( ¯ s 2 , ¯ a 2 ) = E [ Y ( a 1 , a 2 ) − Y ( a 1 , 0) | S 1 = s 1 , S 2 ( a 1 ) = s 2 ] a 2 a 1 Y ( a 1 , a 2 ) S 1 S 2 ( a 1 ) Conditional Intermediate Causal Effect at t = 1 : µ 1 ( s 1 , a 1 ) = E [ Y ( a 1 , 0) − Y (0 , 0) | S 1 = s 1 ] Set a 1 a 2 = 0 Y ( a 1 , 0) S 1 S 2 ( a 1 )

  13. 5 Robins’ Structural Nested Mean Model 13 5 Robins’ Structural Nested Mean Model The SNMM for the conditional mean of Y ( a 1 , a 2 ) given ¯ S 2 ( a 1 ) is: � � E Y ( a 1 , a 2 ) | S 1 , S 2 ( a 1 ) � � = E [ Y (0 , 0)] + E [ Y (0 , 0) | S 1 ] − E [ Y (0 , 0)] � �� � + Y ( a 1 , 0 ) − Y ( 0 , 0 ) | S 1 E � � E [ Y ( a 1 , 0) | ¯ + S 2 ( a 1 )] − E [ Y ( a 1 , 0) | S 1 ] � �� � Y ( a 1 , a 2 ) − Y ( a 1 , 0 ) | ¯ + S 2 ( a 1 ) E = µ 0 + ǫ 1 ( s 1 ) + µ 1 ( s 1 , a 1 ) + ǫ 2 (¯ s 2 , a 1 ) + µ 2 ( ¯ s 2 , ¯ a 2 ) e.g. = µ 0 + ǫ 1 ( s 1 ) + β 10 a 1 + β 11 a 1 s 1 + ǫ 2 (¯ s 2 , a 1 ) + β 20 a 2 + β 21 a 2 s 1 + β 22 a 2 s 2

  14. 5 Robins’ Structural Nested Mean Model 14 Constraints on the Causal and Nuisance Portions � � Y ( a 1 , a 2 ) | ¯ E S 2 ( a 1 ) = ¯ s 2 = µ 0 + ǫ 1 ( s 1 ) + µ 1 ( s 1 , a 1 ) + ǫ 2 (¯ s 2 , a 1 ) + µ 2 ( ¯ s 2 , ¯ a 2 ) , where · µ 2 (¯ s 2 , a 2 , 0) = 0 and µ 1 ( s 1 , 0) = 0 , s 2 , a 1 ) = E [ Y ( a 1 , 0) | ¯ · ǫ 2 (¯ S 2 ( a 1 ) = ¯ s 2 ] − E [ Y ( a 1 , 0) | S 1 = s 1 ] , · ǫ 1 ( s 1 ) = E [ Y (0 , 0) | S 1 = s 1 ] − E [ Y (0 , 0)] , · E S 2 | S 1 [ ǫ 2 (¯ s 2 , a 1 ) | S 1 = s 1 ] = 0 , and E S 1 [ ǫ 1 ( s 1 )] = 0 . The ǫ t ’s make the SNMM a non-standard regression model.

  15. 6 Estimation in Time-Varying Setting 15 6 Estimation in Time-Varying Setting Recall that parametric models for our causal estimands µ 1 and µ 2 are based on the set of parameters β = ( β ′ 1 , β ′ 2 ) ′ . We considered two estimators for β : 1. Proposed 2-Stage Regression Estimator 2. Robins’ Semi-parametric G-Estimator In order to make causal inferences, both estimators rely on Robins’ Sequential Ignorability (or Sequential Randomization) Assumption . We discuss the two estimators in turn, but first . . .

  16. 6 Estimation in Time-Varying Setting 16 So what’s wrong with the Traditional Estimator? An Example of The Traditional Estimator : Apply OLS with E ( Y | ¯ s 2 , ¯ a 2 ) = β ∗ 0 + η 1 s 1 + β ∗ 1 a 1 + β ∗ S 2 = ¯ A 2 = ¯ 2 a 1 s 1 + η 2 s 2 + β ∗ 3 a 2 + β ∗ 4 a 2 s 1 + β ∗ 5 a 2 s 2 - Possibly incorrectly specified nuisance functions. - Two problems arise with the interpretation of β ∗ 1 and β ∗ 2 (i.e., parameters thought to represent µ 1 ) when using the traditional regression estimator. We describe them next. - These problems may occur even in the absence of time-varying confounders (that is, even under Sequential Ignorability) . . .

  17. 6 Estimation in Time-Varying Setting 17 First problem with the Traditional Approach Wrong Effect 4-month Visit Baseline 8-month Visit Set a 1 a 2 = 0 Y ( a 1 , 0) S 1 S 2 ( a 1 ) But what about the effect transmitted through S 2 ( a 1 ) ? The term β ∗ 1 a 1 + β ∗ 2 a 1 s 1 does not capture the “total” impact of ( a 1 , 0) vs (0 , 0) on Y ( a 1 , a 2 ) given values of S 1 .

  18. 6 Estimation in Time-Varying Setting 18 Second problem with the Traditional Approach Spurious Effect 4-month Visit Baseline 8-month Visit V 0 Set a 1 a 2 = 0 Y ( a 1 , 0) S 1 S 2 ( a 1 ) This is also known as “Berkson’s paradox”; and is related to Judea Pearl’s back-door criterion.

  19. 6 Estimation in Time-Varying Setting 19 Proposed 2-Stage Regression Estimator The proposed 2-Stage Estimator resembles the Traditional Estimator. Instead of using the Traditional Estimator E ( Y | ¯ s 2 , ¯ a 2 ) = β ∗ 0 + η 1 s 1 + β ∗ 1 a 1 + β ∗ S 2 = ¯ A 2 = ¯ 2 a 1 s 1 + η 2 s 2 + β ∗ 3 a 2 + β ∗ 4 a 2 s 1 + β ∗ 5 a 2 s 2 , we use the following E ( Y | ¯ s 2 , ¯ a 2 ) = β ∗ 0 + η 1 s 1 + β ∗ 1 a 1 + β ∗ S 2 = ¯ A 2 = ¯ 2 a 1 s 1 � � + β ∗ 3 a 2 + β ∗ 4 a 2 s 1 + β ∗ + η 2 s 2 − E ( S 2 | A 1 , S 1 ) 5 a 2 s 2 . We call it “2-Stage” because first we estimate E ( S 2 | A 1 , S 1 ) , � then use the residual s 2 − E ( S 2 | A 1 , S 1 ) in a second regression to get β ’s. Use sandwich/robust SEs for inference (p-vals, CIs, etc.).

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend