A unification of mediation and interaction: a four-way - - PowerPoint PPT Presentation

a unification of mediation and
SMART_READER_LITE
LIVE PREVIEW

A unification of mediation and interaction: a four-way - - PowerPoint PPT Presentation

A unification of mediation and interaction: a four-way decomposition Tyler J. VanderWeele Departments of Epidemiology and Biostatistics Harvard School of Public Health 1 Plan of Presentation (1) Questions of Mediation and Interaction (2) A


slide-1
SLIDE 1

A unification of mediation and interaction: a four-way decomposition

Tyler J. VanderWeele Departments of Epidemiology and Biostatistics Harvard School of Public Health

1

slide-2
SLIDE 2

Plan of Presentation

(1) Questions of Mediation and Interaction (2) A Unification of Mediation and Interaction (3) Regression Approaches and Ratio Scales (4) Application to Genetic Epidemiology (5) Relation to Prior Decompositions (6) Concluding Remarks

2

slide-3
SLIDE 3

Mediation

In some research contexts we might be interested in the extent to which the effect of some exposure A on some outcome Y is mediated by an intermediate variable M and to what extent it is direct Stated another way, we are interested in the direct and indirect effects of the exposure In other research contexts we may be interested in whether A and M interact in their effects, and how much of their effects are due to interaction

A M Y

3

slide-4
SLIDE 4

Mediation

In some cases, we may be interested in both mediation and interaction In 2008, GWAS studies found variants 15q25.1 associated with lung cancer (Thorgeirsson et al., 2008; Hung et al., 2008; Amos et al., 2008) These same variant were known to be associated with smoking (average cigarettes per day) (Saccone et al., 2007; Spitz et al., 2008) The variants also increased vulnerability to the harmful effect of smoking, a gene-environment interaction e.g. carriers of the variant allele extract more nicotine and toxins from each cigarette (Le Marchand, 2008) The causal inference literature has developed methods that can assess mediation in the presence of interaction to get direct and indirect effects In this example from genetic epidemiology, most of the effect seemed “direct” (94%) with respect to cigarettes per day (VanderWeele et al. 2012) But this does not clarify the role of interaction itself

4

slide-5
SLIDE 5

Notation

Let Y denote some outcome of interest for each individual Let A denote some exposure or treatment of interest for each individual Let M denote some post-treatment intermediate(s) for each individual (potentially on the pathway between A and Y) Let C denote a set of covariates for each individual Let Ya be the counterfactual outcome (or potential outcome) Y for each individual when intervening to set A to a Let Ma be the counterfactual outcome M for each individual when intervening to set A to a Let Yam be the counterfactual outcome Y for each individual when intervening to set A to a and M to m

5

slide-6
SLIDE 6

A Unification of Mediation and Interaction

We can in fact decompose a total effect, TE = Y1 - Y0, into four components (VanderWeele, 2014) under the “composition” assumption that Ya =YaMa (1)A controlled direct effect (CDE): the effect of A in the absence of M (2)A reference interaction (INTref): The interaction that operates only if the mediator is present in the absence of exposure (3)A mediated interaction (INTmed): The interaction that operates only if the exposure changes the mediator (4)A pure indirect effect (PIE): The effect of the mediator in the absence of the exposure times the effect of the exposure on the mediator

6

slide-7
SLIDE 7

A Unification of Mediation and Interaction

We can summarize the four components as: (1)CDE: Neither mediation nor interaction (2)INTref: Interaction but not mediation (3)INTmed: Both mediation and interaction (4)PIE: Mediation but not interaction

7

slide-8
SLIDE 8

A Unification of Mediation and Interaction

We cannot identify these effects for an individual but, under certain confounding assumptions (next slides), we can identify them on average for a population. If so, we let pam = P(Y=1|A=a,M=m) then we have: We could calculate the proportions due to each of the components:

8

slide-9
SLIDE 9

A Unification of Mediation and Interaction

The four components are:

We could add E[INTref] and E[INTmed] for the overall proportion due to interaction: We could add E[PIE] and E[INTmed] for the overall proportion due to mediation:

9

slide-10
SLIDE 10

Identification

The confounding assumptions are the same as those generally used in the causal inference literature to identify direct and indirect effects: (1) There are no unmeasured exposure-outcome confounders given C (2) There are no unmeasured mediator-outcome confounders given (C,A) (3) There are no unmeasured exposure-mediator confounders given C (4) None of the mediator-outcome confounders are affected by exposure For controlled direct effects,

  • nly assumptions (1) and (2)

are needed Note (1) and (3) are guaranteed when treatment is randomized A M Y C1 C3 C2

10

slide-11
SLIDE 11

Identification

More formally, in counterfactual notation, these assumptions are: (1)is Yam | | A | C (2) is Yam | | M | C,A (3) is Ma | | A | C (4) is Yam | | Ma* | C For controlled direct effects,

  • nly assumptions (1) and (2)

are needed Note (1) and (3) are guaranteed when treatment is randomized A M Y C1 C3 C2

11

slide-12
SLIDE 12

Regression Approach

Similar results hold if one or both of A or M are binary Under the confounding assumptions we can estimate each of the four components in a straightforward way using regression models for Y and M: Under these models if our confounding assumptions, then the effects for a change in the exposure from reference level a* to level a are given by:

12

slide-13
SLIDE 13

Relation to Mediation Decompositions

Our basic four-way decomposition was: If we combine the CDE and INTref we obtain what is sometimes called the “natura/pure direct effect” If we combine the PIE and INTmed we obtain what is some times called the “natural/total indirect effect” (Robins and Greenland1992;Pearl 2001) PDE = Pure direct effect (natural direct effect) = TIE = Total indirect effect (natural indirect effect = These are also sometimes called natural direct and indirect effects This is the decomposition of Robins and Greenland (1992) and Pearl (2001) This is essentially the decomposition used in epidemiology and the social sciences when interaction is absent

13

slide-14
SLIDE 14

Relation to Prior Decompositions

VanderWeele and Tchetgen Tchetgen (2014) also showed the total effect could be divided into CDE, PIE and proportion attributable to interaction; the 4-way decomposition unites all other; We can summarize in a figure:

14

slide-15
SLIDE 15

Ratio Scale

A similar four-way decomposition also holds using a ratio scale Where RRam = pam /p00 and where κ = p00 / pa=0 is a scaling factor If we divide each component by the sum, then κ drops out: We can estimate the components using logistic regression (w/SAS code) We can also proceed with case-control data under a rare outcome assumption

15

slide-16
SLIDE 16

Genetic Epidemiology

In 2008, GWAS studies found variants 15q25.1 associated with lung cancer (Thorgeirsson et al., 2008; Hung et al., 2008; Amos et al., 2008) These same variant were known to be associated with smoking (average cigarettes per day) (Saccone et al., 2007; Spitz et al., 2008) The variants also increased vulnerability to the harmful effect of smoking, a gene-environment interaction e.g. carriers of the variant allele extract more nicotine and toxins from each cigarette (Le Marchand, 2008) When methods for direct and indirect effects were employed most of the effect seemed “direct” with respect to cigarettes per day (VanderWeele et al. 2012) But this did not fully capture the role of interaction; there was evidence for such interaction (Li et al, 2010; Truong et al, 2010; VanderWeele et al, 2012) Now we will examine what proportion of the effect is due (i) to just mediation, (ii) to just interaction, (iii) to both and (iv) to neither

16

slide-17
SLIDE 17

Genetic Epidemiology

The study sample consists of 1836 cases and 1452 controls is from a case control study (cf. Miller et al., 2002) assessing the molecular epidemiology of lung cancer, which began in 1992 at the Massachusetts General Hospital (MGH) Eligible cases included any person over the age of 18 years, with a diagnosis of primary lung cancer that was further confirmed by an MGH lung pathologist. The controls were recruited from among the friends or spouses of cancer patients or the friends or spouses of other surgery patients in the same hospital. Potential controls that carried a previous diagnosis of any cancer (other than non-melanoma skin cancer) were excluded from participation.

17

slide-18
SLIDE 18

Genetic Epidemiology

Sample characteristics of cases and controls _________________________________________________________________ Cases (N=1836) Controls (N=1452) _________________________________________________________________ Average Cigarettes per Day 25.42 13.97 Smoking Duration 38.50 18.93 Age 64.86 58.58 College Education 31.3% 33.5% Sex Male 50.1% 56.1% Female 49.9% 43.9% rs8034191 C alleles 33.8% 43.3% 1 48.5% 43.7% 2 17.7% 13.0%

18

slide-19
SLIDE 19

Assumptions About Confounding

To use our approach with the genetic variants we need to assume no unmeasured confounding for the (1) exposure-outcome, (2) mediator-

  • utcome, and (3) exposure-mediator relationships

Assumptions (1) and (3) are probably plausible for the exposure (the genetic variant) subject to no population stratification (the analysis was restricted to Caucasians) *(2)* No confounding may be less plausible for the smoking – lung cancer association (e.g. SES / neighborhood) We consider sensitivity analysis later (4) Smoking duration may affect cigarettes/day and lung cancer and may affected by the variant (though not much evidence) and results are similar when duration is omitted A M Y C C U

slide-20
SLIDE 20

Genetic Epidemiology

When we apply the four-way decomposition using logistic regression for lung cancer and linear regression for square root of average cigarettes per day (this measure is more normally distributed) comparing 2 to 0 variant alleles we obtain a total effect risk ratio of RR=1.77 and: Most of the direct effect (which is 94%) appears to be due to INTref i.e. to be due to interaction but not mediation; mediation is only about 6% As suspected, the proportion due to interaction is substantial, but now we can quantify this

20

slide-21
SLIDE 21

Genetic Epidemiology

With the two-way (Robins and Greenland, 1992; Pearl, 2001; VanderWeele et al., 2012) decomposition we obscure the role of interaction here because these combine the CDE and INTref into the PDE

21

slide-22
SLIDE 22

Study Summary

(1) Most of the effect seemed to be due to interaction, in the absence

  • f mediation (at least with respect to cigarettes per day)

(2) Both the mediated effect and the reference interaction may be underestimated due to measurement error in the self-reported cigarettes per day measure (Valeri et al., 2014) (3) Other aspects of smoking (e.g. depth of inhalation) may mediate more of the relationship (4) The strong interaction might likewise depend on the smoking variable used (cigarettes per day versus depth of inhalation) (5) At least with respect to cigarettes per day, however, most of the effect is not by increasing cigarettes per day (it does this only by 1 CPD for smokers) but rather because of the interaction

22

slide-23
SLIDE 23

APOE and Memory

The same technique was applied to a similar genetic example with APOE e4 alleles (Sajeev et al., 2015) APOE e4 is associated with Alzheimer’s, cognitive decline, memory, etc. To what extent is the effect of e4 alleles on memory mediated by cerebrovascular disease markers e.g. microbleeds? To what extent is it due to interaction? Data come from 4121 participants in the population-based Age- Gene/Environment Susceptibility (AGES) Study in Reykjavik, Iceland We use techniques for the 4-way decomposition All models adjusted for age, sex, education, diabetes, smoking status, and midlife measures of physical activity, body mass index, systolic blood pressure, and total cholesterol

23

slide-24
SLIDE 24

APOE and Memory

24

slide-25
SLIDE 25

Concluding Remarks

One further theoretical point is of interest Sometimes a portion eliminated measure is proposed as being of more policy relevance (Robins and Greenland, 1992; VanderWeele, 2013):

E[PE] = E[TE] – E[CDE] = E(Y1 - Y0) - E(Y10 - Y00)

i.e. what portion (or proportion) of the effect could we eliminate if we set M=0 The four-way decomposition gives a further causal interpretation of PE: i.e. it is the proportion due to mediation or interaction or both When we fix the mediator to 0, we eliminate both mediation and interaction This is different from the portion mediated (PM) in that it includes INTref In the example PM=6% but PE=61% because of the interaction The CDE (and thus also the portion eliminated) are easier to identify from the data (only confounding assumptions 1 and 2, not 3 and 4, are required)

25

slide-26
SLIDE 26

Concluding Remarks

(1) The four-way decomposition makes clear what proportion of an effect is due (i) to just mediation, (ii) to just interaction, (iii) to both and (iv) to neither (2) It unites, within a single framework, prior decompositions for mediation and prior decompositions for interaction (3) It gives the most insight into both phenomena of mediation and interaction (cf. VanderWeele, 2015) (4) It is relatively straightforward to implement with SAS code (5) Sensitivity analysis for measurement error and unmeasured confounding are available for some mediation and interaction measures; it would be good to extend these to cover each of the four components

26

slide-27
SLIDE 27

27

OXFORD UNIVERSITY PRESS

Explanation in Causal Inference Methods for Mediation and Interaction

2015 │ Hardcover│ ISBN: 9780199325870

slide-28
SLIDE 28

References

Amos, C.I., et al. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat. Genet. 40, 616-622 (2008). Baron RM, Kenny DA. The moderator-mediator variable distinction in social psycho- logical research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology. 1986; 51:1173-1182. Chanock, S.J. & Hunter, D.J. When the smoke clears… Nat. 452, 537-538 (2008). Hosmer, D.W., Lemeshow, S. (1992). Confidence interval estimation of interaction. Epidemiology 3:452-56. Hung, R., et al. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nat. 452, 633-637 (2008). Imai, K., Keele, L., Yamamoto, T. (2010a). Identification, inference, and sensitivity analysis for causal mediation effects. Statistical Science, 25:51-71.

28

slide-29
SLIDE 29

References

Imai, K., Keele, L., Tingley, D. (2010b). A general approach to causal mediation

  • analysis. Psychological Methods, 15:309-334.

Imai, K., Keele, L., Tingley, D., Yamamoto, T. (2010c). Causal mediation analysis using R. In: H.D. Vinod (ed.), Advances in Social Science Research Using R. New York: Springer (Lecture Notes in Statistics), p.129-154. Judd CM, Kenny DA. Process analysis: estimating mediation in treatment

  • evaluations. Eval Rev, 1981;5:602-619.

Lange, T., Vansteelandt, S., and Bekaert, M. (2012). A simple unified approach for estimating natural direct and indirect effects. Am J Epidemiol, 176:190-195. Le Marchand, L., et al. Smokers with the CHRNA lung cancer-associated variants are exposed to higher levels of nicotine equivalents and a carcinogenic tobacco- specific nitrosamine. Cancer Res. 68, 9137-9140 (2008). Pearl J. (2001). Direct and indirect effects. In Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence, 411-20. Morgan Kaufmann, San Francisco.

29

slide-30
SLIDE 30

References

Robins JM, Greenland S. (1992). Identifiability and exchangeability for direct and indirect effects. Epidemiology 3, 143-155. Rothman, K. J. Modern Epidemiology. 1st ed. Little, Brown and Company, Boston, MA (1986). Tchetgen Tchetgen, E.J. (2011). On causal mediation analysis with a survival

  • utcome. International Journal of Biostatistics, 7:Article 33, 1-38.

Tchetgen Tchetgen, E.J. and Shpitser, I. (2012). Semiparametric theory for causal mediation analysis: efficiency bounds, multiple robustness, and sensitivity analysis. Annals of Statistics, 40:1816-1845. Valeri, L., Lin, X., and VanderWeele, T.J. (2014). Mediation analysis when a continuous mediator is measured with error and the outcome follows a generalized linear model. Statistics in Medicine. Valeri, L. and VanderWeele, T.J., Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychological Methods, 18:137-150.

30

slide-31
SLIDE 31

References

VanderWeele, T.J. (2010). Bias formulas for sensitivity analysis for direct and indirect

  • effects. Epidemiology, 21:540-551.

VanderWeele, T.J. (2011). Causal mediation analysis with survival data. Epidemiol, 22:582-585. VanderWeele, T.J. (2013). Policy-relevant proportions for direct effects. Epidemiology, 24:175-176. VanderWeele, T.J. (2013). A three-way decomposition of a total effect into direct, indirect, and interactive effects. Epidemiology, 24: 24:224-232. VanderWeel, T.J. (2014). A unification of mediation and interaction: a four-way

  • decomposition. Epidemiology, 25:749-761.

VanderWeel, T.J. (2015). Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press: New York, in press.

31

slide-32
SLIDE 32

References

VanderWeele, T.J., Asomaning, K., Tchetgen Tchetgen, E.J., Han, Y., Spitz, M.R., Shete, S., Wu, X., Gaborieau, V., Wang, Y., McLaughlin, J., Hung, R.J., Brennan, P., Amos, C.I., Christiani, D.C. and Lin, X. (2012). Genetic variants on 15q25.1, smoking and lung cancer: an assessment of mediation and interaction. American Journal of Epidemiology, 75:1013-1020. VanderWeele, T.J. and Tchetgen Tchetgen, E.J. (2014). Attributing effects to

  • interactions. Epidemiology, 25:711-722.

VanderWeele, T.J. and Vansteelandt, S. (2009). Conceptual issues concerning mediation, interventions and composition. Statistics and Its Interface 2:457-468. VanderWeele, T.J. and Vansteelandt, S. (2010). Odds ratios for mediation analysis with a dichotomous outcome. American Journal of Epidemiology, 172:1339-1348. VanderWeele, T.J. and Vansteelandt, S. (2013). Mediation analysis with multiple

  • mediators. Epidemiologic Methods, 2:95-115.

32

slide-33
SLIDE 33

General Decomposition

33

slide-34
SLIDE 34

General Decomposition

34