Comparing covariate adjustment in interventional and observational - - PowerPoint PPT Presentation

β–Ά
comparing covariate adjustment in interventional and
SMART_READER_LITE
LIVE PREVIEW

Comparing covariate adjustment in interventional and observational - - PowerPoint PPT Presentation

Comparing covariate adjustment in interventional and observational studies Markus Kalisch, Seminar fr Statistik, ETH Zrich What is the total causal effect ? Treatment Outcome If we apply treatment , how will outcome


slide-1
SLIDE 1

Markus Kalisch, Seminar fΓΌr Statistik, ETH ZΓΌrich

Comparing covariate adjustment in interventional and observational studies

slide-2
SLIDE 2
  • If we apply treatment π‘Œ, how will outcome 𝑍 change ?
  • Data collection:
  • observational study
  • interventional study (RCT)

1

What is the total causal effect ?

Treatment π‘Œ Outcome 𝑍

slide-3
SLIDE 3
  • Total causal effect and covariate adjustment
  • Issues in observational studies
  • Issues in interventional studies
  • Insights from recent theoretical developments

2

Outline for the rest of the talk

slide-4
SLIDE 4

3

Causal Model: How the real world might look like

  • We use directed acyclic graphs (DAG) – no feedback loops
  • Example: DAG 𝐻
  • Terminology:

Set of all variables: 𝒀 = {π‘Œ1, π‘Œ2, … , π‘Œ5, 𝑍} Path: π‘Œ1, π‘Œ2, π‘Œ3, 𝑍 Directed path = β€œcausal-path”: π‘Œ1, π‘Œ2, π‘Œ3 Not directed path = Non-causal path: (π‘Œ4,π‘Œ3,𝑍) Parents p𝑏 π‘Œ3 = {π‘Œ2, π‘Œ4}, Children cβ„Ž π‘Œ1 = π‘Œ2 Ancestor π‘π‘œ, Descendant 𝑒𝑓, Non-descendants π‘œπ‘’

π‘Œ1 π‘Œ2 π‘Œ3 𝑍 π‘Œ4 π‘Œ5 Think of family tree

slide-5
SLIDE 5

4

More details: Structural Equation Model (SEM)

  • Example of SEM:

π‘Œ1 = 𝑂

1

π‘Œ2 = 4π‘Œ1 + 𝑂2 𝑂

1, 𝑂2 ∼ 𝑂 0,1

𝑗𝑗𝑒

  • Visualization of causal structure:
  • Difference to arbitrary hierarchical system of equations:

Due to causal interpretation, solving for a variable on the RHS is not meaningful in SEM.

π‘Œ1 π‘Œ2 Causal interpretation

slide-6
SLIDE 6

5

Quantifying the total causal effect

Define intervention distribution by replacing (some) structural equations

  • do-Operator

Reference: Pearl, J. (2009). Causality: Models, Reasoning and Inference. 2nd edition. Cambridge Univ. Press.

E.g. Β«intervention on π‘ŒΒ»:

  • Old SEM: 𝑇 with equation π‘Œ = 2 + π‘Œ5 + 𝑂

π‘Œ

  • New SEM: መ

𝑇 with equation π‘Œ = 4

  • New SEM generates new distribution:

𝑄 መ

𝑇(𝒀) = 𝑄 𝑇(𝒀|𝑒𝑝 π‘Œ = 4 ) and in particular P(𝑍|𝑒𝑝(π‘Œ = 4))

  • Final goal: Estimate intervention distribution given observational data
  • Oftentimes: Expectation is enough – e.g. 𝐹(𝑍|𝑒𝑝 π‘Œ = 4 )
slide-7
SLIDE 7
  • Idea: Identify intervention effects by only using conditional probabilities /

expectations

  • Practice: Often interested in 𝐹 𝑍 = 𝑧 𝑒𝑝 π‘Œ = 𝑦
  • Can show for multivariate Gaussian density:

𝐹 𝑍 𝑒𝑝 π‘Œ = 𝑦 = 𝛽 + 𝛿𝑦 + π›Ύπ‘ˆπΉ 𝐢

  • Total Causal Effect:

d 𝑒𝑦 𝐹 𝑍 𝑒𝑝 π‘Œ = 𝑦

= 𝛿 This is the regression coefficient of π‘Œ in the regression of 𝑍 on π‘Œ and 𝐢

6

Covariate adjustment: Adjustment set

π‘Œ 𝑍 𝐢 𝑄 𝑍 = 𝑧 𝑒𝑝 π‘Œ = 𝑦 = ෍

π‘βˆˆπΆ

𝑄 𝑍 = 𝑧 π‘Œ = 𝑦, 𝐢 = 𝑐 𝑄(𝐢 = 𝑐) Adjustment set Β«doΒ» No Β«doΒ»

slide-8
SLIDE 8
  • Total causal effect and covariate adjustment
  • Issues in observational studies
  • Issues in interventional studies
  • Insights from recent theoretical developments

7

Outline for the rest of the talk

slide-9
SLIDE 9

8

Causal Diagram: Example 1 - confounder

Treatment Outcome Should we add the lab information as covariate ? Lab 1,2,…

slide-10
SLIDE 10
  • πœπ‘Œ ∼ 𝑂(0,1), πœπ‘Ž ∼ 𝑂(0,1), πœπ‘ ~ 𝑂(0,1) independent
  • True causal system:

π‘Ž = πœπ‘Ž π‘Œ = 0.7 βˆ— π‘Ž + πœπ‘Œ 𝑍 = 1 βˆ— π‘Œ + 0.5 βˆ— π‘Ž + πœπ‘

  • True causal effect of π‘Œ on 𝑍: 1

If we increase π‘Œ by one unit, 𝑍 will also increase by one unit

  • Can we estimate the true causal effect with a linear regression ?

9

Example 1 in numbers

π‘Œ 𝑍 π‘Ž

slide-11
SLIDE 11
  • True causal effect of π‘Œ on 𝑍: 1
  • Simple Regression: π‘šπ‘›(𝑍 ~ π‘Œ)
  • Multiple Regression: π‘šπ‘›(𝑍~π‘Œ + π‘Ž)

10

Example 1 in numbers

Correct Incorrect Missing the confounder introduced a bias! π‘Œ 𝑍 π‘Ž

slide-12
SLIDE 12

11

Causal Diagram: Example 2 – selection variable

Treatment Outcome Should we add the info of the follow-up test as covariate ? Follow-up test

slide-13
SLIDE 13
  • πœπ‘Œ ∼ 𝑂(0,1), πœπ‘Ž ∼ 𝑂(0,1), πœπ‘ ~ 𝑂(0,1) independent
  • True causal system:

X = πœπ‘Œ Y = 0.7 βˆ— π‘Œ + πœπ‘ Z = 0.8 βˆ— π‘Œ + 0.5 βˆ— 𝑍 + πœπ‘Ž

  • True causal effect of π‘Œ on 𝑍: 0.7

If we increase π‘Œ by one unit, 𝑍 will also increase by 0.7 units

  • Can we estimate the true causal effect with a linear regression ?

12

Example 2 in numbers

π‘Œ 𝑍 π‘Ž

slide-14
SLIDE 14
  • True causal effect of π‘Œ on 𝑍: 0.7
  • Simple Regression: π‘šπ‘›(𝑍 ~ π‘Œ)
  • Multiple Regression: π‘šπ‘›(𝑍~π‘Œ + π‘Ž)

13

Example 2 in numbers

Correct Incorrect Including the selection variable introduced a bias! π‘Œ 𝑍 π‘Ž

slide-15
SLIDE 15
  • Take parents of π‘Œ as adjustment set (special case of Pearl’s back-door

criterion)

  • Sufficient but not complete
  • Example 1:

PC: π‘Ž is a valid adjustment set; would {} be a valid adjustment set, too β†’ ??? (perhaps we can not measure π‘Ž although we know it exists)

  • Example 2:

PC: {} is a valid adjustment set; would π‘Ž be a valid adjustment set, too β†’ ???

14

β€œParent Criterion” (PC)

π‘Œ 𝑍 π‘Ž π‘Œ 𝑍 π‘Ž

slide-16
SLIDE 16

In observational studies: Judging if an adjustment set is valid is not trivial

15

Conclusion 1

slide-17
SLIDE 17
  • Total causal effect and covariate adjustment
  • Issues in observational studies
  • Issues in interventional studies
  • Insights from recent theoretical developments

16

Outline for the rest of the talk

slide-18
SLIDE 18
  • Cage: Experimental Unit
  • 5 cages with treatment (π‘Œ = 1), 5 cages with control (π‘Œ = 0)
  • Randomize allocation: In causal diagram think of β€œdeleting all incoming edges

to π‘Œβ€

17

RCT: Evaluation

Treatment Control Control Control Treatment Treatment Control Treatment Treatment Control

slide-19
SLIDE 19

18

RCT in causal diagram

𝒀 𝒁 π‘Ž1 π‘Ž2 𝐡 𝐢 𝐷 RCT 𝒀 𝒁 π‘Ž1 π‘Ž2 𝐡 𝐢 𝐷 PC: Valid adjustement set is {π‘Ž1, π‘Ž2} PC: Valid adjustment set is {} PC: {} is always valid adjustment set after randomization

slide-20
SLIDE 20
  • Given a proper design, we can do a two-sample t-test with two groups (i.e.

empty adjustment set).

  • What if we have more covariates (sex, age, intermediate blood test, follow-up

information, …) ?

  • Is it always better to add covariates to the analysis ?

19

RCT: Evaluation

Treatment Control Control Control Treatment Treatment Control Treatment Treatment Control

slide-21
SLIDE 21
  • You can bias ( β€œmess up” ), the analysis by adding the β€œwrong” covariates.
  • RCT: It is always safe to add no covariates to the analysis.
  • Adding the β€œright” covariates might increase precision.

20

Messing up the evaluation of a randomized controlled trial (RCT)

slide-22
SLIDE 22

21

Causal Diagram: Example 1

Treatment Intermediate Blood T est Outcome Should we add the intermediate blood test as covariate ?

slide-23
SLIDE 23
  • πœπ‘Œ ∼ 𝑂(0,1), πœπ‘Ž ∼ 𝑂(0,1), πœπ‘ ~ 𝑂(0,1) independent
  • True causal system:

π‘Œ = πœπ‘Œ π‘Ž = 2 βˆ— π‘Œ + πœπ‘Ž 𝑍 = 0.5 βˆ— π‘Ž + πœπ‘

  • True causal effect of π‘Œ on 𝑍: 2 βˆ— 0.5 = 1

If we increase π‘Œ by one unit, 𝑍 will also increase by one unit

  • Can we estimate the true causal effect with a linear regression ?

22

Example 1 in numbers

π‘Œ π‘Ž 𝑍

slide-24
SLIDE 24
  • True causal effect of π‘Œ on 𝑍: 2 βˆ— 0.5 = 1
  • Simple Regression: π‘šπ‘›(𝑍 ~ π‘Œ)
  • Multiple Regression: π‘šπ‘›(𝑍~π‘Œ + π‘Ž)

23

Example 1 in numbers

π‘Œ π‘Ž 𝑍 Correct Incorrect Adding a covariate introduced a bias!

slide-25
SLIDE 25

24

Causal Diagram: Example 2

Treatment Outcome Should we add the lab information as covariate ? Lab 1,2,…

slide-26
SLIDE 26
  • πœπ‘Œ ∼ 𝑂(0,1), πœπ‘Ž ∼ 𝑂(0,1), πœπ‘ ~ 𝑂(0,1) independent
  • True causal system:

π‘Œ = πœπ‘Œ π‘Ž = πœπ‘Ž 𝑍 = 1 βˆ— π‘Œ + 0.5 βˆ— π‘Ž + πœπ‘

  • True causal effect of π‘Œ on 𝑍: 1

If we increase π‘Œ by one unit, 𝑍 will also increase by one unit

  • Can we estimate the true causal effect with a linear regression ?

25

Example 2 in numbers

π‘Œ 𝑍 π‘Ž

slide-27
SLIDE 27
  • True causal effect of π‘Œ on 𝑍: 1
  • Simple Regression: π‘šπ‘›(𝑍~π‘Œ)
  • Multiple Regression: π‘šπ‘›(𝑍~π‘Œ + π‘Ž)

26

Example 2 in numbers

π‘Œ 𝑍 π‘Ž Correct Correct

  • Adding a covariate did not

introduce a bias

  • Confidence interval with covariate

is slightly smaller (0.12 vs 0.14)

slide-28
SLIDE 28
  • Adding the wrong variable will introduce a bias

β€œWrong variable”: On causal path from π‘Œ to 𝑍 or Β«descendantsΒ» of those nodes (post-intervention)

  • Adding the right variables might increase precision

β€œRight variable”: Parents of nodes on causal path from π‘Œ to 𝑍 (pre-intervention)

  • Problem in practice:

Usually don’t know true causal structure! What are β€œright” and β€œwrong” variables ?

  • If in doubt, don’t use covariate !
  • Safe variables: Things that clearly β€œpreceded” π‘Œ (e.g. gender)

27

Summary

𝒀 𝒁 π‘Ž1 π‘Ž2 𝐡 𝐢 𝐷

slide-29
SLIDE 29
  • Total causal effect and covariate adjustment
  • Issues in observational studies
  • Issues in interventional studies
  • Insights from recent theoretical developments

28

Outline for the rest of the talk

slide-30
SLIDE 30

Getting the β€œright estimate”:

  • given causal structure, criterion to check if a set is a valid adjustment set
  • assuming causal structure is a strong assumption in practice
  • discussion can shift to discussing reasonable causal structures
  • Pearl’s back-door criterion
  • Generalized Adjustment criterion

29

Adjustment Criteria

slide-31
SLIDE 31

Background: d-separation

  • Given a DAG 𝐻: π‘Œ and 𝑍 are d-separated (Β«blockedΒ») by {π‘Ž1,… , π‘Žπ‘ž} if you can

not walk from π‘Œ to 𝑍.

  • Rules for walking from π‘Œ to 𝑍:

30

π‘Žπ‘— π‘Žπ‘— π‘Žπ‘— 𝐢 ... A 𝐢 𝐢 𝐢 𝐢 ... not blocked blocked π‘Žπ‘— π‘Žπ‘—

  • r

DAG

slide-32
SLIDE 32

d-separation: Example

  • π‘Œ1 and π‘Œ3 are d-sep by π‘Œ2
  • π‘Œ1 and π‘Œ3 are not d-sep by {}
  • π‘Œ2 and π‘Œ4 are d-sep by {}
  • π‘Œ2 and π‘Œ4 are not d-sep by π‘Œ3
  • π‘Œ2 and π‘Œ4 are not d-sep by π‘Œ5

31

π‘Œ1 π‘Œ2 π‘Œ3 π‘Œ4 π‘Œ5

slide-33
SLIDE 33
  • Improvement on Parent Criterion
  • PBC: Set π‘Ž satisfies back-door criterion relative to (π‘Œ, 𝑍) if
  • No node in π‘Ž is a descendant of π‘Œ and
  • π‘Ž d-separates every path between π‘Œ and 𝑍 that contains an arrow into π‘Œ
  • Example: Parents of π‘Œ always satisfy the back-door criterion
  • Result (Pearl): If a set of variables π‘Ž satisfies the back-door criterion relative to

(π‘Œ, 𝑍), then π‘Ž is a valid adjustment set.

32

Pearl’s back-door criterion (PBC)

𝑄 𝑍 = 𝑧 𝑒𝑝 π‘Œ = 𝑦 = ෍

𝑨

𝑄 𝑍 = 𝑧 π‘Œ = 𝑦, π‘Ž = 𝑨 𝑄(π‘Ž = 𝑨)

slide-34
SLIDE 34
  • Empty set satisfies back-door criterion
  • π‘Ž does not satisfy back-door criterion,

but π‘Ž is a valid adjustment set !

  • β†’ Pearl’s back-door criterion is not complete

33

Pearls back-door criterion is not complete

𝒀 𝒁 𝐚 Correct

slide-35
SLIDE 35

Getting the β€œright estimate”:

  • β€œSound and complete” (= correct and does not miss anything)
  • We will simplify and show results only for DAGs and single node

interventions

  • GAC is general:
  • DAGs, PDAGs, CPDAGs
  • MAGs, PAGs
  • sets and not only single variables

34

Improvements: Generalized Adjustment Criterion (GAC) & asymptotic variance

slide-36
SLIDE 36

GAC for DAGs: Preliminaries

35

  • Causal nodes π·π‘œ(π‘Œ,𝑍,𝐻) relative to π‘Œ and 𝑍 in 𝐻:

All nodes on a causal path from π‘Œ to 𝑍 (excluding π‘Œ but including 𝑍)

  • Forbidden set 𝐺𝑝𝑠𝑐 π‘Œ,𝑍,𝐻 relative to nodes π‘Œ and 𝑍 in DAG 𝐻:

All nodes on causal paths from π‘Œ to 𝑍 (excluding π‘Œ but including 𝑍) and all descendants of those nodes together with π‘Œ. 𝐺𝑝𝑠𝑐 π‘Œ,𝑍,𝐻 = 𝐸𝑓 π·π‘œ π‘Œ, 𝑍,𝐻 βˆͺ π‘Œ

  • Example

𝒀 𝒁 π‘Ž1 π‘Ž2 𝐡 𝐢 𝐷 π·π‘œ π‘Œ, 𝑍, 𝐻 = {𝐡, 𝑍} 𝐺𝑝𝑠𝑐 π‘Œ, 𝑍, 𝐻 = {𝐡, 𝑍, 𝐢, 𝐷, π‘Œ} See also Shpitser, 2012 for the DAG case π‘Ž3 β€œpost-treatment”

slide-37
SLIDE 37

GAC for DAGs

36

π‘Ž is an adjustment set relative to (π‘Œ,𝑍) in 𝐻 if and only if

  • no node in π‘Ž is in the forbidden set relative to π‘Œ and 𝑍 in 𝐻 and
  • all non-causal paths from π‘Œ to 𝑍 are blocked by π‘Ž in 𝐻.

Example:

𝒀 𝒁 π‘Ž1 𝐡 𝐢 𝐷 π’‚πŸ’ Possible choices for blocking: π‘Ž3 βˆͺ any subset of π‘Ž1, π‘Ž2, π‘Ž4 β†’ 8 possible valid adjustment sets π‘Ž4 β€œpost-treatment” β€œpre-treatment” π‘Ž2 π·π‘œ π‘Œ, 𝑍, 𝐻 = {𝐡, 𝑍} 𝐺𝑝𝑠𝑐 π‘Œ, 𝑍, 𝐻 = {𝐡, 𝑍, 𝐢, 𝐷, π‘Œ}

  • R package dagitty
  • Online tool dagitty
slide-38
SLIDE 38

Getting more precision

  • All 8 adjustment sets have no bias but which one has lowest (asymptotic)

variance ?

  • Optimal set 𝑃 π‘Œ, 𝑍, 𝐻 = 𝑄𝑏 π·π‘œ π‘Œ,𝑍,𝐻 , 𝐻 \ 𝐺𝑝𝑠𝑐 π‘Œ, 𝑍, 𝐻
  • In example: π·π‘œ π‘Œ, 𝑍, 𝐻 = 𝐡, 𝑍 , 𝑄𝑏 π·π‘œ π‘Œ, 𝑍, 𝐻 , 𝐻 = π‘Œ, π‘Ž1, π‘Ž3,π‘Ž4

Of those, π‘Œ is in 𝐺𝑝𝑠𝑐(π‘Œ, 𝑍, 𝐻). Thus, 𝑃 π‘Œ, 𝑍, 𝐻 = {π‘Ž1,π‘Ž3, π‘Ž4}

37

𝒀 𝒁 π‘Ž1 𝐡 𝐢 𝐷 π’‚πŸ’ Possible choices for blocking: π‘Ž3 βˆͺ any subset of π‘Ž1, π‘Ž2, π‘Ž4 β†’ 8 possible valid adjustment sets π‘Ž4 β€œpost-treatment” β€œpre-treatment” π‘Ž2 (For linear structural equation models with Gaussian errors) π·π‘œ π‘Œ, 𝑍, 𝐻 = {𝐡, 𝑍} 𝐺𝑝𝑠𝑐 π‘Œ, 𝑍, 𝐻 = {𝐡, 𝑍, 𝐢, 𝐷, π‘Œ} Current research of L. Henckel and M. Maathuis

slide-39
SLIDE 39
  • Total causal effect and covariate adjustment

β†’ find the β€œright” adjustment set β†’ linear regression

  • Issues in observational studies

β†’ not easy to find right adjustment set; bigger β‰  better

  • Issues in interventional studies

β†’ can β€œmess up” RCT by using β€œwrong” adjustment set; if in doubt, use empty set after RCT

  • Insights from recent theoretical developments

β†’ GAC is sound and complete for finding adjustment set given causal structure (strong assumption) β†’ discussion can shift to discussing reasonable causal structures β†’ RCT remains gold standard

38

Summary