Can one extract causal information from high-dimensional - - PowerPoint PPT Presentation

can one extract causal information from high dimensional
SMART_READER_LITE
LIVE PREVIEW

Can one extract causal information from high-dimensional - - PowerPoint PPT Presentation

Can one extract causal information from high-dimensional observational data? Applied Multivariate Statistics Spring 2012 (not relevant for exam) What is a causal effect? Markus Kalisch, ETH Zurich 2 What is a causal effect? Drowning


slide-1
SLIDE 1

Applied Multivariate Statistics – Spring 2012 (not relevant for exam)

Can one extract causal information from high-dimensional observational data?

slide-2
SLIDE 2

What is a causal effect?

2 Markus Kalisch, ETH Zurich
slide-3
SLIDE 3

What is a causal effect?

3 Markus Kalisch, ETH Zurich

Drowning accidents

slide-4
SLIDE 4

What is a causal effect?

4 Markus Kalisch, ETH Zurich

Drowning accidents Ice cream sales

slide-5
SLIDE 5

What is a causal effect?

5 Markus Kalisch, ETH Zurich

Drowning accidents Ice cream sales

slide-6
SLIDE 6

What is a causal effect?

6 Markus Kalisch, ETH Zurich

Drowning accidents Ice cream sales

slide-7
SLIDE 7

What is a causal effect?

7 Markus Kalisch, ETH Zurich

Drowning accidents Ice cream sales

?

slide-8
SLIDE 8

What is a causal effect?

8 Markus Kalisch, ETH Zurich

Drowning accidents Ice cream sales

slide-9
SLIDE 9

What is a causal effect?

9 Markus Kalisch, ETH Zurich

Drowning accidents Ice cream sales

slide-10
SLIDE 10

What is a causal effect?

10 Markus Kalisch, ETH Zurich

Drowning accidents Ice cream sales

slide-11
SLIDE 11

Another example: Smoking

11 Markus Kalisch, ETH Zurich
slide-12
SLIDE 12

Scenario 1: Observe 1000 smoker and count the incidence of lung cancer

12 Markus Kalisch, ETH Zurich
slide-13
SLIDE 13

Scenario 1: Observe 1000 smokers and count the incidence of lung cancer Scenario 2: Make 1000 random people smoke and count the incidence of lung cancer

13 Markus Kalisch, ETH Zurich
slide-14
SLIDE 14

Scenario 1: Observe 1000 smokers and count the incidence of lung cancer Scenario 2: Make 1000 random people smoke and count the incidence of lung cancer are different.

14 Markus Kalisch, ETH Zurich
slide-15
SLIDE 15

What is a causal effect?

15 Markus Kalisch, ETH Zurich

CHANGE BY INTERVENTION

slide-16
SLIDE 16

How to find causal effects?

16 Markus Kalisch, ETH Zurich
slide-17
SLIDE 17

How to find causal effects?

17 Markus Kalisch, ETH Zurich

Experimental Data

?

slide-18
SLIDE 18

How to find causal effects?

18 Markus Kalisch, ETH Zurich

Two groups of plots: Identical in all aspects (sunlight, water, soil quality, …)

Experimental Data

slide-19
SLIDE 19

How to find causal effects?

19 Markus Kalisch, ETH Zurich

Two groups of plots: Identical in all aspects (sunlight, water, soil quality, …) Practice: Randomized assignment

Experimental Data

slide-20
SLIDE 20

How to find causal effects?

20 Markus Kalisch, ETH Zurich

Experimental Data

slide-21
SLIDE 21

How to find causal effects?

21 Markus Kalisch, ETH Zurich

Experimental Data

slide-22
SLIDE 22

How to find causal effects?

22 Markus Kalisch, ETH Zurich

Experimental Data Outcome due to fertilizer, since everything else was equal

slide-23
SLIDE 23

How to find causal effects?

Sometimes, randomized controlled experiments are

  • too expensive (gene experiments)
  • too time-consuming (gene experiments)
  • unethical (HIV treatment)
  • just not practical (smoking).
23 Markus Kalisch, ETH Zurich
slide-24
SLIDE 24

If experiment is impossible…

24 Markus Kalisch, ETH Zurich

Observational Data

slide-25
SLIDE 25

… observe fields of two farmers.

25 Markus Kalisch, ETH Zurich

Observational Data

slide-26
SLIDE 26

… observe fields of two farmers.

26 Markus Kalisch, ETH Zurich

Observational Data

Groups not guaranteed to be identical in all aspects (sunlight, water, soil quality, …)

slide-27
SLIDE 27

… observe fields of two farmers.

27 Markus Kalisch, ETH Zurich

Observational Data

slide-28
SLIDE 28

… observe fields of two farmers.

28 Markus Kalisch, ETH Zurich

Observational Data

Is outcome due to fertilizer? We can’t tell !

slide-29
SLIDE 29

… observe fields of two farmers.

29 Markus Kalisch, ETH Zurich

Observational Data

slide-30
SLIDE 30

… observe fields of two farmers.

30 Markus Kalisch, ETH Zurich

Observational Data

slide-31
SLIDE 31

How to find causal effects?

Can one extract causal information from observational data alone?

31 Markus Kalisch, ETH Zurich
slide-32
SLIDE 32

Goal of this talk

  • IDA finds a set of possible causal effects given
  • bservational data consistently even in high dimensions.
  • One element of the set is the true causal effect;

bounds on set are useful

  • Does not replace randomized experiments
  • Helps prioritizing and designing random experiments
32 Markus Kalisch, ETH Zurich

IDA

slide-33
SLIDE 33

Example

  • Yeast: Saccharomyces cerevisiae
33 Markus Kalisch, ETH Zurich
slide-34
SLIDE 34

Example

  • Yeast: Saccharomyces cerevisiae
34 Markus Kalisch, ETH Zurich
slide-35
SLIDE 35

Example

  • Yeast: Saccharomyces cerevisiae
  • What are the causal effects among

the thousands of genes?

35 Markus Kalisch, ETH Zurich
slide-36
SLIDE 36

Example

  • Yeast: Saccharomyces cerevisiae
  • What are the causal effects among

the thousands of genes?

  • Approach:

Model gene expression of each gene as a random variable. Can we use the joint distribution of gene expression to extract causal information?

36 Markus Kalisch, ETH Zurich
slide-37
SLIDE 37 37 Markus Kalisch, ETH Zurich Distribution
  • racle

Here is a distribution

  • racle.

Now find the causal effect!

slide-38
SLIDE 38

Outline in Theory

38 Markus Kalisch, ETH Zurich

Causal Structure do-calculus with known causal structure Causal effects Distribution

  • racle

IDA

slide-39
SLIDE 39

Pearl’s do-operator

  • Notation for causal intervention

P(Y=y | do(X=x)) “distribution of Y, if there is an intervention in variable X”

  • Causal effect

C(x’) = d/dx E[Y=y | do(X=x)]|x=x’ “change in expected value of Y, if there is an intervention in variable X”

39 Markus Kalisch, ETH Zurich

do-calculus with known causal structure

slide-40
SLIDE 40

P(Y=y | X=x) ≠ P(Y=y | do(X=x))

40 Markus Kalisch, ETH Zurich

P(rain | wet) = high P(rain | do(wet)) = = P(rain) = = low Pick a random day: do-calculus with known causal structure

slide-41
SLIDE 41

Pearl’s do-calculus

41 Markus Kalisch, ETH Zurich

Causal structure

X Y Z

Rules: Expression with “do” Expression without “do”

Judea Pearl, “Causality”, 2010, Cambridge University Press

do-calculus with known causal structure

slide-42
SLIDE 42

Example: Back-door Adjustment

42 Markus Kalisch, ETH Zurich

Causal structure

X Y Z

Rules P(Y=y | do(X=x))

P(Y=y | X=x, Z=0) * P(Z=0) + P(Y=y | X=x, Z=1) * P(Z=1) Assume Z is binary (0/1)

do-calculus with known causal structure

slide-43
SLIDE 43

Example: Back-door Adjustment

43 Markus Kalisch, ETH Zurich

Causal structure

X Y Z

Rules P(Y=y | do(X=x))

P(Y=y | X=x, Z=0) * P(Z=0) + P(Y=y | X=x, Z=1) * P(Z=1) Assume Z is binary (0/1)

“do”

do-calculus with known causal structure

slide-44
SLIDE 44

Example: Back-door Adjustment

44 Markus Kalisch, ETH Zurich

Causal structure

X Y Z

Rules P(Y=y | do(X=x))

P(Y=y | X=x, Z=0) * P(Z=0) + P(Y=y | X=x, Z=1) * P(Z=1) Assume Z is binary (0/1)

No “do”

do-calculus with known causal structure

slide-45
SLIDE 45

Conclusion 1

45 Markus Kalisch, ETH Zurich

If causal structure is known, we can infer causal effects from observations

do-calculus with known causal structure

slide-46
SLIDE 46

Outline in Theory

46 Markus Kalisch, ETH Zurich

Causal Structure do-calculus with known causal structure Causal effects Distribution

  • racle

IDA 

slide-47
SLIDE 47

Estimate Causal Structure

47 Markus Kalisch, ETH Zurich

Causal Structure

Oftentimes, causal structure is unknown Estimate causal structure

slide-48
SLIDE 48 48 Markus Kalisch, ETH Zurich

Causal Directed Acyclic Graph (DAG)

X W Z Y

Causal Structure

slide-49
SLIDE 49 49 Markus Kalisch, ETH Zurich

Causal Directed Acyclic Graph (DAG)

X W Z Y

Random Variables Direct cause Causal Structure

slide-50
SLIDE 50 50 Markus Kalisch, ETH Zurich

Causal Directed Acyclic Graph (DAG)

X W Z Y

Random Variables Direct cause implies

Conditional independence relations among variables

Causal Structure

slide-51
SLIDE 51

Estimate a DAG model

51 Markus Kalisch, ETH Zurich

DAG encodes independence information

Independencies among variables given by oracle Reverse engineering DAG

Causal Structure

slide-52
SLIDE 52

Estimate a DAG model

52 Markus Kalisch, ETH Zurich

DAG encodes independence information

Independencies among variables given by oracle Reverse engineering DAG

PC Algorithm

  • P. Spirtes, C. Glymour, R. Scheines, “Causation, Prediction, and Search”, 2000, MIT Press

Causal Structure

slide-53
SLIDE 53

Ambiguity: Equivalence class

53 Markus Kalisch, ETH Zurich

Several DAGs describe exactly the same list of independence relations

X W Z Y X W Z Y

Causal Structure

slide-54
SLIDE 54

Ambiguity: Equivalence class

54 Markus Kalisch, ETH Zurich

Several DAGs describe exactly the same list of independence relations

X W Z Y X W Z Y

Causal Structure

slide-55
SLIDE 55

Ambiguity: Equivalence class

55 Markus Kalisch, ETH Zurich

Several DAGs describe exactly the same list of independence relations

X W Z Y X W Z Y X W Z Y

Equivalence class: PARTIALLY Directed Acyclic Graph (PDAG)

Causal Structure

slide-56
SLIDE 56

Ambiguity: Equivalence class

56 Markus Kalisch, ETH Zurich

Several DAGs describe exactly the same list of independence relations

X W Z Y X W Z Y X W Z Y

Equivalence class: PARTIALLY Directed Acyclic Graph (PDAG)

Causal Structure

slide-57
SLIDE 57

Ambiguity: Equivalence class

57 Markus Kalisch, ETH Zurich

Some DAGs describe exactly the same list of independence relations

X W Z Y X W Z Y X W Z Y

Equivalence class: PARTIALLY Directed Acyclic Graph (PDAG) PC Algorithm finds equivalence class

Causal Structure

slide-58
SLIDE 58

Outline in Theory

58 Markus Kalisch, ETH Zurich

Causal Structure do-calculus with known causal structure Causal effects Distribution

  • racle

IDA 

Up to equivalence class

slide-59
SLIDE 59

Putting everything together

59 Markus Kalisch, ETH Zurich Distribution
  • racle
PDAG DAG 1 … DAG n Effect 1 Effect n Set of causal effects
slide-60
SLIDE 60

Putting everything together

60 Markus Kalisch, ETH Zurich Distribution
  • racle
PDAG DAG 1 … DAG n Effect 1 Effect n Set of causal effects PC Algorithm
slide-61
SLIDE 61

Putting everything together

61 Markus Kalisch, ETH Zurich Distribution
  • racle
PDAG DAG 1 … DAG n Effect 1 Effect n Set of causal effects PC Algorithm do-calculus
slide-62
SLIDE 62

Putting everything together

62 Markus Kalisch, ETH Zurich Distribution
  • racle
PDAG DAG 1 … DAG n Effect 1 Effect n Set of causal effects PC Algorithm do-calculus Bounds, e.g. minimum absolute value
slide-63
SLIDE 63

Outline in Theory

63 Markus Kalisch, ETH Zurich

Equivalence class of Causal Structure Set of Causal effects Distribution

  • racle

 

do-calculus with known causal structure

IDA

slide-64
SLIDE 64 64 Markus Kalisch, ETH Zurich

I’m busy! Find your own information on the distribution…

slide-65
SLIDE 65

Outline in Theory Practice

65 Markus Kalisch, ETH Zurich

Equivalence class of Causal Structure Set of Causal effects Observational data

IDA

do-calculus with known causal structure

slide-66
SLIDE 66

Outline in Theory Practice

66 Markus Kalisch, ETH Zurich

Equivalence class of Causal Structure Set of Causal effects Observational data

IDA

do-calculus with known causal structure

Conditional independence tests
slide-67
SLIDE 67

Outline in Theory Practice

67 Markus Kalisch, ETH Zurich

Equivalence class of Causal Structure Set of Causal effects Observational data

IDA

do-calculus with known causal structure

Conditional independence tests Estimated properties
  • f distribution
slide-68
SLIDE 68

Outline in Theory Practice

68 Markus Kalisch, ETH Zurich

Equivalence class of Causal Structure Set of Causal effects Observational data

IDA

do-calculus with known causal structure

Conditional independence tests Estimated properties
  • f distribution
slide-69
SLIDE 69

Consistency in high-dimensions: Gaussian case

Estimating graphical models with PC algorithm

69 Markus Kalisch, ETH Zurich
  • M. Kalisch, P. Bühlmann, “Estimating high-dimensional DAGs with the PC algorithm”,
2007, JMLR 8, 613 - 636

Do-calculus in high dimensions

M.H. Maathuis, M. Kalisch, P. Bühlmann, “Estimating high-dimensional intervention effects from observational data”, 2009, Annals of Statistics 37, 3133 - 3164
slide-70
SLIDE 70

Consistency in high-dimensions: Gaussian case

Estimating graphical models with PC algorithm

70 Markus Kalisch, ETH Zurich
  • M. Kalisch, P. Bühlmann, “Estimating high-dimensional DAGs with the PC algorithm”,
2007, JMLR 8, 613 - 636

Do-calculus in high dimensions

M.H. Maathuis, M. Kalisch, P. Bühlmann, “Estimating high-dimensional intervention effects from observational data”, 2009, Annals of Statistics 37, 3133 - 3164 Intervention effects if DAG is Absent
slide-71
SLIDE 71

Main assumptions & requirements

71 Markus Kalisch, ETH Zurich
  • Gaussian data from unknown causal DAG
  • Faithfulness to this DAG
  • No hidden or selection variables
  • Involves a tuning parameter
slide-72
SLIDE 72

Experimental validation

72 Markus Kalisch, ETH Zurich

Complex system Experiment Top causal effects Observational data Top causal effects Agreement ?

IDA

slide-73
SLIDE 73

Back to the beer: Experimental validation of IDA in Saccharomyces cerevisiae

73 Markus Kalisch, ETH Zurich
slide-74
SLIDE 74

Setting

  • 5361 observed genes
  • Experiments: 234 single-gene deletion mutants
  • Observational data: 63 wild-type cultures
  • Very high dimensional: 5361 variables, 63 observations
74 Markus Kalisch, ETH Zurich
slide-75
SLIDE 75 75 Markus Kalisch, ETH Zurich 234 * 5360 effects
slide-76
SLIDE 76 76 Markus Kalisch, ETH Zurich Top 10% causal effects from experiment 234 * 5360 effects
slide-77
SLIDE 77 77 Markus Kalisch, ETH Zurich Top 5000 Causal effects Using IDA Top 10% causal effects from experiment 234 * 5360 effects
slide-78
SLIDE 78 78 Markus Kalisch, ETH Zurich Top 5000 Causal effects Using IDA Top 10% causal effects from experiment Top 5000 effects using other methods 234 * 5360 effects
slide-79
SLIDE 79 79 Markus Kalisch, ETH Zurich Top 10% causal effects from experiment 234 * 5360 effects False Positives True Positives
slide-80
SLIDE 80 80 Markus Kalisch, ETH Zurich True Positives False Positives 1000 800 600 400 200 1000 2000 3000 4000

IDA Lasso Elastic net Random guessing

M.H. Maathuis,
  • D. Colombo,
  • M. Kalisch,
  • P. Bühlmann,
“Predicting causal effects in large-scale systems from
  • bservational
data”, 2010, Nature Methods 7, 247 - 248
slide-81
SLIDE 81 81 Markus Kalisch, ETH Zurich True Positives False Positives 1000 800 600 400 200 1000 2000 3000 4000

IDA Lasso Elastic net Random guessing

M.H. Maathuis,
  • D. Colombo,
  • M. Kalisch,
  • P. Bühlmann,
“Predicting causal effects in large-scale systems from
  • bservational
data”, 2010, Nature Methods 7, 247 - 248

Top 1000 estimated effects 100 900

slide-82
SLIDE 82 82 Markus Kalisch, ETH Zurich True Positives False Positives 1000 800 600 400 200 1000 2000 3000 4000

IDA Lasso Elastic net Random guessing

M.H. Maathuis,
  • D. Colombo,
  • M. Kalisch,
  • P. Bühlmann,
“Predicting causal effects in large-scale systems from
  • bservational
data”, 2010, Nature Methods 7, 247 - 248

Top 1000 estimated effects 130 870

slide-83
SLIDE 83 83 Markus Kalisch, ETH Zurich True Positives False Positives 1000 800 600 200 1000 2000 3000 4000

IDA Lasso Elastic net Random guessing

M.H. Maathuis,
  • D. Colombo,
  • M. Kalisch,
  • P. Bühlmann,
“Predicting causal effects in large-scale systems from
  • bservational
data”, 2010, Nature Methods 7, 247 - 248

Top 1000 estimated effects 400 600

slide-84
SLIDE 84

Outline in Theory

84 Markus Kalisch, ETH Zurich

Equivalence class of Causal Structure Set of Causal effects Distribution

  • racle

do-calculus with known causal structure

IDA

slide-85
SLIDE 85

Outline in Theory Practice

85 Markus Kalisch, ETH Zurich

Equivalence class of Causal Structure Set of Causal effects Observational data

IDA

do-calculus with known causal structure

slide-86
SLIDE 86

Summary of assumptions

86 Markus Kalisch, ETH Zurich
  • Data is faithful to an underlying causal DAG
  • No hidden or selection variables
  • Consistent in high-dimensions if
  • data multivariate normal
  • some regularity conditions on partial correlations
  • underlying DAG is sparse
  • For IDA also: All conditional expectations are linear
slide-87
SLIDE 87

R

  • Function “ida” in package “pcalg”
87 Markus Kalisch, ETH Zurich