When causality matters for prediction: Investigating the practical - - PowerPoint PPT Presentation

when causality matters for prediction investigating the
SMART_READER_LITE
LIVE PREVIEW

When causality matters for prediction: Investigating the practical - - PowerPoint PPT Presentation

When causality matters for prediction: Investigating the practical tradeoffs Robert E. Tillman Peter Spirtes Department of Philosophy Machine Learning Department College of Humanities and Social Sciences School of Computer Science NIPS 2008


slide-1
SLIDE 1

When causality matters for prediction: Investigating the practical tradeoffs

Robert E. Tillman Peter Spirtes

Department of Philosophy Machine Learning Department College of Humanities and Social Sciences School of Computer Science

NIPS 2008 Workshop on Causality: Objectives and Assessment

slide-2
SLIDE 2

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causal Discovery

Pollution HeartDis CiliaDam BreathDis LungCapac Genotype Smoker Income Parent

The Usual Setup: Unobserved data generating process i.i.d. sample

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 2 / 28

slide-3
SLIDE 3

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causal Discovery

Pollution HeartDis CiliaDam BreathDis LungCapac Genotype Smoker Income Parent

The Usual Setup: Unobserved data generating process i.i.d. sample Objective: Learn structure, e.g. causal Bayesian network

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 2 / 28

slide-4
SLIDE 4

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causal Discovery

Pollution HeartDis CiliaDam BreathDis LungCapac Genotype Smoker Income Parent

The Usual Setup: Unobserved data generating process i.i.d. sample Objective: Learn structure, e.g. causal Bayesian network Assessment: Compare to “ground truth”, i.e. simulations, experimental studies, expert knowledge

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 2 / 28

slide-5
SLIDE 5

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causal Discovery

Pollution HeartDis CiliaDam BreathDis LungCapac Genotype Smoker Income Parent

The Usual Setup: Unobserved data generating process i.i.d. sample Objective: Learn structure, e.g. causal Bayesian network Assessment: Compare to “ground truth”, i.e. simulations, experimental studies, expert knowledge Focus: Learn network models that accurately depict the data generating mechanism

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 2 / 28

slide-6
SLIDE 6

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Prediction

BB Target P1 P2 P3 P4 P5

The Standard Problem: “Target” variable associated with “predictor” variables i.i.d sample (training data)

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 3 / 28

slide-7
SLIDE 7

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Prediction

BB Target P1 P2 P3 P4 P5

The Standard Problem: “Target” variable associated with “predictor” variables i.i.d sample (training data) Objective: Predict target from values of predictor variables

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 3 / 28

slide-8
SLIDE 8

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Prediction

BB Target P1 P2 P3 P4 P5

The Standard Problem: “Target” variable associated with “predictor” variables i.i.d sample (training data) Objective: Predict target from values of predictor variables Assessment: Compare predictions to known target values, i.e. testing data, cross validation

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 3 / 28

slide-9
SLIDE 9

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Prediction

BB Target P1 P2 P3 P4 P5

The Standard Problem: “Target” variable associated with “predictor” variables i.i.d sample (training data) Objective: Predict target from values of predictor variables Assessment: Compare predictions to known target values, i.e. testing data, cross validation Focus: Train classifier/regression model that minimizes loss function, e.g. makes accurate predictions Model need not resemble the true data generating mechanism, i.e. Naive Bayes

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 3 / 28

slide-10
SLIDE 10

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causal Discovery and Prediction Previous focus: predicting the effects of possible interventions: Specify the distribution for a manipulated population Counterfactuals

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 4 / 28

slide-11
SLIDE 11

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causal Discovery and Prediction Previous focus: predicting the effects of possible interventions: Specify the distribution for a manipulated population Counterfactuals Assume intervention has not been performed, e.g. no data from manipulated population

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 4 / 28

slide-12
SLIDE 12

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causal Discovery and Prediction Previous focus: predicting the effects of possible interventions: Specify the distribution for a manipulated population Counterfactuals Assume intervention has not been performed, e.g. no data from manipulated population Causation and Prediction Challenge: Training data from unmanipulated population

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 4 / 28

slide-13
SLIDE 13

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causal Discovery and Prediction Previous focus: predicting the effects of possible interventions: Specify the distribution for a manipulated population Counterfactuals Assume intervention has not been performed, e.g. no data from manipulated population Causation and Prediction Challenge: Training data from unmanipulated population (Structural) intervention is performed System stabilizes

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 4 / 28

slide-14
SLIDE 14

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causal Discovery and Prediction Previous focus: predicting the effects of possible interventions: Specify the distribution for a manipulated population Counterfactuals Assume intervention has not been performed, e.g. no data from manipulated population Causation and Prediction Challenge: Training data from unmanipulated population (Structural) intervention is performed System stabilizes Draw i.i.d sample for predictors from manipulated population Predict target using predictor values from stabilized manipulated distribution

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 4 / 28

slide-15
SLIDE 15

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causation and Prediction Challenge Results: Participants used causal methods and methods which ignore causality

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 5 / 28

slide-16
SLIDE 16

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causation and Prediction Challenge Results: Participants used causal methods and methods which ignore causality Some top-ranking participants did not use causal methods, i.e. support vector machines (for feature selection and classification) Other participants using causal methods did not do as well

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 5 / 28

slide-17
SLIDE 17

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causation and Prediction Challenge Results: Participants used causal methods and methods which ignore causality Some top-ranking participants did not use causal methods, i.e. support vector machines (for feature selection and classification) Other participants using causal methods did not do as well Questions: Is causality useful for standard prediction tasks?

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 5 / 28

slide-18
SLIDE 18

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causation and Prediction Challenge Results: Participants used causal methods and methods which ignore causality Some top-ranking participants did not use causal methods, i.e. support vector machines (for feature selection and classification) Other participants using causal methods did not do as well Questions: Is causality useful for standard prediction tasks? Is it useful in practice?

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 5 / 28

slide-19
SLIDE 19

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causation and Prediction Challenge Results: Participants used causal methods and methods which ignore causality Some top-ranking participants did not use causal methods, i.e. support vector machines (for feature selection and classification) Other participants using causal methods did not do as well Questions: Is causality useful for standard prediction tasks? Is it useful in practice? Is this a realistic scenario?

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 5 / 28

slide-20
SLIDE 20

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Causation and Prediction Challenge Results: Participants used causal methods and methods which ignore causality Some top-ranking participants did not use causal methods, i.e. support vector machines (for feature selection and classification) Other participants using causal methods did not do as well Questions: Is causality useful for standard prediction tasks? Is it useful in practice? Is this a realistic scenario? Possible Explanations: Sampling error, overfitting Parametric assumptions do not hold, i.e. linearity, Gaussianity Prediction for target is invariant under the manipulation.

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 5 / 28

slide-21
SLIDE 21

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Invariance of prediction under manipulations Simple example:

X Y

Bayes optimal prediction for Y is P(Y|X)

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 6 / 28

slide-22
SLIDE 22

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Invariance of prediction under manipulations Simple example:

X Y

Bayes optimal prediction for Y is P(Y|X) Manipulating X does not change distribution of P(Y|X), still Bayes

  • ptimal

Prediction (once system stabilizes) is invariant under manipulation

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 6 / 28

slide-23
SLIDE 23

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Invariance of prediction under manipulations Simple example:

X Y

Bayes optimal prediction for Y is P(Y|X) Manipulating Y does change distribution of P(Y|X), Y depends on manipulation Incorrect predictions in stabilized manipulated population

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 7 / 28

slide-24
SLIDE 24

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Terminology

CiliaDam LungCapac BreathDis Pollution HeartDis Genotype Smoker Income Parent

Predict CiliaDam

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 8 / 28

slide-25
SLIDE 25

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Terminology

CiliaDam LungCapac BreathDis Smoker Pollution HeartDis Genotype Income Parent

Parents of CiliaDam

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 9 / 28

slide-26
SLIDE 26

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Terminology

CiliaDam LungCapac BreathDis Pollution HeartDis Genotype Smoker Income Parent

Children of CiliaDam

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 10 / 28

slide-27
SLIDE 27

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Terminology

CiliaDam LungCapac BreathDis Genotype HeartDis Pollution Smoker Income Parent

Coparents (spouses) of CiliaDam

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 11 / 28

slide-28
SLIDE 28

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Terminology

CiliaDam LungCapac BreathDis Genotype HeartDis Smoker Pollution Income Parent

Definition In a causal Bayesian network B = G, P over variables V, the Markov Blanket for X ∈ V is the minimal set of variables MBG

X ⊆ V/{X} such that

X ⊥ ⊥ V/MBG

X | MBG X.

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 12 / 28

slide-29
SLIDE 29

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Terminology

CiliaDam LungCapac BreathDis Genotype HeartDis Smoker Pollution Income Parent

Definition In a causal Bayesian network B = G, P over variables V, the Markov Blanket for X ∈ V is the minimal set of variables MBG

X ⊆ V/{X} such that

X ⊥ ⊥ V/MBG

X | MBG X.

Theorem (Pearl, 1988) The Markov blanket for X consists

  • f the parents, children and

coparents of X in G.

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 12 / 28

slide-30
SLIDE 30

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Interventions

Pollution HeartDis CiliaDam BreathDis LungCapac Genotype Smoker Income Parent Policy(Smoker)

Policy(Smoker)=0

Pollution HeartDis CiliaDam BreathDis LungCapac Genotype Smoker Income Parent Policy(Smoker)

Policy(Smoker)=1

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 13 / 28

slide-31
SLIDE 31

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conditions for invariance of prediction under manipulations

CiliaDam LungCapac BreathDis Genotype HeartDis Smoker Pollution Income Parent

Theorem (Prediction invariance) In a causal Bayesian network B = G, P over variables V, let T ∈ V be a target, X ⊆ V a set of predictor variables, and Y ⊆ V the set of manipulated variables. If X ⊇ MBG

T and ∀Y ∈ Y, Y = T and

Y / ∈ Children(T), then prediction

  • f T using X is invariant under the

manipulation.

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 14 / 28

slide-32
SLIDE 32

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conditions for invariance of prediction under manipulations P(T | X) = P(T | MBG

T)

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 15 / 28

slide-33
SLIDE 33

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conditions for invariance of prediction under manipulations P(T | X) = P(T | MBG

T)

= P(T, MBG

T)

  • T P(T, MBG

T)

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 15 / 28

slide-34
SLIDE 34

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conditions for invariance of prediction under manipulations P(T | X) = P(T | MBG

T)

= P(T, MBG

T)

  • T P(T, MBG

T)

=

  • X∈T∪Children(T)∪Parents(T)∪Coparents(T) P(X | Parents(T))
  • T
  • X∈T∪Children(T)∪Parents(T)∪Coparents(T) P(X | Parents(T))

in the Markov blanket subgraph

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 15 / 28

slide-35
SLIDE 35

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conditions for invariance of prediction under manipulations P(T | X) = P(T | MBG

T)

= P(T, MBG

T)

  • T P(T, MBG

T)

=

  • X∈T∪Children(T)∪Parents(T)∪Coparents(T) P(X | Parents(T))
  • T
  • X∈T∪Children(T)∪Parents(T)∪Coparents(T) P(X | Parents(T))

in the Markov blanket subgraph . . . =

  • X∈T∪Children(T) P(X | Parents(T))
  • T
  • X∈T∪Children(T) P(X | Parents(T))

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 15 / 28

slide-36
SLIDE 36

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Correcting for manipulations

CiliaDam LungCapac BreathDis Genotype HeartDis Smoker Pollution Income Parent Policy(BreathDis)

Policy(BreathDis) = 0 Theorem (Causal correction) In a causal Bayesian network B = G, P over variables V, let T be a target and Y ⊆ V the set of manipulated variables. P

  • T | MBG(Policy(Y))

T

  • , is invariant

under the manipulation of Y if ∄Y ∈ Y, such that Y ∈ Children(T) and Y is an ancestor of some C ∈ Children(T) ∩ V/Y.

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 16 / 28

slide-37
SLIDE 37

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Correcting for manipulations

CiliaDam LungCapac BreathDis Genotype HeartDis Smoker Pollution Income Parent Policy(BreathDis)

Policy(BreathDis) = 1 Theorem (Causal correction) In a causal Bayesian network B = G, P over variables V, let T be a target and Y ⊆ V the set of manipulated variables. P

  • T | MBG(Policy(Y))

T

  • , is invariant

under the manipulation of Y if ∄Y ∈ Y, such that Y ∈ Children(T) and Y is an ancestor of some C ∈ Children(T) ∩ V/Y.

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 17 / 28

slide-38
SLIDE 38

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Correcting for manipulations

CiliaDam LungCapac BreathDis Genotype HeartDis Smoker Pollution Income Parent Policy(LungCapac)

Policy(BreathDis) = 0 Theorem (Causal correction) In a causal Bayesian network B = G, P over variables V, let T be a target and Y ⊆ V the set of manipulated variables. P

  • T | MBG(Policy(Y))

T

  • , is invariant

under the manipulation of Y if ∄Y ∈ Y, such that Y ∈ Children(T) and Y is an ancestor of some C ∈ Children(T) ∩ V/Y.

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 18 / 28

slide-39
SLIDE 39

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Correcting for manipulations

CiliaDam LungCapac BreathDis HeartDis Smoker Pollution Genotype Income Parent Policy(LungCapac)

Policy(BreathDis) = 1 Make Correction! Theorem (Causal correction) In a causal Bayesian network B = G, P over variables V, let T be a target and Y ⊆ V the set of manipulated variables. P

  • T | MBG(Policy(Y))

T

  • , is invariant

under the manipulation of Y if ∄Y ∈ Y, such that Y ∈ Children(T) and Y is an ancestor of some C ∈ Children(T) ∩ V/Y.

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 19 / 28

slide-40
SLIDE 40

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Model for experiments

T C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 GC0 GC1 GC2 GC3 GC4 GC5 GC6 P0 S0 P1 P2 P3 P4 S1 GP0 GP1 GP2 GP3 GP4 N0 U0 When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 20 / 28

slide-41
SLIDE 41

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Experiments Method: Train causal and noncausal prediction methods on unmanipulated population (linear Gaussians)

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 21 / 28

slide-42
SLIDE 42

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Experiments Method: Train causal and noncausal prediction methods on unmanipulated population (linear Gaussians) Manipulate 0, 5, 10 random nonchildren of T (including Markov blanket)

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 21 / 28

slide-43
SLIDE 43

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Experiments Method: Train causal and noncausal prediction methods on unmanipulated population (linear Gaussians) Manipulate 0, 5, 10 random nonchildren of T (including Markov blanket) Manipulate 0, . . . , 9 children of T in addition

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 21 / 28

slide-44
SLIDE 44

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Experiments Method: Train causal and noncausal prediction methods on unmanipulated population (linear Gaussians) Manipulate 0, 5, 10 random nonchildren of T (including Markov blanket) Manipulate 0, . . . , 9 children of T in addition Predict T from manipulated distribution

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 21 / 28

slide-45
SLIDE 45

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Experiments Method: Train causal and noncausal prediction methods on unmanipulated population (linear Gaussians) Manipulate 0, 5, 10 random nonchildren of T (including Markov blanket) Manipulate 0, . . . , 9 children of T in addition Predict T from manipulated distribution Hypotheses: Noncausal methods will be equivalent or better when no children are manipulated

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 21 / 28

slide-46
SLIDE 46

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Experiments Method: Train causal and noncausal prediction methods on unmanipulated population (linear Gaussians) Manipulate 0, 5, 10 random nonchildren of T (including Markov blanket) Manipulate 0, . . . , 9 children of T in addition Predict T from manipulated distribution Hypotheses: Noncausal methods will be equivalent or better when no children are manipulated Causal methods will do increasingly better than noncausal methods as more children are manipulated

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 21 / 28

slide-47
SLIDE 47

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Differences between distributions

1 2 3 4 5 6 7 8 9 100 200 300 400 500 600 700 Number of manipulated children of T Average squared difference between predictions 0 manipulated non−children of T 5 manipulated non−children of T 10 manipulated non−children of T

Squared difference between ground truth predictions for T using unmanipulated and manipulated model

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 22 / 28

slide-48
SLIDE 48

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Prediction methods Noncausal Methods: LR-ALL linear regression using all predictors LR-MB linear regression using only the Markov blanket LASSO “least absolute shrinkage and selection operator” SVR-RBF support vector regression using radial kernel RVR-RBF relevance vector regression using radial kernel

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 23 / 28

slide-49
SLIDE 49

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Prediction methods Noncausal Methods: LR-ALL linear regression using all predictors LR-MB linear regression using only the Markov blanket LASSO “least absolute shrinkage and selection operator” SVR-RBF support vector regression using radial kernel RVR-RBF relevance vector regression using radial kernel Causal Methods: LR-MB/C linear regression with Markov blanket correcting for manipulated children LR-MB/C* linear regression with Markov blanket correcting for manipulated children and active paths to unmanipulated children

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 23 / 28

slide-50
SLIDE 50

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Total prediction error

1 2 3 4 5 6 7 8 9 50 100 150 200 250 300 Number of manipulated children of T Mean squared error LR−ALL LR−MB LASSO SVR−RBF RVR−RBF LR−MB/C LR−MB/C*

0 Manipulated Nonchildren of T

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 24 / 28

slide-51
SLIDE 51

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Total prediction error

1 2 3 4 5 6 7 8 9 50 100 150 200 250 Number of manipulated children of T Mean squared error LR−ALL LR−MB LASSO SVR−RBF RVR−RBF LR−MB/C LR−MB/C*

5 Manipulated Nonchildren of T

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 25 / 28

slide-52
SLIDE 52

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Total prediction error

1 2 3 4 5 6 7 8 9 50 100 150 200 250 Number of manipulated children of T Mean squared error LR−ALL LR−MB LASSO SVR−RBF RVR−RBF LR−MB/C LR−MB/C*

10 Manipulated Nonchildren of T

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 26 / 28

slide-53
SLIDE 53

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Nonlinear data Repeated previous simulations adding nonlinear dependencies

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 27 / 28

slide-54
SLIDE 54

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Nonlinear data Repeated previous simulations adding nonlinear dependencies Results so far inconclusive In general, nonparametric methods do best, though poor performance in all cases

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 27 / 28

slide-55
SLIDE 55

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conclusions Is causality relevant for prediction?

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 28 / 28

slide-56
SLIDE 56

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conclusions Is causality relevant for prediction? Yes, as long as a noncausal method is not invariant under the manipulation

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 28 / 28

slide-57
SLIDE 57

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conclusions Is causality relevant for prediction? Yes, as long as a noncausal method is not invariant under the manipulation But causality is needed to know noncausal methods are invariant

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 28 / 28

slide-58
SLIDE 58

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conclusions Is causality relevant for prediction? Yes, as long as a noncausal method is not invariant under the manipulation But causality is needed to know noncausal methods are invariant In practice? Tradeoff between errors related to causality and errors related to parametric assumptions, overfitting, etc.

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 28 / 28

slide-59
SLIDE 59

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conclusions Is causality relevant for prediction? Yes, as long as a noncausal method is not invariant under the manipulation But causality is needed to know noncausal methods are invariant In practice? Tradeoff between errors related to causality and errors related to parametric assumptions, overfitting, etc. Noncausal prediction may be frequently invariant under manipulations or

  • nly make small errors related to causality

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 28 / 28

slide-60
SLIDE 60

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conclusions Is causality relevant for prediction? Yes, as long as a noncausal method is not invariant under the manipulation But causality is needed to know noncausal methods are invariant In practice? Tradeoff between errors related to causality and errors related to parametric assumptions, overfitting, etc. Noncausal prediction may be frequently invariant under manipulations or

  • nly make small errors related to causality

Advantages of nonparametric methods and methods which deal with

  • verfitting well may cancel out errors related to causality

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 28 / 28

slide-61
SLIDE 61

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conclusions Is causality relevant for prediction? Yes, as long as a noncausal method is not invariant under the manipulation But causality is needed to know noncausal methods are invariant In practice? Tradeoff between errors related to causality and errors related to parametric assumptions, overfitting, etc. Noncausal prediction may be frequently invariant under manipulations or

  • nly make small errors related to causality

Advantages of nonparametric methods and methods which deal with

  • verfitting well may cancel out errors related to causality

Many other variables involved, analysis incomplete

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 28 / 28

slide-62
SLIDE 62

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conclusions Is causality relevant for prediction? Yes, as long as a noncausal method is not invariant under the manipulation But causality is needed to know noncausal methods are invariant In practice? Tradeoff between errors related to causality and errors related to parametric assumptions, overfitting, etc. Noncausal prediction may be frequently invariant under manipulations or

  • nly make small errors related to causality

Advantages of nonparametric methods and methods which deal with

  • verfitting well may cancel out errors related to causality

Many other variables involved, analysis incomplete Future directions for causal discovery: Methods which deal with overfitting well

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 28 / 28

slide-63
SLIDE 63

Causation and Prediction Invariance of prediction functions Experimental Results Conclusions

Conclusions Is causality relevant for prediction? Yes, as long as a noncausal method is not invariant under the manipulation But causality is needed to know noncausal methods are invariant In practice? Tradeoff between errors related to causality and errors related to parametric assumptions, overfitting, etc. Noncausal prediction may be frequently invariant under manipulations or

  • nly make small errors related to causality

Advantages of nonparametric methods and methods which deal with

  • verfitting well may cancel out errors related to causality

Many other variables involved, analysis incomplete Future directions for causal discovery: Methods which deal with overfitting well Less restrictive parametric assumptions

When causality matters for prediction Tillman and Spirtes NIPS 2008 Workshop on Causality 28 / 28