Causality Bernhard Sch olkopf and Jonas Peters MPI for Intelligent - - PowerPoint PPT Presentation

causality
SMART_READER_LITE
LIVE PREVIEW

Causality Bernhard Sch olkopf and Jonas Peters MPI for Intelligent - - PowerPoint PPT Presentation

Causality Bernhard Sch olkopf and Jonas Peters MPI for Intelligent Systems, T ubingen MLSS, T ubingen 21st July 2015 Charig et al.: Comparison of treatment of renal calculi by open surgery, (...) , British Medical Journal, 1986


slide-1
SLIDE 1

Causality

Bernhard Sch¨

  • lkopf and Jonas Peters

MPI for Intelligent Systems, T¨ ubingen MLSS, T¨ ubingen 21st July 2015

slide-2
SLIDE 2

Charig et al.: “Comparison of treatment of renal calculi by open surgery, (...) ”, British Medical Journal, 1986

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-3
SLIDE 3

Charig et al.: “Comparison of treatment of renal calculi by open surgery, (...) ”, British Medical Journal, 1986

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-4
SLIDE 4
  • J. Mooij et al.: Distinguishing cause from effect using observational data: methods and benchmarks, submitted
  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-5
SLIDE 5

Assume P(X1, . . . , X4) has been induced by

X1 = f1(X3, N1) X2 = N2 X3 = f3(X2, N3) X4 = f4(X2, X3, N4)

  • Ni jointly independent
  • G0 has no cycles

X4 X2 X3 X1 G0

Functional causal model. Can the DAG be recovered from P(X1, . . . , X4)?

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-6
SLIDE 6

Assume P(X1, . . . , X4) has been induced by

X1 = f1(X3, N1) X2 = N2 X3 = f3(X2, N3) X4 = f4(X2, X3, N4)

  • Ni jointly independent
  • G0 has no cycles

X4 X2 X3 X1 G0

Functional causal model. Can the DAG be recovered from P(X1, . . . , X4)? No.

JP, J. Mooij, D. Janzing and B. Sch¨

  • lkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
  • S. Shimizu, P. Hoyer, A. Hyv¨

arinen, A. Kerminen: A linear non-Gaussian acyclic model for causal discovery. JMLR, 2006

  • P. B¨

uhlmann, JP, J. Ernest: CAM: Causal add. models, high-dim. order search and penalized regr., Annals of Statistics 2014

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-7
SLIDE 7

Assume P(X1, . . . , X4) has been induced by

X1 = f1(X3) + N1 X2 = N2 X3 = f3(X2) + N3 X4 = f4(X2, X3) + N4

  • Ni ∼ N(0, σ2

i ) jointly independent

  • G0 has no cycles

X4 X2 X3 X1 G0

Additive noise model with Gaussian noise. Can the DAG be recovered from P(X1, . . . , X4)? Yes iff fi nonlinear.

JP, J. Mooij, D. Janzing and B. Sch¨

  • lkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
  • P. B¨

uhlmann, JP, J. Ernest: CAM: Causal add. models, high-dim. order search and penalized regr., Annals of Statistics 2014

  • S. Shimizu, P. Hoyer, A. Hyv¨

arinen, A. Kerminen: A linear non-Gaussian acyclic model for causal discovery. JMLR, 2006

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-8
SLIDE 8

Consider a distribution generated by

Y = f (X) + NY with NY , X ind ∼ N X Y

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-9
SLIDE 9

Consider a distribution generated by

Y = f (X) + NY with NY , X ind ∼ N X Y

Then, if f is nonlinear, there is no

X = g(Y ) + MX with MX , Y ind ∼ N X Y

JP, J. Mooij, D. Janzing and B. Sch¨

  • lkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-10
SLIDE 10

Consider a distribution corresponding to

Y = X 3 + NY with NY , X ind ∼ N X Y

with X ∼ N(1, 0.52) NY ∼ N(0, 0.42)

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-11
SLIDE 11

−0.5 0.0 0.5 1.0 1.5 2.0 2.5 5 10 15 X Y

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-12
SLIDE 12

−0.5 0.0 0.5 1.0 1.5 2.0 2.5 5 10 15 X Y

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-13
SLIDE 13

−0.5 0.0 0.5 1.0 1.5 2.0 2.5 5 10 15 X Y

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-14
SLIDE 14

−0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 5 10 15 gam(X ~ s(Y))$residuals Y

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-15
SLIDE 15

Surprise (under some assumptions): 2 variables ⇒ p variables

JP, J. Mooij, D. Janzing and B. Sch¨

  • lkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-16
SLIDE 16

Surprise (under some assumptions): 2 variables ⇒ p variables

JP, J. Mooij, D. Janzing and B. Sch¨

  • lkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014

Let P(X1, . . . , Xp) be induced by a ...

conditions identif. structural equation model: Xi = fi(XPAi, Ni)

additive noise model: Xi = fi(XPAi) + Ni

  • nonlin. fct.

✓ causal additive model: Xi =

k∈PAi fik(Xk) + Ni

  • nonlin. fct.

✓ linear Gaussian model: Xi =

k∈PAi βikXk + Ni

linear fct. ✗

. (results hold for Gaussian noise)

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-17
SLIDE 17
  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-18
SLIDE 18
  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-19
SLIDE 19

GAUL GAUSS “the LINEAR”

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-20
SLIDE 20

20 40 60 80 100 10 20 30 40 50 60 70 80 90 100

Significant Not significant

Decision rate (%) Accuracy (%)

IGCI LiNGaM Additive Noise PNL

see also

  • D. Lopez-Paz, K. Muandet, B. Sch¨
  • lkopf, I. Tolstikhin: Towards a Learning Theory of Cause-Effect Inference, ICML 2015
  • E. Sgouritsa, D. Janzing, P. Hennig, B. Sch¨
  • lkopf: Inf. of Cause and Effect with Unsupervised Inverse Regr., AISTATS 2015
  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-21
SLIDE 21

Real data: genetic perturbation experiments for yeast (Kemmeren et al., 2014) p = 6170 genes nobs = 160 wild-types nint = 1479 gene deletions (targets known)

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-22
SLIDE 22

Real data: genetic perturbation experiments for yeast (Kemmeren et al., 2014) p = 6170 genes nobs = 160 wild-types nint = 1479 gene deletions (targets known) true hits: ≈ 0.1% of pairs

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-23
SLIDE 23

Real data: genetic perturbation experiments for yeast (Kemmeren et al., 2014) p = 6170 genes nobs = 160 wild-types nint = 1479 gene deletions (targets known) true hits: ≈ 0.1% of pairs “Invariant prediction” method: E = {obs, int}

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-24
SLIDE 24

Real data: genetic perturbation experiments for yeast (Kemmeren et al., 2014) p = 6170 genes nobs = 160 wild-types nint = 1479 gene deletions (targets known) true hits: ≈ 0.1% of pairs “Invariant prediction” method: E = {obs, int}

JP, P. B¨ uhlmann, N. Meinshausen: Causal inference using inv. pred.: identification and conf. intervals, arXiv, 1501.01332

  • D. Rothenhaeusler, C. Heinze et al.: backShift: Learning causal cyclic graphs from unknown shift interv., arXiv 1506.02494
  • M. Rojas-Carulla et al.: A Causal Perspective on Domain Adaptation, arXiv 1507.05333
  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-25
SLIDE 25

ACTIVITY GENE 5954 ACTIVITY GENE 4710 −1.0 −0.5 0.0 0.5 −1.0 −0.5 0.0 0.5

  • bservational training data

ACTIVITY GENE 5954 −1.0 −0.5 0.0 0.5

interventional training data (interv. on genes other than 5954 and 4710)

ACTIVITY GENE 5954 ACTIVITY GENE 4710 −5 −4 −3 −2 −1 1 −5 −4 −3 −2 −1 1

interventional test data point (intervention on gene 5954)

most significant pair

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-26
SLIDE 26

ACTIVITY GENE 3729 ACTIVITY GENE 3730 −0.5 0.0 0.5 1.0 −0.5 0.0 0.5 1.0

  • bservational training data

ACTIVITY GENE 3729 −0.5 0.0 0.5 1.0

interventional training data (interv. on genes other than 3729 and 3730)

ACTIVITY GENE 3729 ACTIVITY GENE 3730 −4 −3 −2 −1 1 2 −4 −3 −2 −1 1 2

interventional test data point (intervention on gene 3729)

2nd most significant pair

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-27
SLIDE 27

ACTIVITY GENE 3672 ACTIVITY GENE 1475 −0.5 0.0 0.5 1.0 1.5 −0.5 0.0 0.5 1.0 1.5

  • bservational training data

ACTIVITY GENE 3672 −0.5 0.0 0.5 1.0 1.5

interventional training data (interv. on genes other than 3672 and 1475)

ACTIVITY GENE 3672 ACTIVITY GENE 1475 −3 −2 −1 1 2 −3 −2 −1 1 2

interventional test data point (intervention on gene 3672)

3rd most significant pair

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-28
SLIDE 28

# INTERVENTION PREDICTIONS # STRONG INTERVENTION EFFECTS 5 10 15 20 25 2 4 6 8

PERFECT INVARIANT HIDDEN−INVARIANT PC RFCI REGRESSION (CV−Lasso) GES and GIES RANDOM (99% prediction− interval)

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-29
SLIDE 29

http://xkcdsw.com/3039

  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015

slide-30
SLIDE 30
  • B. Watterson: It’s a magical world, Andrews McMeel Publishing, 1996
  • B. Sch¨
  • lkopf & J. Peters (MPI)

Causality 21st July 2015