JOINT PROBABILISTIC INFERENCE OF CAUSAL STRUCTURE Dhanya Sridhar - - PowerPoint PPT Presentation

joint probabilistic inference of causal structure
SMART_READER_LITE
LIVE PREVIEW

JOINT PROBABILISTIC INFERENCE OF CAUSAL STRUCTURE Dhanya Sridhar - - PowerPoint PPT Presentation

JOINT PROBABILISTIC INFERENCE OF CAUSAL STRUCTURE Dhanya Sridhar Lise Getoor U.C. Santa Cruz KDD Workshop on Causal Discovery August 14 th , 2016 1 Outline Motivation Problem Formulation Our Approach Preliminary Results 2


slide-1
SLIDE 1

JOINT PROBABILISTIC INFERENCE OF CAUSAL STRUCTURE

Dhanya Sridhar Lise Getoor U.C. Santa Cruz KDD Workshop on Causal Discovery August 14th, 2016

1

slide-2
SLIDE 2

Outline

  • Motivation
  • Problem Formulation
  • Our Approach
  • Preliminary Results

2

slide-3
SLIDE 3

Traditional to Hybrid Approaches

3

X2 X1 X4 X3

Constraint Based

X2 X1

Score(G, D)

X2 X1

Score(G, D)

X3

X X X

Search and Score Based

slide-4
SLIDE 4

Traditional to Hybrid Approaches

4

X2 X1 X4 X3

Constraint Based

X2 X1

Score(G, D)

X2 X1

Score(G, D)

X4

X X X

Search and Score Based Hybrid Approaches

slide-5
SLIDE 5

Traditional to Hybrid Approaches

5

Hybrid Approaches:

  • PC-based DAG Search – Dash and Drudzel, UAI 99
  • Min-max Hill Climbing – Tsamardinos et al., JMLR 06

X2 X1 X4 X3

X X X

X2 X1 X4 X3

Score(G, D)

Constraint Based

slide-6
SLIDE 6

Joint Inference for Structure Discovery

X2 X1 X4 X3

C12

C24

A

13

A34

Joint Inference of Variables: Causal Edge Cij Adjacency Edges Aij

slide-7
SLIDE 7

Joint Inference for Structure Discovery

X2 X1 X4 X3

C12

C24

A

13

A34

Joint Inference Approaches:

  • Linear Programming Relaxations, Jaakkola et al., AISTATS 10

Joint Inference of Variables: Causal Edge Cij Adjacency Edges Aij

slide-8
SLIDE 8

Joint Inference for Structure Discovery

X2 X1 X4 X3

C12

C24

A

13

A34

Joint Inference Approaches:

  • Linear Programming Relaxations, Jaakkola et al., AISTATS 10
  • MAX-SAT, Hyttinen et al., UAI 13

Joint Inference of Variables: Causal Edge Cij Adjacency Edges Aij

slide-9
SLIDE 9

Outline

  • Motivation
  • Problem Formulation
  • Our Approach
  • Preliminary Results

9

slide-10
SLIDE 10

Probabilistic Joint Model of Causal Structure

X2 X1 X4 X3

C12

C24

A

13

A34

Extending joint approaches: probabilistic model over causal structures

slide-11
SLIDE 11

Probabilistic Joint Model of Causal Structure

X2 X1 X4 X3

C12

C24

A

13

A34

Independence Tests

slide-12
SLIDE 12

Probabilistic Joint Model of Causal Structure

X2 X1 X4 X3

C12

C24

A

13

A34

Combining logical and structural constraints and probabilistic reasoning

slide-13
SLIDE 13

Outline

  • Motivation
  • Problem Formulation
  • Our Approach
  • Preliminary Results

13 13

slide-14
SLIDE 14

Probabilistic Soft Logic (PSL)

14 14

Bach et. al (2015). “Hinge-loss Markov Random Fields and Pr Bach et. al (2015). “Hinge-loss Markov Random Fields and Probabilistic Soft

  • babilistic Soft

Logic.” Logic.” arXiv arXiv. . Open sour Open source softwar ce software: https://psl.umiacs.umd.edu

5.0: Causes(A, B) ^ Causes(B, C) ^ Linked(A,C) à Causes(A, C)

Weighted rules

  • Logic-like syntax with probabilistic, soft constraints
  • Describes an undirected graphical model
slide-15
SLIDE 15

Probabilistic Soft Logic (PSL)

15 15

Bach et. al (2015), Bach et. al (2015), arXiv arXiv Open sour Open source softwar ce software: https://psl.umiacs.umd.edu

5.0: Causes(A, B) ^ Causes(B, C) ^ Linked(A,C) à Causes(A, C)

Weighted rules Predicates are continuous random variables!

  • Logic-like syntax with probabilistic, soft constraints
  • Describes an undirected graphical model
slide-16
SLIDE 16

Probabilistic Soft Logic (PSL)

16 16

Bach et. al (2015), Bach et. al (2015), arXiv arXiv Open sour Open source softwar ce software: https://psl.umiacs.umd.edu

5.0: Causes(A, B) ^ Causes(B, C) ^ Linked(A,C) à Causes(A, C)

Weighted rules Predicates are continuous random variables! Relaxations of Logical Operators

  • Logic-like syntax with probabilistic, soft constraints
  • Describes an undirected graphical model
slide-17
SLIDE 17
  • Rules instantiated with values from real network

17 17

5.0: Causes(A, B) ^ Causes(B, C) ^ Linked(A,C) à Causes(A, C)

Probabilistic Soft Logic (PSL)

X2 X1 X4 X3

C12

C24

A

13

A34

slide-18
SLIDE 18
  • Rules instantiated with variables from real network

18 18

Probabilistic Soft Logic (PSL)

5.0: Causes(X1, X2) ^ Causes(X2, X4) ^ Linked(X1,X4) à Causes(X1, X4)

C12 C14 C24 A

14

slide-19
SLIDE 19

Bach et al. NIPS 12, Bach et al. UAI 13

19 19

Bach et al. (2015), arXiv arXiv

5.0: Causes(X1, X2) ^ Causes(X2, X4) ^ Linked(X1,X4) à Causes(X1, X4) Convex relaxation of implication and distance to rule satisfaction

Soft Logic Relaxation

Linear Function

slide-20
SLIDE 20

Hinge-loss Markov Random Fields

Bach et al. NIPS 12, Bach et al. UAI 13

20 20

Bach et al. (2015), arXiv arXiv

p(Y|X) = 1 Z(w, X) exp 2 4−

m

X

j=1

wj h max {j(Y, X), 0}]{1,2}i 3 5

Conditional Conditional random field random field

slide-21
SLIDE 21

Bach et al. NIPS 12, Bach et al. UAI 13

21 21

Bach et al. (2015), arXiv arXiv

p(Y|X) = 1 Z(w, X) exp 2 4−

m

X

j=1

wj h max {j(Y, X), 0}]{1,2}i 3 5

Conditional Conditional random field random field Featur Feature functions ar e functions are e hinge-loss functions hinge-loss functions

Hinge-loss Markov random fields

slide-22
SLIDE 22

Bach et al. NIPS 12, Bach et al. UAI 13

22 22

Bach et al. (2015), arXiv

p(Y|X) = 1 Z(w, X) exp 2 4−

m

X

j=1

wj h max {j(Y, X), 0}]{1,2}i 3 5

Conditional Conditional random field random field Featur Feature function for e function for each each instantiated rule instantiated rule

Hinge-loss Markov random fields

slide-23
SLIDE 23

Bach et al. NIPS 12, Bach et al. UAI 13

23 23

Bach et al. (2015), arXiv arXiv

p(Y|X) = 1 Z(w, X) exp 2 4−

m

X

j=1

wj h max {j(Y, X), 0}]{1,2}i 3 5

Conditional Conditional random field random field 5.0: Causes(X1, X2) ^ Causes(X2, X4) ^ Linked(X1,X4) à Causes(X1, X4)

Hinge-loss Markov random fields

slide-24
SLIDE 24

Bach et al. NIPS 12, Bach et al. UAI 13

24 24

Bach et al. (2015), arXiv arXiv

p(Y|X) = 1 Z(w, X) exp 2 4−

m

X

j=1

wj h max {j(Y, X), 0}]{1,2}i 3 5

Conditional Conditional random field random field

MAP Inference Intuition: minimize distances to satisfaction!

Hinge-loss Markov random fields

slide-25
SLIDE 25

Fast Inference in Hinge-loss MRFs

Bach et al. NIPS 12, Bach et al. UAI 13

25 25

Convex, continuous inference

  • bjective…

Convex optimization!

  • Solved using efficient, message-passing algorithm

called Alternating Direction Method of Multipliers

  • Algorithms for weight learning and reasoning with

latent variables

Bach et al. (2015), arXiv arXiv Open sour Open source softwar ce software: https://psl.umiacs.umd.edu

slide-26
SLIDE 26

Encoding PC Algorithm with PSL

  • PC Algorithm:
  • No latent variables and confounders
  • Constraint-based approach
  • PC with PSL:
  • Use all independence tests
  • All rule weights set to 1.0

26 26

slide-27
SLIDE 27

PSL Causal Structure Discovery

27 27

Multiple independence tests with various separation sets No early pruning!

slide-28
SLIDE 28

PSL Causal Structure Discovery

28 28

Colliders in triples using d-separation

slide-29
SLIDE 29

PSL Causal Structure Discovery

29 29

slide-30
SLIDE 30

PSL Causal Structure Discovery

30 30

slide-31
SLIDE 31

PSL Causal Structure Discovery

31 31

slide-32
SLIDE 32

Outline

  • Motivation
  • Problem Formulation
  • Our Approach
  • Preliminary Results

32 32

slide-33
SLIDE 33

Evaluation Dataset

33 33

Synthetic Causal DAG Dataset – 2000 examples

Causality Challenge: http://www.causality.inf.ethz.ch/data/LUCAS.html

slide-34
SLIDE 34

Evaluation

  • Experimental setup:
  • G2 Independence Tests for both PC and PSL
  • Max separation set of size 3
  • Evaluation details
  • Run PC and PC-PSL algorithms and compare to

causal ground truth

  • For PSL, round with threshold selected by cross-

validation on causal edges

34 34

slide-35
SLIDE 35

Causal Edge Prediction Results

35 35

Accuracy Accuracy F1 Scor F1 Score PC Algorithm 0.91 ± 0.06 0.53 ± 0.26 PC-PSL 0.94 ± 0.02 0.58 ± 0.19

Average causal edge prediction accuracy and F1 score

  • n 3-fold cross validation
slide-36
SLIDE 36

Summary and Future Directions

36 36

  • Joint inference of causal structure using probabilistic,

soft constraints

  • Incorporate prior and domain knowledge for causal

edges from text-mining, ontological constraints, and variable selection methods

  • Extensive, cross-validation experiments on multiple

datasets