CREDAL SENTENTIAL DECISION DIAGRAMS Alessandro Antonucci, a - - PowerPoint PPT Presentation

credal sentential decision diagrams
SMART_READER_LITE
LIVE PREVIEW

CREDAL SENTENTIAL DECISION DIAGRAMS Alessandro Antonucci, a - - PowerPoint PPT Presentation

ISIPTA 2019 GHENT 03/07 ALESSANDRO ANTONUCCI, ALESSANDRO FACCHINI & LILITH MATTEI CREDAL SENTENTIAL DECISION DIAGRAMS Alessandro Antonucci, a senior researcher in probabilistic graphical models and machine learning Istituto


slide-1
SLIDE 1

CREDAL SENTENTIAL DECISION DIAGRAMS

ALESSANDRO ANTONUCCI, ALESSANDRO FACCHINI & LILITH MATTEI

ISIPTA 2019 GHENT 03/07

slide-2
SLIDE 2

Istituto Dalle Molle di Studi per l’Intelligenza Artificiale

▸ Alessandro

Antonucci, a senior researcher in probabilistic graphical models and machine learning

▸ Alessandro Facchini, a

convenience* logician

*Concept and formulation by Yoichi Hirai

▸ Lilith Mattei, research assistant,

wannabe PhD student

slide-3
SLIDE 3

FRAMING THE PROBLEM - STATE OF THE ART

WHAT ARE CSDD? FIRST SOME ZOOLOGY

slide-4
SLIDE 4

FRAMING THE PROBLEM - STATE OF THE ART

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984)

slide-5
SLIDE 5

FRAMING THE PROBLEM - STATE OF THE ART

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Imprecise version?

slide-6
SLIDE 6

FRAMING THE PROBLEM - STATE OF THE ART

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Imprecise version? Sum-product nets (Poon & Domingos, 2012)

slide-7
SLIDE 7

FRAMING THE PROBLEM - STATE OF THE ART

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) Imprecise version? Imprecise version? Sum-product nets (Poon & Domingos, 2012)

slide-8
SLIDE 8

FRAMING THE PROBLEM - STATE OF THE ART

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Sum-product nets (Poon & Domingos, 2012) Credal SPNs (Mauá et al., 2017) …and logical constraints ? Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014) Imprecise version? Imprecise version?

slide-9
SLIDE 9

FRAMING THE PROBLEM - STATE OF THE ART

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDD (here) Imprecise version? Imprecise version? Imprecise version? Sum-product nets (Poon & Domingos, 2012) Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014)

slide-10
SLIDE 10

FRAMING THE PROBLEM

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDD (here) Imprecise version? Imprecise version? Imprecise version? ? Do nice properties of CSPNs adapt to CSDD? Sum-product nets (Poon & Domingos, 2012) Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014)

slide-11
SLIDE 11

FRAMING THE PROBLEM

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDD (here) Imprecise version? Imprecise version? Imprecise version? Do nice properties of CSPNs adapt to CSDD? YES ! Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014) Sum-product nets (Poon & Domingos, 2012)

slide-12
SLIDE 12

FRAMING THE PROBLEM

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDD (here) Imprecise version? Imprecise version? Imprecise version?

  • Fast marginal inference algorithm for

general CSPNs

  • Fast conditional inference algorithm

for singly connected CSPNs

Do nice properties of CSPNs adapt to CSDD? YES ! Sum-product nets (Poon & Domingos, 2012) Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014)

slide-13
SLIDE 13

FRAMING THE PROBLEM

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDD (here) Imprecise version? Imprecise version? Imprecise version?

message of this work: CSDD’s stand to PSDD’s as CSPN's stand to SPN’s

Do nice properties of CSPNs adapt to CSDD? YES ! Sum-product nets (Poon & Domingos, 2012) Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014)

slide-14
SLIDE 14

FRAMING THE PROBLEM

WHAT ARE CSDD? FIRST SOME ZOOLOGY

Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDDs (here) Imprecise version? Imprecise version? Imprecise version?

…but what are CSDDs ?

message of this work: CSDD’s stand to PSDD’s as CSPN's stand to SPN’s

Do nice properties of CSPNs adapt to CSDD? YES ! Sum-product nets (Poon & Domingos, 2012) Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014)

slide-15
SLIDE 15

WHAT ARE CSDD’S ?

A FIRST GLIMPSE TO CSDD

slide-16
SLIDE 16

WHAT ARE CSDD’S ?

A FIRST GLIMPSE TO CSDD

  • CSDD = Credal version of Probabilistic Sentential

Decision Diagrams

slide-17
SLIDE 17

WHAT ARE CSDD’S ?

A FIRST GLIMPSE TO CSDD

  • CSDD = Credal version of Probabilistic Sentential

Decision Diagrams

▸ so, what are PSDDs?

slide-18
SLIDE 18

WHAT ARE CSDD’S ?

A FIRST GLIMPSE TO CSDD

  • CSDD = Credal version of Probabilistic Sentential

Decision Diagrams

▸ so, what are PSDDs? ▸ actually, what are SDDs?

slide-19
SLIDE 19

TOY EXAMPLE (FROM KISA ET AL. 2014)

100 STUDENTS ENROLLING IN 4 CLASSES: LOGIC (L), KNOWLEDGE REPRESENTATION (K), PROBABILITY (P), AI (A)

▸ 16 joint states ▸ Three logical constraints

L K P A

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

(P ∨ L), (A → P), (K → A ∨ L)

slide-20
SLIDE 20

TOY EXAMPLE (FROM KISA ET AL. 2014)

▸ 16 joint states ▸ Three logical constraints

L K P A

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

ϕ := (P ∨ L) ∧ (A → P) ∧ (K → A ∨ L)

100 STUDENTS ENROLLING IN 4 CLASSES: LOGIC (L), KNOWLEDGE REPRESENTATION (K), PROBABILITY (P), AI (A)

slide-21
SLIDE 21

TOY EXAMPLE (FROM KISA ET AL. 2014)

▸ 16 joint states ▸ Three logical constraints ▸ 7 states not satisfying the

logical constraints (hence never observed)

L

K P A

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

ϕ := (P ∨ L) ∧ (A → P) ∧ (K → A ∨ L)

100 STUDENTS ENROLLING IN 4 CLASSES: LOGIC (L), KNOWLEDGE REPRESENTATION (K), PROBABILITY (P), AI (A)

slide-22
SLIDE 22

TOY EXAMPLE (FROM KISA ET AL. 2014)

▸ 16 joint states ▸ Three logical constraints ▸ 7 states not satisfying the

logical constraints (hence never observed)

▸ 1 state logically possible but

never observed

L

K P A

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

ϕ := (P ∨ L) ∧ (A → P) ∧ (K → A ∨ L)

100 STUDENTS ENROLLING IN 4 CLASSES: LOGIC (L), KNOWLEDGE REPRESENTATION (K), PROBABILITY (P), AI (A)

slide-23
SLIDE 23

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

A Sentential Decision Diagram representing is a “deterministic” logic circuit

ϕ

slide-24
SLIDE 24

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

A Sentential Decision Diagram representing is a “deterministic” logic circuit

ϕ

⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)

slide-25
SLIDE 25

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

A Sentential Decision Diagram representing is a “deterministic” logic circuit

ϕ

⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)

Partition

slide-26
SLIDE 26

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

A Sentential Decision Diagram representing is a “deterministic” logic circuit

take a subset of the variables, form a partition of the tautology, e.g.,

ϕ

¬L ∧ K L ¬L ∧ ¬K

ϕ = (P ∨ L) ∧ (P ∨ ¬A) ∧ (A ∨ L ∨ ¬K)

⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)

slide-27
SLIDE 27

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

A Sentential Decision Diagram representing is a “deterministic” logic circuit

take a subset of the variables, form a partition of the tautology, e.g.,

ϕ

¬L ∧ K L ¬L ∧ ¬K

ϕ = (P ∨ L) ∧ (P ∨ ¬A) ∧ (A ∨ L ∨ ¬K)

What does becomes when

ϕ

L = ⊥ , K = ⊤ ?

⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)

slide-28
SLIDE 28

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

A Sentential Decision Diagram representing is a “deterministic” logic circuit

take a subset of the variables, form a partition of the tautology, e.g.,

ϕ

¬L ∧ K L ¬L ∧ ¬K P ∧ A

ϕ = (P ∨ L) ∧ (P ∨ ¬A) ∧ (A ∨ L ∨ ¬K)

⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)

slide-29
SLIDE 29

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

A Sentential Decision Diagram representing is a “deterministic” logic circuit

take a subset of the variables, form a partition of the tautology, e.g.,

ϕ

¬L ∧ K L ¬L ∧ ¬K

ϕ = (P ∨ L) ∧ (P ∨ ¬A) ∧ (A ∨ L ∨ ¬K)

What does becomes when

ϕ

L = ⊤ ? What does becomes when

ϕ

L = ⊥ , K = ⊥ ?

⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)

P ∧ A

slide-30
SLIDE 30

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

A Sentential Decision Diagram representing is a “deterministic” logic circuit

take a subset of the variables, form a partition of the tautology, e.g.,

ϕ

¬L ∧ K L ¬L ∧ ¬K P ∧ A

ϕ = (P ∨ L) ∧ (P ∨ ¬A) ∧ (A ∨ L ∨ ¬K)

P ∨ ¬A

P

⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)

slide-31
SLIDE 31

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

A Sentential Decision Diagram representing is a “deterministic” logic circuit

take a subset of the variables, form a partition of the tautology, e.g.,

ϕ

¬L ∧ K L ¬L ∧ ¬K P ∧ A P ∨ ¬A

P

⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)

(¬L ∧ K) ∧ (P ∧ A)⋁L ∧ (P ∨ ¬A)⋁(¬L ∧ ¬K) ∧ P = ϕ

slide-32
SLIDE 32

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

A Sentential Decision Diagram representing is a “deterministic” logic circuit

take a subset of the variables, form a partition of the tautology, e.g.,

▸ ▸

Proceed recursively…

ϕ

¬L ∧ K L ¬L ∧ ¬K P ∧ A P ∨ ¬A

P

⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)

(¬L ∧ K) ∧ (P ∧ A)⋁L ∧ (P ∨ ¬A)⋁(¬L ∧ ¬K) ∧ P = ϕ

slide-33
SLIDE 33

CONSTRAINTS FIRST: SDDS

MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)

(¬L ∧ K⋁L ∧ ⊥ )⋀(P ∧ A⋁¬P ∧ ⊥ )

(L ∧ ⊤ ⋁¬L ∧ ⊥ )⋀(¬P ∧ ¬A⋁P ∧ ⊤ )

(¬L ∧ ¬K⋁L ∧ ⊥ )⋀(P ∧ ⊤ ⋁¬P ∧ ⊥ )

∨ ∨

= ϕ

Paired boxes: AND gates Decision nodes: OR gates

slide-34
SLIDE 34

CONSTRAINTS FIRST, DATA AFTER: PSDD

MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)

▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a

parametrized SDD:

ϕ

slide-35
SLIDE 35

CONSTRAINTS FIRST, DATA AFTER: PSDD

MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)

▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a

parametrized SDD:

▸ Parameters learned from data

ϕ

slide-36
SLIDE 36

CONSTRAINTS FIRST, DATA AFTER: PSDD

MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)

▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a

parametrized SDD:

▸ Parameters learned from data ▸ Inducing a joint probability

ℙ(A, L, P, K)

ϕ

slide-37
SLIDE 37

CONSTRAINTS FIRST, DATA AFTER: PSDD

MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)

▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a

parametrized SDD:

▸ Parameters learned from data ▸ Inducing a joint probability

ℙ(A, L, P, K)

ϕ

ℙ(L ∧ ¬K)

slide-38
SLIDE 38

CONSTRAINTS FIRST, DATA AFTER: PSDD

MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)

▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a

parametrized SDD:

▸ Parameters learned from data ▸ Inducing a joint probability

ℙ(A, L, P, K)

ϕ

ℙ(P|L)

ℙ(L ∧ ¬K)

slide-39
SLIDE 39

CONSTRAINTS FIRST, DATA AFTER: PSDD

MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)

▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a

parametrized SDD:

▸ Parameters learned from data ▸ Inducing a joint probability ▸ context-specific independences wrt derived from the structure

ℙ(A, L, P, K)

ℙ ϕ

ℙ(P|L)

ℙ(L ∧ ¬K)

slide-40
SLIDE 40

CONSTRAINTS FIRST, DATA AFTER: PSDD

MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)

▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a

parametrized SDD:

▸ Parameters learned from data ▸ Inducing a joint probability ▸ context-specific independences wrt derived from the structure ▸ Logically impossible events have zero probability:

ℙ(A, L, P, K)

ℙ(x) > 0 ↔ x ⊧ ϕ

ℙ ϕ

ℙ(P|L)

ℙ(L ∧ ¬K)

slide-41
SLIDE 41

DEFINING CSDD’S

CREDAL VERSION OF PSDD’S:

slide-42
SLIDE 42

DEFINING CSDD’S

CREDAL VERSION OF PSDD’S: REPLACE PMF’S WITH CS’S

slide-43
SLIDE 43

DEFINING CSDD’S

CREDAL VERSION OF PSDD’S: REPLACE PMF’S WITH CS’S

ϕ

▸ Credal Sentential Decision Diagrams (CSDDs) for

[ 10

101 , 11 101 ]

[ 30

101 , 31 101 ]

[ 60

101 , 61 101 ]

[ 24

31 , 25 31 ]

[ 3

13 , 4 13 ]

[ 54

61 , 55 61 ]

[ 18

31 , 19 31 ]

[ 12

31 , 13 31 ]

slide-44
SLIDE 44

DEFINING CSDD’S

CREDAL VERSION OF PSDD’S: REPLACE PMF’S WITH CS’S

ϕ

▸ Credal Sentential Decision Diagrams (CSDDs) for ▸ Syntax: CS attached to each decision node and to each terminal node

[ 10

101 , 11 101 ]

[ 30

101 , 31 101 ]

[ 60

101 , 61 101 ]

[ 24

31 , 25 31 ]

[ 3

13 , 4 13 ]

[ 54

61 , 55 61 ]

[ 18

31 , 19 31 ]

[ 12

31 , 13 31 ]

slide-45
SLIDE 45

DEFINING CSDD’S

CREDAL VERSION OF PSDD’S: REPLACE PMF’S WITH CS’S

ϕ

▸ Credal Sentential Decision Diagrams (CSDDs) for ▸ Syntax: CS attached to each decision node and to each terminal node ▸ Semantics: collection of consistent PSDDs

[ 10

101 , 11 101 ]

[ 30

101 , 31 101 ]

[ 60

101 , 61 101 ]

[ 24

31 , 25 31 ]

[ 3

13 , 4 13 ]

[ 54

61 , 55 61 ]

[ 18

31 , 19 31 ]

[ 12

31 , 13 31 ]

slide-46
SLIDE 46

DEFINING CSDD’S

CREDAL VERSION OF PSDD’S: REPLACE PMF’S WITH CS’S

ϕ

▸ Credal Sentential Decision Diagrams (CSDDs) for ▸ Syntax: CS attached to each decision node and to each terminal node ▸ Semantics: collection of consistent PSDDs ▸ PSDD induces joint P, CSDD induces joint CS (“Strong extension”)

[ 10

101 , 11 101 ]

[ 30

101 , 31 101 ]

[ 60

101 , 61 101 ]

[ 24

31 , 25 31 ]

[ 3

13 , 4 13 ]

[ 54

61 , 55 61 ]

[ 18

31 , 19 31 ]

[ 12

31 , 13 31 ]

slide-47
SLIDE 47

CSDD’S INFERENCE

CSDD’S INFERENCE

slide-48
SLIDE 48

CSDD’S INFERENCE

CSDD’S INFERENCE

Marginal queries:

Given evidence e, calculate

ℙ(e) = min

ℙ(X)∈𝕃(X) ℙ(e)

ℙ(x|e) = min

ℙ(X)∈𝕃(X)

ℙ(x, e) ℙ(e)

slide-49
SLIDE 49

CSDD’S INFERENCE

CSDD’S INFERENCE

Marginal queries:

Given evidence e, calculate

ℙ(e) = min

ℙ(X)∈𝕃(X) ℙ(e)

Conditional queries:

Given available evidence e and queried variabile, calculate

ℙ(x|e) = min

ℙ(X)∈𝕃(X)

ℙ(x, e) ℙ(e)

slide-50
SLIDE 50

TWO POLYTIME ALGORITHMS

▸ Adaptation of CSPNs algorithms (Mauá et al.) to CSDDs:

Marginal queries:

  • Bottom-up propagation of LP

task’s results

  • Coefficients of each LP task are

computed in the lower level

  • Feasible regions are the local

CSs

Conditional queries:

  • Decisional version of original task
  • Bottom-up propagation of LP task’s

results

  • Coefficients of each LP task are

computed in the lower level, depending on evidence

  • Feasible regions are the local CSs

CSDD’S INFERENCE

slide-51
SLIDE 51

TWO POLYTIME ALGORITHMS

▸ Adaptation of CSPNs algorithms (Mauá et al.) to CSDDs:

Marginal queries:

  • Bottom-up propagation of LP

task’s results

  • Coefficients of each LP task are

computed in the lower level

  • Feasible regions are the local

CSs

Conditional queries:

  • Decisional version of original task
  • Bottom-up propagation of LP task’s

results

  • Coefficients of each LP task are

computed in the lower level, depending on evidence

  • Feasible regions are the local CSs

CSDD’S INFERENCE

Needs singly connected topology

slide-52
SLIDE 52

CONCLUSIONS AND FUTURE WORK

▸ CSDDs as a new tool for sensitivity analysis in PSDD ▸ Robust marginalisation and conditioning (for singly connected

circuits) with poly complexity ▸ Application to “credal” ML with structured spaces ▸ Complexity and approximations results for multiply connected

CSDDs

▸ Hybrid (structured/unstructured) models ▸ Structural learning (trade-off small SDD / likelihood / independences) ▸ CNs vs. CSDDs ?

slide-53
SLIDE 53

CONCLUSIONS AND FUTURE WORK

▸ CSDDs as a new tool for sensitivity analysis in PSDD ▸ Robust marginalisation and conditioning (for singly connected

circuits) with poly complexity ▸ Application to “credal” ML with structured spaces ▸ Complexity and approximations results for multiply connected

CSDDs

▸ Hybrid (structured/unstructured) models ▸ Structural learning (trade-off small SDD / likelihood / independences) ▸ CNs vs. CSDDs ?

slide-54
SLIDE 54

Credal Sentential Decision Diagrams (CSDDs)

Alessandro Antonucci, Alessandro Facchini, Lilith Mattei

{alessandro, alessandro.facchini, lilith}@idsia.ch A NEW CLASS OF (CREDAL) GRAPHICAL MODELS
  • Bayesian nets as classical (precise) probabilistic graphical models (BNs)
  • With imprecise probabilities? Credal networks (CNs, Cozman, 2000)
  • With deep structure (and tractable inference)?
Sum-product networks (SPNs, Poon & Domingos, 2011)
  • With deep structure and imprecise probabilities?
Credal sum-product networks (CSPNs, Mauá et al., 2017)
  • With deep structure and embedding logical constraints?
Probabilistic sentential decision diagrams (PSDDs, Kisa et al., 2014)
  • Deep structure, imprecise probabilities and logical constraints?
Credal sentential decision diagrams (CSDDs, this paper) TOY EXAMPLE: CLASSES ENROLLMENT
  • Data about 100 students in four classes
  • Logic, Knowledge, Probability and
Artificial Intelligence
  • Logical constraints for classes:
φ := (P _ L) ^ (A ! P) ^ (K ! A _ L)
  • Out of 24 = 16 joint configurations,
  • nly eight in the data set
seven are logically impossible,
  • ne possible but observed)
  • Robust learning of a model over (L,K,P,A)?
  • Consistent with the logical constraints φ?
  • The solution is a CSSD!
L K P A # 1 1 6 1 1 54 1 1 1 1 1 1 1 1 10 1 5 1 1 1 1 1 1 1 1 1 1 13 1 1 1 1 1 1 8 1 1 1 1 3 FROM SDDS TO CSDDS (THROUGH PSDDS)
  • Logical skeleton? φ as a circuit alternating OR and AND gates
  • This is a sentential decision diagram, (SDD, Choi & Darwiche, 2013)
  • Probabilistic model? Probability mass functions annotating
the OR gates of the SDD (PSDDs)
  • PSDD is a joint probability mass function consistent with the constraints
P(L, K, P, A) : P(l, k, p, a) = 0 iff (l, k, p, a) 6| = φ
  • CSDD? Credal version of PSDD: credal sets instead of mass functions
  • Credal sets on OR gates and terminal nodes >
  • Semantics: all PSDDs with parameters consistent with the local credal sets
  • Strong extension K(L, K, P, A) as the joint credal set of
all the joint mass functions induced by the consistent PSDDs
  • CSDD Inference? Lower/upper bounds wrt the strong extension
  • Base theorem: for each z: P(z) > 0 iff z |
= φ and P(z) = 0 iff z 6| = φ
  • Learning CSDD? Parameters are conditional probabilities,
Imprecise Dirichlet Model to learn local (conditional) credal sets
  • Data scarcity issue on the leaves.justifies imprecise approach!
MARGINAL QUERIES
  • Circuit traversal from leaves in re-
verse topological order
  • Every time a decision node is pro-
cessed, a LP task whose feasible region are the local credal sets of the node should be solved.
  • Analogous to Mauá et al. (2017)
for CSPNs, with additionally sup- port to logical constraints CONDITIONAL QUERIES
  • Conditional queries solved by generalized Bayes’ rule (GBR)
  • Associated decision problem is deciding whether or not,
for a given µ 2 [0, 1]: P(x|e) > µ
  • As P(x|e) + P(¬x|e) = 1 for each P(X) 2 Kr(X),
and assuming that P(e) > 0, this corresponds to: minP(X)2Kr(X) [(1 µ)P(x, e) µP(¬x, e)] > 0
  • Recursive formulation (for singly connected circuits):
min[θ1,...,θk]2Kr(P) ∑k i=1 π(pi) σ(si) θi > 0
  • where π(pi) is equal to minPpi (Z)2Kpi (Z)
⇥(1 µ)Ppi(x, el) µPpi(¬x, el) ⇤
  • and σ(si) is equal to
( Psi(er) if π(pi) < 0 Psi(er)
  • therwise.
  • Circuit
traversal from leaves (as for marginal queries)
  • LP tasks on deci-
sion nodes whose coefficients are computed with marginal queries
  • Bracketing
scheme to solve GBR
  • Again analogous
to Mauá et al. (2017) result for CSPNs CONCLUSIONS & OUTLOOKS
  • CSDDs as a new tool for sensitivity analysis in PSDD
  • Fast robust marginalisation and conditioning
(but conditioning works for singly connected circuits only)
  • Complexity results and approximated algorithm are needed
  • CNs vs. CSDDs? Credal classification with CSDDs?
REFERENCES
  • Hoifung Poon & Pedro Domingos. Sum-product networks: a new deep architecture.
In IEEE ICCV Workshops, pages 689-690. IEEE, 2011.
  • Denis Mauá, Fabio Cozman, Diarmaid Conaty, and Cassio de Campos. Credal sum-
product networks. In Proceedings of ISIPTA ’17, pages 205-216, 2017.
  • Denis Deratani Mauá, Diarmaid Conaty, Fabio Cozman, Katja Poppenhaeger, and
Cassio de Campos. Robustifying sum-product networks. International Journal of Approximate Reasoning, 2018.
  • Doga Kisa, Guy Van den Broeck, Arthur Choi, and Adnan Darwiche. Probabilistic
sentential decision diagrams. In Proceedings of the Fourteenth International Confer- ence on the Principles of Knowledge Representation and Reasoning, 2014.
  • Fabio Cozman. Credal networks. Artificial Intelligence, 120:199-233, 2000.
  • Arthur Choi and Adnan Darwiche. Dynamic minimization of sentential decision
  • diagrams. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial In-
telligence, 2013.