CREDAL SENTENTIAL DECISION DIAGRAMS
ALESSANDRO ANTONUCCI, ALESSANDRO FACCHINI & LILITH MATTEI
CREDAL SENTENTIAL DECISION DIAGRAMS Alessandro Antonucci, a - - PowerPoint PPT Presentation
ISIPTA 2019 GHENT 03/07 ALESSANDRO ANTONUCCI, ALESSANDRO FACCHINI & LILITH MATTEI CREDAL SENTENTIAL DECISION DIAGRAMS Alessandro Antonucci, a senior researcher in probabilistic graphical models and machine learning Istituto
ALESSANDRO ANTONUCCI, ALESSANDRO FACCHINI & LILITH MATTEI
Istituto Dalle Molle di Studi per l’Intelligenza Artificiale
▸ Alessandro
Antonucci, a senior researcher in probabilistic graphical models and machine learning
▸ Alessandro Facchini, a
convenience* logician
*Concept and formulation by Yoichi Hirai
▸ Lilith Mattei, research assistant,
wannabe PhD student
FRAMING THE PROBLEM - STATE OF THE ART
WHAT ARE CSDD? FIRST SOME ZOOLOGY
FRAMING THE PROBLEM - STATE OF THE ART
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984)
FRAMING THE PROBLEM - STATE OF THE ART
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Imprecise version?
FRAMING THE PROBLEM - STATE OF THE ART
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Imprecise version? Sum-product nets (Poon & Domingos, 2012)
FRAMING THE PROBLEM - STATE OF THE ART
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) Imprecise version? Imprecise version? Sum-product nets (Poon & Domingos, 2012)
FRAMING THE PROBLEM - STATE OF THE ART
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Sum-product nets (Poon & Domingos, 2012) Credal SPNs (Mauá et al., 2017) …and logical constraints ? Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014) Imprecise version? Imprecise version?
FRAMING THE PROBLEM - STATE OF THE ART
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDD (here) Imprecise version? Imprecise version? Imprecise version? Sum-product nets (Poon & Domingos, 2012) Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014)
FRAMING THE PROBLEM
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDD (here) Imprecise version? Imprecise version? Imprecise version? ? Do nice properties of CSPNs adapt to CSDD? Sum-product nets (Poon & Domingos, 2012) Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014)
FRAMING THE PROBLEM
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDD (here) Imprecise version? Imprecise version? Imprecise version? Do nice properties of CSPNs adapt to CSDD? YES ! Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014) Sum-product nets (Poon & Domingos, 2012)
FRAMING THE PROBLEM
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDD (here) Imprecise version? Imprecise version? Imprecise version?
general CSPNs
for singly connected CSPNs
Do nice properties of CSPNs adapt to CSDD? YES ! Sum-product nets (Poon & Domingos, 2012) Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014)
FRAMING THE PROBLEM
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDD (here) Imprecise version? Imprecise version? Imprecise version?
message of this work: CSDD’s stand to PSDD’s as CSPN's stand to SPN’s
Do nice properties of CSPNs adapt to CSDD? YES ! Sum-product nets (Poon & Domingos, 2012) Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014)
FRAMING THE PROBLEM
WHAT ARE CSDD? FIRST SOME ZOOLOGY
Bayesian nets (Pearl, 1984) Credal nets (Cozman, 2000) Tractable “deep” model ? Credal SPNs (Mauá et al., 2017) …and logical constraints ? CSDDs (here) Imprecise version? Imprecise version? Imprecise version?
…but what are CSDDs ?
message of this work: CSDD’s stand to PSDD’s as CSPN's stand to SPN’s
Do nice properties of CSPNs adapt to CSDD? YES ! Sum-product nets (Poon & Domingos, 2012) Probabilistic Sentential Decision Diagrams, PSDDs (Kisa et al., 2014)
WHAT ARE CSDD’S ?
A FIRST GLIMPSE TO CSDD
WHAT ARE CSDD’S ?
A FIRST GLIMPSE TO CSDD
Decision Diagrams
WHAT ARE CSDD’S ?
A FIRST GLIMPSE TO CSDD
Decision Diagrams
▸ so, what are PSDDs?
WHAT ARE CSDD’S ?
A FIRST GLIMPSE TO CSDD
Decision Diagrams
▸ so, what are PSDDs? ▸ actually, what are SDDs?
TOY EXAMPLE (FROM KISA ET AL. 2014)
100 STUDENTS ENROLLING IN 4 CLASSES: LOGIC (L), KNOWLEDGE REPRESENTATION (K), PROBABILITY (P), AI (A)
▸ 16 joint states ▸ Three logical constraints
L K P A
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
(P ∨ L), (A → P), (K → A ∨ L)
TOY EXAMPLE (FROM KISA ET AL. 2014)
▸ 16 joint states ▸ Three logical constraints
L K P A
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
ϕ := (P ∨ L) ∧ (A → P) ∧ (K → A ∨ L)
100 STUDENTS ENROLLING IN 4 CLASSES: LOGIC (L), KNOWLEDGE REPRESENTATION (K), PROBABILITY (P), AI (A)
TOY EXAMPLE (FROM KISA ET AL. 2014)
▸ 16 joint states ▸ Three logical constraints ▸ 7 states not satisfying the
logical constraints (hence never observed)
L
K P A
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
ϕ := (P ∨ L) ∧ (A → P) ∧ (K → A ∨ L)
100 STUDENTS ENROLLING IN 4 CLASSES: LOGIC (L), KNOWLEDGE REPRESENTATION (K), PROBABILITY (P), AI (A)
TOY EXAMPLE (FROM KISA ET AL. 2014)
▸ 16 joint states ▸ Three logical constraints ▸ 7 states not satisfying the
logical constraints (hence never observed)
▸ 1 state logically possible but
never observed
L
K P A
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
ϕ := (P ∨ L) ∧ (A → P) ∧ (K → A ∨ L)
100 STUDENTS ENROLLING IN 4 CLASSES: LOGIC (L), KNOWLEDGE REPRESENTATION (K), PROBABILITY (P), AI (A)
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
▸
A Sentential Decision Diagram representing is a “deterministic” logic circuit
ϕ
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
▸
A Sentential Decision Diagram representing is a “deterministic” logic circuit
▸
ϕ
⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
▸
A Sentential Decision Diagram representing is a “deterministic” logic circuit
▸
ϕ
⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)
Partition
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
▸
A Sentential Decision Diagram representing is a “deterministic” logic circuit
▸
take a subset of the variables, form a partition of the tautology, e.g.,
ϕ
¬L ∧ K L ¬L ∧ ¬K
ϕ = (P ∨ L) ∧ (P ∨ ¬A) ∧ (A ∨ L ∨ ¬K)
⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
▸
A Sentential Decision Diagram representing is a “deterministic” logic circuit
▸
take a subset of the variables, form a partition of the tautology, e.g.,
ϕ
¬L ∧ K L ¬L ∧ ¬K
ϕ = (P ∨ L) ∧ (P ∨ ¬A) ∧ (A ∨ L ∨ ¬K)
What does becomes when
ϕ
L = ⊥ , K = ⊤ ?
⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
▸
A Sentential Decision Diagram representing is a “deterministic” logic circuit
▸
take a subset of the variables, form a partition of the tautology, e.g.,
ϕ
¬L ∧ K L ¬L ∧ ¬K P ∧ A
ϕ = (P ∨ L) ∧ (P ∨ ¬A) ∧ (A ∨ L ∨ ¬K)
⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
▸
A Sentential Decision Diagram representing is a “deterministic” logic circuit
▸
take a subset of the variables, form a partition of the tautology, e.g.,
ϕ
¬L ∧ K L ¬L ∧ ¬K
ϕ = (P ∨ L) ∧ (P ∨ ¬A) ∧ (A ∨ L ∨ ¬K)
What does becomes when
ϕ
L = ⊤ ? What does becomes when
ϕ
L = ⊥ , K = ⊥ ?
⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)
P ∧ A
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
▸
A Sentential Decision Diagram representing is a “deterministic” logic circuit
▸
take a subset of the variables, form a partition of the tautology, e.g.,
ϕ
¬L ∧ K L ¬L ∧ ¬K P ∧ A
ϕ = (P ∨ L) ∧ (P ∨ ¬A) ∧ (A ∨ L ∨ ¬K)
P ∨ ¬A
P
⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
▸
A Sentential Decision Diagram representing is a “deterministic” logic circuit
▸
take a subset of the variables, form a partition of the tautology, e.g.,
▸
ϕ
¬L ∧ K L ¬L ∧ ¬K P ∧ A P ∨ ¬A
P
⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)
(¬L ∧ K) ∧ (P ∧ A)⋁L ∧ (P ∨ ¬A)⋁(¬L ∧ ¬K) ∧ P = ϕ
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
▸
A Sentential Decision Diagram representing is a “deterministic” logic circuit
▸
take a subset of the variables, form a partition of the tautology, e.g.,
▸ ▸
Proceed recursively…
ϕ
¬L ∧ K L ¬L ∧ ¬K P ∧ A P ∨ ¬A
P
⊤ = (¬L ∧ K) ∨ L ∨ (¬L ∧ ¬K)
(¬L ∧ K) ∧ (P ∧ A)⋁L ∧ (P ∨ ¬A)⋁(¬L ∧ ¬K) ∧ P = ϕ
CONSTRAINTS FIRST: SDDS
MODELING CONSTRAINTS WITH CIRCUITS: SDD’S (DARWICHE 2011)
(¬L ∧ K⋁L ∧ ⊥ )⋀(P ∧ A⋁¬P ∧ ⊥ )
(L ∧ ⊤ ⋁¬L ∧ ⊥ )⋀(¬P ∧ ¬A⋁P ∧ ⊤ )
(¬L ∧ ¬K⋁L ∧ ⊥ )⋀(P ∧ ⊤ ⋁¬P ∧ ⊥ )
= ϕ
Paired boxes: AND gates Decision nodes: OR gates
CONSTRAINTS FIRST, DATA AFTER: PSDD
MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)
▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a
parametrized SDD:
ϕ
CONSTRAINTS FIRST, DATA AFTER: PSDD
MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)
▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a
parametrized SDD:
▸ Parameters learned from data
ϕ
CONSTRAINTS FIRST, DATA AFTER: PSDD
MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)
▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a
parametrized SDD:
▸ Parameters learned from data ▸ Inducing a joint probability
ℙ(A, L, P, K)
ϕ
CONSTRAINTS FIRST, DATA AFTER: PSDD
MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)
▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a
parametrized SDD:
▸ Parameters learned from data ▸ Inducing a joint probability
ℙ(A, L, P, K)
ϕ
ℙ(L ∧ ¬K)
CONSTRAINTS FIRST, DATA AFTER: PSDD
MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)
▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a
parametrized SDD:
▸ Parameters learned from data ▸ Inducing a joint probability
ℙ(A, L, P, K)
ϕ
ℙ(P|L)
ℙ(L ∧ ¬K)
CONSTRAINTS FIRST, DATA AFTER: PSDD
MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)
▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a
parametrized SDD:
▸ Parameters learned from data ▸ Inducing a joint probability ▸ context-specific independences wrt derived from the structure
ℙ(A, L, P, K)
ℙ ϕ
ℙ(P|L)
ℙ(L ∧ ¬K)
CONSTRAINTS FIRST, DATA AFTER: PSDD
MODELING DATA + CONSTRAINTS WITH CIRCUITS: PSDD’S (KISA, 2014)
▸ A Probabilistic Sentential Decision Diagrams (PSDDs) for. is a
parametrized SDD:
▸ Parameters learned from data ▸ Inducing a joint probability ▸ context-specific independences wrt derived from the structure ▸ Logically impossible events have zero probability:
ℙ(A, L, P, K)
ℙ(x) > 0 ↔ x ⊧ ϕ
ℙ ϕ
ℙ(P|L)
ℙ(L ∧ ¬K)
DEFINING CSDD’S
CREDAL VERSION OF PSDD’S:
DEFINING CSDD’S
CREDAL VERSION OF PSDD’S: REPLACE PMF’S WITH CS’S
DEFINING CSDD’S
CREDAL VERSION OF PSDD’S: REPLACE PMF’S WITH CS’S
ϕ
▸ Credal Sentential Decision Diagrams (CSDDs) for
[ 10
101 , 11 101 ]
[ 30
101 , 31 101 ]
[ 60
101 , 61 101 ]
[ 24
31 , 25 31 ]
[ 3
13 , 4 13 ]
[ 54
61 , 55 61 ]
[ 18
31 , 19 31 ]
[ 12
31 , 13 31 ]
DEFINING CSDD’S
CREDAL VERSION OF PSDD’S: REPLACE PMF’S WITH CS’S
ϕ
▸ Credal Sentential Decision Diagrams (CSDDs) for ▸ Syntax: CS attached to each decision node and to each terminal node
⊤
[ 10
101 , 11 101 ]
[ 30
101 , 31 101 ]
[ 60
101 , 61 101 ]
[ 24
31 , 25 31 ]
[ 3
13 , 4 13 ]
[ 54
61 , 55 61 ]
[ 18
31 , 19 31 ]
[ 12
31 , 13 31 ]
DEFINING CSDD’S
CREDAL VERSION OF PSDD’S: REPLACE PMF’S WITH CS’S
ϕ
▸ Credal Sentential Decision Diagrams (CSDDs) for ▸ Syntax: CS attached to each decision node and to each terminal node ▸ Semantics: collection of consistent PSDDs
⊤
[ 10
101 , 11 101 ]
[ 30
101 , 31 101 ]
[ 60
101 , 61 101 ]
[ 24
31 , 25 31 ]
[ 3
13 , 4 13 ]
[ 54
61 , 55 61 ]
[ 18
31 , 19 31 ]
[ 12
31 , 13 31 ]
DEFINING CSDD’S
CREDAL VERSION OF PSDD’S: REPLACE PMF’S WITH CS’S
ϕ
▸ Credal Sentential Decision Diagrams (CSDDs) for ▸ Syntax: CS attached to each decision node and to each terminal node ▸ Semantics: collection of consistent PSDDs ▸ PSDD induces joint P, CSDD induces joint CS (“Strong extension”)
⊤
[ 10
101 , 11 101 ]
[ 30
101 , 31 101 ]
[ 60
101 , 61 101 ]
[ 24
31 , 25 31 ]
[ 3
13 , 4 13 ]
[ 54
61 , 55 61 ]
[ 18
31 , 19 31 ]
[ 12
31 , 13 31 ]
CSDD’S INFERENCE
CSDD’S INFERENCE
CSDD’S INFERENCE
CSDD’S INFERENCE
Marginal queries:
Given evidence e, calculate
ℙ(e) = min
ℙ(X)∈𝕃(X) ℙ(e)
ℙ(x|e) = min
ℙ(X)∈𝕃(X)
ℙ(x, e) ℙ(e)
CSDD’S INFERENCE
CSDD’S INFERENCE
Marginal queries:
Given evidence e, calculate
ℙ(e) = min
ℙ(X)∈𝕃(X) ℙ(e)
Conditional queries:
Given available evidence e and queried variabile, calculate
ℙ(x|e) = min
ℙ(X)∈𝕃(X)
ℙ(x, e) ℙ(e)
TWO POLYTIME ALGORITHMS
▸ Adaptation of CSPNs algorithms (Mauá et al.) to CSDDs:
Marginal queries:
task’s results
computed in the lower level
CSs
Conditional queries:
results
computed in the lower level, depending on evidence
CSDD’S INFERENCE
TWO POLYTIME ALGORITHMS
▸ Adaptation of CSPNs algorithms (Mauá et al.) to CSDDs:
Marginal queries:
task’s results
computed in the lower level
CSs
Conditional queries:
results
computed in the lower level, depending on evidence
CSDD’S INFERENCE
Needs singly connected topology
CONCLUSIONS AND FUTURE WORK
▸ CSDDs as a new tool for sensitivity analysis in PSDD ▸ Robust marginalisation and conditioning (for singly connected
circuits) with poly complexity ▸ Application to “credal” ML with structured spaces ▸ Complexity and approximations results for multiply connected
CSDDs
▸ Hybrid (structured/unstructured) models ▸ Structural learning (trade-off small SDD / likelihood / independences) ▸ CNs vs. CSDDs ?
CONCLUSIONS AND FUTURE WORK
▸ CSDDs as a new tool for sensitivity analysis in PSDD ▸ Robust marginalisation and conditioning (for singly connected
circuits) with poly complexity ▸ Application to “credal” ML with structured spaces ▸ Complexity and approximations results for multiply connected
CSDDs
▸ Hybrid (structured/unstructured) models ▸ Structural learning (trade-off small SDD / likelihood / independences) ▸ CNs vs. CSDDs ?
Credal Sentential Decision Diagrams (CSDDs)
Alessandro Antonucci, Alessandro Facchini, Lilith Mattei
{alessandro, alessandro.facchini, lilith}@idsia.ch A NEW CLASS OF (CREDAL) GRAPHICAL MODELS