Int egration de connaissances ontologiques pour lapprentissage des - - PowerPoint PPT Presentation

int egration de connaissances ontologiques pour l
SMART_READER_LITE
LIVE PREVIEW

Int egration de connaissances ontologiques pour lapprentissage des - - PowerPoint PPT Presentation

Bayesian networks Ontologies BNs and Ontologies Int egration de connaissances ontologiques pour lapprentissage des r eseaux bay esiens Montassar Ben Messaoud 12 , Mouna Ben Ishak 12 , Philippe Leray 2 , Nahla Ben Amor 1 1 LARODEC,


slide-1
SLIDE 1

Bayesian networks Ontologies BNs and Ontologies

Int´ egration de connaissances ontologiques pour l’apprentissage des r´ eseaux bay´ esiens

Montassar Ben Messaoud12, Mouna Ben Ishak12, Philippe Leray2, Nahla Ben Amor1

1LARODEC, ISG, Universit´

e de Tunis, Tunisie

2Connaissances et D´

ecision, LINA UMR 6241, Nantes, France

Colloque Apprentissage Artificiel & Fouille de Donn´ ees 28 juin 2012, Univ. Paris 13

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 1/22

slide-2
SLIDE 2

Bayesian networks Ontologies BNs and Ontologies

Motivations

Knowledge-based systems aim to make expertise available for decision making and information sharing To resolve some complex problems, the combination of different knowledge-based systems can be very powerful Our idea Combine Two KBSs : Bayesian Networks and Ontologies, to help both

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 2/22

slide-3
SLIDE 3

Bayesian networks Ontologies BNs and Ontologies

Motivations

Knowledge-based systems aim to make expertise available for decision making and information sharing To resolve some complex problems, the combination of different knowledge-based systems can be very powerful Our idea Combine Two KBSs : Bayesian Networks and Ontologies, to help both

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 2/22

slide-4
SLIDE 4

Bayesian networks Ontologies BNs and Ontologies

Motivations

Knowledge-based systems aim to make expertise available for decision making and information sharing To resolve some complex problems, the combination of different knowledge-based systems can be very powerful Our idea Combine Two KBSs : Bayesian Networks and Ontologies, to help both

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 2/22

slide-5
SLIDE 5

Bayesian networks Ontologies BNs and Ontologies

Outline ...

1

Bayesian networks BN definition BN learning

2

Ontologies Ontology definition Ontology learning

3

BNs and Ontologies Existing work Our proposal

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 3/22

slide-6
SLIDE 6

Bayesian networks Ontologies BNs and Ontologies

Bayesian network definition

Definition [Pearl, 1985] A Bayesian network (BN) is defined by

  • ne qualitative description of (conditional) dependences /

independences between variables directed acyclic graph (DAG)

  • ne quantitative description of these dependences

conditional probability distributions (CPDs)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 4/22

slide-7
SLIDE 7

Bayesian networks Ontologies BNs and Ontologies

Example

  • ne topological order : B, E, A, R, T (not unique)

R Radio E Earthquake A Alarm B Burglary T TV

P(Alarm|Burglary,Earthquake)

Burglary,Earthquake = Y,Y Y,N N,Y N,N Alarm=Y 0.75 0.10 0.99 0.10 Alarm=N 0.25 0.90 0.01 0.90

P(TV|Radio)

Radio = Y N TV=Y 0.99 0.50 TV=N 0.01 0.50

P(Radio|Earthquake)

Earthquake = Y N Radio=Y 0.99 0.01 Radio=N 0.01 0.99 P(Burglary)=[0.001 0.999] P(Earthquake)=[0.0001 0.9999]

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 5/22

slide-8
SLIDE 8

Bayesian networks Ontologies BNs and Ontologies

Consequence

Chain rule P(S) = P(S1) × P(S2|S1) × P(S3|S1, S2) × · · · × P(Sn|S1 . . . Sn−1) Consequence with a BN P(Si|S1 . . . Si−1) = P(Si|parents(Si)) so P(S) = Πn

i=1P(Si|parents(Si))

The (global) joint probability distribution is decomposed in a product of (local) conditional distributions BN = compact representation of the joint distribution P(S) given some information about dependence relationships between variables

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 6/22

slide-9
SLIDE 9

Bayesian networks Ontologies BNs and Ontologies

Consequence

Chain rule P(S) = P(S1) × P(S2|S1) × P(S3|S1, S2) × · · · × P(Sn|S1 . . . Sn−1) Consequence with a BN P(Si|S1 . . . Si−1) = P(Si|parents(Si)) so P(S) = Πn

i=1P(Si|parents(Si))

The (global) joint probability distribution is decomposed in a product of (local) conditional distributions BN = compact representation of the joint distribution P(S) given some information about dependence relationships between variables

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 6/22

slide-10
SLIDE 10

Bayesian networks Ontologies BNs and Ontologies

Markov equivalence

Definition B1 and B2 are Markov equivalent iff both describe exactly the same conditional (in)dependence statements Graphical properties B1 and B2 have the same skeleton, V-structures and inferred edges All the equivalent graphs (= equivalence class) can be summarized by one partially directed DAG named CPDAG or Essential Graph

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 7/22

slide-11
SLIDE 11

Bayesian networks Ontologies BNs and Ontologies

Markov equivalence

Definition B1 and B2 are Markov equivalent iff both describe exactly the same conditional (in)dependence statements Graphical properties B1 and B2 have the same skeleton, V-structures and inferred edges All the equivalent graphs (= equivalence class) can be summarized by one partially directed DAG named CPDAG or Essential Graph

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 7/22

slide-12
SLIDE 12

Bayesian networks Ontologies BNs and Ontologies

Markov equivalence

A S T L B O X D A S T L B O X D

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 8/22

slide-13
SLIDE 13

Bayesian networks Ontologies BNs and Ontologies

BN structure learning

How to build / learn a Bayesian Network ?

1

DAG is known, how to determine the CPDs ?

from experts : knowledge elicitation from complete data / incomplete data

2

DAG is unknown, how to determine it (or the CPDAG) ?

from complete data / incomplete data

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 9/22

slide-14
SLIDE 14

Bayesian networks Ontologies BNs and Ontologies

Structure learning is a complex task

Size of the ”solution” space The number of possible DAGs with n variables is super-exponential w.r.t n (Robinson 77) NS(n) =

  • 1

, n = 0 or 1 n

i=1(−1)i+1n i

  • 2i(n−1)NS(n − i),

n > 1 NS(5) = 29281 NS(10) = 4.2 × 1018 An exhaustive search is impossible !

One thousand millenniums = 3.2 × 1015 seconds

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 10/22

slide-15
SLIDE 15

Bayesian networks Ontologies BNs and Ontologies

Structure learning algorithms

How to search a good BN ? Constraint-based methods BN = independence model ⇒ find CI in data in order to build the DAG Score-based methods BN = probabilistic model that must fit data as well as possible ⇒ search the DAG space in order to maximize a scoring function Hybrid methods

more

Some issues Identifiability : learning algorithms can’t distinguish between any structure in the same equivalence class

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 11/22

slide-16
SLIDE 16

Bayesian networks Ontologies BNs and Ontologies

Structure learning algorithms

How to search a good BN ? Constraint-based methods BN = independence model ⇒ find CI in data in order to build the DAG Score-based methods BN = probabilistic model that must fit data as well as possible ⇒ search the DAG space in order to maximize a scoring function Hybrid methods

more

Some issues Learning algorithms can deal with a priori knowledge in order to reduce search space : white list, black list ,node ordering, repetition of local structures ...

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 11/22

slide-17
SLIDE 17

Bayesian networks Ontologies BNs and Ontologies

Outline ...

1

Bayesian networks BN definition BN learning

2

Ontologies Ontology definition Ontology learning

3

BNs and Ontologies Existing work Our proposal

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 12/22

slide-18
SLIDE 18

Bayesian networks Ontologies BNs and Ontologies

Ontology

Basics Definition [Gruber, 1993]

shared understanding within a community of people declarative specification of entities and their relationships with each other separate the domain knowledge from the operational knowledge

Construction/Evolution: expertise and/or machine learning Reasoning: description logic reasoners

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 13/22

slide-19
SLIDE 19

Bayesian networks Ontologies BNs and Ontologies

Elements of an ontology

C: Classes (concepts) P: Attributes (properties) H: Hierarchical structure (is-a, part-of relations) R: Other semantic relationships I: Instances (individuals) A: Axioms (logic statements)

Landslide Tsunami Fire Catastrophes Volcano Earthquake Flood

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 14/22

slide-20
SLIDE 20

Bayesian networks Ontologies BNs and Ontologies

Elements of an ontology

C: Classes (concepts) P: Attributes (properties) H: Hierarchical structure (is-a, part-of relations) R: Other semantic relationships I: Instances (individuals) A: Axioms (logic statements)

Landslide Tsunami Fire Catastrophes Volcano Earthquake Flood

Name Localization Date Nb_killed Name Mgnitude Localization Date Nb_killed

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 14/22

slide-21
SLIDE 21

Bayesian networks Ontologies BNs and Ontologies

Elements of an ontology

C: Classes (concepts) P: Attributes (properties) H: Hierarchical structure (is-a, part-of relations) R: Other semantic relationships I: Instances (individuals) A: Axioms (logic statements)

Landslide Tsunami Fire

Natural

Volcano Earthquake Flood

Catastrophes Man-made

is-a is-a is-a is-a is-a is-a is-a is-a

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 14/22

slide-22
SLIDE 22

Bayesian networks Ontologies BNs and Ontologies

Elements of an ontology

C: Classes (concepts) P: Attributes (properties) H: Hierarchical structure (is-a, part-of relations) R: Other semantic relationships I: Instances (individuals) A: Axioms (logic statements)

Landslide Tsunami Fire

Natural

Volcano Earthquake Flood

Catastrophes Man-made

is-a is-a is-a is-a is-a is-a is-a is-a Causes Causes Causes Causes Causes Causes

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 14/22

slide-23
SLIDE 23

Bayesian networks Ontologies BNs and Ontologies

Elements of an ontology

C: Classes (concepts) P: Attributes (properties) H: Hierarchical structure (is-a, part-of relations) R: Other semantic relationships I: Instances (individuals) A: Axioms (logic statements)

Landslide Tsunami Fire

Natural

Volcano Earthquake Flood

Catastrophes Man-made

is-a is-a is-a is-a is-a is-a is-a is-a Aleppo Shaanxi Haiti is-instance is-instance is-instance

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 14/22

slide-24
SLIDE 24

Bayesian networks Ontologies BNs and Ontologies

Elements of an ontology

C: Classes (concepts) P: Attributes (properties) H: Hierarchical structure (is-a, part-of relations) R: Other semantic relationships I: Instances (individuals) A: Axioms (logic statements)

Landslide Tsunami Fire

Natural

Volcano Earthquake Flood

Catastrophes Man-made

is-a is-a is-a is-a is-a is-a is-a is-a Aleppo Shaanxi Haiti is-instance is-instance is-instance

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 14/22

slide-25
SLIDE 25

Bayesian networks Ontologies BNs and Ontologies

Ontology learning : subtasks

Ontology population Get new instances of concept(s) already present in the ontology Ontology enrichment Update (add or modify) concepts, properties and relations in a given ontology Evolution vs. revolution Evolution (ontology continuity): Add new knowledge Revolution (ontology discontinuity): Modify existing knowledge

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 15/22

slide-26
SLIDE 26

Bayesian networks Ontologies BNs and Ontologies

Ontology learning : subtasks

Ontology population Get new instances of concept(s) already present in the ontology Ontology enrichment Update (add or modify) concepts, properties and relations in a given ontology Evolution vs. revolution Evolution (ontology continuity): Add new knowledge Revolution (ontology discontinuity): Modify existing knowledge

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 15/22

slide-27
SLIDE 27

Bayesian networks Ontologies BNs and Ontologies

Ontology learning : subtasks

Ontology population Get new instances of concept(s) already present in the ontology Ontology enrichment Update (add or modify) concepts, properties and relations in a given ontology Evolution vs. revolution Evolution (ontology continuity): Add new knowledge Revolution (ontology discontinuity): Modify existing knowledge

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 15/22

slide-28
SLIDE 28

Bayesian networks Ontologies BNs and Ontologies

Ontology learning : methods

[Buitelaar & Cimiano, 2008] ”Bridging the gap between text and knowledge” Natural Language Processing [Buitelaar et al., 2003, Velardi et al., 2005] Concept extraction Taxonomy learning (is-a, part-of) Population (information extraction) Machine Learning Clustering for taxonomy learning [Bisson et al., 2000] Association rules for relation discovery [Madche & Staab, 2000] ILP for relation discovery [Rudolph et al., 2007]

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 16/22

slide-29
SLIDE 29

Bayesian networks Ontologies BNs and Ontologies

Ontology learning : methods

[Buitelaar & Cimiano, 2008] ”Bridging the gap between text and knowledge” Natural Language Processing [Buitelaar et al., 2003, Velardi et al., 2005] Concept extraction Taxonomy learning (is-a, part-of) Population (information extraction) Machine Learning Clustering for taxonomy learning [Bisson et al., 2000] Association rules for relation discovery [Madche & Staab, 2000] ILP for relation discovery [Rudolph et al., 2007]

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 16/22

slide-30
SLIDE 30

Bayesian networks Ontologies BNs and Ontologies

Ontology learning : methods

[Buitelaar & Cimiano, 2008] ”Bridging the gap between text and knowledge” Natural Language Processing [Buitelaar et al., 2003, Velardi et al., 2005] Concept extraction Taxonomy learning (is-a, part-of) Population (information extraction) Machine Learning Clustering for taxonomy learning [Bisson et al., 2000] Association rules for relation discovery [Madche & Staab, 2000] ILP for relation discovery [Rudolph et al., 2007]

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 16/22

slide-31
SLIDE 31

Bayesian networks Ontologies BNs and Ontologies

Outline ...

1

Bayesian networks BN definition BN learning

2

Ontologies Ontology definition Ontology learning

3

BNs and Ontologies Existing work Our proposal

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 17/22

slide-32
SLIDE 32

Bayesian networks Ontologies BNs and Ontologies

Existing work

Bayesian Network = ⇒ Ontology BayesOWL [Ding & Peng, 2004] OntoBayes [Yang & Calmet, 2005] PR-OWL [Costa & Laskey, 2006] Use of BNs for probabilistic modeling and reasoning (no learning) Ontology = ⇒ Bayesian Network BN ”basic” construction using ontologies [Devitt et al., 2006] Ontology-based semi-automatic construction of BN in E-health applications [Jeon & Ko, 2007] Use of ontologies to manually build a BN (without data)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 18/22

slide-33
SLIDE 33

Bayesian networks Ontologies BNs and Ontologies

Existing work

Bayesian Network = ⇒ Ontology BayesOWL [Ding & Peng, 2004] OntoBayes [Yang & Calmet, 2005] PR-OWL [Costa & Laskey, 2006] Use of BNs for probabilistic modeling and reasoning (no learning) Ontology = ⇒ Bayesian Network BN ”basic” construction using ontologies [Devitt et al., 2006] Ontology-based semi-automatic construction of BN in E-health applications [Jeon & Ko, 2007] Use of ontologies to manually build a BN (without data)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 18/22

slide-34
SLIDE 34

Bayesian networks Ontologies BNs and Ontologies

Existing work

Bayesian Network = ⇒ Ontology BayesOWL [Ding & Peng, 2004] OntoBayes [Yang & Calmet, 2005] PR-OWL [Costa & Laskey, 2006] Use of BNs for probabilistic modeling and reasoning (no learning) Ontology = ⇒ Bayesian Network BN ”basic” construction using ontologies [Devitt et al., 2006] Ontology-based semi-automatic construction of BN in E-health applications [Jeon & Ko, 2007] Use of ontologies to manually build a BN (without data)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 18/22

slide-35
SLIDE 35

Bayesian networks Ontologies BNs and Ontologies

Existing work

Bayesian Network = ⇒ Ontology BayesOWL [Ding & Peng, 2004] OntoBayes [Yang & Calmet, 2005] PR-OWL [Costa & Laskey, 2006] Use of BNs for probabilistic modeling and reasoning (no learning) Ontology = ⇒ Bayesian Network BN ”basic” construction using ontologies [Devitt et al., 2006] Ontology-based semi-automatic construction of BN in E-health applications [Jeon & Ko, 2007] Use of ontologies to manually build a BN (without data)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 18/22

slide-36
SLIDE 36

Bayesian networks Ontologies BNs and Ontologies

Our proposal

”Bridging the gap between text and knowledge” Ontology + data = ⇒ Bayesian Network = ⇒ Ontology BN Structure Learning Ontology Evolution 2 variations

1

Causal BNs

more

semantical causal discovery SemCaDo 1.0 [Ben Messaoud &

  • al. 2009]

causal discovery for ontology evolution SemCaDo 2.0 [Ben Messaoud & al., 2009]

more 2

Object-Oriented BNs

OOBN structure learning and ontology evolution O2C [Ben Ishak et al., 2011]

more Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 19/22

slide-37
SLIDE 37

Bayesian networks Ontologies BNs and Ontologies

Our proposal

”Bridging the gap between data and knowledge” Ontology + data = ⇒ Bayesian Network = ⇒ Ontology BN Structure Learning Ontology Evolution 2 variations

1

Causal BNs

more

semantical causal discovery SemCaDo 1.0 [Ben Messaoud &

  • al. 2009]

causal discovery for ontology evolution SemCaDo 2.0 [Ben Messaoud & al., 2009]

more 2

Object-Oriented BNs

OOBN structure learning and ontology evolution O2C [Ben Ishak et al., 2011]

more Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 19/22

slide-38
SLIDE 38

Bayesian networks Ontologies BNs and Ontologies

Our proposal

”Bridging the gap between data and knowledge” Ontology + data = ⇒ Bayesian Network = ⇒ Ontology BN Structure Learning Ontology Evolution 2 variations

1

Causal BNs

more

semantical causal discovery SemCaDo 1.0 [Ben Messaoud &

  • al. 2009]

causal discovery for ontology evolution SemCaDo 2.0 [Ben Messaoud & al., 2009]

more 2

Object-Oriented BNs

OOBN structure learning and ontology evolution O2C [Ben Ishak et al., 2011]

more Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 19/22

slide-39
SLIDE 39

Bayesian networks Ontologies BNs and Ontologies

Our proposal

”Bridging the gap between data and knowledge” Ontology + data = ⇒ Bayesian Network = ⇒ Ontology BN Structure Learning Ontology Evolution 2 variations

1

Causal BNs

more

semantical causal discovery SemCaDo 1.0 [Ben Messaoud &

  • al. 2009]

causal discovery for ontology evolution SemCaDo 2.0 [Ben Messaoud & al., 2009]

more 2

Object-Oriented BNs

OOBN structure learning and ontology evolution O2C [Ben Ishak et al., 2011]

more Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 19/22

slide-40
SLIDE 40

Bayesian networks Ontologies BNs and Ontologies

Conclusion

BN structure learning is NP hard, need of prior knowledge Ontology evolution is difficult, based on text-mining Cooperation can help for both tasks Originality of our proposal BN structure learning : use ontology instead of expert knowledge : separation between expert acquisition and structure learning Ontology evolution : use BN structure learning to directly discover relationships from data Difficulties No similar work or benchmark for a comparative study SemCaDo : one concept = attribute = node O2C : OOBN learning is too complex

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 20/22

slide-41
SLIDE 41

Bayesian networks Ontologies BNs and Ontologies

Conclusion

BN structure learning is NP hard, need of prior knowledge Ontology evolution is difficult, based on text-mining Cooperation can help for both tasks Originality of our proposal BN structure learning : use ontology instead of expert knowledge : separation between expert acquisition and structure learning Ontology evolution : use BN structure learning to directly discover relationships from data Difficulties No similar work or benchmark for a comparative study SemCaDo : one concept = attribute = node O2C : OOBN learning is too complex

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 20/22

slide-42
SLIDE 42

Bayesian networks Ontologies BNs and Ontologies

Conclusion

BN structure learning is NP hard, need of prior knowledge Ontology evolution is difficult, based on text-mining Cooperation can help for both tasks Originality of our proposal BN structure learning : use ontology instead of expert knowledge : separation between expert acquisition and structure learning Ontology evolution : use BN structure learning to directly discover relationships from data Difficulties No similar work or benchmark for a comparative study SemCaDo : one concept = attribute = node O2C : OOBN learning is too complex

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 20/22

slide-43
SLIDE 43

Bayesian networks Ontologies BNs and Ontologies

Future works

What next ? Creation of benchmarks Generalize SemCaDo or O2C with more general models Ideas : Multi Entity BNs [Laskey, 2006] Relational BNs [Getoor et al., 2007]

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 21/22

slide-44
SLIDE 44

Bayesian networks Ontologies BNs and Ontologies

Thank you for your attention

philippe.leray@univ-nantes.fr

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 22/22

slide-45
SLIDE 45

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Outline ...

4

Bayesian network Structure Learning constraint-based methods score-based methods local search methods

5

Causal Bayesian Networks definition causal BN structure learning

6

SemCaDo algorithm definition experimental study

7

O2C algorithm definition algorithm

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 1/49

slide-46
SLIDE 46

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Constraint-based methods

How to search a good BN ? Constraint-based methods BN = independence model ⇒ find CI in data in order to build the DAG Score-based methods BN = probabilistic model that must fit data as well as possible ⇒ search the DAG space in order to maximize a scoring function Hybrid methods

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 2/49

slide-47
SLIDE 47

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Constraint-based methods

Two reference algorithms Pearl et Verma : IC, IC* Spirtes, Glymour et Scheines : SGS, PC, CI, FCI Common principle Build an undirected graph describing direct dependences between variables (χ2 tests)

by adding edges (Pearl et Verma) by deleting edges (SGS)

Detect V-structures (from previous statistical tests) Propagate some edge orientation (inferred edges) in order to

  • btain a CPDAG

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 3/49

slide-48
SLIDE 48

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

constraint-based methods

Some inconvenients Reliability of CI test conditionally to several variables with a limited amount of data)

SGS heuristic : if df < N

10, then declare dependence

Combinatorial explosion of the number of tests

PC heuristic : begin with order 0 (XA⊥XB) then order 1 (XA⊥XB | XC), etc ...

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 4/49

slide-49
SLIDE 49

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

PC algorithm

Step 0 : undirected complete graph Left : target BN used to generate 5000 samples

A S T L B O X D A S T L B O X D

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 5/49

slide-50
SLIDE 50

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

PC algorithm

Step 1a : delete all order 0 independences discovered χ2: S⊥A L⊥A B⊥A O⊥A X⊥A D⊥A T⊥S L⊥T

O⊥B X⊥B

A S T L B O X D A S T L B O X D

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 5/49

slide-51
SLIDE 51

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

PC algorithm

Step 1a : delete all order 1 independences discovered χ2: T⊥A|O

O⊥S|L X⊥S|L B⊥T|S X⊥T|O D⊥T|O ...

A S T L B O X D A S T L B O X D

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 5/49

slide-52
SLIDE 52

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

PC algorithm

Step 1a : delete all order 2 independences discovered χ2: D⊥S|{L, B}

X⊥O|{T, L} D⊥O|{T, L}

A S T L B O X D A S T L B O X D

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 5/49

slide-53
SLIDE 53

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

PC algorithm

Step 2 : research V-structures χ2 : one V-structure T → O ← L is discovered

A S T L B O X D A S T L B O X D

Step 3 : inferred edges no one in this example

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 5/49

slide-54
SLIDE 54

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

PC algorithm

From CPDAG to DAG Orientation of the remaining undirected edges (only constraint : do not create any new V-structure)

A S T L B O X D A S T L B O X D

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 5/49

slide-55
SLIDE 55

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

PC algorithm

Obtained DAG versus target one χ2 test with 5000 samples fails to discover A → T, O → X and O → D

A S T L B O X D A S T L B O X D

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 5/49

slide-56
SLIDE 56

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Score-based methods

How to search a good BN ? Constraint-based methods BN = independence model ⇒ find CI in data in order to build the DAG Score-based methods BN = probabilistic model that must fit data as well as possible ⇒ search the DAG space in order to maximize a scoring function Hybrid methods

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 6/49

slide-57
SLIDE 57

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Notion of score

General principle : Occam razor Pluralitas non est ponenda sine neccesitate (La pluralit´ e (des notions) ne devrait pas ˆ etre pos´ ee sans n´ ecessit´ e) plurality should not be posited without necessity Frustra fit per plura quod potest fieri per pauciora (C’est en vain que l’on fait avec plusieurs ce que l’on peut faire avec un petit nombre) It is pointless to do with more what can be done with fewer = Parcimony principle : find a model Fitting the data D : likelihood : L(D|θ, B) The simplest possible : dimension of B : Dim(B)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 7/49

slide-58
SLIDE 58

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Score examples

AIC and BIC Compromise between likelihood and complexity Application of AIC (Aka¨ ıke 70) and BIC (Schwartz 78) criteria SAIC(B, D) = log L(D|θMV , B) − Dim(B) SBIC(B, D) = log L(D|θMV , B) − 1 2Dim(B) log N Bayesian scores : BD, BDe, BDeu SBD(B, D) = P(B, D) (Cooper et Herskovits 92) BDe = BD + score equivalence (Heckerman 94) SBD(B, D) = P(B)

n

  • i=1

qi

  • j=1

Γ(αij) Γ(Nij + αij)

ri

  • k=1

Γ(Nijk + αijk) Γ(αijk)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 8/49

slide-59
SLIDE 59

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Score properties

Two important properties Decomposability (Global)Score(B, D) =

n

  • i=1

(local)score(Xi, pai) Score equivalence If two BN B1 and B2 are Markov equivalent then S(B1, D) = S(B2, D)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 9/49

slide-60
SLIDE 60

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Score properties

Two important properties Decomposability (Global)Score(B, D) =

n

  • i=1

(local)score(Xi, pai) Score equivalence If two BN B1 and B2 are Markov equivalent then S(B1, D) = S(B2, D)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 9/49

slide-61
SLIDE 61

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Heuristic exploration of search space

Search space and heuristics B space

restriction to tree space : Chow&Liu, MWST DAG with node ordering : K2 algorithm greedy search genetic algorithms, ...

E space

greedy equivalence search

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 10/49

slide-62
SLIDE 62

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Restriction to tree space

Principle What is the best tree connecting all the nodes, i.e. maximizing a weight defined for each possible edge ? Answer : maximal weighted spanning tree (MWST) weight = mutual information [Chow & Liu, 1968] W (XA, XB) =

  • a,b

Nab N log NabN Na.N.b weight = any local score variation [Heckerman, 1994] W (XA, XB) = score(XA, Pa(XA) = XB) − score(XA, ∅)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 11/49

slide-63
SLIDE 63

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Restriction to tree space

Principle What is the best tree connecting all the nodes, i.e. maximizing a weight defined for each possible edge ? Answer : maximal weighted spanning tree (MWST) weight = mutual information [Chow & Liu, 1968] W (XA, XB) =

  • a,b

Nab N log NabN Na.N.b weight = any local score variation [Heckerman, 1994] W (XA, XB) = score(XA, Pa(XA) = XB) − score(XA, ∅)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 11/49

slide-64
SLIDE 64

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Restriction to tree space

Remarks MWST returns an undirected tree This undirected tree = CPDAG of all the directed tree with this skeleton Obtain a directed tree by (randomly) choosing one root and

  • rienting the edges with a depth first search over this tree

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 12/49

slide-65
SLIDE 65

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Example : obtained DAG vs. target one

A S T L B O X D A S T L B O X D

MWST can not discover cycles neither V-structures (tree space !)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 13/49

slide-66
SLIDE 66

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Heuristic exploration of search space

Search space and heuristics B space

restriction to tree space : Chow&Liu, MWST DAG with node ordering : K2 algorithm greedy search genetic algorithms, ...

E space

greedy equivalence search

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 14/49

slide-67
SLIDE 67

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Greedy search

Principle Exploration of the search space with traversal operators

add edge invert edge delete edge

and respect the DAG definition (no cycle) Exploration can begin from any given DAG

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 15/49

slide-68
SLIDE 68

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Example : obtained DAG vs. target one

A S T L B O X D A S T L B O X D

start = empty graph. GS result = local optimum :-(

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 16/49

slide-69
SLIDE 69

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Example : obtained DAG vs. target one

A S T L B O X D A S T L B O X D

start = MWST result. GS result is better

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 16/49

slide-70
SLIDE 70

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Heuristic exploration of search space

Search space and heuristics B space

restriction to tree space : Chow&Liu, MWST DAG with node ordering : K2 algorithm greedy search genetic algorithms, ...

E space

greedy equivalence search

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 17/49

slide-71
SLIDE 71

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

What about changing our search space

Preliminaries IC/PC result = CPDAG MWST result = CPDAG Score-based methods do not distinguish equivalent DAGs Search in E E = CPDAG space Better properties : YES

2 equivalent structures = 1 unique structure in E

Better size : NO

E size is quasi similar to DAG space asymptotic ratio is 3,7 : [Gillispie & Perlman, 2001]

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 18/49

slide-72
SLIDE 72

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Greedy Equivalent Search

Principe [Chickering, 2002] Greedy search in E Phase 1 : add edges until convergence Phase 2 : delete edges until convergence

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 19/49

slide-73
SLIDE 73

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Add edge examples in E

X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 E0 neighborhood of E0 E1 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4 neighborhood of E1 X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 20/49

slide-74
SLIDE 74

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Score-based methods

How to search a good BN ? Constraint-based methods BN = independence model ⇒ find CI in data in order to build the DAG Score-based methods BN = probabilistic model that must fit data as well as possible ⇒ search the DAG space in order to maximize a scoring/fitness function Hybrid methods

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 21/49

slide-75
SLIDE 75

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Hybrid methods = local search methods

Local search and global learning Search one local neighborhood for a given node T Reiterate for each T Learn the global structure with these local informations which neighborhood ? PC(T) : Parents and Children T (without distinction) MB(T) : Markov Blanket of T - Parents, children and spouses

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 22/49

slide-76
SLIDE 76

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Hybrid methods = local search methods

Local search and global learning Search one local neighborhood for a given node T Reiterate for each T Learn the global structure with these local informations which neighborhood ? PC(T) : Parents and Children T (without distinction) MB(T) : Markov Blanket of T - Parents, children and spouses

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 22/49

slide-77
SLIDE 77

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Local search identification

Identification of MB(T) or PC(T) IAMB [Aliferis et al., 2002] MMPC [Tsamardinos et al., 2003], ... Hybrid structure learning algorithms MMHC [Tsamardinos et al., 2006] = MMPC + Greedy search

back Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 23/49

slide-78
SLIDE 78

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Outline ...

4

Bayesian network Structure Learning constraint-based methods score-based methods local search methods

5

Causal Bayesian Networks definition causal BN structure learning

6

SemCaDo algorithm definition experimental study

7

O2C algorithm definition algorithm

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 24/49

slide-79
SLIDE 79

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

A BN is not a causal model

Usual BN A → B does not imply direct causal relationship between A and B, Only edges from the CPDAG represent causal relationships ∗ Confusion When the DAG is given by an expert, this graph is very often causal When the DAG is learnt from data, no reason to be causal ! Causal BN Each A → B represents one direct causal relationship, i.e. A is the direct cause which generates B

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 25/49

slide-80
SLIDE 80

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

A BN is not a causal model

Usual BN A → B does not imply direct causal relationship between A and B, Only edges from the CPDAG represent causal relationships ∗ Confusion When the DAG is given by an expert, this graph is very often causal When the DAG is learnt from data, no reason to be causal ! Causal BN Each A → B represents one direct causal relationship, i.e. A is the direct cause which generates B

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 25/49

slide-81
SLIDE 81

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

A BN is not a causal model

Usual BN A → B does not imply direct causal relationship between A and B, Only edges from the CPDAG represent causal relationships ∗ Confusion When the DAG is given by an expert, this graph is very often causal When the DAG is learnt from data, no reason to be causal ! Causal BN Each A → B represents one direct causal relationship, i.e. A is the direct cause which generates B

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 25/49

slide-82
SLIDE 82

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Intervention vs. Observation

Probabilistic inference :

we observe B = b, we compute P(A|B = b)

Causal inference [Pearl 00]:

we manipulate/intervene upon B : do(B = b)

example with A → B P(A|do(B = b)) = P(A), P(B|do(A = a)) = P(B|A = a) example with A ← B P(A|do(B = b)) = P(A|B = b), P(B|do(A = a)) = P(B)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 26/49

slide-83
SLIDE 83

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Causal structure learning

Usual situation : observational data

whatever the method, the right result is the CPDAG partial determination of the causal structure

How to find a full causal graph ? Use only experimental data, and decide at every step the more interesting experiment to realize (active learning [Murphy, 2001], ...) Use only observational data, for a very specific distribution (LiNGAM models [Hoyer et al., 2008]) Another idea: MyCaDo algorithm

[Meganck et al., 2006]

Use (already existing) observational data to find the CPDAG Complete the orientation with experimental data

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 27/49

slide-84
SLIDE 84

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Causal structure learning

Usual situation : observational data

whatever the method, the right result is the CPDAG partial determination of the causal structure

How to find a full causal graph ? Use only experimental data, and decide at every step the more interesting experiment to realize (active learning [Murphy, 2001], ...) Use only observational data, for a very specific distribution (LiNGAM models [Hoyer et al., 2008]) Another idea: MyCaDo algorithm

[Meganck et al., 2006]

Use (already existing) observational data to find the CPDAG Complete the orientation with experimental data

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 27/49

slide-85
SLIDE 85

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Causal structure learning

Usual situation : observational data

whatever the method, the right result is the CPDAG partial determination of the causal structure

How to find a full causal graph ? Use only experimental data, and decide at every step the more interesting experiment to realize (active learning [Murphy, 2001], ...) Use only observational data, for a very specific distribution (LiNGAM models [Hoyer et al., 2008]) Another idea: MyCaDo algorithm

[Meganck et al., 2006]

Use (already existing) observational data to find the CPDAG Complete the orientation with experimental data

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 27/49

slide-86
SLIDE 86

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

MyCaDo algorithm

Données d'observation Données expérimentales Système Algorithme d'apprentissage de structure d'un RB CPDAG Algorithme MyCaDo Choix de l'exp. Réalisation de l'exp. Analyse des résultats Réseau Bayésien causal

(1) Choice of the experiment = what variable M manipulate ? the one potentially orienting more edges by taking into account experiment/observation cost

back Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 28/49

slide-87
SLIDE 87

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

MyCaDo algorithm

Données d'observation Données expérimentales Système Algorithme d'apprentissage de structure d'un RB CPDAG Algorithme MyCaDo Choix de l'exp. Réalisation de l'exp. Analyse des résultats Réseau Bayésien causal

(2) Experimentation do(M = m) for all possible values m

  • bserve all candidate variables C (C–M)

back Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 28/49

slide-88
SLIDE 88

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

MyCaDo algorithm

Données d'observation Données expérimentales Système Algorithme d'apprentissage de structure d'un RB CPDAG Algorithme MyCaDo Choix de l'exp. Réalisation de l'exp. Analyse des résultats Réseau Bayésien causal

(3) Result analysis : P(C|M) (obs.) ≃ P(C|do(M)) (exp.) ? if equal C ← M else M ← C

  • rient some other edges by applying specific rules

back Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 28/49

slide-89
SLIDE 89

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Outline ...

4

Bayesian network Structure Learning constraint-based methods score-based methods local search methods

5

Causal Bayesian Networks definition causal BN structure learning

6

SemCaDo algorithm definition experimental study

7

O2C algorithm definition algorithm

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 29/49

slide-90
SLIDE 90

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

SemCaDo assumptions

BN ⇔ Ontology general assumptions Nodes ⇐ ⇒ Concepts Random variables ⇐ ⇒ Concept attributes Causal dependencies ⇐ ⇒ Semantic causal relations Data ⇐ ⇒ Concept-attribute instances SemCaDo specific assumptions Causal relations concern concepts sharing the same semantic type Ontology continuity (evolution)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 30/49

slide-91
SLIDE 91

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

SemCaDo assumptions

BN ⇔ Ontology general assumptions Nodes ⇐ ⇒ Concepts Random variables ⇐ ⇒ Concept attributes Causal dependencies ⇐ ⇒ Semantic causal relations Data ⇐ ⇒ Concept-attribute instances SemCaDo specific assumptions Causal relations concern concepts sharing the same semantic type Ontology continuity (evolution)

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 30/49

slide-92
SLIDE 92

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Example

Landslide Tsunami Fire

Natural

Volcano Earthquake Flood

Catastrophes Man-made

is-a is-a is-a is-a is-a is-a is-a is-a Causes Causes Causes Causes Causes Causes

Our causal BN will represent the grey part of the ontology

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 31/49

slide-93
SLIDE 93

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

SemCaDo three main steps

Obs. Data

1st step Structure learning 3rd step Ontology evolution

Choice of experience Choice of experience Analyse the results

2nd step Causal discovery

PDAG CBN

Perform the experience

Enriched ontology Domain

  • ntology

Inter. Data Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 32/49

slide-94
SLIDE 94

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

1st step: initial BN structure learning

Extraction of the causal relationships (R⌋) from the ontology Integration of these edge as constraints in the structure learning algorithm [De Campos & al., 2007] Continuity : these edges will not be ”questioned” during learning Interest Ontology helps in reducing the search task complexity

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 33/49

slide-95
SLIDE 95

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

1st step: initial BN structure learning

Extraction of the causal relationships (R⌋) from the ontology Integration of these edge as constraints in the structure learning algorithm [De Campos & al., 2007] Continuity : these edges will not be ”questioned” during learning Interest Ontology helps in reducing the search task complexity

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 33/49

slide-96
SLIDE 96

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

2nd step: serendipitous causal discovery

Experimentations are needed in order to find the causal

  • rientation of some edges

Our solution : MyCaDo [Meganck & al., 2006], iterative causal discovery process Adaptation to take into account ontological knowledge : Rada distance on H between one set of concepts and their most specific common subsumer Interest Ontology helps in potentially orienting the more unexpected (serendipitous) links

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 34/49

slide-97
SLIDE 97

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

2nd step: serendipitous causal discovery

Experimentations are needed in order to find the causal

  • rientation of some edges

Our solution : MyCaDo [Meganck & al., 2006], iterative causal discovery process Adaptation to take into account ontological knowledge : Rada distance on H between one set of concepts and their most specific common subsumer Interest Ontology helps in potentially orienting the more unexpected (serendipitous) links

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 34/49

slide-98
SLIDE 98

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

3rd Step: Ontology evolution process

[Stojanovic et al., 2002] Interest BN structure learning from data helps in discovering new relations in the ontology

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 35/49

slide-99
SLIDE 99

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

3rd Step: Ontology evolution process

[Stojanovic et al., 2002] Interest BN structure learning from data helps in discovering new relations in the ontology

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 35/49

slide-100
SLIDE 100

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Experimental study (1)

Benchmark : no existing real benchmark & system :-(

BN graph : random generation (50 to 200 nodes) Ontology :

Causal relationships : BN edges Hierarchy of concept : generation by clustering BN nodes

Data is generated by using BN as a generative model

Experimental protocol

Hierarchy of concepts and 10% to 40% of existing causal relationships are given as inputs Semantic gain : cumulative Rada distance of the discovered relationships

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 36/49

slide-101
SLIDE 101

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Experimental study (1)

Benchmark : no existing real benchmark & system :-(

BN graph : random generation (50 to 200 nodes) Ontology :

Causal relationships : BN edges Hierarchy of concept : generation by clustering BN nodes

Data is generated by using BN as a generative model

Experimental protocol

Hierarchy of concepts and 10% to 40% of existing causal relationships are given as inputs Semantic gain : cumulative Rada distance of the discovered relationships

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 36/49

slide-102
SLIDE 102

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Experimental study (2)

Ontology helps structure learning SemCaDo performs better in less steps than MyCaDo

back Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 37/49

slide-103
SLIDE 103

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Experimental study (2)

Structure learning helps ontology evolution Original causal discoveries are discovered first and can be added to the ontology

back Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 37/49

slide-104
SLIDE 104

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Outline ...

4

Bayesian network Structure Learning constraint-based methods score-based methods local search methods

5

Causal Bayesian Networks definition causal BN structure learning

6

SemCaDo algorithm definition experimental study

7

O2C algorithm definition algorithm

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 38/49

slide-105
SLIDE 105

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Object Oriented Bayesian Networks

An extension of BNs using the object paradigm

[Bangsø and Wuillemin, 2000a; Bangsø and Wuillemin, 2000b; Koller and Pfeffer, 1997] Support several aspects of the object oriented modeling. (e.g., inheritance, instantiation) Designed to model large and complex domains

OOBN structure learning : OO-SEM This algorithm [Langseth and Nielsen, 2003] is based on 2 steps

Generation of a prior OOBN based on a prior expert knowledge Grouping nodes into instantiations and instantiations into classes Giving a prior information about the candidate interfaces Adaptation of Structural EM algorithm to learn the final structure

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 39/49

slide-106
SLIDE 106

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Object Oriented Bayesian Networks

An extension of BNs using the object paradigm

[Bangsø and Wuillemin, 2000a; Bangsø and Wuillemin, 2000b; Koller and Pfeffer, 1997] Support several aspects of the object oriented modeling. (e.g., inheritance, instantiation) Designed to model large and complex domains

OOBN structure learning : OO-SEM This algorithm [Langseth and Nielsen, 2003] is based on 2 steps

Generation of a prior OOBN based on a prior expert knowledge Grouping nodes into instantiations and instantiations into classes Giving a prior information about the candidate interfaces Adaptation of Structural EM algorithm to learn the final structure

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 39/49

slide-107
SLIDE 107

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

OOBN structure learning

Drawbacks This prior knowledge is not always obvious to obtain The expert should be familiar with the object oriented modeling Our idea Harness ontologies representation capabilities in order to generate the prior OOBN structure Ontologies OOBNs Concepts Cp Classes Properties Pcpi Real nodes Inheritance relations HR Class hierarchies Semantic relations SR Links/ Interfaces

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 40/49

slide-108
SLIDE 108

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

OOBN structure learning

Drawbacks This prior knowledge is not always obvious to obtain The expert should be familiar with the object oriented modeling Our idea Harness ontologies representation capabilities in order to generate the prior OOBN structure Ontologies OOBNs Concepts Cp Classes Properties Pcpi Real nodes Inheritance relations HR Class hierarchies Semantic relations SR Links/ Interfaces

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 40/49

slide-109
SLIDE 109

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

The 2OC approach

Ontology Prior OOBN Final OOBN

it seems interesting to refine the ontology used initially !

Morphing process Learning process Change detection process Proposals

  • f evolution

Data

(1) ontology to prior OOBN [Ben Ishak et al., 2011a] (2) final OOBN to ontology [Ben Ishak et al., 2011b]

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 41/49

slide-110
SLIDE 110

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Onto2PriorOOBN algorithm

Ontology based generation of a prior OOBN Ontology graph traversal and morphing into a prior OOBN structure 3 steps

Initialization step: to generate the OOBN class and a class to each concept Discovery step: to define input, internal and output sets for each class of the OOBN Closing step: to define instances to add to the global OOBN class

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 42/49

slide-111
SLIDE 111

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Onto2PriorOOBN algorithm

Ontology based generation of a prior OOBN Ontology graph traversal and morphing into a prior OOBN structure 3 steps

Initialization step: to generate the OOBN class and a class to each concept Discovery step: to define input, internal and output sets for each class of the OOBN Closing step: to define instances to add to the global OOBN class

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 42/49

slide-112
SLIDE 112

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Onto2PriorOOBN algorithm

Ontology based generation of a prior OOBN Ontology graph traversal and morphing into a prior OOBN structure 3 steps

Initialization step: to generate the OOBN class and a class to each concept Discovery step: to define input, internal and output sets for each class of the OOBN Closing step: to define instances to add to the global OOBN class

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 42/49

slide-113
SLIDE 113

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Onto2PriorOOBN algorithm

Ontology based generation of a prior OOBN Ontology graph traversal and morphing into a prior OOBN structure 3 steps

Initialization step: to generate the OOBN class and a class to each concept Discovery step: to define input, internal and output sets for each class of the OOBN Closing step: to define instances to add to the global OOBN class

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 42/49

slide-114
SLIDE 114

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Illustrative example : initial ontology

MedCost ThisCarDam ThisCarCost PropCost ILiCost OtherCarCost

Insurance

Age GoodStuden SocioEcon HomeBase AntiTheft VehicleYear MakeModel OtherCar

CarOwner

Accident

Accident

Airbag Cushioning Mileage CarValue RuggedAuto Antilock

Car

Theft

Theft concerns

DrivQuality SeniorTrain DrivingSkill DrivHist RiskAversion

Driver has driver characteristics repaies repaies concerns

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 43/49

slide-115
SLIDE 115

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Illustrative example : prior OOBN

MedCost ThisCarCost ThisCarDam PropCost OtherCarCost ILiCost GoodStuden OtherCar SocioEcon Age AntiTheft MakeModel HomeBase VehicleYear GoodStuden OtherCar SocioEcon Age AntiTheft MakeModel HomeBase VehicleYear

CO:CarOwner Global_Insurance

accident

A:Accident

accident

C:Car

Airbag Cushioning Mileage CarValue RuggedAuto Antilock Airbag Cushioning Mileage CarValue RuggedAuto Antilock Theft

T: Theft

Theft

I:Insurance

SeniorTrain

D:Driver

DrivingSkill RiskAversion DrivQuality DrivHist SeniorTrain DrivingSkill RiskAversion DrivQuality DrivHist

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 44/49

slide-116
SLIDE 116

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

FinalOOBN2Onto

Ontology enrichment based on OOBN learning Part of the ontology evolution process Consists in adding, removing or modifying concepts, properties and/or relations

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 45/49

slide-117
SLIDE 117

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Example : remove relations

No common interface identified between two classes ⇒ their corresponding concepts should be independent

After morphing After learning Possible change

p1.1 p1.2

Cp1

p2.1 P2.2

Cp2 I1: cCp1 I2: cCp2

p1.1 p1.2 p2.1 p2.2 cp2.2 cp2.1 p1.1 p1.2

Cp1

p2.1 P2.2

Cp2 I1: cCp1 I2: cCp2

p1.1 p1.2 p2.1 p2.2

SR

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 46/49

slide-118
SLIDE 118

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Example : add concepts / relations

If ccp communicates with only one class → Add relation Otherwise, check classes similarities → Add concepts / relations

p1.1 p1.2

Cp1

p2.1 p2.2

Cp2

p3.1

Cp3

p4.1 p4.2

Cp4 I1: cCp1 I4: cCp4

p1.1 p1.2 p4.1 p4.2 p4.2 p4.1

I3: cCp3

p3.2 p3.2

I2: cCp2

p2.1 p2.2 p2.2 p2.1

I1: cCp1

p1.2

I2: cCp2

p2.1 p2.2 p2.2 p1.1 p1.2

Cp1

p3.1

Cp3

p super.1

CpSuper

p2.2

Cp2

p4.1

Cp4 After morphing Possible change After learning is-a is-a I4: cCp4

p4.1 p4.2 p4.1

I3: cCp3

p3.2 p3.2 p1.1

SR1 SR2 SR3 SR1 SR2 SR3

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 47/49

slide-119
SLIDE 119

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Example : concepts redefinition

If the class contains more than one component, then The corresponding concept may be deconstructed into more refined ones

p1 p2 p3 p4

Cp Ccp

p1 p2 p3 p4 I1 I2

A disconnected graph After learning

p1 p2 p3

Cp1

p4

Cp2 Possible change

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 48/49

slide-120
SLIDE 120

Bayesian network Structure Learning Causal Bayesian Networks SemCaDo algorithm O2C algorithm

Concepts and relations identification

Semi-automated process The possible changes are communicated to an expert The expert semantically identify the discovered relations and /

  • r concepts

Montassar Ben Messaoud, Mouna Ben Ishak, Philippe Leray, Nahla Ben Amor Apprentissage RB et Ontologies 49/49