Learning Terminological Na ve Bayesian Classifiers Under Different - - PowerPoint PPT Presentation

learning terminological na ve bayesian classifiers under
SMART_READER_LITE
LIVE PREVIEW

Learning Terminological Na ve Bayesian Classifiers Under Different - - PowerPoint PPT Presentation

Learning Terminological Na ve Bayesian Classifiers Under Different Assumptions on Missing Knowledge Pasquale Minervini Claudia dAmato Nicola Fanizzi Department of Computer Science University of Bari URSW 2011 Bonn, October 23,


slide-1
SLIDE 1

Learning Terminological Na¨ ıve Bayesian Classifiers Under Different Assumptions

  • n Missing Knowledge

Pasquale Minervini Claudia d’Amato Nicola Fanizzi

Department of Computer Science University of Bari

URSW 2011 ⋄ Bonn, October 23, 2011

slide-2
SLIDE 2

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works

Contents

1

Introduction & Motivation

2

Background

3

Learning a Terminological Na¨ ıve Bayesian Network

4

Classifying individuals with a TBN

5

Conclusions and Future Works

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-3
SLIDE 3

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works

Introduction & Motivations

Uncertainty is inherently present in real-world knowledge In the SW context difficulties arise modeling real-world domains using only purely logical formalisms Several approaches for coping with unceratin knowledge have been proposed (probabilistic, fuzzy,...)

usually probabilistic information is assumed to be available the CWA is adopted

⇓ Exploiting an already populated ontology, a method capturing probabilistic information could be of help

the OWA has to be taken into account

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-4
SLIDE 4

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works

Paper Contributions

Proposal of a Terminological na¨ ıve Bayesian classifier for predicting class-membership probabilistically it is a na¨ ıve Bayesian network modeling the conditional dependencies between a learned set of Description Logic (complex) concepts and a target concept it deals with the incomplete knowledge due the OWA by considering different ignorance models:

Missing Completely at Random Missing at Random Informatively Missing

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-5
SLIDE 5

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works The Reference Representation Language Bayesian Networks

Knowledge Base Representation

Assumption: resources, concepts and relationships are defined in terms of a representation that can be mapped to some DL language (with the standard model-theoretic semantics) K = T , A T-box T is a set of definitions A-box A contains extensional assertions on concepts and roles e.g. C(a) and R(a, b) The set of the individuals (resources) occurring in A will be denoted Ind(A)

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-6
SLIDE 6

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works The Reference Representation Language Bayesian Networks

Basics of Bayesian Networks...

A Bayesian network (BN) is a DAG G representing the conditional dependencies in a set of random variables Each vertex in G corresponds to a random variable Xi Each edge in G indicates a direct influence relation between the two connected random variables A set of conditional probability distributions θG is associated with each vertex Each vertex Xi in G is conditionally independent of any subset S ⊆ Nd(Xi) of vertices that are not descendants of Xi

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-7
SLIDE 7

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works The Reference Representation Language Bayesian Networks

...Basics of Bayesian Networks

The joint probability distribution Pr(X1, . . . , Xn) over a set of random variables {X1, . . . , Xn} is computed as Pr(X1, . . . , Xn) =

n

  • i=1

Pr(Xi | parents(Xi)); Given a BN, it is possible to evaluate inference queries by marginalization To decrease the inference complexity the na¨ ıve Bayes network is often considered

it is assumed that the presence (or absence) of a particular feature (random variable) of a class is unrelated to the presence (or absence) of any other feature, given the class variable (random variable)

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-8
SLIDE 8

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works Defining a Terminological Na¨ ıve Bayesian Network Learning a TBN: Problem Definition TBN: the Learning Algorithm The Ignorance Models

Terminological Na¨ ıve Bayesian Network: Definition

A Terminological Bayesian Network (TBN) NK, w.r.t. a DL KB K, is defined as a pair G, ΘG, where: G = V, E is a directed acyclic graph, in which: V = {F1, . . . , Fn, C} is a set of vertices, each Fi representing a DL (eventually complex) concepts defined over K and C representing a target concept E ⊆ V × V is a set of edges, modeling the dependence relations between the elements of V; ΘG is a set of conditional probability distributions (CPD), one for each V ∈ V, representing the conditional probability of the feature concept given its parents in the graph

In the case of a Terminological Na¨ ıve Bayesian Network, E = {C, Fi | i ∈ {1, . . . , n}}, namely ∀i, j ∈ {1, . . . , n} and i = j Fi is independent

  • f Fj given the value of the target concept
  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-9
SLIDE 9

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works Defining a Terminological Na¨ ıve Bayesian Network Learning a TBN: Problem Definition TBN: the Learning Algorithm The Ignorance Models

Terminological Na¨ ıve Bayesian Network: Example

Given: a set of DL feature concepts F = {Female, HasChild := ∃hasChild.Person} (variable names are used instead of complex feature concepts) a target concept Father the Terminological Na¨ ıve Bayesian Network is:

Pr(Female|Father) Pr(Female|¬Father) Pr(HasChild|Father) Pr(HasChild|¬Father)

Father Female HasChild

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-10
SLIDE 10

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works Defining a Terminological Na¨ ıve Bayesian Network Learning a TBN: Problem Definition TBN: the Learning Algorithm The Ignorance Models

Learning a TBN: Problem Definition

Given: a target concept C a DL KB K = T , A the sets of of positive, negative and neutral examples for C, denoted with Ind+

C (A), Ind− C (A) and Ind0 C(A), so that:

∀a ∈ Ind+

C (A) : K |

= C(a), ∀a ∈ Ind−

C (A) : K |

= ¬C(a), ∀a ∈ Ind0

C(A) : K |

= C(a) ∧ K | = ¬C(a);

an ignorance model a scoring function score for a TBN NK w.r.t. IndC(A) Find: a network N ∗

K maximizing the scoring function

N ∗

K ← arg max NK

score(NK, IndC(A)))

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-11
SLIDE 11

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works Defining a Terminological Na¨ ıve Bayesian Network Learning a TBN: Problem Definition TBN: the Learning Algorithm The Ignorance Models

TBN: the Learning Algorithm...

function learn(K, IndC(A)) {The TBN is initialized as containing only the target concept node} N ∗

K = G, ΘG;

G = V ← {C}, E ← ∅; NK ← ∅; repeat NK ← N ∗

K;

{A new network is created, having one more node and different parameters than the previous one} Network = c′, N ′

K, s′ ← extend(NK, IndC(A));

N ∗

K ← N ′ K;

{Possible stopping conditions: a) improvements in score below a threshold; b) reaching a maximum number of nodes} until stopping criterion on Network; return NK;

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-12
SLIDE 12

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works Defining a Terminological Na¨ ıve Bayesian Network Learning a TBN: Problem Definition TBN: the Learning Algorithm The Ignorance Models

...TBN: the Learning Algorithm

function extend(NK, IndC(A)) Concept ← Start; Best ← ∅; repeat Concepts ← ∅; for c′ ∈ {c′ ∈ ρcl

↓ (Concept) | |c′| ≤ min(|c| + d, maxLen)} do

V′ ← V ∪ {c′}; N ′

K ← optimalNetwork(V′, IndC(A));

s′ ← score(N ′

K, IndC(A));

Concepts ← Concepts ∪ {c′, N ′

K, s′};

end for Best ← arg maxc′,N ′

K,s′∈Concepts∪{Best} s′;

Concept ← c : c, NK, s = Best; {Possible stopping conditions: a) exceeding a maximum number of iterations; b) exceeding a maximum number of refinement steps} until Stopping criterion on Best; return Best;

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-13
SLIDE 13

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works Defining a Terminological Na¨ ıve Bayesian Network Learning a TBN: Problem Definition TBN: the Learning Algorithm The Ignorance Models

Learning a Na¨ ıve TBN: Example

Initial Network Searching for the first feature Adding the first feature to the network Searching for the second feature Adding the second feature to the network Father ∃hasChild.Person Father ∃hasParent.Person ∃hasChild.Person Female Female ∃hasChild.⊤ Male Person ∃hasParent.⊤ ∃married.⊤ Female Mammal Person ⊤ ⊤ ∃hasSibling.⊤ ∃married.⊤ Father

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-14
SLIDE 14

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works Defining a Terminological Na¨ ıve Bayesian Network Learning a TBN: Problem Definition TBN: the Learning Algorithm The Ignorance Models

The ignorance models

To learn the TBN, different assumptions (ignorance models) on the nature of the missing information are considered, given an ideal KB K∗ having additional knowledge: MCAR (Missing Completely At Random) – the probability that a ∈ C I is missing is independent of any kind of (additional) knowledge: Pr(K | = C(a) ∧ K | = ¬C(a) | K∗) = Pr(K | = C(a) ∧ K | = ¬C(a)); MAR (Missing At Random) – the probability that a ∈ C I is missing depends only from K and does not depend on additional knowledge: Pr(K | = C(a)∧K | = ¬C(a) | K∗) = Pr(K | = C(a)∧K | = ¬C(a) | K); NMAR (Not Missing At Random or IM, Informatively Missing) – the probability that a ∈ C I is missing could be not the same if additional knowledge is available Pr(K | = C(a)∧K | = ¬C(a) | K∗) = Pr(K | = C(a)∧K | = ¬C(a) | K).

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-15
SLIDE 15

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works Defining a Terminological Na¨ ıve Bayesian Network Learning a TBN: Problem Definition TBN: the Learning Algorithm The Ignorance Models

TBN under MCAR assumption

Only positive and negative examples is considered Parameters estimated by the use of the frequency distribution score computed as the log-likelihood on training data: L(NK | IndC(A)) = =

  • a∈Ind+

C (A)

log Pr(C(a) | NK) +

  • a∈Ind−

C (A)

log Pr(¬C(a) | NK);

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-16
SLIDE 16

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works Defining a Terminological Na¨ ıve Bayesian Network Learning a TBN: Problem Definition TBN: the Learning Algorithm The Ignorance Models

TBN under MAR assumption

Positive, negative and neutral examples are considered The EM algorithm is adopted for parameters estimation score is computed as the log-likelihood on training data considering also the neutral examples L(NK | IndC(A)) =

  • a∈Ind0

C (A)

  • C ′∈{C,¬C}

log Pr(C ′(a) | NK) Pr(C ′ | NK) +

  • a∈Ind+

C (A)

log Pr(C(a) | NK) +

  • a∈Ind−

C (A)

log Pr(¬C(a) | NK);

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-17
SLIDE 17

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works Defining a Terminological Na¨ ıve Bayesian Network Learning a TBN: Problem Definition TBN: the Learning Algorithm The Ignorance Models

TBN under NMAR assumption

Positive and negative examples are considered For the neutral examples, all the possible fillings are considered Robust Bayesian estimation (RBE) is adopted to learn conditional probability distributions

probability intervals are determined instead of single probability values

score: as for MCAR considering the mean value of the probability intervals

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-18
SLIDE 18

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works

Classifying individuals with a TBN: Example

Given: the feature concepts F = {Female, HasChild} the target concept Father the na¨ ıve TBN of the previous example the DL KB K an individual a s.t. K | = HasChild(a) while the membership of a to Female is not known The probability that a is instance of Father is given by:

Pr(Father(a)) = Pr(Father) Pr(HasChild | Father)

  • Father′∈{Father,¬Father}

Pr(Father′) Pr(HasChild | Father′) ;

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-19
SLIDE 19

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works

Classifying individuals using RBE: Example

A na¨ ıve TBN using Robust Bayesian Estimation for inferring posterior probability intervals in presence of NMAR assumption is s.t. conditional probability contain probability intervals (defined by upper and lower bound) instead of probability values

[Pr(Female|Father),Pr(Female|Father)] [Pr(Female|¬Father),Pr(Female|¬Father)] [Pr(HasChild|Father),Pr(HasChild|Father)] [Pr(HasChild|¬Father),Pr(HasChild|¬Father)]

Father Female HasChild

Inference on a instance of Father given that K | = HasChild(a), is given by the interval [Pr(Father | HasChild), Pr(Father | HasChild)], where: Pr(Fa(a)) = Pr(Fa | HC) = Pr(HC | Fa)Pr(Fa) Pr(HC | Fa)Pr(Fa) + Pr(HC | ¬Fa)Pr(¬Fa) ; Pr(Fa(a)) = Pr(Fa | HC) = Pr(HC | Fa)Pr(Fa) Pr(HC | Fa)Pr(Fa) + Pr(HC | ¬Fa)Pr(¬Fa) ;

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-20
SLIDE 20

Introduction & Motivation Background Learning a Terminological Na¨ ıve Bayesian Network Classifying individuals with a TBN Conclusions and Future Works

Conclusions & Future Work

Conclusions: Proposed a ML method based on the na¨ ıve Bayes assumption for estimating the probability that a generic individual belongs to a certain target concept, given its membership relation to an induced set of (complex) DL concepts an ignorance model for handling incomplete knowledge Future works: experimenting with the method finding optimizations of the proposed method

  • C. d’Amato

Learning Terminological Na¨ ıve Bayesian Classifiers

slide-21
SLIDE 21

The End

That’s all! Questions ?