A Similarity Measure for the ALN Description Logic Nicola Fanizzi, - - PowerPoint PPT Presentation

a similarity measure for the aln description logic
SMART_READER_LITE
LIVE PREVIEW

A Similarity Measure for the ALN Description Logic Nicola Fanizzi, - - PowerPoint PPT Presentation

A Similarity Measure for the ALN Description Logic Nicola Fanizzi, Claudia dAmato Dipartimento di Informatica Universit` a degli Studi di Bari Campus Universitario, Via Orabona 4, 70125 Bari, Italy CILC 2006 Bari Introduction &


slide-1
SLIDE 1

A Similarity Measure for the ALN Description Logic

Nicola Fanizzi, Claudia d’Amato

Dipartimento di Informatica • Universit` a degli Studi di Bari Campus Universitario, Via Orabona 4, 70125 Bari, Italy

CILC 2006 ⋄ Bari

slide-2
SLIDE 2

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments

Contents

1

Introduction & Motivation Motivations Objectives

2

The Reference Representation Language Knowledge Base & Subsumption Normal Form

3

A Similarity Measure for ALN Definition Similarity Measure: example Measure Involving Individuals Discussion

4

Conclusions and Further Developments Conclusions Future Work

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-3
SLIDE 3

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Motivations Objectives

Motivations

Ontological knowledge

Result of a complex process of knowledge acquisition Plays a key role for interoperability in the Semantic Web perspective Is expressed by standard ontology mark-up languages which are supported by well-founded semantics of Description Logics (DLs)

Need of services able to build knowledge bases automatically

  • r semi-automatically

This can be done by the use of inductive inference services

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-4
SLIDE 4

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Motivations Objectives

Objectives

Induction of structural knowledge is known is ML (concept formation).

This is generally applied on zero-order representations.

  • ur Goal → to make clusters of concepts or individuals

asserted by mean ontological knowledge Problem → to define a similarity/dissimilarity measure applicable to ontology languages

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-5
SLIDE 5

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Knowledge Base & Subsumption Normal Form

Why ALN Logic

Knowledge representation by mean Description Logic (ALN) Description Logic is the counterpart framework of OWL language standard de facto for the knowledge representation in the Semantic Web

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-6
SLIDE 6

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Knowledge Base & Subsumption Normal Form

The Representation Language

Primitive concepts NC = {C, D, . . .}: subsets of a domain Primitive roles NR = {R, S, . . .}: binary relations on the domain Interpretation I = (∆I, ·I) where ∆I: domain of the interpretation and ·I: interpretation function: Name Syntax Semantics top concept ⊤ ∆I bottom concept ⊥ ∅ primitive concept A AI ⊆ ∆I primitive negation ¬A ∆I \ AI concept conjunction C1 ⊓ C2 C I

1 ∩ C I 2

universal restriction ∀R.C {x ∈ ∆I | ∀y ∈ ∆I((x, y) ∈ RI → y ∈ C I)} at-most restriction ≤ n.R {x ∈ ∆I | |{y ∈ ∆I | (x, y) ∈ RI} |≤ n} at-least restriction ≥ n.R {x ∈ ∆I | |{y ∈ ∆I | (x, y) ∈ RI} |≥ n}

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-7
SLIDE 7

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Knowledge Base & Subsumption Normal Form

Knowledge Base & Subsumption

K = T , A T-box T is a set of definitions C ≡ D, meaning C I = DI, where C is the concept name and D is a description A-box A contains extensional assertions on concepts and roles e.g. C(a) and R(a, b), meaning, resp., that aI ∈ C I and (aI, bI) ∈ RI. Subsumption Given two concept descriptions C and D, C subsumes D, denoted by C ⊒ D, iff for every interpretation I, it holds that C I ⊇ DI

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-8
SLIDE 8

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Knowledge Base & Subsumption Normal Form

Examples

Instances of concept definitions: Single ≡ Person⊓ ≤ 0.isMarriedTo Polygamist ≡ Person ⊓ ∀isMarriedTo.Person ⊓ ≥ 2.isMarriedTo Bigamist ≡ Person ⊓ ∀isMarriedTo.Person ⊓ = 2.isMarriedTo MalePolygamist ≡ Male ⊓ Person ⊓ ∀isMarriedTo.Person ⊓ ≥ 2.isMarriedTo The following are instances of simple assertions: Male(Bob), Person(Mary), Single(Jhon), isMarriedTo(Bob, Mary) It is easy to see that the following relationship holds: Poligamist ⊒ MalePolygamist.

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-9
SLIDE 9

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Knowledge Base & Subsumption Normal Form

Other Inference Services

instance checking decide whether an individual is an instance of a concept retrieval find all invididuals instance of a concept realization problem finding the concepts which an individual belongs to, especially the most specific one, if any: most specific concept Given an A-Box A and an individual a, the most specific concept of a w.r.t. A is the concept C, denoted MSCA(a), such that A | = C(a) and C ⊑ D, ∀D such that A | = D(a).

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-10
SLIDE 10

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Knowledge Base & Subsumption Normal Form

Normal Form

C is in ALN normal form iff C ≡ ⊥ or C ≡ ⊤ or if C =

  • P∈prim(C)

P ⊓

  • R∈NR

(∀R.CR ⊓ ≥n.R ⊓ ≤m.R) where: CR = valR(C), n =minR(C) and m = maxR(C)

prim(C) set of all (negated) atoms occurring at C’s top-level valR(C) conjunction C1 ⊓ · · · ⊓ Cn in the value restriction on R, if any (o.w. valR(C) = ⊤); minR(C) = max{n ∈ N | C ⊑ (≥ n.R)} (always finite number); maxR(C) = min{n ∈ N | C ⊑ (≤ n.R)} (if unlimited maxR(C) = ∞) For any R, every sub-description in valR(C) is in normal form.

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-11
SLIDE 11

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Definition Similarity Measure: example Measure Involving Individuals Discussion

A Similarity Measure for ALN: Definition / I

L = ALN/≡ the set of all concepts in ALN normal form I canonical interpretation of A A-Box s : L × L → [0, 1] defined ∀C, D ∈ L: s(C, D) := λ[sP(prim(C), prim(D)) + + 1 |NR|

  • R∈NR

s(valR(C), valR(D)) + 1 |NR| · ·

  • R∈NR

sN((minR(C), maxR(C)), (minR(D), maxR(D)))] where λ ∈]0, 1] (let λ = 1/3),

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-12
SLIDE 12

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Definition Similarity Measure: example Measure Involving Individuals Discussion

A Similarity Measure for ALN: Defintion / II

sP(prim(C), prim(D)) := |

PC ∈prim(C) PI C ∩ QD∈prim(D) QI D|

|

PC ∈prim(C) PI C ∪ QD∈prim(D) QI D|

sN((mC, MC), (mD, MD)) := min(MC, MD) − max(mC, mD) + 1 max(MC, MD) − min(mC, mD) + 1 sN((mC, MC), (mD, MD)) := 0 if min(MC, MD) > max(mC, mD)

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-13
SLIDE 13

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Definition Similarity Measure: example Measure Involving Individuals Discussion

Similarity Measure: example...

Let A be the considered ABox Person(Meg), ¬Male(Meg), hasChild(Meg,Bob), hasChild(Meg,Pat), Person(Bob), Male(Bob), hasChild(Bob,Ann), Person(Pat), Male(Pat), hasChild(Pat,Gwen), Person(Gwen), ¬Male(Gwen), Person(Ann), ¬Male(Ann), hasChild(Ann,Sue), marriedTo(Ann,Tom), Person(Sue), ¬Male(Sue), Person(Tom), Male(Tom) and let C and D be two descriptions in ALN normal form: C ≡ Person ⊓ ∀marriedTo.Person⊓ ≤ 1.hasChild D ≡ Male ⊓ ∀marriedTo.(Person ⊓ ¬Male)⊓ ≤ 2.hasChild

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-14
SLIDE 14

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Definition Similarity Measure: example Measure Involving Individuals Discussion

...Similarity Measure: example...

In order to compute s(C, D) let us consider: Let be λ := 1

3

NR = {hasChild, marriedTo} → |NR| = 2 s(C, D) := 1 3  sP(prim(C), prim(D)) + 1 2

  • R∈NR

s(valR(C), valR(D)) + + 1 2

  • R∈NR

sN((minR(C), maxR(C)), (minR(D), maxR(D)))  

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-15
SLIDE 15

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Definition Similarity Measure: example Measure Involving Individuals Discussion

...Similarity Measure: example...

In order to compute sP let us note that: prim(C) = Person prim(D) = Male sP({Person}, {Male}) = = |{Meg,Bob,Pat,Gwen,Ann,Sue,Tom}∩{Bob,Pat,Tom}|

|{Meg,Bob,Pat,Gwen,Ann,Sue,Tom}∪{Bob,Pat,Tom}| =

=

|{Bob,Pat,Tom}| |{Meg,Bob,Pat,Gwen,Ann,Sue,Tom}| = 3/7

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-16
SLIDE 16

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Definition Similarity Measure: example Measure Involving Individuals Discussion

...Similarity Measure: example...

To compute s for value restrictions, it is important to note that NR = {hasChild, marriedTo} valmarriedTo(C) = Person and valhasChild(C) = ⊤ valmarriedTo(D) = Person ⊓ ¬Male and valhasChild(D) = ⊤ s(Person, Person ⊓ ¬Male) + s(⊤, ⊤) = = 1

3 · (sP(Person, Person ⊓ ¬Male) + 1 2 · (1 + 1) + 1 2 · (1 + 1))+

+ 1

3 · (1 + 1 + 1) = 1 3 · (4 7 + 1 + 1) + 1 = 13 7

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-17
SLIDE 17

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Definition Similarity Measure: example Measure Involving Individuals Discussion

...Similarity Measure: example...

To compute s for number restrictions it is important to note that

NR = {hasChild, marriedTo} minmarriedTo(C) = 0; maxmarriedTo(C) = |∆| + 1 = 7 + 1 = 8 minhasChild(C) = 0; maxhasChild(C) = 1 minmarriedTo(D) = 0; maxmarriedTo(D) = |∆| + 1 = 7 + 1 = 8 minhasChild(D) = 0; maxhasChild(D) = 2 min(MC, MD) > max(mC, mD)

sN( (mhasChild(C), MhasChild(C)), (mhasChild(D), MhasChild(D))) + + sN((mmarriedTo(C), MmarriedTo(C)), (mmarriedTo(D), MmarriedTo(D))) = = min(MhasChild(C),MhasChild(D))−max(mhasChild(C),mhasChild(D))+1

max(MhasChild(C),MhasChild(D)−min(mhasChild(C),mhasChild(D))+1) + 1 =

= min(1,2)−max(0,0)+1

max(1,2)−min(0,0)+1) + 1 = 2 3 + 1 = 5 3

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-18
SLIDE 18

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Definition Similarity Measure: example Measure Involving Individuals Discussion

Measure Involving Individuals

Let c and d two individuals in a given A-Box. We can consider C ∗ = MSC∗(c) and D∗ = MSC∗(d): s(c, d) := s(C ∗, D∗) = s(MSC∗(c), MSC∗(d)) Analogously: ∀c : s(c, D) := s(MSC∗(c), D)

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-19
SLIDE 19

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Definition Similarity Measure: example Measure Involving Individuals Discussion

Discussion

The similarity value is mainly determined as the amount of

  • verlapping sets of individuals that are extension of the

concepts involved, considering also their sub-concepts

the influence of sub-concepts in determining similarity value decreases w.r.t. their nesting level

The similarity measure is defined recursively

its complexity mainly depends on the complexity of the Instance checking operator

limited to primitive concepts it can be pre-compiled

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-20
SLIDE 20

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Conclusions Future Work

Conclusions

The presented function s is a Similalrity Measure

it is definite positive, symmetric, and has maximal value only when the concepts are equivalent

The presented Similarity Measure is based on the A-Box semantics and it is applicable also to couples of individuals, or a concepts and an individual s is defined using the set theory and reasoning operators

It uses a numerical approach but is applied on symbolic representations

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-21
SLIDE 21

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Conclusions Future Work

Further Developments

Testing the Similarity Measure using some classification and clustering algorithms (Ongoing) Extension of the measure for more expressive DL such as ALCN Definition of new Similarity/Dissimilarity Measures for DLs representations, using Kernel functions that are a means to express a notion of similarity in some unknown feature space. Thus it could be possible exploiting the efficiency of kernel methods (e.g. SVMs) in a relational setting

  • N. Fanizzi, C. d’Amato

A Similarity Measure

slide-22
SLIDE 22

Introduction & Motivation The Reference Representation Language A Similarity Measure for ALN Conclusions and Further Developments Conclusions Future Work

The End

Thanks For Your Attention

  • N. Fanizzi, C. d’Amato

A Similarity Measure