Exact Query Reformulation with First-order Ontologies and Databases - - PowerPoint PPT Presentation

exact query reformulation with first order ontologies and
SMART_READER_LITE
LIVE PREVIEW

Exact Query Reformulation with First-order Ontologies and Databases - - PowerPoint PPT Presentation

Exact Query Reformulation with First-order Ontologies and Databases Nhung Ngo Free University of Bolzano A joint work with Enrico Franconi and Volga Kernet Nov 13, 2013 Motivation DBox Query Determinacy Exact Safe-range Query Reformulation


slide-1
SLIDE 1

Exact Query Reformulation with First-order Ontologies and Databases

Nhung Ngo

Free University of Bolzano A joint work with Enrico Franconi and Volga Kernet

Nov 13, 2013

slide-2
SLIDE 2

Motivation DBox Query Determinacy Exact Safe-range Query Reformulation Application: Query Answering over an Expressive DL and DBox Conclusion

slide-3
SLIDE 3

Query Answering Under Contraints and Databases

◮ Given a FOL fragment L ◮ Constraints: a set of L-sentences KB ◮ Database: a finite set of facts DB over a relational signature

PDB

◮ Query: a (open) L-formula

slide-4
SLIDE 4

Query Answering Under Contraints and Databases

◮ Given a FOL fragment L ◮ Constraints: a set of L-sentences KB ◮ Database: a finite set of facts DB over a relational signature

PDB

◮ Query: a (open) L-formula

slide-5
SLIDE 5

Query Answering Under Contraints and Databases

◮ Given a FOL fragment L ◮ Constraints: a set of L-sentences KB ◮ Database: a finite set of facts DB over a relational signature

PDB

◮ Query: a (open) L-formula

slide-6
SLIDE 6

Query Answering Under Contraints and Databases

◮ Given a FOL fragment L ◮ Constraints: a set of L-sentences KB ◮ Database: a finite set of facts DB over a relational signature

PDB

◮ Query: a (open) L-formula

slide-7
SLIDE 7

Ontology-based Data Access

◮ Given a description logic fragment DL ◮ TBox: a set of TBox assertions T ◮ Databases: a set of ABox assertions A ◮ Query: a concept query

slide-8
SLIDE 8

Ontology-based Data Access

◮ Given a description logic fragment DL ◮ TBox: a set of TBox assertions T ◮ Databases: a set of ABox assertions A ◮ Query: a concept query

slide-9
SLIDE 9

Ontology-based Data Access

◮ Given a description logic fragment DL ◮ TBox: a set of TBox assertions T ◮ Databases: a set of ABox assertions A ◮ Query: a concept query

slide-10
SLIDE 10

Ontology-based Data Access

◮ Given a description logic fragment DL ◮ TBox: a set of TBox assertions T ◮ Databases: a set of ABox assertions A ◮ Query: a concept query

slide-11
SLIDE 11

Ontology-based Data Access

◮ Given a description logic fragment DL ◮ TBox: a set of TBox assertions T ◮ Databases: a set of ABox assertions A ◮ Query: a concept query

slide-12
SLIDE 12

ABox

◮ An ABox is a finite set of ground atomic facts, syntactically

Example

A = {Employee(Jon), Project(Winter)} is an ABox

◮ The semantics of ABoxes is given by first-order structures. ◮ An ABox does not correspond to a first-order structure!

slide-13
SLIDE 13

ABox

◮ An ABox is a finite set of ground atomic facts, syntactically

Example

A = {Employee(Jon), Project(Winter)} is an ABox

◮ The semantics of ABoxes is given by first-order structures. ◮ An ABox does not correspond to a first-order structure!

slide-14
SLIDE 14

ABox

◮ An ABox is a finite set of ground atomic facts, syntactically

Example

A = {Employee(Jon), Project(Winter)} is an ABox

◮ The semantics of ABoxes is given by first-order structures. ◮ An ABox does not correspond to a first-order structure!

slide-15
SLIDE 15

ABox

◮ An ABox is a finite set of ground atomic facts, syntactically

Example

A = {Employee(Jon), Project(Winter)} is an ABox

◮ The semantics of ABoxes is given by first-order structures. ◮ An ABox does not correspond to a first-order structure!

slide-16
SLIDE 16

ABox Semantics

Definition

A first-order structure I is a model of an ABox A if I ⊇ A

An ABox is an incomplete database i.e., an ABox represents a class

  • f databases.

Example

Given the ABox A = {Employee(Jon), Project(Winter)}

◮ {{Employee(Jon), Project(Winter)} is a model of A ◮ {{Employee(Jon), Employee(Rob), Project(Winter)} ◮ {{Employee(Jon), Employee(Rob), Project(River)} is

not a model of A

slide-17
SLIDE 17

ABox Semantics

Definition

A first-order structure I is a model of an ABox A if I ⊇ A

An ABox is an incomplete database i.e., an ABox represents a class

  • f databases.

Example

Given the ABox A = {Employee(Jon), Project(Winter)}

◮ {{Employee(Jon), Project(Winter)} is a model of A ◮ {{Employee(Jon), Employee(Rob), Project(Winter)} ◮ {{Employee(Jon), Employee(Rob), Project(River)} is

not a model of A

slide-18
SLIDE 18

ABox Semantics

Definition

A first-order structure I is a model of an ABox A if I ⊇ A

An ABox is an incomplete database i.e., an ABox represents a class

  • f databases.

Example

Given the ABox A = {Employee(Jon), Project(Winter)}

◮ {{Employee(Jon), Project(Winter)} is a model of A ◮ {{Employee(Jon), Employee(Rob), Project(Winter)} ◮ {{Employee(Jon), Employee(Rob), Project(River)} is

not a model of A

slide-19
SLIDE 19

ABox Semantics

Definition

A first-order structure I is a model of an ABox A if I ⊇ A

An ABox is an incomplete database i.e., an ABox represents a class

  • f databases.

Example

Given the ABox A = {Employee(Jon), Project(Winter)}

◮ {{Employee(Jon), Project(Winter)} is a model of A ◮ {{Employee(Jon), Employee(Rob), Project(Winter)} ◮ {{Employee(Jon), Employee(Rob), Project(River)} is

not a model of A

slide-20
SLIDE 20

Certain Answers

Definition

Given:

◮ a TBox T , an ABox A and ◮ a concept query Q and an individual a

Question: (T , A) | = Q(a) ? That is ∀ models I of (T , A), is it the case that I | = Q(a)

slide-21
SLIDE 21

ABox vs Database

◮ Additional constraint as a standard view over data:

∀xBad-Project(x) ↔ Project(x) ∧ ¬∃yWork-for(y,x)

◮ Database:

Work-for = {<Jon, Winter>, <Rob, Winter>} Project = {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {River}

◮ ABox:

Work-for ⊇ {<Jon, Winter>, <Rob, Winter>} Project ⊇ {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {} ABox does not scale down to standard DB answer

slide-22
SLIDE 22

ABox vs Database

◮ Additional constraint as a standard view over data:

∀xBad-Project(x) ↔ Project(x) ∧ ¬∃yWork-for(y,x)

◮ Database:

Work-for = {<Jon, Winter>, <Rob, Winter>} Project = {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {River}

◮ ABox:

Work-for ⊇ {<Jon, Winter>, <Rob, Winter>} Project ⊇ {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {} ABox does not scale down to standard DB answer

slide-23
SLIDE 23

ABox vs Database

◮ Additional constraint as a standard view over data:

∀xBad-Project(x) ↔ Project(x) ∧ ¬∃yWork-for(y,x)

◮ Database:

Work-for = {<Jon, Winter>, <Rob, Winter>} Project = {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {River}

◮ ABox:

Work-for ⊇ {<Jon, Winter>, <Rob, Winter>} Project ⊇ {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {} ABox does not scale down to standard DB answer

slide-24
SLIDE 24

ABox vs Database

◮ Additional constraint as a standard view over data:

∀xBad-Project(x) ↔ Project(x) ∧ ¬∃yWork-for(y,x)

◮ Database:

Work-for = {<Jon, Winter>, <Rob, Winter>} Project = {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {River}

◮ ABox:

Work-for ⊇ {<Jon, Winter>, <Rob, Winter>} Project ⊇ {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {} ABox does not scale down to standard DB answer

slide-25
SLIDE 25

ABox vs Database

◮ Additional constraint as a standard view over data:

∀xBad-Project(x) ↔ Project(x) ∧ ¬∃yWork-for(y,x)

◮ Database:

Work-for = {<Jon, Winter>, <Rob, Winter>} Project = {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {River}

◮ ABox:

Work-for ⊇ {<Jon, Winter>, <Rob, Winter>} Project ⊇ {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {} ABox does not scale down to standard DB answer

slide-26
SLIDE 26

ABox vs Database

◮ Additional constraint as a standard view over data:

∀xBad-Project(x) ↔ Project(x) ∧ ¬∃yWork-for(y,x)

◮ Database:

Work-for = {<Jon, Winter>, <Rob, Winter>} Project = {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {River}

◮ ABox:

Work-for ⊇ {<Jon, Winter>, <Rob, Winter>} Project ⊇ {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {} ABox does not scale down to standard DB answer

slide-27
SLIDE 27

ABox vs Database

◮ Additional constraint as a standard view over data:

∀xBad-Project(x) ↔ Project(x) ∧ ¬∃yWork-for(y,x)

◮ Database:

Work-for = {<Jon, Winter>, <Rob, Winter>} Project = {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {River}

◮ ABox:

Work-for ⊇ {<Jon, Winter>, <Rob, Winter>} Project ⊇ {Winter, River}

◮ Q(x) := Bad-Project(x)

⇒ {} ABox does not scale down to standard DB answer

slide-28
SLIDE 28

ABox vs Database

◮ ABox:

Work-for ⊇ { <Jon, Winter>} Project ⊇ { Winter, River}

◮ Query as a standard view over database:

Q(x) := Work-for(y,x) Q = Π2 Work-for

◮ Q = EVAL(Π2 Work-for)

⇒ {Winter, River}

◮ Q = Π2( EVAL( Work-for))

⇒ {Winter}

Queries are not compositional wrt certain answer semantics!

slide-29
SLIDE 29

ABox vs Database

◮ ABox:

Work-for ⊇ { <Jon, Winter>} Project ⊇ { Winter, River}

◮ Query as a standard view over database:

Q(x) := Work-for(y,x) Q = Π2 Work-for

◮ Q = EVAL(Π2 Work-for)

⇒ {Winter, River}

◮ Q = Π2( EVAL( Work-for))

⇒ {Winter}

Queries are not compositional wrt certain answer semantics!

slide-30
SLIDE 30

ABox vs Database

◮ ABox:

Work-for ⊇ { <Jon, Winter>} Project ⊇ { Winter, River}

◮ Query as a standard view over database:

Q(x) := Work-for(y,x) Q = Π2 Work-for

◮ Q = EVAL(Π2 Work-for)

⇒ {Winter, River}

◮ Q = Π2( EVAL( Work-for))

⇒ {Winter}

Queries are not compositional wrt certain answer semantics!

slide-31
SLIDE 31

ABox vs Database

◮ ABox:

Work-for ⊇ { <Jon, Winter>} Project ⊇ { Winter, River}

◮ Query as a standard view over database:

Q(x) := Work-for(y,x) Q = Π2 Work-for

◮ Q = EVAL(Π2 Work-for)

⇒ {Winter, River}

◮ Q = Π2( EVAL( Work-for))

⇒ {Winter}

Queries are not compositional wrt certain answer semantics!

slide-32
SLIDE 32

ABox vs Database

◮ ABox:

Work-for ⊇ { <Jon, Winter>} Project ⊇ { Winter, River}

◮ Query as a standard view over database:

Q(x) := Work-for(y,x) Q = Π2 Work-for

◮ Q = EVAL(Π2 Work-for)

⇒ {Winter, River}

◮ Q = Π2( EVAL( Work-for))

⇒ {Winter}

Queries are not compositional wrt certain answer semantics!

slide-33
SLIDE 33

ABox vs Database

◮ ABox:

Work-for ⊇ { <Jon, Winter>} Project ⊇ { Winter, River}

◮ Query as a standard view over database:

Q(x) := Work-for(y,x) Q = Π2 Work-for

◮ Q = EVAL(Π2 Work-for)

⇒ {Winter, River}

◮ Q = Π2( EVAL( Work-for))

⇒ {Winter}

Queries are not compositional wrt certain answer semantics!

slide-34
SLIDE 34

Our Proposal We propose DBox as a way to model complete data

slide-35
SLIDE 35

DBox A DBox is a finite set of ground anomic facts, syntactically

Example

D = { Employee(Jon), Project(Winter)} is a DBox

Definition

A first-order structure I is a model of a DBox D if

◮ I ⊇ D ◮ for all predicate P in the signature of D, P I = {¯

a|P(¯ a ∈ D}

DBox captures the exact semantics of database.

slide-36
SLIDE 36

DBox A DBox is a finite set of ground anomic facts, syntactically

Example

D = { Employee(Jon), Project(Winter)} is a DBox

Definition

A first-order structure I is a model of a DBox D if

◮ I ⊇ D ◮ for all predicate P in the signature of D, P I = {¯

a|P(¯ a ∈ D}

DBox captures the exact semantics of database.

slide-37
SLIDE 37

DBox A DBox is a finite set of ground anomic facts, syntactically

Example

D = { Employee(Jon), Project(Winter)} is a DBox

Definition

A first-order structure I is a model of a DBox D if

◮ I ⊇ D ◮ for all predicate P in the signature of D, P I = {¯

a|P(¯ a ∈ D}

DBox captures the exact semantics of database.

slide-38
SLIDE 38

DBox example Given the DBox D = { Employee(Jon), Project(Winter)}

◮ { Employee(Jon), Project(Winter)} is a model of D ◮ { Employee(Jon), Manager(Rob), Project(Winter)} ◮ { Employee(Jon), Employee(Rob), Project(Winter)} is

not a model of D

slide-39
SLIDE 39

DBox example Given the DBox D = { Employee(Jon), Project(Winter)}

◮ { Employee(Jon), Project(Winter)} is a model of D ◮ { Employee(Jon), Manager(Rob), Project(Winter)} ◮ { Employee(Jon), Employee(Rob), Project(Winter)} is

not a model of D

slide-40
SLIDE 40

DBox example Given the DBox D = { Employee(Jon), Project(Winter)}

◮ { Employee(Jon), Project(Winter)} is a model of D ◮ { Employee(Jon), Manager(Rob), Project(Winter)} ◮ { Employee(Jon), Employee(Rob), Project(Winter)} is

not a model of D

slide-41
SLIDE 41

DBox example Given the DBox D = { Employee(Jon), Project(Winter)}

◮ { Employee(Jon), Project(Winter)} is a model of D ◮ { Employee(Jon), Manager(Rob), Project(Winter)} ◮ { Employee(Jon), Employee(Rob), Project(Winter)} is

not a model of D

slide-42
SLIDE 42

Certain Answers

Definition

Given:

◮ an ontology containing: a TBox T , [ an ABox A ] and ◮ a DBox D ◮ a concept query Q and an individual a

Question: (T , [A], D) | = Q(a) ? That is ∀ models I of (T , [A], D), is it the case that I | = Q(a)

slide-43
SLIDE 43

Conjunctive Query Answering: DBox vs ABox

◮ Complexity w.r.t to ABox

Results by Calvanese, de Giacomo, Lembo, Lenzerini, Rosati, Krisnadhi, Lutz. Description Logic KB-type Combined Data DL − Lite[core|F] (T , A) NP-complete in AC0 EL (T , A) NP-complete PTIME-complete

◮ Complexity w.r.t to DBox Results by Franconi, Seylan,

Anglica Ibez-Garca. Description Logic KB-type Combined Data DL − Lite[core|F] (T , D) coNP-hard coNP-complete DL − Lite[core|F] (T , D, [A]) EXPTIME-hard coNP-complete EL (T , A) EXPTIME-hard coNP-hard DBoxes are natural but expensive formalism.

slide-44
SLIDE 44

Conjunctive Query Answering: DBox vs ABox

◮ Complexity w.r.t to ABox

Results by Calvanese, de Giacomo, Lembo, Lenzerini, Rosati, Krisnadhi, Lutz. Description Logic KB-type Combined Data DL − Lite[core|F] (T , A) NP-complete in AC0 EL (T , A) NP-complete PTIME-complete

◮ Complexity w.r.t to DBox Results by Franconi, Seylan,

Anglica Ibez-Garca. Description Logic KB-type Combined Data DL − Lite[core|F] (T , D) coNP-hard coNP-complete DL − Lite[core|F] (T , D, [A]) EXPTIME-hard coNP-complete EL (T , A) EXPTIME-hard coNP-hard DBoxes are natural but expensive formalism.

slide-45
SLIDE 45

Conjunctive Query Answering: DBox vs ABox

◮ Complexity w.r.t to ABox

Results by Calvanese, de Giacomo, Lembo, Lenzerini, Rosati, Krisnadhi, Lutz. Description Logic KB-type Combined Data DL − Lite[core|F] (T , A) NP-complete in AC0 EL (T , A) NP-complete PTIME-complete

◮ Complexity w.r.t to DBox Results by Franconi, Seylan,

Anglica Ibez-Garca. Description Logic KB-type Combined Data DL − Lite[core|F] (T , D) coNP-hard coNP-complete DL − Lite[core|F] (T , D, [A]) EXPTIME-hard coNP-complete EL (T , A) EXPTIME-hard coNP-hard DBoxes are natural but expensive formalism.

slide-46
SLIDE 46

Conjunctive Query Answering: DBox vs ABox

◮ Complexity w.r.t to ABox

Results by Calvanese, de Giacomo, Lembo, Lenzerini, Rosati, Krisnadhi, Lutz. Description Logic KB-type Combined Data DL − Lite[core|F] (T , A) NP-complete in AC0 EL (T , A) NP-complete PTIME-complete

◮ Complexity w.r.t to DBox Results by Franconi, Seylan,

Anglica Ibez-Garca. Description Logic KB-type Combined Data DL − Lite[core|F] (T , D) coNP-hard coNP-complete DL − Lite[core|F] (T , D, [A]) EXPTIME-hard coNP-complete EL (T , A) EXPTIME-hard coNP-hard DBoxes are natural but expensive formalism.

slide-47
SLIDE 47

DBox is expensive

◮ Do not consider every TBox ⇒ Safe TBox (Lutz & Wolter,

2011)

◮ Do not consider every query ⇒ Determinacy

slide-48
SLIDE 48

DBox is expensive

◮ Do not consider every TBox ⇒ Safe TBox (Lutz & Wolter,

2011)

◮ Do not consider every query ⇒ Determinacy

slide-49
SLIDE 49

DBox is expensive

◮ Do not consider every TBox ⇒ Safe TBox (Lutz & Wolter,

2011)

◮ Do not consider every query ⇒ Determinacy

slide-50
SLIDE 50

Determinacy Given a TBox T and a DBox D, a query Q is determined by the DBox predicates PD if its answer functionally depends only from the extension of the DBox predicates.

Definition (Determinacy)

Let Ii

(D) and Ij (D) be any two models of a TBox T embedding a DBox D.

A query Q is determined by the DBox predicates PD given the TBox T if the answer of Q over Ii

(D) is the same as the answer of Q over Ij (D).

Checking determinacy of a query with a first-order TBox is reducible to entailment.

slide-51
SLIDE 51

Determinacy Given a TBox T and a DBox D, a query Q is determined by the DBox predicates PD if its answer functionally depends only from the extension of the DBox predicates.

Definition (Determinacy)

Let Ii

(D) and Ij (D) be any two models of a TBox T embedding a DBox D.

A query Q is determined by the DBox predicates PD given the TBox T if the answer of Q over Ii

(D) is the same as the answer of Q over Ij (D).

Checking determinacy of a query with a first-order TBox is reducible to entailment.

slide-52
SLIDE 52

Determinacy Given a TBox T and a DBox D, a query Q is determined by the DBox predicates PD if its answer functionally depends only from the extension of the DBox predicates.

Definition (Determinacy)

Let Ii

(D) and Ij (D) be any two models of a TBox T embedding a DBox D.

A query Q is determined by the DBox predicates PD given the TBox T if the answer of Q over Ii

(D) is the same as the answer of Q over Ij (D).

Checking determinacy of a query with a first-order TBox is reducible to entailment.

slide-53
SLIDE 53

Determinacy: Example Let the DBox predicates be PD = { Mother, Father} and let T consists of:

  • 1. ∀x Mother(x) ↔ ∃y Woman(x) ∧ hasChild(x,y)
  • 2. ∀x Father(x) ↔ ∃y Man(x) ∧ hasChild(x,y)

and let Q = ∃x∃y(( Woman(x) ∨ Man(x)) ∧ hasChild(x,y))

Fact

Q is determined by the DBox predicates given the TBox.

slide-54
SLIDE 54

Determinacy: Example Let the DBox predicates be PD = { Mother, Father} and let T consists of:

  • 1. ∀x Mother(x) ↔ ∃y Woman(x) ∧ hasChild(x,y)
  • 2. ∀x Father(x) ↔ ∃y Man(x) ∧ hasChild(x,y)

and let Q = ∃x∃y(( Woman(x) ∨ Man(x)) ∧ hasChild(x,y))

Fact

Q is determined by the DBox predicates given the TBox.

slide-55
SLIDE 55

Why Determinacy?

◮ It captures exactly the notion of ”non ambiguous” answers. ◮ Consider the models of a TBox T with a set of DBox

predicates PD. A query Q determined by PD generates an ”extended” DBox augmented with the relation associated to the determined query.

◮ Arbitrary determined queries can be composed and

decomposed without affecting the outcome.

slide-56
SLIDE 56

Why Determinacy?

◮ It captures exactly the notion of ”non ambiguous” answers. ◮ Consider the models of a TBox T with a set of DBox

predicates PD. A query Q determined by PD generates an ”extended” DBox augmented with the relation associated to the determined query.

◮ Arbitrary determined queries can be composed and

decomposed without affecting the outcome.

slide-57
SLIDE 57

Why Determinacy?

◮ It captures exactly the notion of ”non ambiguous” answers. ◮ Consider the models of a TBox T with a set of DBox

predicates PD. A query Q determined by PD generates an ”extended” DBox augmented with the relation associated to the determined query.

◮ Arbitrary determined queries can be composed and

decomposed without affecting the outcome.

slide-58
SLIDE 58

Exact reformulation

Theorem (Beth,1953)

Given a TBox T and a DBox D, a query Q is determined by the DBox predicates PD if and only if there exists an exact reformulation of Q.

Definition

A reformulation ˆ Q of a query Q is exact if T | = ∀x.Q(x) ↔ ˆ Q(x).

slide-59
SLIDE 59

Exact reformulation

Theorem (Beth,1953)

Given a TBox T and a DBox D, a query Q is determined by the DBox predicates PD if and only if there exists an exact reformulation of Q.

Definition

A reformulation ˆ Q of a query Q is exact if T | = ∀x.Q(x) ↔ ˆ Q(x).

slide-60
SLIDE 60

Example Let the DBox predicates be PD = { Mother, Father} and let T consists of:

  • 1. ∀x Mother(x) ↔ ∃y Woman(x) ∧ hasChild(x,y)
  • 2. ∀x Father(x) ↔ ∃y Man(x) ∧ hasChild(x,y)

and let Q = ∃x∃y(( Woman(x) ∨ Man(x)) ∧ hasChild(x,y))

Fact

T | = Q ↔ ∃x Mother(x) ∨ Father(x). ∃x Mother(x) ∨ Father(x) is an exact reformulation of Q

slide-61
SLIDE 61

Example Let the DBox predicates be PD = { Mother, Father} and let T consists of:

  • 1. ∀x Mother(x) ↔ ∃y Woman(x) ∧ hasChild(x,y)
  • 2. ∀x Father(x) ↔ ∃y Man(x) ∧ hasChild(x,y)

and let Q = ∃x∃y(( Woman(x) ∨ Man(x)) ∧ hasChild(x,y))

Fact

T | = Q ↔ ∃x Mother(x) ∨ Father(x). ∃x Mother(x) ∨ Father(x) is an exact reformulation of Q

slide-62
SLIDE 62

Safe-range Queries

◮ The safe-range fragment is a syntactic fragment of FOL ◮ Intuition: A formula is safe-range if its variables (free and

quantified) are bounded by positive predicates or equalities.

◮ ∃x. A(x) ∧ ¬B(x) - safe-range ◮ A(x) ∨ B(x) - safe range ◮ ∀x.C(x) - not safe-range

◮ An open query is ground safe-range if its grounding is safe

range.

◮ The safe-range fragment of FOL is equally expressive to the

domain independent fragment of FOL and to relational algebra – the core of SQL. = ⇒ SQL can be used for the evaluation of safe-range formulas.

slide-63
SLIDE 63

Safe-range Queries

◮ The safe-range fragment is a syntactic fragment of FOL ◮ Intuition: A formula is safe-range if its variables (free and

quantified) are bounded by positive predicates or equalities.

◮ ∃x. A(x) ∧ ¬B(x) - safe-range ◮ A(x) ∨ B(x) - safe range ◮ ∀x.C(x) - not safe-range

◮ An open query is ground safe-range if its grounding is safe

range.

◮ The safe-range fragment of FOL is equally expressive to the

domain independent fragment of FOL and to relational algebra – the core of SQL. = ⇒ SQL can be used for the evaluation of safe-range formulas.

slide-64
SLIDE 64

Safe-range Queries

◮ The safe-range fragment is a syntactic fragment of FOL ◮ Intuition: A formula is safe-range if its variables (free and

quantified) are bounded by positive predicates or equalities.

◮ ∃x. A(x) ∧ ¬B(x) - safe-range ◮ A(x) ∨ B(x) - safe range ◮ ∀x.C(x) - not safe-range

◮ An open query is ground safe-range if its grounding is safe

range.

◮ The safe-range fragment of FOL is equally expressive to the

domain independent fragment of FOL and to relational algebra – the core of SQL. = ⇒ SQL can be used for the evaluation of safe-range formulas.

slide-65
SLIDE 65

Safe-range Queries

◮ The safe-range fragment is a syntactic fragment of FOL ◮ Intuition: A formula is safe-range if its variables (free and

quantified) are bounded by positive predicates or equalities.

◮ ∃x. A(x) ∧ ¬B(x) - safe-range ◮ A(x) ∨ B(x) - safe range ◮ ∀x.C(x) - not safe-range

◮ An open query is ground safe-range if its grounding is safe

range.

◮ The safe-range fragment of FOL is equally expressive to the

domain independent fragment of FOL and to relational algebra – the core of SQL. = ⇒ SQL can be used for the evaluation of safe-range formulas.

slide-66
SLIDE 66

Safe-range Queries

◮ The safe-range fragment is a syntactic fragment of FOL ◮ Intuition: A formula is safe-range if its variables (free and

quantified) are bounded by positive predicates or equalities.

◮ ∃x. A(x) ∧ ¬B(x) - safe-range ◮ A(x) ∨ B(x) - safe range ◮ ∀x.C(x) - not safe-range

◮ An open query is ground safe-range if its grounding is safe

range.

◮ The safe-range fragment of FOL is equally expressive to the

domain independent fragment of FOL and to relational algebra – the core of SQL. = ⇒ SQL can be used for the evaluation of safe-range formulas.

slide-67
SLIDE 67

Safe-range Queries

◮ The safe-range fragment is a syntactic fragment of FOL ◮ Intuition: A formula is safe-range if its variables (free and

quantified) are bounded by positive predicates or equalities.

◮ ∃x. A(x) ∧ ¬B(x) - safe-range ◮ A(x) ∨ B(x) - safe range ◮ ∀x.C(x) - not safe-range

◮ An open query is ground safe-range if its grounding is safe

range.

◮ The safe-range fragment of FOL is equally expressive to the

domain independent fragment of FOL and to relational algebra – the core of SQL. = ⇒ SQL can be used for the evaluation of safe-range formulas.

slide-68
SLIDE 68

Safe-range Queries

◮ The safe-range fragment is a syntactic fragment of FOL ◮ Intuition: A formula is safe-range if its variables (free and

quantified) are bounded by positive predicates or equalities.

◮ ∃x. A(x) ∧ ¬B(x) - safe-range ◮ A(x) ∨ B(x) - safe range ◮ ∀x.C(x) - not safe-range

◮ An open query is ground safe-range if its grounding is safe

range.

◮ The safe-range fragment of FOL is equally expressive to the

domain independent fragment of FOL and to relational algebra – the core of SQL. = ⇒ SQL can be used for the evaluation of safe-range formulas.

slide-69
SLIDE 69

Safe-range Queries

◮ The safe-range fragment is a syntactic fragment of FOL ◮ Intuition: A formula is safe-range if its variables (free and

quantified) are bounded by positive predicates or equalities.

◮ ∃x. A(x) ∧ ¬B(x) - safe-range ◮ A(x) ∨ B(x) - safe range ◮ ∀x.C(x) - not safe-range

◮ An open query is ground safe-range if its grounding is safe

range.

◮ The safe-range fragment of FOL is equally expressive to the

domain independent fragment of FOL and to relational algebra – the core of SQL. = ⇒ SQL can be used for the evaluation of safe-range formulas.

slide-70
SLIDE 70

Safe-range Queries

◮ The safe-range fragment is a syntactic fragment of FOL ◮ Intuition: A formula is safe-range if its variables (free and

quantified) are bounded by positive predicates or equalities.

◮ ∃x. A(x) ∧ ¬B(x) - safe-range ◮ A(x) ∨ B(x) - safe range ◮ ∀x.C(x) - not safe-range

◮ An open query is ground safe-range if its grounding is safe

range.

◮ The safe-range fragment of FOL is equally expressive to the

domain independent fragment of FOL and to relational algebra – the core of SQL. = ⇒ SQL can be used for the evaluation of safe-range formulas.

slide-71
SLIDE 71

Query Answering: from entailment to model checking

Theorem

Let T be an ontology, Q be a query, D be a consistent DBox for T , and AD the set of all individuals in D and Q. If Q is

◮ an exact reformulation of Q, ◮ safe-range,

then Ans(Q, D, T ) = Ans( Q, D, {}) = {a | AD, D | = Q(a)}

The original query answering problem (based on entailment) is reduced to the problem of checking the validity of the reformulation Q over the single interpretation given by the DBox with the active domain (model checking problem), which can be executed by an SQL engine.

slide-72
SLIDE 72

Query Answering: from entailment to model checking

Theorem

Let T be an ontology, Q be a query, D be a consistent DBox for T , and AD the set of all individuals in D and Q. If Q is

◮ an exact reformulation of Q, ◮ safe-range,

then Ans(Q, D, T ) = Ans( Q, D, {}) = {a | AD, D | = Q(a)}

The original query answering problem (based on entailment) is reduced to the problem of checking the validity of the reformulation Q over the single interpretation given by the DBox with the active domain (model checking problem), which can be executed by an SQL engine.

slide-73
SLIDE 73

Problem Statment Given

◮ an ontology T in L; ◮ a DBox D in L; ◮ a concept query Q in L.

We need to solve the following PROBLEM:

◮ find a first-order logic safe-range exact reformulation of Q

expressed in terms of DBox predicates.

slide-74
SLIDE 74

Problem Statment Given

◮ an ontology T in L; ◮ a DBox D in L; ◮ a concept query Q in L.

We need to solve the following PROBLEM:

◮ find a first-order logic safe-range exact reformulation of Q

expressed in terms of DBox predicates.

slide-75
SLIDE 75

Semantic Characterisation Theorem Given a set of database predicates PDB, a domain independent ontology T , and a query Q, a domain independent exact reformulation Q of Q over PDB under T exists if and only if Q is implicitly definable from PDB under T and it is domain independent with respect to T .

slide-76
SLIDE 76

Constructive Theorem If:

  • 1. T ∪

T | = ∀X. Q[X] ↔ Q[X] (that is, Q[X] is implicitly definable),

  • 2. Q is safe-range

(that is, Q is domain independent),

  • 3. T is safe-range

(that is, T is domain independent), then there exists an exact reformulation Q of Q as a safe-range query in FOL(C, P) over PDB under T , that can be obtained constructively.

slide-77
SLIDE 77

Oops! But databases are finite Beth’s theorem fails if we consider only models with a finite interpretation of database predicates Let P = {P, R, A}, PDB = {P, R}, T consists of:

  • 1. ∀x, y, z. R(x, y) ∧ R(x, z) → y = z,
  • 2. ∀x, y. R(x, y) → ∃z. R(z, x),
  • 3. (∀x, y. R(x, y) → ∃z. R(y, z)) → (∀x. A(x) ↔ P(x))

Fact

◮ ∀x, y. R(x, y) → ∃z. R(y, z) is entailed from the first two formulas

  • nly over finite interpretations of R.

◮ Query Q = A(x) is finitely determined by DBox predicates, but it is

not determined under models with an unrestricted interpretation of R

◮ This knowledge base does not enjoy finitely controllable

determinacy.

◮ The characterization theorem fails too.

slide-78
SLIDE 78

Oops! But databases are finite Beth’s theorem fails if we consider only models with a finite interpretation of database predicates Let P = {P, R, A}, PDB = {P, R}, T consists of:

  • 1. ∀x, y, z. R(x, y) ∧ R(x, z) → y = z,
  • 2. ∀x, y. R(x, y) → ∃z. R(z, x),
  • 3. (∀x, y. R(x, y) → ∃z. R(y, z)) → (∀x. A(x) ↔ P(x))

Fact

◮ ∀x, y. R(x, y) → ∃z. R(y, z) is entailed from the first two formulas

  • nly over finite interpretations of R.

◮ Query Q = A(x) is finitely determined by DBox predicates, but it is

not determined under models with an unrestricted interpretation of R

◮ This knowledge base does not enjoy finitely controllable

determinacy.

◮ The characterization theorem fails too.

slide-79
SLIDE 79

Oops! But databases are finite Beth’s theorem fails if we consider only models with a finite interpretation of database predicates Let P = {P, R, A}, PDB = {P, R}, T consists of:

  • 1. ∀x, y, z. R(x, y) ∧ R(x, z) → y = z,
  • 2. ∀x, y. R(x, y) → ∃z. R(z, x),
  • 3. (∀x, y. R(x, y) → ∃z. R(y, z)) → (∀x. A(x) ↔ P(x))

Fact

◮ ∀x, y. R(x, y) → ∃z. R(y, z) is entailed from the first two formulas

  • nly over finite interpretations of R.

◮ Query Q = A(x) is finitely determined by DBox predicates, but it is

not determined under models with an unrestricted interpretation of R

◮ This knowledge base does not enjoy finitely controllable

determinacy.

◮ The characterization theorem fails too.

slide-80
SLIDE 80

Finitely controllable determinacy

Definition

Given a set of database predicates PDB, an ontology T , we say that the determinacy of a query Q is finitely controllable if, Q is determined iff Q is determined under the models of T where the extensions of database predicates are finite.

Finitely controllable determinacy guarantees the completeness of the characterization theorem

Definition

A fragment L is said to have finitely controllable determinacy property if the determinacy of every query is finitely controllable.

slide-81
SLIDE 81

Finitely controllable determinacy

Definition

Given a set of database predicates PDB, an ontology T , we say that the determinacy of a query Q is finitely controllable if, Q is determined iff Q is determined under the models of T where the extensions of database predicates are finite.

Finitely controllable determinacy guarantees the completeness of the characterization theorem

Definition

A fragment L is said to have finitely controllable determinacy property if the determinacy of every query is finitely controllable.

slide-82
SLIDE 82

The logic fragment that we want

◮ Domain independence/Safe-range ◮ Finite controllable determinacy ◮ As expressive as possible

slide-83
SLIDE 83

The logic fragment that we want

◮ Domain independence/Safe-range ◮ Finite controllable determinacy ◮ As expressive as possible

slide-84
SLIDE 84

The logic fragment that we want

◮ Domain independence/Safe-range ◮ Finite controllable determinacy ◮ As expressive as possible

slide-85
SLIDE 85

ALCHOIQ Syntax and semantics of ALCHOIQ concepts and roles Syntax Semantics A AI ⊆ ∆I {o} {o}I ⊆ ∆I P P I ⊆ ∆I × ∆I P − {(y, x)|(x, y) ∈ P I} ¬C ∆I\CI C ⊓ D CI ∩ DI C ⊔ D CI ∪ DI ≥ nR {x|#({y|(x, y) ∈ RI}) ≥ n} ≥ nR.C {x|#({y|(x, y) ∈ RI} ∩ CI) ≥ n}

slide-86
SLIDE 86

ALCHOIQGN Syntax of ALCHOIQGN concepts and roles R ::= P | P − B ::= A | {o} | ≥ nR C ::= B | ≥ nR.C | ≥ nR.¬C | B ⊓ ¬C | C ⊓ D | C ⊔ D In GN fragments, only guarded negations are allowed.

slide-87
SLIDE 87

ALCHOIQGN Syntax of ALCHOIQGN concepts and roles R ::= P | P − B ::= A | {o} | ≥ nR C ::= B | ≥ nR.C | ≥ nR.¬C | B ⊓ ¬C | C ⊓ D | C ⊔ D In GN fragments, only guarded negations are allowed.

slide-88
SLIDE 88

Why does ALCHOIQGN make sense ?

◮ Non-guarded negation should not appear in a cleanly designed

  • ntology, and, if present, it should be fixed.

◮ The use of absolute negative information:

“a non-male is a female” ¬ male ⊑ female; is not meaningful in conceptual modelling, since the subsumer includes all sorts of objects in the universe. Only guarded negative information in the subsumee should be allowed: “a non-male person is a female” person ⊓ ¬ male ⊑ female.

slide-89
SLIDE 89

Why does ALCHOIQGN make sense ?

◮ Non-guarded negation should not appear in a cleanly designed

  • ntology, and, if present, it should be fixed.

◮ The use of absolute negative information:

“a non-male is a female” ¬ male ⊑ female; is not meaningful in conceptual modelling, since the subsumer includes all sorts of objects in the universe. Only guarded negative information in the subsumee should be allowed: “a non-male person is a female” person ⊓ ¬ male ⊑ female.

slide-90
SLIDE 90

ALCHOIQGN

◮ ALCHOIQGN TBoxes and concept queries are domain

independent.

◮ Expressive power equivalence

Theorem

The domain independent fragment of ALCHOIQ and ALCHOIQGN are equally expressive.

◮ Finitely controllable determinacy

Theorem

ALCHOIQGN TBoxes with concept queries have finitely controllable determinacy.

slide-91
SLIDE 91

ALCHOIQGN

◮ ALCHOIQGN TBoxes and concept queries are domain

independent.

◮ Expressive power equivalence

Theorem

The domain independent fragment of ALCHOIQ and ALCHOIQGN are equally expressive.

◮ Finitely controllable determinacy

Theorem

ALCHOIQGN TBoxes with concept queries have finitely controllable determinacy.

slide-92
SLIDE 92

ALCHOIQGN

◮ ALCHOIQGN TBoxes and concept queries are domain

independent.

◮ Expressive power equivalence

Theorem

The domain independent fragment of ALCHOIQ and ALCHOIQGN are equally expressive.

◮ Finitely controllable determinacy

Theorem

ALCHOIQGN TBoxes with concept queries have finitely controllable determinacy.

slide-93
SLIDE 93

ALCHOIQGN

◮ ALCHOIQGN TBoxes and concept queries are domain

independent.

◮ Expressive power equivalence

Theorem

The domain independent fragment of ALCHOIQ and ALCHOIQGN are equally expressive.

◮ Finitely controllable determinacy

Theorem

ALCHOIQGN TBoxes with concept queries have finitely controllable determinacy.

slide-94
SLIDE 94

ALCHOIQGN

◮ ALCHOIQGN TBoxes and concept queries are domain

independent.

◮ Expressive power equivalence

Theorem

The domain independent fragment of ALCHOIQ and ALCHOIQGN are equally expressive.

◮ Finitely controllable determinacy

Theorem

ALCHOIQGN TBoxes with concept queries have finitely controllable determinacy.

slide-95
SLIDE 95

A complete procedure Input: An ALCHOIQGN TBox T , a concept query Q in ALCHOIQGN , and a database signature (database atomic concepts and roles).

  • 1. Check the implicit definability of the query Q by testing

T ∪ T | = Q ≡ Q using a standard OWL2 reasoner (ALCHOIQGN is a sub-language of OWL2). If it is the case then continue.

  • 2. Compute the Craig interpolant

Q(x) based on the tableau proof of (( T ) ∧ Q(x)) → (( T ) → Q(x))

  • 3. For each free variable x which is not bounded by any positive

predicate in Q[X] do Q[X] := Q[X] ∧ Adom

Q(x)

Output: A safe-range reformulation Q expressed over the database signature.

slide-96
SLIDE 96

A complete procedure Input: An ALCHOIQGN TBox T , a concept query Q in ALCHOIQGN , and a database signature (database atomic concepts and roles).

  • 1. Check the implicit definability of the query Q by testing

T ∪ T | = Q ≡ Q using a standard OWL2 reasoner (ALCHOIQGN is a sub-language of OWL2). If it is the case then continue.

  • 2. Compute the Craig interpolant

Q(x) based on the tableau proof of (( T ) ∧ Q(x)) → (( T ) → Q(x))

  • 3. For each free variable x which is not bounded by any positive

predicate in Q[X] do Q[X] := Q[X] ∧ Adom

Q(x)

Output: A safe-range reformulation Q expressed over the database signature.

slide-97
SLIDE 97

A complete procedure Input: An ALCHOIQGN TBox T , a concept query Q in ALCHOIQGN , and a database signature (database atomic concepts and roles).

  • 1. Check the implicit definability of the query Q by testing

T ∪ T | = Q ≡ Q using a standard OWL2 reasoner (ALCHOIQGN is a sub-language of OWL2). If it is the case then continue.

  • 2. Compute the Craig interpolant

Q(x) based on the tableau proof of (( T ) ∧ Q(x)) → (( T ) → Q(x))

  • 3. For each free variable x which is not bounded by any positive

predicate in Q[X] do Q[X] := Q[X] ∧ Adom

Q(x)

Output: A safe-range reformulation Q expressed over the database signature.

slide-98
SLIDE 98

Conclusion and Future Work

Conclusion:

◮ We introduced a framework to compute an exact reformulation

  • f a concept query under a description logic ontology
  • ver some set of concept and role names (DBox predicates).

◮ We found the conditions which guarantee that a safe-range exact reformulation exists. ◮ We proved that such safe-range exact reformulation being evaluated as a relational algebra query over the DBox give the same answer as the original query under the ontology. ◮ An application of the framework to description logics was studied.

Future Work:

◮ Study optimisations of reformulations.

◮ How to choose the best reformulation in terms of query evaluation?

slide-99
SLIDE 99

Conclusion and Future Work

Conclusion:

◮ We introduced a framework to compute an exact reformulation

  • f a concept query under a description logic ontology
  • ver some set of concept and role names (DBox predicates).

◮ We found the conditions which guarantee that a safe-range exact reformulation exists. ◮ We proved that such safe-range exact reformulation being evaluated as a relational algebra query over the DBox give the same answer as the original query under the ontology. ◮ An application of the framework to description logics was studied.

Future Work:

◮ Study optimisations of reformulations.

◮ How to choose the best reformulation in terms of query evaluation?