Description Logics for Integration Y. Ang elica Ib a nez-Garc a - - PowerPoint PPT Presentation

description logics for integration
SMART_READER_LITE
LIVE PREVIEW

Description Logics for Integration Y. Ang elica Ib a nez-Garc a - - PowerPoint PPT Presentation

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Description Logics for Integration Y. Ang elica Ib a nez-Garc a KRDB Research Centre, Faculty of Computer Science Free


slide-1
SLIDE 1

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Description Logics for Integration

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa

KRDB Research Centre, Faculty of Computer Science Free University of Bozen-Bolzano

DEIS 2010 8-12 Nov.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-2
SLIDE 2

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Outline

1

Ontology-based Data Integration OB Data Integration Framework Issues in OB Data Integration

2

Description Logics Reasoning in DLs Query answering on Ontologies Tractable DLs

3

Description Logic-based Data Integration

4

Discussion Query rewriting Non-monotonic negation

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-3
SLIDE 3

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion OB Data Integration Framework Issues in OB Data Integration

Outline

1

Ontology-based Data Integration OB Data Integration Framework Issues in OB Data Integration

2

Description Logics Reasoning in DLs Query answering on Ontologies Tractable DLs

3

Description Logic-based Data Integration

4

Discussion Query rewriting Non-monotonic negation

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-4
SLIDE 4

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion OB Data Integration Framework Issues in OB Data Integration

Ontology-based Data Integration Framework

OB Data integration:

unified and transparent access, global (or target) schema collection of data stored in multiple, autonomous, and heterogeneous data sources More formally: G, S, M where G: global schema: viewed as a conceptual schema, expressed in logic (ontology) S: data sources: wrapped as relational databases M: mappings: semantically link data at the sources (S) with the ontology (G)

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-5
SLIDE 5

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion OB Data Integration Framework Issues in OB Data Integration

Problems in OB Data Integration

How to model the global schema:

◮ provide a description of the data of interest in semantic terms, ◮ represent the global view as a conceptual schema; ◮ formalize it as logical theory (ontology) ◮ use the resulting logical theory for reasoning, (e.g. query answering)

How to model the the sources, and the mappings How to answer queries expressed on the global schema

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-6
SLIDE 6

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Outline

1

Ontology-based Data Integration OB Data Integration Framework Issues in OB Data Integration

2

Description Logics Reasoning in DLs Query answering on Ontologies Tractable DLs

3

Description Logic-based Data Integration

4

Discussion Query rewriting Non-monotonic negation

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-7
SLIDE 7

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Description Logics in a Nutshell

Logics specifically designed to represent and reason on structured knowledge:

◮ Concepts: sets of objects ◮ Roles: binary relations between (instances of) concepts

Knowledge Bases, aka Ontologies

◮ Intentional Knowledge: TBoxes, general properties of concepts ◮ Extensional Knowledge: ABoxes, assertions about individuals/objects

Nice computational properties: decidability, tractability (in some cases) Trade-off between expressive power and computational complexity of reasoning

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-8
SLIDE 8

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Current applications of Description Logics

DLs have evolved from being used “just” in KR.

Novel applications of DLs:

Databases:

◮ schema design, schema evolution ◮ query optimization ◮ integration of heterogeneous data sources, data warehousing

Conceptual modeling Foundation for the Semantic Web (variants of OWL correspond to specific DLs) . . .

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-9
SLIDE 9

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Reasoning over an Ontology

Reasoning Services

Ontology Satisfiability: O admits at least one model. Concept Instance Checking: c is an instance of a concept C in every model of O. Role Instance Checking: a pair (a1, a2) of individuals is an instance

  • f a role R in every model of O.

Query Answering: computing the certain answers to a query over O.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-10
SLIDE 10

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Query answering on Ontologies

An ontology imposes constraints on the data. Actual data may be incomplete or inconsistent w.r.t. such constraints. q − → T − → A − → Logical Inference − → cert(q, T , A) To be able to deal with data efficiently: separate the contribution of A from the contribution of q and T .

❀ Query answering by query rewriting ❀ Query answering by data completion

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-11
SLIDE 11

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Queries over ontologies

A Conjunctive Query (CQ) over an Ontology O = T , A

has the form: q( x) ← conj( x, y) where x denotes the distinguished variables, y the non-distinguished variables, conj( x, y) is a conjunction of atoms The predicates in atoms are concepts and roles of the ontology.

Union of Conjunctive queries (UCQ)

Datalog notation Q( x) ← conj1( x, y1) . . . . . . Q( x) ← conjn( x, yn)

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-12
SLIDE 12

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Semantics of Queries

Let O = T , A be an ontology, I = (∆I, ·I) an interpretation of O, and q( x) ← ϕ( x, y) a CQ.

An answer to q( x) ← ϕ( x, y) over I, denoted qI

is the set of tuples c of constants of A such that there exists a tuple

  • ∈ ∆I × . . . × ∆I; and the formula ϕ(

c, y) evaluates to true in I[

y/

  • ],

The certain answers to q( x) over O = T , A, denoted cert(q, O)

are the tuples c of constants of A such that c is an answer of q ( c ∈ qI) in every model I of O

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-13
SLIDE 13

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Tractable Description Logics

DL-Lite:

◮ family of DLs optimized according to the tradeoff between expressive

power and complexity of query answering, with emphasis on data

◮ Nice computational properties for answering UCQs ⋆ same data complexity as relational databases ⋆ query answering can be delegated to a relational DB engine ◮ Captures conceptual modeling formalism ◮ Is at the basis of the OWL2 QL profile of OWL2

EL:

◮ is particularly suitable for applications employing ontologies that define

very large numbers of classes and/or properties

◮ ontology consistency, class expression subsumption, and instance

checking can be decided in polynomial time

◮ e.g. very large biomedical ontology SNOMED CT (≈ 400.000axioms)

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-14
SLIDE 14

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

DL-LiteA Syntax

Concept expressions: B ::= A | ∃Q | δ(UC) C ::= ⊤C | B | ¬B | ∃Q. C Value-domain expressions: E ::= ρ(UC) F ::= ⊤D | T1 | · · · | Tn Role expression: Q ::= P | P − R ::= Q | ¬Q Attribute expressions: VC ::= UC | ¬UC

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-15
SLIDE 15

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Semantics of DL-LiteA: objects vs. values

Definition (An interpretation I = (∆I, ·I))

Objects Values Domain: ∆I ∆I

O

∆I

V

Constants: Γ c ∈ ΓO, cI ∈ ∆I

O

d ∈ ΓV , dI ∈ ∆I

V

Concepts /Types Concept C, CI ⊆ ∆I

O

RDF datatype Ti, T I

i ⊆ ∆I V

Roles/ Attributes Role R, RI ⊆ ∆I

O×∆I O

Attribute V , V I ⊆ ∆I

O × ∆I V

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-16
SLIDE 16

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Semantics of DL-LiteA constructs

Construct

Syntax

Semantics top concept ⊤C ∆I

O

negation ¬C ∆I \ CI existential restriction ∃Q {o | ∃o′ | (o, o′) ∈ QI} attribute domain δ(U) {o | ∃v. (o, v) ∈ U I} inverse role P − {(b, a) | (a, b) ∈ P I} role negation ¬Q (∆I

O × ∆I O) \ QI

top domain ⊤D ∆I

V

attribute range ρ(U) {v | ∃o. (o, v) ∈ U I} attribute negation ¬U (∆I

O × ∆I V ) \ U I

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-17
SLIDE 17

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

DL-LiteA Ontologies

TBox T

B ⊑ C concept inclusion E ⊑ F value-domain inclusion Q ⊑ R role inclusion UC ⊑ VC attribute inclusion (funct Q) role functionality (funct UC) attribute functionality (id BI1, . . . , In) identification constraints where each Ii is a role name, an inverse role or an attribute —

No functional or identifying role or attribute can be specialized by using it in the right-hand side of a role or attribute inclusion assertion.

ABox A

A(a), P(a, b), UC(a, d) where a, b are object constants, and d is a value constant

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-18
SLIDE 18

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Semantics of DL-LiteA assertions

Syntax Semantics B ⊑ C BI ⊆ CI Q ⊑ R QI ⊆ RI E ⊑ F EI ⊆ F I U ⊑ V U I ⊆ V I (funct Q) ∀o, o1, o2. (o, o1) ∈ QI ∧ (o, o2) ∈ QI → o1 = o2 (funct U) ∀o, v1, v2. (o, v1) ∈ U I ∧ (o, v2) ∈ U I → v1 = v2 (id B I1, . . . , In) I1, . . . , In identify instances of B A(c) cI ∈ AI P(a, b) (aI, bI) ∈ P I U(c, d) (cI, val(d)) ∈ U I

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-19
SLIDE 19

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Query answering in DL-LiteA

Based on query reformulation Given a (U)CQ q(x), and a satisfiable ontology O = T , A, rewrite q(x) into an FO query qT (x) (independently of A) such that for all a, T , A | = q[ a] iff A | = qT [ a] evaluate the query qT over A, seen as a complete DB + Off-the-shelf RDBMSs can be used for evaluating qT

  • rewritten queries can be of size (| T | · | q |)|q|
  • not scalable when | T | is large (even if | q |) is relatively small
  • This rewriting approach is not applicable to other tractable DLs, e.g.

EL

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-20
SLIDE 20

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Perfect rewriting in DL-LiteA

To compute the perfect rewriting, starting from the original (U)CQ: Iteratively get a CQ to be processed and either: expand positive inclusions & simplify redundant atoms, or unify atoms in the CQ to obtain a more specific CQ to be further expanded. Each result of the above steps is added to the queries to be processed, until no further CQ can be added. — Note: negative inclusions, functionalities, and identification constraints play a role in ontology satisfiability, but not in query answering (i.e., we have separability)

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-21
SLIDE 21

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Use the PIs as basic rewriting rules:

◮ when an atom in the query unifies with the head of the rule, substitute

the atom with the body of the rule.

Apply in all possible ways unification between atoms in a query. Unifying atoms can make rules applicable that were not so before, and is required for completeness of the method.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-22
SLIDE 22

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Algorithm PerfectRef(Q, TP ) Input: union of conjunctive queries Q, set of DL-LiteAPIs TP Output: union of conjunctive queries PR PR := Q; repeat

PR′ := PR; for each q ∈ PR′ do

for each g in q do for each PI I in TP do if I is applicable to g then PR := PR ∪ {ApplyPI(q, g, I)}; for each g1, g2 in q do if g1 and g2 unify then PR := PR ∪ {τ(Reduce(q, g1, g2))};

until PR′ = PR; return PR

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-23
SLIDE 23

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

DL-LiteA TBox

Example

manager ⊑ employee employee ⊑ person employee ⊑ ∃WORKS-FOR ∃WORKS-FOR− ⊑ project manager(x) → employee(x) employee(x) → person(x) employee(x) → WORKS-FOR(x, ) WORKS-FOR( , y) → project(y) Query: q(x) ← WORKS-FOR(x, y), project(y) Perfect Reformulation: q(x) ← WORKS-FOR(x, y), project(y) q(x) ← WORKS-FOR(x, y), WORKS-FOR( , y) q(x) ← WORKS-FOR(x, ) q(x) ← employee(x) q(x) ← manager(x)

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-24
SLIDE 24

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Complexity of reasoning in DL-LiteA

ABox + TBox data complexity TBox + query Ontology satisfiability PTime AC0 Query answering for CQs and UCQs PTime AC0 NP-complete this is exactly as in relational DBs. In fact, reasoning (e.g. ontology satisfiability) can be done by constructing suitable FOL/SQL queries and evaluating them over the ABox: FOL-rewritability.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-25
SLIDE 25

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

The EL family

construct syntax semantics concepts top ⊤ ∆I bottom ⊥ ∅ atomic concept A AI ⊆ ∆I qualified existential restriction ∃P. C {o | ∃o′. (o, o′) ∈ P I ∧ o′ ∈ CI} conjunction C1 ⊓ C2 CI

1 ∩ CI 2

roles atomic role P P I ⊆ ∆I × ∆I TBox concept inclusion C1 ⊑ C2 CI

1 ⊆ CI 2

ABox membership assertions C(a) aI ∈ CI P(a, b) (aI, bI) ∈ P I

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-26
SLIDE 26

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Data Completion / Combined approach

Extend ABox to the canonical model of (T , A), IK Encode it as a finite structure, CK Rewrite q into q† to ensure that the answers to q over CK are correct CK can be constructed by first-order queries: Avoid exponential blow up: polynomial rewritings for DL-LiteN

horn

Applicable to other DLs of the DL-Lite family, exponential rewriting Needs access to the data

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-27
SLIDE 27

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Query rewriting for EL

Rewrite a given CQ q( x) ← ϕ( x, y) into an FO query q† such that the answers to q over IK are the same as the answers to q† over CK | q† |= O(| q | · | T |) q†( x) ← ϕ ∧ ϕ1 ∧ ϕ2 ∧ ϕ3

ϕ1: answer variables and variables in cycles in q must be mapped to ABox ϕ2: if R(x1, x2), R(x3, x2) in q and x2 is mapped outside the ABox then x1 = x3 ϕ3: if R(x1, x2), S(x3, x2) in q and R = S then x2 must be mapped to ABox

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-28
SLIDE 28

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Reasoning in DLs Query answering on Ontologies Tractable DLs

Query rewriting, open questions

is the exponential blowup unavoidable for role inclusions? is the exponential blowup unavoidable for positive existential queries? for which DLs pure rewriting can be polynomial? Alternative query rewriting techniques based on resolution for more expressive logics (with recursive rewritings) [P´ erez-Urbina et al., 2010].

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-29
SLIDE 29

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Outline

1

Ontology-based Data Integration OB Data Integration Framework Issues in OB Data Integration

2

Description Logics Reasoning in DLs Query answering on Ontologies Tractable DLs

3

Description Logic-based Data Integration

4

Discussion Query rewriting Non-monotonic negation

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-30
SLIDE 30

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Ontology-based data integration Systems

Ontology-based data integration System

is a triple OT , M, S where: T is a TBox S is a relational database representing the sources M is a set of mapping assertions between T and S The mapping assertions are a crucial part of an Ontology-Based Data Integration System: they are used to extract the data from the sources to “populate” the

  • ntology

❀ virtual ABox

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-31
SLIDE 31

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Ontology-based data integration: the DL-LiteA solution

the data sources are assumed to be wrapped and presented as relational sources. data federation tools such as IBM Information Integrator can be used to integrate the sources into a single relational Use DL-LiteA ontologies (with mappings) for the conceptual view on the data. Exploit effectiveness of query answering, Take advantage of the distinction between objects and values in DL-LiteA to deal with the notorious impedance mismatch problem.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-32
SLIDE 32

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Impedance mismatch problem

In RDBs, information is represented in forms of tuples of values Ontologies, use both objects and values — Use an alphabet Λ of function symbols, each with an associated arity. Values are denoted by constants from an alphabet ΓV Instances of concepts are denoted by terms built out of ΓV f(d1, . . . , dn), with f ∈ Λ, and di ∈ ΓV

Example

If a person is identified by her SSN, we can introduce a function symbol pers/1. If IBN81B24 is a SSN, then pers(IBN81B24) denotes a person.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-33
SLIDE 33

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Mappings

A mapping assertion in M has the form:

Φ( x) ❀ Ψ( t, y) where Φ is am arbitrary SQL query of arity n > 0 over S, Ψ is a conjunctive query over T of arity n′ > 0 without non-distinguished variables

  • x,

y are variables with y ⊆ x,

  • t are terms of the form f(

z), with f ∈ Λ and z ⊆ x

Split version of M

For each X ∈ Ψ Φ′ ❀ X where Φ′ is the projection of Φ over the variables occurring in X.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-34
SLIDE 34

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Semantics of mappings

I satisfies a mapping assertion Φ ❀ Ψ w.r.t. S

if for each tuple of values v ∈ Eval(Φ, S), and for each ground atom X in Ψ[ x/ v], if X has the form                    A(s) then sI ∈ AI T(s) then sI ∈ T I P(s2, s2) then (sI

1, sI 2) ∈ P I

U(s1, s2) then (sI

1, sI 2) ∈ U I

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-35
SLIDE 35

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Example

D1[SSN : STRING, PROJ : STRING, D : DATE], D2[SSN : STRING, NAME : STRING] M1 : SELECT SSN, PROJ,D ❀ tempEmp(pers(SSN)), FROM D1 WORKS FOR(pers(SNN), proj(PROJ)), ProjName(proj(PROJ), PROJ), until(pers(SNN), D) M2 : SELECT SSN, NAME ❀ employee(pers(SSN)), FROM D2 PersName(pers(SSN), NAME)

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-36
SLIDE 36

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Semantics of OBDI systems

Model of an OBDI system

An interpretation I is a model of OT , M, S if: I is a model of T , I satisfies M w.r.t. S, i.e., I satisfies every assertion in M w.r.t. S. An OBDI system O is satisfiable if it admits at least one model.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-37
SLIDE 37

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Query Answering on OB Data integration systems

Virtual ABox

Let M ∈ M, M = Φ ❀ X. AM,S = {X[ x/ v] | v ∈ Eval(Φ, S)} AM,S = {AM,S | M ∈ M}

bottom-up approach:

querying over AM,S not really efficient in practice materializing the ABox is a PTime process requires mechanisms for updating the ABox w.r.t. the database evolution

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-38
SLIDE 38

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Top-Down Approach

Given an OBDI system OT , M, S the computation of the certain answers to an UCQ q consists of three steps:

1 Rewriting: Compute the perfect rewriting qpr = PerfectRew(q, T ) of

the original query q, using the inclusion assertions of the TBox T .

2 Unfolding: Compute from qpr a new query qunf by unfolding qpr

using (the split version of) the mappings M. qunf is such that:. Eval(qunf, S) = Eval(qpr, AM,S)

3 Evaluation: Delegate the evaluation of qunf to the relational DBMS

managing S.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-39
SLIDE 39

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Unfolding

The unfolding step is crucial for avoiding materializing the virtual ABox To unfold a query qpr with respect to a set of mapping assertions:

1 For each non-split mapping assertion Φi(

x) ❀ Ψi( t, y): Auxi( x) ← Φi( x) (view definition)

2 For each split version Φi(

x) ❀ Xj( t, y) of a mapping assertion, Xj( t, y) ← Auxi( x) (clause)

3 unify each atom X(

z) in the body of qpr (in all possible ways) with the head of a clause X( t, y) ← Auxi( x).

4 Substitute each atom X(

z) with θ(Auxi( x)),

5 The unfolded query qunf is the union of all queries qaux obtained,

together with the view definitions for Auxi appearing in qaux.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-40
SLIDE 40

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion

Computational complexity of Query answering

From the top-down approach to query answering, and the complexity results for DL-LiteA, query answering in a O = T , S, M is: Very efficiently tractable in the size of the database S (i.e., AC0, and in fact FOL-rewritable). Efficiently tractable in the size of the TBox T and the mappings M(i.e., PTime). Exponential in the size of the query (i.e., NP-complete). — Can we move to LAV or GLAV mappings? No, if we want to stay in AC0 [Calvanese et al., 2008].

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-41
SLIDE 41

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Query rewriting Non-monotonic negation

Outline

1

Ontology-based Data Integration OB Data Integration Framework Issues in OB Data Integration

2

Description Logics Reasoning in DLs Query answering on Ontologies Tractable DLs

3

Description Logic-based Data Integration

4

Discussion Query rewriting Non-monotonic negation

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-42
SLIDE 42

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Query rewriting Non-monotonic negation

The theoretical results indicate a good computational behavior in the size

  • f the data. However, performance is a critical issue in practice:

The rewriting consists of a large number of CQs. Query containment can be used to prune the rewriting. This is already implemented in the QuOnto system, but requires further optimizations. The SQL queries generated by the mapping unfolding are not easy to process by the DBMS engine (e.g., they may contain complex joins

  • n skolem terms computed on the fly).

Different mapping unfolding strategies have a strong impact on computational complexity. Experimentation is ongoing to assess the tradeoff. Further extensive experimentations are ongoing:

◮ on artificially generated data; ◮ on real-world use cases.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-43
SLIDE 43

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Query rewriting Non-monotonic negation

CWA or OWA?

Datalog± Generalizes the DL-Lite family of DLs + stratified negation while keeping Ontology querying tractable (polynomial in data complexity) Datalog alone can neither express disjointness nor functionality lack of value creation (e.g. employee ⊑ ∃WORKS-FOR)

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-44
SLIDE 44

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Query rewriting Non-monotonic negation

Additions to Datalog:

Existentially quantified variables in rule heads ❀ tuple generating dependencies (TGDs) Rule bodies of TGDs are guarded ❀ guarded TGDs P(X) ∧ R(X, Y ) ∧ Q(Y ) → ∃Z.R(Y, Z) Bodies contain single atoms only ❀ linear TGDs Negative constraints and keys, e.g. employee(X, Y ) ∧ retired(X, Z) → ⊥

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-45
SLIDE 45

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Query rewriting Non-monotonic negation

A Normal TGD (NTGD)

has the form ∀X∀Y Φ(X, Y )

  • conj. of atoms

and neg. atoms

→ ∃Z Ψ(X, Y )

  • conj. of atoms

guarded: a positive atom in its body contains X, Y linear: is guarded, and has exactly one positive atom in its body

A normal Boolean conjunctive query (NBCQ) Q

is an existentially closed conjunction of atoms and negated atoms ∃ Xp1( X) ∧ . . . ∧ pm( X) ∧ ¬pm+1( X) ∧ . . . ∧ ¬pm+n( X) Q is safe iff every variable in a negative atoms also occurs in a positive atom

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-46
SLIDE 46

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Query rewriting Non-monotonic negation

Theorem

Answering safe NBCQs in guarded Datalog± can be done in polynomial time in data complexity Answering safe NBCQs in linear Datalog± is FO-rewritable

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-47
SLIDE 47

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Query rewriting Non-monotonic negation

Thank you!

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-48
SLIDE 48

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Query rewriting Non-monotonic negation

References

  • C. Beeri, A.Y. Levy, and M. C. Rousset

Rewriting Queries Using Views in Description Logics. In Proc. PODS’97 (Symposium on Principles of Database Systems), pp. 99-108, 1997.

  • A. Cali, G. Gottlob, and T. Lukasiewicz

A general datalog-based framework for tractable query answering over ontologies. In Proc. PODS’09 (Symposium on Principles of Database Systems), pp. 77-86, 2009.

  • D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati.

Tractable reasoning and efficient query answering in description logics: The DL-Lite family.

  • J. of Automated Reasoning 39(3), pp. 385-429, 2007.
  • A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, and R. Rosati,

Linking Data to Ontologies.

  • J. Data Semantics 10, pp. 133-173, 2008.
  • R. Kontchakov, C. Lutz, D. Toman, F. Wolter, and M. Zakharyaschev

Combined FO Rewritability for Conjunctive Query Answering in DL-Lite. Description Logics’09 (International Workshop on Description Logics), 2009.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration

slide-49
SLIDE 49

Ontology-based Data Integration Description Logics Description Logic-based Data Integration Discussion Query rewriting Non-monotonic negation

Further Reading

  • D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati.

Data complexity of query answering in description logics. Proceedings of KR, 2006

  • D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, A. Poggi, R. Rosati,and M. Ruzzi.

Data integration through DL-LiteAontologies. Proceeding of the 3rd Int. Workshop on Semantics in Data and Knowledge Bases (SDKB 2008)

  • C. Lutz, D. Toman, F. Wolter.

Conjunctive query answering in the description logic EL using a relational database system, Proceedings of IJCAI 2009.

  • H. Perez-Urbina, B. Motik, I. Horrocks

Tractable query answering and rewriting under description logic constraints,

  • J. Applied Logic, 2010
  • R. Rosati and A. Almatelli.

Improving query answering over DL-Lite ontologies. Proceedings of KR 2010.

  • Y. Ang´

elica Ib´ a˜ nez-Garc´ ıa Description Logics for Integration