Reasoning challenges in description logic
Roman Kontchakov and Michael Zakharyaschev
Department of Computer Science, Birkbeck College London http://www.dcs.bbk.ac.uk/~{roman,michael}
Reasoning challenges in description logic Roman Kontchakov and - - PowerPoint PPT Presentation
Reasoning challenges in description logic Roman Kontchakov and Michael Zakharyaschev Department of Computer Science , Birkbeck College London http://www.dcs.bbk.ac.uk/~{roman,michael} Description Logic DL is a (large) family of knowledge
Department of Computer Science, Birkbeck College London http://www.dcs.bbk.ac.uk/~{roman,michael}
DL is a (large) family of knowledge representation & reasoning formalisms
(≈ decidable modal, hybrid logics)
Application-driven equilibrium: expressiveness vs. computational costs Recent applications:
Web Ontology Language (OWL) W3C standards OWL 1 (2004), OWL 2 (2009) OWL = DL + XML
DL = OWL − XML
Moscow, 23 August 2010 1
Man ≡ Human ⊓ Male Appendicitis ⊑ Disease ⊓ ∃morphology.Inflam ...
Man(john) hasChild(john, mary) ...
Moscow, 23 August 2010 2
– concept names A0, A1, ...
(e.g., Person, Female, ...)
– role names R0, R1, ...
(e.g., hasChild, loves, ...)
– individual names a0, a1, ...
(e.g., john, mary, ...)
– concept constructs: ⊤, ⊓, ¬, ∃, ∀, ≥ q, ...
(e.g., Person ⊓ Female)
– role constructs: R−, R ◦ S, ...
(e.g., isChildOf)
– axiom construct: ⊑
(e.g., Man ⊑ Person)
– concept names –
where C, D are concepts and R a role Examples: Person ⊓ Female, Person ⊓ ¬Female, Person ⊓ ∃hasChild.⊤, Person ⊓ ∀hasChild.Male
Moscow, 23 August 2010 3
– ∆I is the domain of I (a non-empty set) – ·I is an interpretation function that maps: ∗ concept name Ai → subset AI
i
(AI
i ⊆ ∆I)
∗ role name Ri → binary relation RI
i
(RI
i ⊆ ∆I × ∆I)
∗ individual name ai → element aI
i
(aI
i ∈ ∆I)
– (⊤)I = ∆I and (⊥)I = ∅ – (¬C)I = ∆I \ CI – (C ⊓ D)I = CI ∩ DI – (∀R.C)I = {x ∈ ∆I | ∀y ∈ ∆I (x, y) ∈ RI → y ∈ CI } – (∃R.C)I = {x ∈ ∆I | ∃y ∈ CI (x, y) ∈ RI} – (≥ qR.C)I =
4
statements about how concepts and roles are related to each other A TBox T is a finite set of terminological axioms:
C is subsumed by D (concept inclusion)
R is a subrole of S (role inclusion) an interpretation I satisfies an axiom – I | = C ⊑ D iff CI ⊆ DI – I | = R ⊑ S iff RI ⊆ SI An interpretation I is a model of T if I satisfies every axiom of T
Moscow, 23 August 2010 5
assert knowledge about individuals An ABox A is a finite set of assertional axioms
concept assertion for an individual
role assertion for a pair of individuals an interpretation I satisfies an assertion – I | = C(a) iff aI ∈ CI – I | = R(a, b) iff (aI, bI) ∈ RI An interpretation I is a model of a knowledge base K = (T , A) if I satisfies every axiom of T and A
Moscow, 23 August 2010 6
eg´ e 4.0 a free, open source ontology editor http://protege.stanford.edu/ where you can also find a library of ontologies (tutorials explaining how to use Prot´ eg´ e are at http://www.co-ode.org/resources/tutorials/ )
FaCT++, Pellet
HermiT http://owl.man.ac.uk/factplusplus/ http://pellet.owldl.com/ http://hermit-reasoner.com/
Moscow, 23 August 2010 7
Concept satisfiability: given K = (T , A) and C, decide whether there is I | = K with CI = ∅ Subsumption: given T and concepts C, D, decide whether T | = C ⊑ D
i.e., ∀I (I | = T → I | = C ⊑ D)
Instance checking: given K = (T , A), C and an individual a from A, decide whether K | = C(a) Exercise: show that these three problems are reducible to each other Conjunctive query answering: given a KB K = (T , A), a CQ q( x) and a tuple
K | = q( a) Query answering is typically a harder problem than the other three
Moscow, 23 August 2010 8
A ❀ A(x) ¬C ❀ ¬C(x) C ⊓ D ❀ C(x) ∧ D(x) ∀R.C ❀ ∀y
❀ ∃y
❀ ∃≥qy (R(x, y) ∧ C(y)) C ⊑ D ❀ ∀x
(full FOL is undecidable; this fragment is in NExpTime)
Moscow, 23 August 2010 9
An interpretation I is a model of a KB K = (T , A) under the UNA if I | = K and aI
i = aI j , for any distinct object names ai and aj occurring in A
to explicitly impose constraints on individual names
Price of = Have to check whether a = b in A under given equality constraints Equivalent to reachability in undirected graphs, which is LOGSPACE-complete
(Reingold 2008) . . . just peanuts for most DLs, but not for DL-Lite & OWL 2 QL. . .
Moscow, 23 August 2010 10
. . . – mid 1990s: efficient reasoning cannot afford full Booleans sub-Boolean DLs with ⊓ and ∀ are enough FL, AL, . . . combined complexity ≤ NP mid 1990s – 2005 ‘efficient’ reasoning possible for ExpTime DLs (FaCT,...) full Booleans and other constructs SHIQ, SHOIN (≈ OWL 1), SROIQ (≈ OWL 2) ≥ EXPTIME mid 2005 – . . . new challenges: answering queries & HUGE ontologies Horn DLs with ⊓ and ∃ DL-Lite and EL families ≤ P
Moscow, 23 August 2010 11
Aim: to achieve logical transparency in accessing data – hide from the user where and how data is stored – present only a conceptual view of the data – query the data sources through the conceptual model using RDBMSs
AcademicStaff
Lecturer Module teaches
subclass
r a n g e domain
data sources
Moscow, 23 August 2010 12
empCode: Integer salary: Integer
Employee Manager AreaManager TopManager 1..1 1..* boss
projectName: String
Project 3..* 1..1 1..1 worksOn manages 1..*
{disjoint, complete}
Translating into DL:
TopManager ⊑ Manager AreaManager ⊑ ¬TopManager Manager ⊑ AreaManager ⊔ TopManager Employee ⊑ ∃salary.⊤ ⊤ ⊤ ∃salary−.⊤ ⊤ ⊤ ⊑ Integer ≥ 2 salary.⊤ ⊤ ⊤ ⊑ ⊥ Project ⊑ ≥ 3 worksOn−.⊤ ⊤ ⊤ manages ⊑ worksOn CEO ⊓ (≥ 5 worksOn.⊤ ⊤ ⊤) ⊓ ∃manages.⊤ ⊤ ⊤ ⊑ ⊥ (integrity constraint)
Moscow, 23 August 2010 13
bool
R ::= P | P − B ::= ⊥ | A | ≥ qR C ::= B | ¬C | C1 ⊓ C2 TBox axioms C1 ⊑ C2
horn
TBox axioms B1 ⊓ · · · ⊓ Bn ⊑ B
krom
TBox axioms B1 ⊑ B2 B1 ⊑ ¬B2 ¬B1 ⊑ B2
core = DL-LiteN horn ∩ DL-LiteN krom
under UNA combined complexity sat.: NP data comp. instance: in AC0 data comp. query: coNP combined complexity: P data comp. instance: in AC0 data comp. query: in AC0
d.c. instance: in AC0 d.c. query: coNP
d.c. instance: in AC0 d.c. query: in AC0
DL-Litebool, DL-Litehorn, DL-Litekrom, DL-Litecore:
∃R available
Moscow, 23 August 2010 14
DL-Lite can only speak about the domains and ranges of binary relations, and how many successors and predecessors a point can have but not about the types of these successors/predecessors; types are defined uniformly by domain/range constraints Examples. Describe the models of the following KBs:
≥ 2R ⊑ ⊥}, A = ∅
A ⊑ ∃R, ∃R− ⊑ ∃R, ≥ 2−R ⊑ ⊥}, A = {A(a)}
Moscow, 23 August 2010 15
bool
Let I and J be two interpretations. A relation ̺ ⊆ ∆I × ∆J is called a lite-bisimulation between I and J if (concept) for every concept name A, if x̺y then x ∈ AI iff y ∈ AJ (role) for every role R, if x̺y then x ∈ (= qR)I iff y ∈ (= qR)J
where q ∈ N ∪ {∞}
(I, x) ∼ (J , y) if there is a lite-bisimulation ̺ between I and J with x̺y DL-LiteN
bool concepts are invariant under lite-bisimulations, that is,
if (I, x) ∼ (J , y) then x ∈ CI iff y ∈ CJ , for every concept C A first-order formula ϕ(x) is equivalent to a DL-LiteN
bool concept
iff ϕ(x) is invariant under lite-bisimulations
Moscow, 23 August 2010 16
bool
A lite-bisimulation relation ̺ between I and J is global if – for every x ∈ ∆I there is y ∈ ∆J with x̺y, and – for every y ∈ ∆J there is x ∈ ∆I with x̺y I is lite-bisimilar to J , I ∼ J , if there is a global lite-bisimulation between I and J DL-LiteN
bool TBoxes are invariant under global lite-bisimulations, that is,
if I ∼ J then I | = T iff J | = T , for every DL-LiteN
bool TBox T
Given I and x ∈ ∆I, let tI(x) = {C | x ∈ CI} — the type of x in I TI = {tI(x) | x ∈ ∆I} — set of all types in I I ∼ J iff TI = TJ
models are determined by their types ❀ 1-ary predicates
Moscow, 23 August 2010 17
Every model of a DL-LiteN
bool TBox is globally lite-bisimilar to a tree-shaped model
Examples. Construct a tree-shaped model which is globally lite-bisimilar to t1 t2 t3
R R R
where t1, t2, t3 are distinct types Tree models of DL-LiteN
bool KBs?
ABox A B
Why is the tree-model property so important?
Moscow, 23 August 2010 18
Satisfiability of DL-LiteN
bool KBs is NP-complete (for combined complexity)
Proof DL-LiteN
bool
K ❀ K† (a universal 1-variable FO formula)
T = {A ⊑ ∃P − , ∃P − ⊑ A, A ⊑ ≥ 2 P, ⊤ ⊑ ≤ 1 P − , ∃P ⊑ A}, A = {A(a), P (a, a′)}
∀x
∧ (E2P (x)→E1P (x)) ∧ (E2P −(x)→E1P −(x)) ∧ (E1P (x)→E1P −(dp−)) ∧ (E1P −(x)→E1P (dp))
(∃P )I = ∅ iff (∃P −)I = ∅ ∃x E1P (x) ↔ ∃x E1P −(x)
. .
dp− dp a a′ | = K†
. .
dp− dp a a′ | = K†
. .
dp− dp a a′ | = K† | = K
K is satisfiable iff K† is. K† computed in LogSpace. K† says that – ∃ appropriate dr – ∀ point is of proper type
Moscow, 23 August 2010 19
(under UNA)
For DL-LiteN
horn KBs K, the translation K† is a conjunction of formulas of the form
(horn) ∀x
For DL-LiteN
krom KBs K, the translation K† is a conjunction of formulas of the form
(krom) ∀x
∀x
∀x
For DL-LiteN
core KBs K, the translation K† is a conjunction of formulas of the form
(core) ∀x
∀x
Moscow, 23 August 2010 20
horn and DL-LiteN core
For a consistent DL-LiteN
horn KB K = (T , A), the canonical model IK
is constructed as follows:
draw the missing R-arrows to fresh points and add ∃R− to their types
= K then there is a map h: ∆IK → ∆I such that, for all x, y ∈ ∆IK, basic concepts B and roles R, – if x ∈ BIK then h(x) ∈ BI; – if (x, y) ∈ RIK then (h(x), h(y)) ∈ RI
= q( a) iff IK | = q( a)
Moscow, 23 August 2010 21
DL-LiteF
core (only functionality)
is NLogSpace-complete for combined complexity and in AC0 for data complexity DL-LiteHF
core
(DL-LiteF
core + R1 ⊑ R2)
is ExpTime-complete for combined complexity and P-complete for data complexity Example: A1 ⊓ A2 ⊑ C can be simulated by the axioms: A1 ⊑ ∃R1 A2 ⊑ ∃R2 R1 ⊑ R12 R2 ⊑ R12 ≥ 2 R12 ⊑ ⊥ ∃R−
1
⊑ ∃R−
3
∃R3 ⊑ C R3 ⊑ R23 R2 ⊑ R23 ≥ 2 R−
23 ⊑ ⊥ Moscow, 23 August 2010 22
α
if R has a proper sub-role in T then T contains no negative occurrences of ≥ q R or ≥ q R− with q ≥ 2
≥ q R.C ≥ q R.C if ≥ q R.C occurs in T then T contains no negative occurrences of ≥ q′ R or ≥ q′ inv(R) with q′ ≥ 2
no TBox can contain both a functionality constraint ≥ 2 R ⊑ ⊥ and ≥ q R.C, for any q ≥ 1
in particular, same complexity of DL-Lite(RN )
α
and DL-LiteN
α
(NLogSpace-hard for data complexity)
Moscow, 23 August 2010 23
Without UNA, satisfiability of DL-LiteN
α KBs is NP-complete w.r.t. both
combined and data complexity, for any α ∈ {core, krom, horn, bool}
source of non-determinism: different ways of identifying ABox individuals
Lower bound: by reduction of monotone 1-in-3 3SAT n
k=1(ak,1 ∨ ak,2 ∨ ak,3)
a1,1 a1,2 a1,3 an,1 an,2 an,3
q q q
c1 cn
a1, a2, . . . , am
✉ ✉ ✉ ✉ ✉ ✉ ✉ ❅ ❅ ❅ ❅ ■ ❅ ❅ ❅ ❅ ■ ✻ ✻
A = {ak,i = ak,j | i = j} ∪ {P (ck, ak,j) | k ≤ n, j ≤ 3} T = {≥ 4P ⊑ ⊥} Answer is yes iff there is a (true) variable ai in the given CNF such that Kai = (T , A ∪ {P (ck, ai) | k ≤ n}) is satisfiable without UNA NB: One can get rid of = in A
Moscow, 23 August 2010 24
. .
UNA no role inclusions horn core krom bool F N AC0 P
CONP
instance checking data complexity EXPTIME NP P NLOGSPACE satisfiability combined complexity
Legend
query answering
CONP
query answering = instance checking
. .
no UNA no role inclusions F N F N AC0 P
CONP
instance checking data complexity EXPTIME NP P NLOGSPACE satisfiability combined complexity
Legend
query answering
CONP
query answering = instance checking
. .
with/without UNA role inclusions F N F N F N AC0 P
CONP
instance checking data complexity EXPTIME NP P NLOGSPACE satisfiability combined complexity
Legend
query answering
CONP
query answering = instance checking
Moscow, 23 August 2010 25
‘An OWL 2 profile is a trimmed down version of OWL 2 that trades some expressive power for the efficiency of reasoning’ ‘OWL 2 QL is aimed at applications that use very large volumes of instance data, and where query answering is the most important reasoning task. In OWL 2 QL, conjunctive query answering can be implemented using conventional relational database systems.’ OWL 2 QL = DL-LiteH
core
with/without UNA with = (but no =) with (a)symmetric, (ir)reflexive and disjoint roles (but no transitive roles) Why not DL-LiteH
horn? Moscow, 23 August 2010 26
‘The OWL 2 EL profile is designed as a subset of OWL 2 that
very large numbers of classes and/or properties,
instance checking can be decided in polynomial time.’ For example, OWL 2 EL provides class constructors that are sufficient to express the very large biomedical ontology SNOMED CT (≈ 400.000 axioms) Pericardium ⊑ Tissue ⊓ ∃cont in.Heart Pericarditis ⊑ Inflammation ⊓ ∃has loc.Pericardium Inflammation ⊑ Disease ⊓ ∃acts on.Tissue Disease ⊓ ∃has loc.∃cont in.Heart ⊑ Heartdisease ⊓ NeedsTreatment
Moscow, 23 August 2010 27
EL concepts: C ::= ⊤ | ⊥ | A | ∃R.C | C1 ⊓ C2 EL TBoxes: finite sets of CIs C1 ⊑ C2 EL ABoxes: finite sets of assertions C(a), R(a, b) Concept satisfiability: given T , C, decide whether there is I | = T with CI = ∅ Subsumption: given T and concepts C, D, decide whether T | = C ⊑ D Instance checking: given a KB K = (T , A), C and an individual a from A, decide whether K | = C(a) Reducible to each other! Conjunctive query answering: given a KB K = (T , A), a CQ q( x) and a tuple
K | = q( a)
Moscow, 23 August 2010 28
EL can specify some positive information about types of points, viz:
(but not that it does not belong to a concept);
(but not that all outgoing R-arrows end in the concept);
Example. Describe the models of the following KBs: T = {A ⊑ B1, B1 ⊑ ∃R.B1, ∃R.B1 ⊑ B2, B1 ⊓ B2 ⊑ ∃S.B2}, A = {A(a)}
Moscow, 23 August 2010 29
Let I and J be two interpretations. A relation ̺ ⊆ ∆I × ∆J is called a simulation of I in J if (concept) for every concept name A, if x̺y then x ∈ AI ⇒ y ∈ AJ (role) for every role name R, if x̺y then (x, x′) ∈ RI ⇒ ∃y′ (y, y′) ∈ RJ and x′̺y′ (I, x) (J , y) if there is a simulation ̺ of I in J with x̺y EL concepts are preserved under simulations, that is, if (I, x) (J , y) then x ∈ CI ⇒ y ∈ CJ , for every concept C EL concepts cannot distinguish between (I, x) and (J , y) if (I, x) (J , y) and (J , y) (I, x)
What are the differences between DL-Lite and EL?
Moscow, 23 August 2010 30
(basically the same construction as for DL-LiteN
horn)
For a consistent EL KB K = (T , A), the canonical model IK is constructed as follows
draw an R-arrow to a fresh point and add C to its type
= K then there is a map h: ∆IK → ∆I such that, for all x, y ∈ ∆IK, concept and role names A and R, – if x ∈ AIK then h(x) ∈ AI; – if (x, y) ∈ RIK then (h(x), h(y)) ∈ RI
= q( a) iff IK | = q( a) IK can be infinite
Moscow, 23 August 2010 31
ABox A a C TBox T ⊤ ⊑ ∃R.A, ⊤ ⊑ ∃R.B Canonical model IK aC
R R R R R R
Compact canonical model CK aC wA A wB B
R R R R R R IK is obtained by unravelling CK; (CK, a) (IK, a)
Moscow, 23 August 2010 32
Con(K) = the set of all concepts in K ∆CK = Ind(A) ∪ {wC | C ∈ Con(K)} wC is a witness for C ACK = {a | K | = A(a)} ∪ {wC | T | = C ⊑ A}
(A a concept name)
RCK = {(a, b) | R(a, b) ∈ A} ∪
(R a role name)
{(a, wC) | K | = ∃R.C(a)} ∪ {(wC, wD) | T | = C ⊑ ∃R.D}
❀ Satisfiability of EL KBs is PTime-complete
Moscow, 23 August 2010 33
EL can be extended, without losing tractability , with
(e.g., R ◦ R ⊑ R means transitivity)
(the range of R is in C)
(the domain of R is in C)
≈ OWL 2 EL Extensions with any of the constructs C ⊔ D, ∀R.C, ≥ qR, R−, symmetric roles result in ExpTime-hard reasoning Exercise: construct an ELI (EL + inverse roles) KB K with CK of exponential size
Moscow, 23 August 2010 34
instance data + ontologies Reasoning problem: answering queries over knowledge & data
q = C(x)
an ABox individual a is an answer iff T , A | = C(a)
Example T = {Boss ⊑ Employee}, A = {Boss(bob)}, q = Emploee(x)
‘list all employees’
Answer: x = bob
(not an answer over A alone)
T , A | = C(a) iff there is no I | = T ∪ A such that I | = ¬C(a) iff T ∪ A ∪ {¬C(a)} is not satisfiable Instance checking is as complex as satisfiability checking
Moscow, 23 August 2010 35
q = ∃ y ϕ( x, y),
where ϕ( x, y) is a conjunction of atoms A(z), R(z, z′) with z, z′ ∈ x ∪ y
a tuple a of ABox individuals is an answer iff I | = ∃ y ϕ( a, y) for every I | = T ∪ A usually more complex than satisfiability
complexity of answering CQs without quantified variables?
q = ∃ y ϕ( x, y), ϕ may contain both ∧ and ∨
(but no ¬)
may contain ∧, ∨, ¬, ∀, ∃
no good: validity of FO formulas is undecidable
(1) as efficient as database query answering and (2) based on relational database management systems
Moscow, 23 August 2010 36
bool: exercise
T :
Research ⊑ ∃worksIn, ∃worksIn− ⊑ Project, Project ⊑ ∃manages−, ∃manages ⊑ Academic ⊔ Visiting, ∃teaches ⊑ Academic ⊔ Research, Academic ⊑ ∃teaches ⊓ ≤ 1 teaches, Research ⊓ Visiting ⊑ ⊥, ∃writes ⊑ Academic ⊔ Research,
A = {teaches(a, b), teaches(a, c)} q = ∃y
T ′ = T ∪ {Visiting ⊑ ≥ 2 writes} Disjunction is (NP-) hard even for data complexity Only Horn logics can be suitable for ontology-based data access
Moscow, 23 August 2010 37
Given a CQ q( x) over T , rewrite q( x) into an FO query q′( x) such that for all A and a, T , A | = q[ a] iff A | = q′[ a] conjunctive query q TBox T + union of conjunctive queries q′ ABox A ABox A ‘Maximal’ DLs for which query answering is in FO (=AC0) for data complexity: DL-Lite(HN )
horn
under UNA
and DL-LiteH
horn
without UNA
Moscow, 23 August 2010 38
Want: all tuples a of individuals in A such that IK | = q( a)
where IK is the canonical model of K = (T , A)
Can: query the ABox A (using an RDBMS) To construct the canonical model IK:
A B
∃R ∃S
by introducing ‘fresh’ witnesses q′ should incorporate T
Moscow, 23 August 2010 39
Compute the rewriting q′ for the following CQ and TBox: q(x) = x A y1 y2 y3 y4
R S S R R S
T = {B ⊑ ∃R, B ⊑ ∃S, ∃R ⊑ A}
q(x) = ∃y1, y2, y3, y4
R(y1, y2) ∧ S(y1, y3) ∧ R(y4, y2) ∧ S(y4, y3)
(in ABox or the tree part)
Moscow, 23 August 2010 40
Suppose y1 is in the ABox, while y2, y3, y4 are in the tree part qabox x A y1 y1 y2 y3 y4
R S S R R S
qtree T = {B ⊑ ∃R, B ⊑ ∃S, ∃R ⊑ A}
the canonical model?
rewritten query for this partition: A ∨ B ∨ ∃R x y1 B
R S take disjunction of such queries for all partitions
Moscow, 23 August 2010 41
Off-the-shelf RDBMSs can be used for CQ answering in DL-Lite working systems available (Quonto, Requiem, Presto) Experimental results: not scalable for large DL-Litecore ontologies complexity paradox? Reason: q over (T , A) ❀T q′ over A with |q′| = O(|T | · |q|)|q| is it optimal? Is data complexity a proper measure? (in RDBMSs, typical queries are relatively small...)
Take the structure of A, T , q into account? Bounded treewidth? ...
The rewriting approach is not applicable to other tractable DLs, e.g., EL
why?
Moscow, 23 August 2010 42
conjunctive query q TBox T + FO query q′ + ABox A ABox A′
❀ combined approach
Moscow, 23 August 2010 43
ABox A a A b
S
TBox T A ⊑ ∃R, ∃S− ⊑ B, ∃R− ⊑ ∃S, ∃S− ⊑ ∃S Canonical model IK a A bB
S
S
S
B B B ‘Compact’ canonical model CK a A bB
S
cR
R
cS
S S
B
S
IK is obtained by ‘unravelling’ CK
Does CK give correct answers to queries?
Moscow, 23 August 2010 44
IK a A bB
S
S
S
B B B CK a A bB
S
cR
R
cS
S S
B
S
q y0 y4 y1
R
y3 y2
S S S
What is the answer to q over CK? What is the answer to q over IK?
Find an FO expressible condition for such situations
a cS− cR cR cS
Moscow, 23 August 2010 45
Given K = (T , A), q and R(x, y) ∈ q,
fR(x,y) : terms(q) → {cS | S used in K} ∪ {ε} such that
– if fR(x,y)(z) = ε then we must have x = z – otherwise z must be mapped to fR(x,y)(z)
In the previous example, fR(y1,y2)(y3) = ε fR(y,y) does not exists
Moscow, 23 August 2010 46
horn (1)
rewrite a given CQ q = ∃ u ϕ into an FO query q† such that
= answers to q† over CK
ϕ1 =
∈ u
(v = cR)
‘all answer variables must get ABox values’
ϕ′
1 =
∈ u
¬aux(v), where aux is a new relation containing all cR, then |q†| = O(|q|)
Moscow, 23 August 2010 47
horn (2)
ϕ2 =
fR(x,y) does not exist
(y = cR)
if no tree witness exists then y cannot be mapped to a non-ABox element
ϕ3 =
fR(x,y) exists
→
(z = x)
48
Exercise 1: compute q′ for the exercise on page 45
ϕ1 = ϕ2 = ⊤ ϕ3 = (y2 = cS) → (y1 = y3)
Exercise 2: Use the rewriting and combined approaches for the following KB and query: T : A: {Student(a), Professor(b), teaches(b, a)} Student ⊑ ∃hasTutor ∃teaches− ⊑ Student Professor ⊑ ∃teaches ∃hasTutor− ⊑ Professor q(x) = teaches(x, y), hasTutor(y, z), hasTutor(u, z)
Moscow, 23 August 2010 49
horn
what can we do with role inclusions? Reduce positive existential queries over DL-Lite(HN )
horn
KBs to unions of (exponentially many) CQs over DL-LiteN
horn KBs
Step 1. DL-Lite(HN )
horn
KB K = (T , A) ❀ DL-LiteN
horn KB K = (Th, A)
by replacing all R ⊑∗ S with ∃R ⊑ ∃S
(⊑∗ is the transitive closure of ⊑)
Step 2. Positive existential q over K ❀ union of CQs qh over CKh: – replace each R(t, t′) in q with
S(t, t′) – convert result into disjunctive normal form (exponential blowup) ≤ r|q| conjuncts, where r is the depth of ⊑∗ K | = q( a) iff CKh | = qh
is there a polynomial rewriting?
Moscow, 23 August 2010 50
❀ pure polynomial rewriting for DL-LiteN
core
horn
(which is P-complete for data complexity)
with executing the original query over the data
(the formulas ϕ1–ϕ3 introduce additional selection conditions on top of the original query)
– is the exponential blowup unavoidable for role inclusions? – is the exponential blowup unavoidable for positive existential queries? – for which DLs pure rewriting can be polynomial?
Moscow, 23 August 2010 51
The query rewriting approach cannot work for EL because already instance checking in EL is PTime-complete w.r.t. data complexity Lower bound: by reduction of PTime-complete entailment for Horn CNF E.g., ϕ = (a1 ∧ a2 → a3) ∧ (a2 → a1) ∧ a2 is encoded by the ABox Aϕ a1 a2 a3 c1 c2
P R S S P R
T and the (ϕ-independent) TBox T : T = {∃S.(∃P.T ⊓ ∃R.T ) ⊑ T } ϕ | = ai iff (T , Aϕ) | = T (ai)
Moscow, 23 August 2010 52
ABox A a C TBox T ⊤ ⊑ ∃R.A, ⊤ ⊑ ∃R.B Canonical model IK aC
R R R R R R
Compact canonical model CK aC wA A wB B
R R R R R R IK is obtained by unravelling CK
Difference from DL-Lite: multiple R-successors of non-ABox points
Moscow, 23 August 2010 53
rewrite a given CQ q = ∃ u ϕ into an FO query q† such that
= answers to q∗ over CK
ϕ1: answer variables and variables in cycles in q must be mapped to ABox ϕ2: if x1 x2 x3
R R
in q and x2 is mapped outside the ABox then x1 = x3 ϕ3: if x1 x2 x3
R S
in q and R = S then x2 must be mapped to ABox
Moscow, 23 August 2010 54
ABox A a A bA
R
TBox T A ⊑ ∃R.A CK a A bA wA A
R R R R
q(x) = ∃y
x = b q∗(x) = ∃y
Moscow, 23 August 2010 55
ABox A a A bA
R
TBox T A ⊑ ∃R.A CK a A bA wA A
R R R R
q(x, x′) = ∃y
x′ = b q∗(x) = ∃y
ABox(x) ∧ ABox(x′) ∧
no answer
Moscow, 23 August 2010 56
comparing logical consequences over some common vocabulary Σ not the syntactic form of the axioms (as in diff)
adding new axioms but preserving the relationships between terms of a certain part Σ of the vocabulary
importing an ontology and using its vocabulary Σ as originally defined
(relationships between terms of Σ should not change)
computing a subset M (ideally as small as possible) of an ontology T that ‘says’ the same about Σ as T
Moscow, 23 August 2010 57
A fragment of a conceptual schema:
disj cov Staff Research Visiting Academic ProjectManager △ △ ▽ Project
❄
worksOn 1..* 1..*
✻
manages 1..2
Translating into DL:
∃ manages.⊤ ⊤ ⊤ ⊑ ProjectManager ∃ manages−.⊤ ⊤ ⊤ ⊑ Project Project ⊑ ∃ manages−.⊤ ⊤ ⊤ ≥ 3 manages−.⊤ ⊤ ⊤ ⊑ ⊥ Research ⊓ Visiting ⊑ ⊥ Academic ⊑ ProjectManager ProjectManager ⊑ Academic ⊔ Visiting . . .
R ::= P | P − B ::= ⊥ | Ai | ∃R | ≥ q R C ::= B | ¬C | C1 ⊓ C2 | C1 ⊔ C2 DL-Litebool TBox axioms: C1 ⊑ C2 ABox assertions: C(a), R(b, c) Essentially positive existential queries: ∃ yϕ( x, y), built from C(t), R(t, t′), ∧, ∨
Moscow, 23 August 2010 58
Let T1 and T2 be TBoxes and Σ a signature (concept and role names) When do T1 and T2 ‘say’ the same about Σ?
T1 c
Σ T2
T1 | = C ⊑ D implies T2 | = C ⊑ D
x) and ABoxes A, T1 q
Σ T2
(T1, A) | = q( a) implies (T2, A) | = q( a),
for all
a
for all Σ-interpretations I, T1 m
Σ T2
∃ I1 ⊇ I I1 | = T1 implies ∃ I2 ⊇ I I2 | = T2
T1 ≡S
Σ T2
T1 S
Σ T2 and T2 S Σ T1 Moscow, 23 August 2010 59
Example 1.
Σ = {Lecturer, Course} T1 = ∅, T2 = {Lecturer ⊑ ∃teaches, ∃teaches− ⊑ Course}
T1 ≡c
Σ T2 ?
T1 ≡q
Σ T2 ?
Take A = {Lecturer(a)}, q = ∃y Course(y). Then (T1, A) | = q but (T2, A) | = q
Example 2.
Σ = {Lecturer} T1 = ∅, T2 = {Lecturer ⊑ ∃teaches, Lecturer ⊓ ∃teaches− ⊑ ⊥}
T1 ≡c
Σ T2 ?
T1 ≡q
Σ T2 ?
Take A = {Lecturer(a)}, q = ∃y ¬Lecturer(y). Then (T1, A) | = q and (T2, A) | = q
Moscow, 23 August 2010 60
Example 3. Let T1 contain the axioms
Research ⊑ ∃worksOn, ∃worksOn− ⊑ Project, Project ⊑ ∃manages−, ∃manages ⊑ Academic ⊔ Visiting, ∃teaches ⊑ Academic ⊔ Research, Academic ⊑ ∃teaches ⊓ ≤ 1 teaches, Research ⊓ Visiting ⊑ ⊥, ∃writes ⊑ Academic ⊔ Research, T2 = T1 ∪ {Visiting ⊑ ≥ 2 writes} and Σ = {teaches}
Σ T2
T2 | = Visiting ⊑ Academic, but nothing new in the signature Σ
Σ T2:
A = {teaches(a, b), teaches(a, c)} q = ∃x
. .
b b b b
Research Visiting Project
a b c (T1, A) | = q (I | = (T1, A) but I | = q) (T2, A) | = q
. .
b b b b
Research Visiting ⊔ Academic Project
a b c
. .
b b b b b
Research Academic Project
a b c
Moscow, 23 August 2010 61
Let Q be a set of numerical parameters and Σ a signature ΣQ-concepts B: Ai ∈ Σ and (≥ q R) with q ∈ Q and R ∈ Σ ΣQ-type t t t is a set of ΣQ-concepts containing B or ¬B (but not both), for all B For a TBox T , a ΣQ-type t t t is T -realisable if t t t is satisfied in a model of T
(i.e., there is a I of T and a point w in it such that w ∈ BI iff B ∈ t t t)
a set Ξ of ΣQ-types is precisely T -realisable if there is a model of T realising precisely the types from Ξ
T1 Σ-concept entails T2 iff every T1-realisable ΣQ-type is T2-realisable T1 Σ-query entails T2 iff every precisely T1-realisable set Ξ of ΣQ-types is precisely T2-realisable
Moscow, 23 August 2010 62
Theorem.
Πp
2-complete
elementary and CONEXPTIME-hard
bool
E.g. deciding Σ-concept and Σ-query inseparability for DL-LiteN
horn is
CONP-complete
2-completeness means that the problem can be encoded as
satisfiability of ∀∃ quantified Boolean formulas Various QBF solvers can be used to check Σ-concept and Σ-query inseparability
(2EXPTIME-complete for ALC, undecidable for ALCQIO)
Moscow, 23 August 2010 63
Let T be a TBox, Q a set of numerical parameters and t t t a sig(T )Q-type ‘t t t0 is T -realisable with t t t1, . . . ,t t tn being witnesses’
propositional formula
= ΦT (b0, b1, . . . , bn)
bj is the vector of all propositional variables B∗ of the type t t tj
Then the condition ‘every T1-realisable ΣQ-type t t t is T2-realisable’ is described by the following QBF ∀bΣQ
∃bT1
1 . . . ∃bT1 n1 ΦT1(bΣQ
· bT1\ΣQ , bT1
1 , . . . , bT1 n1)
→ ∃bT2\ΣQ ∃bT2
1 · · · ∃bT2 n2 ΦT2(bΣQ
·bT2\ΣQ , bT2
1 , . . . , bT2 n2)
is the ΣQ-part of b0 and bTi\ΣQ contains the rest of the variables)
Moscow, 23 August 2010 64
TBox instances
(standard Department Ontology + ICNARC)
axioms basic concepts series description instances
T1 T2 T1 T2 Σ NN
T1 does not Σ-concept entail T2
840 59–308 74–396 47–250 49–300 5–103 YN
T1 Σ-concept but not Σ-query entails T2
504 56–302 77–382 44–246 58–298 6–89 YY
T1 Σ-query entails T2
624 43–178 43–222 40–158 40–188 5–64
QBF solvers
(http://skizzo.info/)
(http://www.cs.toronto.edu/~fbacchus/)
(http://www.princeton.edu/~chaff/quaffle.html)
(http://www.star.dist.unige.it/)
(http://www.mind-lab.it/aqme/)
(http://fmv.jku.at/depqbf/) Σ-concept entailment QBF Σ-query entailment QBF
series
variables clauses variables clauses NN 1,469–48,631 2,391–74,621 1,715–60,499 5,763–1,217,151 YN 1,460–46,873 2,352–71,177 1,755–59,397 7,006–1,122,361 YY 1,006–16,033 1,420–23,363 1,202–20,513 2,963–204,889
number of clauses is
linear quadratic
(in the number of roles)
Moscow, 23 August 2010 65
Σ-concept entailment Σ-query entailment
YY
%
10 20 30 40 50 60 70 80 90 100 1 2 4 8 16 32 64 128 256 512 s
YN
%
10 20 30 40 50 60 70 80 90 100 1 2 4 8 16 32 64 128 256 512 s
NN
%
10 20 30 40 50 60 70 80 90 100 1 2 4 8 16 32 64 128 256 512 s 10 20 30 40 50 60 70 80 90 100 1 2 4 8 16 32 64 128 256 512 s 10 20 30 40 50 60 70 80 90 100 1 2 4 8 16 32 64 128 256 512 s 10 20 30 40 50 60 70 80 90 100 1 2 4 8 16 32 64 128 256 512 s
Moscow, 23 August 2010 66
studied under different names: forgetting, uniform interpolation, variable elimination. . .
A DL L admits forgetting (has uniform interpolation) if, for every T in L and every Σ, there exists TΣ in L with sig(TΣ) ⊆ Σ such that T and TΣ are Σ-concept inseparable in L Theorem Both DL-Litebool and DL-Litehorn have uniform interpolation and the uniform interpolant can be constructed in exponential time DL-Liteu
bool:
C ::= . . . | ∃C | . . .
(universal modality)
e.g., (≥ 2 teaches) ⊑ ∃ ∃ ∃(∃teaches ⊓ ≤ 1 teaches)
TΣ with sig(TΣ) ⊆ Σ is a uniform interpolant of T w.r.t. Σ in DL-Liteu
bool if
T | = C ⊑ D iff TΣ | = C ⊑ D, for every C ⊑ D in DL-Liteu
bool with sig(C ⊑ D) ⊑ Σ
T ′ Σ-query entails T iff T ′ | = C ⊑ D, for each C ⊑ D ∈ TΣ Theorem For every T in DL-Litebool and every Σ one can construct a uniform interpolant TΣ of T w.r.t. Σ in DL-Liteu
bool in time exponential in T Moscow, 23 August 2010 67
Let S be an inseparability relation, T a TBox and Σ a signature. M ⊆ T is
if M ≡S
Σ T
if M ≡S
Σ∪sig(M) T
if ∅ ≡S
Σ∪sig(M) T \ M
M is a minimal module of T if it can’t be made smaller Facts:
Σ-module
⇒ ⇒ ⇒ self-contained ≡q
Σ-module
⇒ ⇒ ⇒ ≡q
Σ-module
Σ-module
⇒ ⇒ ⇒ ≡c
Σ-module
Σ-module
Moscow, 23 August 2010 68
(1) Publisher ⊑ ∃pubHasDistrib (11) BookUser ⊑ User (2) ∃pubHasDistrib− ⊑ Distributor (12) User ⊑ ∃hasRole (3) Publisher ⊑ ¬Distributor (13) ∃hasRole− ⊑ Role (4) ∃pubHasDistrib ⊑ Publisher (14) Role ⊑ ¬Publisher (5) Publisher ⊑ ≤ 1 pubHasDistrib (15) User ⊑ ¬Publisher (6) Role ⊑ ¬Distributor (16) Role ⊑ ¬User (7) User ⊑ ¬Distributor (17) User ⊑ ∃userAdmedBy (8) Publisher ⊑ ∃pubAdmedBy (18) ∃userAdmedBy− ⊑ AdmUser (9) ∃pubAdmedBy− ⊑ AdmUser ⊔ BookUser (19) ∃userAdmedBy ⊑ User (10) AdmUser ⊑ User (20) ∃pubAdmedBy ⊑ Publisher
the minimal Sc
Σ-module is ∅
minimal Sq
Σ-modules of T :
MD , MR and MU the minimal depleting Sq
Σ-module is T Moscow, 23 August 2010 69
input T , Σ M := T for each α ∈ M do if M \ {α} ≡S
Σ M then M := M \ {α}
end for
NB: depends
input T , Σ T ′ := T ; Γ := Σ; W := ∅ while T ′ \ W = ∅ do choose α ∈ T ′ \ W W := W ∪ {α} if W ≡S
Γ ∅ then
T ′ := T ′ \ {α}; W := ∅; Γ := Γ ∪ sig(α) endif end while
Moscow, 23 August 2010 70
34
MCM
54
MQM
294
MDQM
465
⊤ ⊥M
206
SRS
598
Prompt
393
E-mod
39
MCM
66
MQM
319
MDQM
351
⊤ ⊥M
117
SRS
509
Prompt
191
E-mod
200 400 600
Module sizes and standard deviation for |Σ| = 10
Moscow, 23 August 2010 71
Σ-concept entailment is tractable;
for expressive DLs such as ALC;
Moscow, 23 August 2010 72