Dichotomies in Ontology-Mediated Querying with the Guarded Fragment - - PowerPoint PPT Presentation

dichotomies in ontology mediated querying with the
SMART_READER_LITE
LIVE PREVIEW

Dichotomies in Ontology-Mediated Querying with the Guarded Fragment - - PowerPoint PPT Presentation

Dichotomies in Ontology-Mediated Querying with the Guarded Fragment Frank Wolter University of Liverpool Based on joint work with A. Hernich, C. Lutz and F . Papacchini (PODS 2017) Dichotomy Theorems Given a class of problems, we would like


slide-1
SLIDE 1

Dichotomies in Ontology-Mediated Querying with the Guarded Fragment

Frank Wolter University of Liverpool

Based on joint work with A. Hernich, C. Lutz and F . Papacchini (PODS 2017)

slide-2
SLIDE 2

Dichotomy Theorems

Given a class of problems, we would like to classify them into the hard and the easy problems. Ideally, there shouldn’t be any intermediate problems.

✬ ✫ ✩ ✪

Hard Easy Intermediate

slide-3
SLIDE 3

We focus on P/NP Dichotomy Theorems

By Ladner’s Theorem, there are NP-intermediate problems (if P=NP). Moreover, being in P is undecidable for problems in NP (if P=NP). Thus, we can expect P/NP dichotomy theorems only for rather restricted classes

  • f problems.

✬ ✫ ✩ ✪

NP-complete in PTime NP-intermediate

slide-4
SLIDE 4

Homomorphism (or CSP) Problems

Consider an undirected graph H. How hard is the following problem:

  • Input: an undirected graph G.
  • Question: is there a homomorphism h from G to H?

((h(a), h(b)) is an edge in H if (a, b) is an edge in G.)

slide-5
SLIDE 5

Homomorphism (or CSP) Problems

Consider an undirected graph H. How hard is the following problem:

  • Input: an undirected graph G.
  • Question: is there a homomorphism h from G to H?

((h(a), h(b)) is an edge in H if (a, b) is an edge in G.)

  • if H is a single self-loop?
slide-6
SLIDE 6

Homomorphism (or CSP) Problems

Consider an undirected graph H. How hard is the following problem:

  • Input: an undirected graph G.
  • Question: is there a homomorphism h from G to H?

((h(a), h(b)) is an edge in H if (a, b) is an edge in G.)

  • if H is a single self-loop?
  • if H = K2 (K2 complete graph on two vertices)?
slide-7
SLIDE 7

Homomorphism (or CSP) Problems

Consider an undirected graph H. How hard is the following problem:

  • Input: an undirected graph G.
  • Question: is there a homomorphism h from G to H?

((h(a), h(b)) is an edge in H if (a, b) is an edge in G.)

  • if H is a single self-loop?
  • if H = K2 (K2 complete graph on two vertices)?
  • if H = K3?
slide-8
SLIDE 8

Homomorphism (or CSP) Problems

Consider an undirected graph H. How hard is the following problem:

  • Input: an undirected graph G.
  • Question: is there a homomorphism h from G to H?

((h(a), h(b)) is an edge in H if (a, b) is an edge in G.)

  • if H is a single self-loop?
  • if H = K2 (K2 complete graph on two vertices)?
  • if H = K3?

Hell and Nesetril (1990): This problem is in PTime iff H contains a self-loop

  • r is bipartite. Otherwise this problem is NP-complete.
slide-9
SLIDE 9

Generalization to Relational Structures (CSP)

Let H be a finite relational structure (also called template). The constraint satis- faction problem for H, CSP(H) is the following decision problem:

  • Input: a finite relational structure D.
  • Question: is there a homomorphism from D to H?
slide-10
SLIDE 10

Generalization to Relational Structures (CSP)

Let H be a finite relational structure (also called template). The constraint satis- faction problem for H, CSP(H) is the following decision problem:

  • Input: a finite relational structure D.
  • Question: is there a homomorphism from D to H?

Feder-Vardi Conjecture (1993): There is a P/NP dichotomy for CSPs. Equiv- alently, the is such a dichotomy for digraphs.

slide-11
SLIDE 11

Generalization to Relational Structures (CSP)

Let H be a finite relational structure (also called template). The constraint satis- faction problem for H, CSP(H) is the following decision problem:

  • Input: a finite relational structure D.
  • Question: is there a homomorphism from D to H?

Feder-Vardi Conjecture (1993): There is a P/NP dichotomy for CSPs. Equiv- alently, the is such a dichotomy for digraphs. Lots of progress over the past 20 years (mainly due to algebraic reformulation):

  • Early result: There is a P/NP Dichotomy for CSPs with two elements (Schae-

fer 1978).

  • Example: There is a P/NP Dichotomy for CSPs with three elements (Bulatov

2006).

slide-12
SLIDE 12

Ontology Mediated Querying of Data (Example 1)

  • Data D: finite set of ground atoms (often regarded as finite relational struc-

ture); e.g., LiverpoolAcademic(peter), HasLiverpoolId(sue, Liv123)

slide-13
SLIDE 13

Ontology Mediated Querying of Data (Example 1)

  • Data D: finite set of ground atoms (often regarded as finite relational struc-

ture); e.g., LiverpoolAcademic(peter), HasLiverpoolId(sue, Liv123)

  • Ontology O: a finite set of FO-sentences; e.g.,

∀x (LiverpoolAcademic(x) → ∃y HasLiverpoolId(x, y))

slide-14
SLIDE 14

Ontology Mediated Querying of Data (Example 1)

  • Data D: finite set of ground atoms (often regarded as finite relational struc-

ture); e.g., LiverpoolAcademic(peter), HasLiverpoolId(sue, Liv123)

  • Ontology O: a finite set of FO-sentences; e.g.,

∀x (LiverpoolAcademic(x) → ∃y HasLiverpoolId(x, y))

  • Query q(

x): an FO-formula; e.g., q(x) = ∃y HasLiverpoolId(x, y)

slide-15
SLIDE 15

Ontology Mediated Querying of Data (Example 1)

  • Data D: finite set of ground atoms (often regarded as finite relational struc-

ture); e.g., LiverpoolAcademic(peter), HasLiverpoolId(sue, Liv123)

  • Ontology O: a finite set of FO-sentences; e.g.,

∀x (LiverpoolAcademic(x) → ∃y HasLiverpoolId(x, y))

  • Query q(

x): an FO-formula; e.g., q(x) = ∃y HasLiverpoolId(x, y) A tuple a ∈ dom(D) is a certain answer for q and O over D if D ∪ O | = q( a)

slide-16
SLIDE 16

Ontology Mediated Querying of Data (Example 1)

  • Data D: finite set of ground atoms (often regarded as finite relational struc-

ture); e.g., LiverpoolAcademic(peter), HasLiverpoolId(sue, Liv123)

  • Ontology O: a finite set of FO-sentences; e.g.,

∀x (LiverpoolAcademic(x) → ∃y HasLiverpoolId(x, y))

  • Query q(

x): an FO-formula; e.g., q(x) = ∃y HasLiverpoolId(x, y) A tuple a ∈ dom(D) is a certain answer for q and O over D if D ∪ O | = q( a) Here D ∪ O | = q(a) ⇔ a ∈ {sue, peter}

slide-17
SLIDE 17

Ontology Mediated Querying of Data (Reachability)

  • Ontology O:

{∀x (∃y (H(y) ∧ parent(x, y)) → H(x))}

slide-18
SLIDE 18

Ontology Mediated Querying of Data (Reachability)

  • Ontology O:

{∀x (∃y (H(y) ∧ parent(x, y)) → H(x))}

  • Query q:

q(x) = H(x)

slide-19
SLIDE 19

Ontology Mediated Querying of Data (Reachability)

  • Ontology O:

{∀x (∃y (H(y) ∧ parent(x, y)) → H(x))}

  • Query q:

q(x) = H(x)

  • Data D:

parent(b0, b1), · · · , parent(b5, b6), H(b6)

slide-20
SLIDE 20

Ontology Mediated Querying of Data (Reachability)

  • Ontology O:

{∀x (∃y (H(y) ∧ parent(x, y)) → H(x))}

  • Query q:

q(x) = H(x)

  • Data D:

parent(b0, b1), · · · , parent(b5, b6), H(b6)

  • Certain answers for q(x) and O over D are:

D ∪ O | = q(a) ⇔ a ∈ {b0, b1, b2, b3, b4, b5, b6}.

slide-21
SLIDE 21

Ontology Mediated Querying of Data (Colorability)

  • Ontology O:

– ∀x (red(x) ∨ blue(x) ∨ green(x)). – ∀x (red(x) ∧ E(x, y) ∧ red(y) → clash(x)) (same for blue, green).

slide-22
SLIDE 22

Ontology Mediated Querying of Data (Colorability)

  • Ontology O:

– ∀x (red(x) ∨ blue(x) ∨ green(x)). – ∀x (red(x) ∧ E(x, y) ∧ red(y) → clash(x)) (same for blue, green).

  • Query q:

q() = ∃x clash(x)

slide-23
SLIDE 23

Ontology Mediated Querying of Data (Colorability)

  • Ontology O:

– ∀x (red(x) ∨ blue(x) ∨ green(x)). – ∀x (red(x) ∧ E(x, y) ∧ red(y) → clash(x)) (same for blue, green).

  • Query q:

q() = ∃x clash(x)

  • Data D: undirected graph

D = (W, E)

slide-24
SLIDE 24

Ontology Mediated Querying of Data (Colorability)

  • Ontology O:

– ∀x (red(x) ∨ blue(x) ∨ green(x)). – ∀x (red(x) ∧ E(x, y) ∧ red(y) → clash(x)) (same for blue, green).

  • Query q:

q() = ∃x clash(x)

  • Data D: undirected graph

D = (W, E)

  • Certain Answers to q and O over D:

O ∪ D | = q iff D is not 3-colorable

slide-25
SLIDE 25

Ontology Mediated Querying of Data (Colorability)

  • Ontology O:

– ∀x (red(x) ∨ blue(x) ∨ green(x)). – ∀x (red(x) ∧ E(x, y) ∧ red(y) → clash(x)) (same for blue, green).

  • Query q:

q() = ∃x clash(x)

  • Data D: undirected graph

D = (W, E)

  • Certain Answers to q and O over D:

O ∪ D | = q iff D is not 3-colorable

  • One can do this for every CSP(H).
slide-26
SLIDE 26

Relevant Languages for Ontology-Mediated Querying

Lots of results over the past 15 years on the complexity of deciding D ∪ O | = q( a),

slide-27
SLIDE 27

Relevant Languages for Ontology-Mediated Querying

Lots of results over the past 15 years on the complexity of deciding D ∪ O | = q( a), where q typically a conjunctive query (or primitive positive sentence), that is an FO-sentence of the form: ∃ x

  • i∈I

Ri( xi)

slide-28
SLIDE 28

Relevant Languages for Ontology-Mediated Querying

Lots of results over the past 15 years on the complexity of deciding D ∪ O | = q( a), where q typically a conjunctive query (or primitive positive sentence), that is an FO-sentence of the form: ∃ x

  • i∈I

Ri( xi) O often in a fragment of the guarded fragment (GF) of FO only admit guarded quantifiers ∀ y(α( x, y) → ϕ( x, y)), ∃ y(α( x, y) ∧ ϕ( x, y)) where ϕ( x, y) is in GF and α( x, y) is an atomic formula containing all variables in x ∪ y.

slide-29
SLIDE 29

Relevant Languages for Ontology-Mediated Querying

Lots of results over the past 15 years on the complexity of deciding D ∪ O | = q( a), where q typically a conjunctive query (or primitive positive sentence), that is an FO-sentence of the form: ∃ x

  • i∈I

Ri( xi) O often in a fragment of the guarded fragment (GF) of FO only admit guarded quantifiers ∀ y(α( x, y) → ϕ( x, y)), ∃ y(α( x, y) ∧ ϕ( x, y)) where ϕ( x, y) is in GF and α( x, y) is an atomic formula containing all variables in x ∪ y. GF inherits many nice properties from modal and description logics.

slide-30
SLIDE 30

Relevant Languages for Ontology-Mediated Querying

Lots of results over the past 15 years on the complexity of deciding D ∪ O | = q( a), where q typically a conjunctive query (or primitive positive sentence), that is an FO-sentence of the form: ∃ x

  • i∈I

Ri( xi) O often in a fragment of the guarded fragment (GF) of FO only admit guarded quantifiers ∀ y(α( x, y) → ϕ( x, y)), ∃ y(α( x, y) ∧ ϕ( x, y)) where ϕ( x, y) is in GF and α( x, y) is an atomic formula containing all variables in x ∪ y. GF inherits many nice properties from modal and description logics. We also consider the 2-variable guarded fragment of FO with counting (GC2): ∀x(x = x → (∃≥200y author of(x, y) → ProlificAuthor(x)))

slide-31
SLIDE 31

Some Complexity Results

Deciding D ∪ O | = q( a) for q a conjunctive query and

  • O empty: NP-complete (homomorphism problem).
slide-32
SLIDE 32

Some Complexity Results

Deciding D ∪ O | = q( a) for q a conjunctive query and

  • O empty: NP-complete (homomorphism problem).
  • O in GF or GC2: 2ExpTime-complete (Baranyi et al 2010).
slide-33
SLIDE 33

Some Complexity Results

Deciding D ∪ O | = q( a) for q a conjunctive query and

  • O empty: NP-complete (homomorphism problem).
  • O in GF or GC2: 2ExpTime-complete (Baranyi et al 2010).
  • O in 2-variable fragment of FO: undecidable (Rosati 2007).
slide-34
SLIDE 34

Some Complexity Results

Deciding D ∪ O | = q( a) for q a conjunctive query and

  • O empty: NP-complete (homomorphism problem).
  • O in GF or GC2: 2ExpTime-complete (Baranyi et al 2010).
  • O in 2-variable fragment of FO: undecidable (Rosati 2007).

Slightly misleading!

slide-35
SLIDE 35

Data Complexity

When deciding D ∪ O | = q( a) assume that O and q are small and D is large.

slide-36
SLIDE 36

Data Complexity

When deciding D ∪ O | = q( a) assume that O and q are small and D is large. Then it is reasonable to assume that

  • O and q are fixed and D is the only input; thus focus on data complexity.
slide-37
SLIDE 37

Data Complexity

When deciding D ∪ O | = q( a) assume that O and q are small and D is large. Then it is reasonable to assume that

  • O and q are fixed and D is the only input; thus focus on data complexity.

We obtain:

  • O empty: AC0.
slide-38
SLIDE 38

Data Complexity

When deciding D ∪ O | = q( a) assume that O and q are small and D is large. Then it is reasonable to assume that

  • O and q are fixed and D is the only input; thus focus on data complexity.

We obtain:

  • O empty: AC0.
  • O in GF or GC2: coNP-complete (Baranyi et al 2010).
slide-39
SLIDE 39

Data Complexity

When deciding D ∪ O | = q( a) assume that O and q are small and D is large. Then it is reasonable to assume that

  • O and q are fixed and D is the only input; thus focus on data complexity.

We obtain:

  • O empty: AC0.
  • O in GF or GC2: coNP-complete (Baranyi et al 2010).

For data complexity coNP-hardness is very bad news! It has become a huge industry to determine fragments of GF and GC2 in PTime (or even better FO or datalog rewritable fragments) .

slide-40
SLIDE 40

P/coNP Dichotomy Theorems for Ontology Mediated Querying in GF and GC2

  • O is in PTime if for every conjunctive query q deciding O ∪ D |

= q is in PTime in data complexity.

  • O is coNP-hard if there exists a conjunctive query q such that deciding

O ∪ D | = q is coNP-hard in data complexity.

✬ ✫ ✩ ✪

coNP-complete in PTime coNP-intermediate

slide-41
SLIDE 41

Relevant Fragments

We consider the fragments uGF and uGC2 of GF and GC2 which are invariant under disjoint unions.

slide-42
SLIDE 42

Relevant Fragments

We consider the fragments uGF and uGC2 of GF and GC2 which are invariant under disjoint unions. Modulo logical equivalence uGF and uGC2 sentences take the form ∀x(x = x → ϕ(x)), ∀ x(R( x) → ϕ( x)) where ϕ contains no closed subformulas and does not use equality as a guard.

slide-43
SLIDE 43

Relevant Fragments

We consider the fragments uGF and uGC2 of GF and GC2 which are invariant under disjoint unions. Modulo logical equivalence uGF and uGC2 sentences take the form ∀x(x = x → ϕ(x)), ∀ x(R( x) → ϕ( x)) where ϕ contains no closed subformulas and does not use equality as a guard. The depth of a uGF or uGC2 sentence is the number of nestings of guarded quantifiers without the outermost universal guarded quantifier. The following sentence has depth 1: ∀x(x = x → ∀y(author of(x, y) → Book(y)))

slide-44
SLIDE 44

Relevant Fragments

We consider the fragments uGF and uGC2 of GF and GC2 which are invariant under disjoint unions. Modulo logical equivalence uGF and uGC2 sentences take the form ∀x(x = x → ϕ(x)), ∀ x(R( x) → ϕ( x)) where ϕ contains no closed subformulas and does not use equality as a guard. The depth of a uGF or uGC2 sentence is the number of nestings of guarded quantifiers without the outermost universal guarded quantifier. The following sentence has depth 1: ∀x(x = x → ∀y(author of(x, y) → Book(y))) In uGF− and uGC−

2 we admit only x = x as the outermost guard.

slide-45
SLIDE 45

Relevant Fragments

We consider the fragments uGF and uGC2 of GF and GC2 which are invariant under disjoint unions. Modulo logical equivalence uGF and uGC2 sentences take the form ∀x(x = x → ϕ(x)), ∀ x(R( x) → ϕ( x)) where ϕ contains no closed subformulas and does not use equality as a guard. The depth of a uGF or uGC2 sentence is the number of nestings of guarded quantifiers without the outermost universal guarded quantifier. The following sentence has depth 1: ∀x(x = x → ∀y(author of(x, y) → Book(y))) In uGF− and uGC−

2 we admit only x = x as the outermost guard.

385 out of 411 ontologies in the Bioportal repository are in GC−

2 (depth 1)

slide-46
SLIDE 46

Illustration (all in GC−

2 (1))

  • The ontology

O1 = {∀x(∃≥200y author of(x, y) → ProlificWriter(x))} is in PTime.

slide-47
SLIDE 47

Illustration (all in GC−

2 (1))

  • The ontology

O1 = {∀x(∃≥200y author of(x, y) → ProlificWriter(x))} is in PTime.

  • The ontology

O2 = {∀x(Writer(x) → ∃y(author of(x, y) ∧ Book(y)))} is in PTime.

slide-48
SLIDE 48

Illustration (all in GC−

2 (1))

  • The ontology

O1 = {∀x(∃≥200y author of(x, y) → ProlificWriter(x))} is in PTime.

  • The ontology

O2 = {∀x(Writer(x) → ∃y(author of(x, y) ∧ Book(y)))} is in PTime.

  • The ontology O1 ∪ O2 is coNP-hard.
slide-49
SLIDE 49

Illustration (all in GC−

2 (1))

  • The ontology

O1 = {∀x(∃≥200y author of(x, y) → ProlificWriter(x))} is in PTime.

  • The ontology

O2 = {∀x(Writer(x) → ∃y(author of(x, y) ∧ Book(y)))} is in PTime.

  • The ontology O1 ∪ O2 is coNP-hard.
  • The ontology

O1 ∪ O2 ∪ {∀xy(author of(x, y) → Book(y))} is again in PTime.

slide-50
SLIDE 50

Summary of Results

No Dichotomy CSP-Hard (Datalog= = PTIME) Dichotomy (Datalog= = PTIME) uGF−

2 (2, f)

uGF2(1, =) uGF2(2) uGF2(1, f) uGF−(1, =) uGF(1) uGF−

2 (2)

uGC−

2 (1, =)

Number in brackets indicates

  • depth,
  • f presence of partial functions,
  • ·2 restriction to two variables,
  • ·− restricts outermost guards to be equality.
slide-51
SLIDE 51

Necessary condition for PTime: Materializability

An ontology O is materializable if for every D there exists a model A of O ∪ D such that for all conjunctive queries q: O ∪ D | = q ⇔ A | = q

slide-52
SLIDE 52

Necessary condition for PTime: Materializability

An ontology O is materializable if for every D there exists a model A of O ∪ D such that for all conjunctive queries q: O ∪ D | = q ⇔ A | = q Let O be an FO ontology invariant under disjoint unions. If O is not materi- alizable, then O is coNP-hard.

slide-53
SLIDE 53

Materializability is not a sufficient for PTime

We construct a materializable ontology from the ontology O encoding three- colorability:

  • ∀x (red(x) ∨ blue(x) ∨ green(x)).
  • ∀x (red(x) ∧ E(x, y) ∧ red(y) → clash(x)) (same for blue, green).

Clearly O itself is not materializable. Replace red(x), blue(x), and green(x) by complex formulas that are not directly visible to conjunctive queries, e.g. ∃y(Rred(x, y) ∧ ∀z(Sred(y, z) → red(z))) Then the resulting ontology is still coNP-hard but materializable.

slide-54
SLIDE 54

A sufficient condition for PTime (even datalog-rewritability)

Every relational structure D can be unravelled into a guarded tree-decomposable structure D∗ (sometimes also called acyclic).

slide-55
SLIDE 55

A sufficient condition for PTime (even datalog-rewritability)

Every relational structure D can be unravelled into a guarded tree-decomposable structure D∗ (sometimes also called acyclic). This unravelling preserves formulas in GF and GC2.

slide-56
SLIDE 56

A sufficient condition for PTime (even datalog-rewritability)

Every relational structure D can be unravelled into a guarded tree-decomposable structure D∗ (sometimes also called acyclic). This unravelling preserves formulas in GF and GC2. An ontology O is unravelling tolerant if for every D the following holds for the unravelling D∗ of D: for all acyclic conjunctive queries q: O ∪ D | = q ⇔ O ∪ D∗ | = q

slide-57
SLIDE 57

A sufficient condition for PTime (even datalog-rewritability)

Every relational structure D can be unravelled into a guarded tree-decomposable structure D∗ (sometimes also called acyclic). This unravelling preserves formulas in GF and GC2. An ontology O is unravelling tolerant if for every D the following holds for the unravelling D∗ of D: for all acyclic conjunctive queries q: O ∪ D | = q ⇔ O ∪ D∗ | = q Let O be a uGF or uGC2 ontology. If O is unravelling tolerant, then O is in PTime (actually datalog-rewritable).

slide-58
SLIDE 58

The Dichotomy Theorem

Let O be in any of the languages uGF−(1, =), uGF(1), uGF−

2 (2), uGC− 2 (1, =).

Then we have the following classification:

✬ ✫ ✩ ✪

coNP-complete in PTime Datalog rewritable materializable unravelling tolerant

slide-59
SLIDE 59

Undecidability and Non-Dichotomy

In uGF−

2 (2, f) we have symbols for partial functions (weak counting), depth 2

formulas, and at most two variables. For uGF−

2 (2, f) ontologies it is undecidable whether they are in PTime and

whether they are coNP-hard (unless P=NP). Materializability and datalog rewritability are undecidable.

slide-60
SLIDE 60

Undecidability and Non-Dichotomy

In uGF−

2 (2, f) we have symbols for partial functions (weak counting), depth 2

formulas, and at most two variables. For uGF−

2 (2, f) ontologies it is undecidable whether they are in PTime and

whether they are coNP-hard (unless P=NP). Materializability and datalog rewritability are undecidable. To show non-dichotomy we prove a variation of Ladner’s Theorem: The exists a Turing machine whose run fitting problem can a partial run be extended to a full run of the machine is neither in PTime nor NP-hard (unless P=NP).

slide-61
SLIDE 61

Undecidability and Non-Dichotomy

In uGF−

2 (2, f) we have symbols for partial functions (weak counting), depth 2

formulas, and at most two variables. For uGF−

2 (2, f) ontologies it is undecidable whether they are in PTime and

whether they are coNP-hard (unless P=NP). Materializability and datalog rewritability are undecidable. To show non-dichotomy we prove a variation of Ladner’s Theorem: The exists a Turing machine whose run fitting problem can a partial run be extended to a full run of the machine is neither in PTime nor NP-hard (unless P=NP). Using this result we show: For uGF−

2 (2, f) ontologies there is no P/coNP dichotomy (unless P=NP).

slide-62
SLIDE 62

Problems

  • Is there a P/coNP dichotomy for uGF? Many smaller steps...
  • Decidability: assume we have a dichotomy for a class of ontologies. Is it

decidable whether an ontology O from the class is in PTime?