The Tractability Frontier of Well-designed SPARQL Queries Miguel - - PowerPoint PPT Presentation

the tractability frontier of well designed sparql queries
SMART_READER_LITE
LIVE PREVIEW

The Tractability Frontier of Well-designed SPARQL Queries Miguel - - PowerPoint PPT Presentation

The Tractability Frontier of Well-designed SPARQL Queries Miguel Romero (University of Oxford) ACM PODS 2018, 12 June, Houston-USA Well-designed SPARQL SPARQL : standard query language for RDF graphs Well-designed SPARQL (Perez, Arenas, Gutierrez


slide-1
SLIDE 1

The Tractability Frontier of Well-designed SPARQL Queries

Miguel Romero (University of Oxford)

ACM PODS 2018, 12 June, Houston-USA

slide-2
SLIDE 2

Well-designed SPARQL

SPARQL: standard query language for RDF graphs Well-designed SPARQL (Perez, Arenas, Gutierrez 2006)

  • Evaluation is coNP-complete (PSPACE-complete for SPARQL)

This work:

  • Well-designed SPARQL restricted to AND, OPTIONAL, UNION
slide-3
SLIDE 3

Tractable evaluation

Evaluating well-designed SPARQL becomes tractable for some classes

  • Most general condition: local tractability

(Letelier, Perez, Pichler, Skritek 2013; Barceló, Pichler, Skritek 2015)

Main Question: Which classes of well-designed SPARQL queries 
 can be evaluated in polynomial time? Our Contribution: The tractable classes are precisely those of bounded domination width

slide-4
SLIDE 4

Well-designed Pattern Trees/Forests

Well-designed SPARQL queries
 with AND, OPTIONAL, UNION Well-designed Pattern Forests = Well-designed SPARQL queries
 with AND, OPTIONAL Well-designed Pattern Trees =

(Letelier, Perez, Pichler, Skritek 2013)

In this talk: We focus on (well-designed) pattern forests

slide-5
SLIDE 5

Basics of RDF graphs and pattern trees/forests

slide-6
SLIDE 6

RDF Graphs

Fix: set of identifiers I, set of variables V RDF Graph = finite set of triples from I x I x I (s, p, o) s

  • p
slide-7
SLIDE 7

Fix: set of identifiers I, set of variables V Answer of a CQ q(X) over an RDF graph G: q(G) = {h|X : h is a homomorphism from q to G} Conjunctive query (CQ) = 
 AND of triples from (I U V) x (I U V) x (I U V) + free variables

Conjunctive Queries (CQs) over RDF graphs

q(?y, ?z) = (?x, p, o) AND (?y, ?x, a) AND (o, ?z, ?y) AND (p, ?w, ?w)

  • Full CQ = All variables are free (no projection)
slide-8
SLIDE 8

Well-designed Pattern Tree =

  • For each variable ?x, the set {t in T | ?x in pat(t)} is connected in T

Well-designed Pattern Trees

(T, pat), where T is rooted tree and pat is a function 
 mapping each node of T to a full CQ such that

slide-9
SLIDE 9

G

Subtree T’ of P = subtree of T containing the root

T’

pat(T’) = AND of all the CQs in {pat(t) | t in T’} P=(T, pat)

Well-designed Pattern Trees: semantics

slide-10
SLIDE 10

G

Subtree T’ of P = subtree of T containing the root

T’

P=(T, pat) Child of T’= node not in T’ whose parent is in T’

Well-designed Pattern Trees: semantics

pat(T’) = AND of all the CQs in {pat(t) | t in T’}

slide-11
SLIDE 11

G

T’

P=(T, pat) h is in P(G) iff
 there is a subtree T’ such that

  • h is a homomorphism from pat(T’) to G
  • for each child t of T’, h cannot be extended to pat(T’) AND pat(t)

h t

pat(t)

g

Well-designed Pattern Trees: semantics

slide-12
SLIDE 12

Well-designed Pattern Forests

Well-designed Pattern Forest = Union of well-designed pattern trees Answer of F={P1,…,Pm} over RDF graph G: F(G) = P1(G) U … U Pm(G)

slide-13
SLIDE 13

The Evaluation Problem

EVAL(C) Let C be a class of well-designed pattern forests

Instance: well-designed pattern forest F in C, RDF graph G, mapping h Question: does h belong to F(G)?

slide-14
SLIDE 14

Domination width and main theorem

slide-15
SLIDE 15

Main Theorem

Assume FPT=W[1]. Let C be a recursively enumerable class of 
 well-designed pattern forests. Then the following are equivalent:

  • EVAL(C) can be solved in polynomial time
  • C has bounded domination width

Theorem:

Proof based on the corresponding characterisation for conjunctive queries


(Dalmau, Kolaitis, Vardi 2002; Grohe 2003)

Treewidth of a CQ = measure of tree-likeness ctw(q(X)):= treewidth of the core of q(X)

slide-16
SLIDE 16

The case of Conjunctive Queries

Assume FPT=W[1]. Let C be a recursively enumerable class of 
 conjunctive queries of bounded arity. Then the following are equivalent:

  • CQ-EVAL(C) can be solved in polynomial time
  • C has bounded ctw

Theorem (Dalmau, Kolaitis, Vardi 2002; Grohe 2003) Tractability part via the existential k-pebble game (Kolaitis, Vardi 1995)

  • Relaxation for checking existence of homomorphisms (complete, but not correct)
  • Existence of a winning strategy for the Duplicator can be done in poly time
  • Always correct for conjunctive queries q with ctw(q) < k

Hardness part via a reduction from the clique problem (W[1]-hardness)

slide-17
SLIDE 17

The case of Conjunctive Queries

Assume FPT=W[1]. Let C be a recursively enumerable class of 
 conjunctive queries of bounded arity. Then the following are equivalent:

  • CQ-EVAL(C) can be solved in polynomial time
  • C has bounded ctw

Theorem (Dalmau, Kolaitis, Vardi 2002; Grohe 2003) ctw(Q(X)) = 
 minimum k such that for every qi(X), there is qj(X) such that 
 ctw(qj(X)) is at most k and 
 qj(X) can be mapped to qi(X) via a homomorphism Can be extended to unions of CQs (UCQs) Q(X)={q1(X),…qm(X)}

slide-18
SLIDE 18

Domination width G

T’

P=(T, pat) h in P(G) ?

h

Is h a “potential solution”?

can be computed in poly time

slide-19
SLIDE 19

Domination width G

T’

P=(T, pat) h in P(G) ?

h

X:= vars(T’) ti t1 tn

h is not in P(G) iff h is in QT’(G) UCQ QT’(X) := {qt1(X),…,qtn(X)} CQ qti(X):= (pat(T’) AND pat(ti))(X)

slide-20
SLIDE 20

Domination width G

T’

P=(T, pat) h in P(G) ?

h

X:= vars(T’) ti t1 tn

h is not in P(G) iff h is in QT’(G) UCQ QT’(X) := {qt1(X),…,qtn(X)} CQ qti(X):= (pat(T’) AND pat(ti))(X) dw(P) := maximum ctw(QT’(X)), over all subtree T’

slide-21
SLIDE 21

Domination width G

T’

P=(T, pat) ti CQ qti(X):= (pat(T’) AND pat(ti))(X) tj qti(X) qtj(X) dw(P) < k dw(P) := maximum ctw(QT’(X)), over all subtree T’ ctw(qtj(X))<k

slide-22
SLIDE 22

Domination width G

T’

P=(T, pat) ti CQ qti(X):= (pat(T’) AND pat(ti))(X) dw(P) < k h in P(G) ?

h

  • exist. k-pebble game

dw(P) := maximum ctw(QT’(X)), over all subtree T’

slide-23
SLIDE 23

Domination width G

h in P(G) ?

h

…. ….

h

T’ T’’ F={P1,…,Pm}

T’ AND T’’

rename new variables

slide-24
SLIDE 24

Domination width G

h in P(G) ?

h

X:= vars(T’)=vars(T’’)=dom(h)

…. ….

h

T’ T’’ h is not in F(G) iff h is in Q{T’,T’’}(X) Q{T’,T’’}(X):={pat(T’) AND pat(T’’) + choice of children} dw(F) = maximum ctw(QS(X)), over all set S of subtrees

  • ver the same set of variables X and satisfying certain closure property

F={P1,…,Pm} (and renaming)

slide-25
SLIDE 25

Main Theorem

Assume FPT=W[1]. Let C be a recursively enumerable class of 
 well-designed pattern forests. Then the following are equivalent:

  • EVAL(C) can be solved in polynomial time
  • C has bounded domination width

Theorem: Tractability part:
 Application of the existential k-pebble game 
 as for the case of conjunctive queries (Dalmau, Kolaitis, Vardi 2002) Hardness part:
 Reduction from clique (Grohe 2003)
 + some basic properties of pattern forests with large dw

slide-26
SLIDE 26

The case of UNION-free queries (pattern trees)

slide-27
SLIDE 27

Branch Treewidth

P=(T, pat)

pat(t)

Branch Bt of t

t r

slide-28
SLIDE 28

Branch Treewidth

P=(T, pat)

pat(t)

t

Branch Bt of t CQ bt(X) := (pat(Bt) AND pat(t))(X) X:= vars(Bt)

bw(P) := maximum ctw(bt(X)) over all node t of T

Proposition: 
 For every well-designed pattern tree P, we have dw(P)=bw(P)

r

slide-29
SLIDE 29

Final Remarks

Characterisation of tractable classes of pattern forests

  • Dichotomy: A class C is either tractable or W[1]-hard

(well-designed SPARQL restricted to AND, OPTIONAL, UNION)

  • Dichotomy fails when we add FILTER (CQs with inequalities) and


SELECT (Kroll, Pichler, Skritek 2016) Open problem: Characterise fixed-parameter tractable classes of queries with SELECT 
 f(|q|) |G|

c

(Recent characterisation for simple queries, Mengel, Skritek 2018)

Thank you!

The {AND, OPTIONAL, UNION} fragment is maximal with this property: