Exercise 11: Graph Databases and Path Queries
Database Theory 2020-07-06 Maximilian Marx, David Carral
1 / 49
Exercise 11: Graph Databases and Path Queries Database Theory - - PowerPoint PPT Presentation
Exercise 11: Graph Databases and Path Queries Database Theory 2020-07-06 Maximilian Marx, David Carral 1 / 49 Exercise 1 Exercise. It was explained in the lecture that RDF and Property Graph can encode the same graph structures. How could we
1 / 49
could we encode arbitrary hypergraphs (relational databases) in RDF? RDF can be considered as a synonym for “labelled directed graph” here – the technical details of the RDF standard are not important for this exercise.
2 / 49
could we encode arbitrary hypergraphs (relational databases) in RDF? RDF can be considered as a synonym for “labelled directed graph” here – the technical details of the RDF standard are not important for this exercise. Solution.
3 / 49
could we encode arbitrary hypergraphs (relational databases) in RDF? RDF can be considered as a synonym for “labelled directed graph” here – the technical details of the RDF standard are not important for this exercise. Solution. ◮ Let G be some labelled hypergraph.
4 / 49
could we encode arbitrary hypergraphs (relational databases) in RDF? RDF can be considered as a synonym for “labelled directed graph” here – the technical details of the RDF standard are not important for this exercise. Solution. ◮ Let G be some labelled hypergraph. ◮ We construct GRDF by reifying hyperedges: for every p-labelled hyperedge ϕ = p(t1, t2, . . . , tℓ) in G,
5 / 49
could we encode arbitrary hypergraphs (relational databases) in RDF? RDF can be considered as a synonym for “labelled directed graph” here – the technical details of the RDF standard are not important for this exercise. Solution. ◮ Let G be some labelled hypergraph. ◮ We construct GRDF by reifying hyperedges: for every p-labelled hyperedge ϕ = p(t1, t2, . . . , tℓ) in G, ◮ we add labels p1, p2, . . . , pℓ;
6 / 49
could we encode arbitrary hypergraphs (relational databases) in RDF? RDF can be considered as a synonym for “labelled directed graph” here – the technical details of the RDF standard are not important for this exercise. Solution. ◮ Let G be some labelled hypergraph. ◮ We construct GRDF by reifying hyperedges: for every p-labelled hyperedge ϕ = p(t1, t2, . . . , tℓ) in G, ◮ we add labels p1, p2, . . . , pℓ; ◮ a vertex vϕ; and
7 / 49
could we encode arbitrary hypergraphs (relational databases) in RDF? RDF can be considered as a synonym for “labelled directed graph” here – the technical details of the RDF standard are not important for this exercise. Solution. ◮ Let G be some labelled hypergraph. ◮ We construct GRDF by reifying hyperedges: for every p-labelled hyperedge ϕ = p(t1, t2, . . . , tℓ) in G, ◮ we add labels p1, p2, . . . , pℓ; ◮ a vertex vϕ; and ◮ edges p1(cϕ, t1), p2(cϕ, t2), . . . , pℓ(cϕ, tℓ) to GRDF .
8 / 49
explain why there is none.
S(x, x) ← human(x) S(x, y) ← parent(x, w) ∧ S(v, w) ∧ parent(y, v)
9 / 49
explain why there is none.
S(x, x) ← human(x) S(x, y) ← parent(x, w) ∧ S(v, w) ∧ parent(y, v) Solution.
10 / 49
explain why there is none.
S(x, x) ← human(x) S(x, y) ← parent(x, w) ∧ S(v, w) ∧ parent(y, v) Solution. 1.
◮ S matches paths of the form parentn ◦ human ◦ parentn, with n ≥ 0.
11 / 49
explain why there is none.
S(x, x) ← human(x) S(x, y) ← parent(x, w) ∧ S(v, w) ∧ parent(y, v) Solution. 1.
◮ S matches paths of the form parentn ◦ human ◦ parentn, with n ≥ 0. ◮ This is not a regular language, and hence cannot be expressed as a 2RPQ.
12 / 49
explain why there is none.
S(x, x) ← human(x) S(x, y) ← parent(x, w) ∧ S(v, w) ∧ parent(y, v) Solution. 1.
◮ S matches paths of the form parentn ◦ human ◦ parentn, with n ≥ 0. ◮ This is not a regular language, and hence cannot be expressed as a 2RPQ. ◮ Since the length of a matched path is not accessible in a C2RPQ, this can also not be expressed as a C2RPQ.
13 / 49
explain why there is none.
AncCity(x, y, x′, y′) ← parent(x, x′) ∧ bornIn(x, y) ∧ bornIn(x′, y′) AncCity(x, y, x′′, y′′) ← AncCity(x, y, x′, y′) ∧ AncCity(x′, y′, x′′, y′′) Query(x, x′, y) ← AncCity(x, y, x′, y) Solution. 1.
◮ S matches paths of the form parentn ◦ human ◦ parentn, with n ≥ 0. ◮ This is not a regular language, and hence cannot be expressed as a 2RPQ. ◮ Since the length of a matched path is not accessible in a C2RPQ, this can also not be expressed as a C2RPQ.
2.
14 / 49
explain why there is none.
AncCity(x, y, x′, y′) ← parent(x, x′) ∧ bornIn(x, y) ∧ bornIn(x′, y′) AncCity(x, y, x′′, y′′) ← AncCity(x, y, x′, y′) ∧ AncCity(x′, y′, x′′, y′′) Query(x, x′, y) ← AncCity(x, y, x′, y) Solution. 1.
◮ S matches paths of the form parentn ◦ human ◦ parentn, with n ≥ 0. ◮ This is not a regular language, and hence cannot be expressed as a 2RPQ. ◮ Since the length of a matched path is not accessible in a C2RPQ, this can also not be expressed as a C2RPQ.
2. The following C2RPQ expresses Query: (parent ◦ parent∗)(x, x′) ∧ bornIn(x, y) ∧ bornIn(x′, y)
15 / 49
explain why there is none.
DDAnc(x, y) ← parent(x, y) ∧ bornIn(x, dresden) ∧ bornIn(y, dresden) DDAnc(x, z) ← DDAnc(x, y) ∧ parent(y, z) ∧ bornIn(z, dresden) Solution. 1.
◮ S matches paths of the form parentn ◦ human ◦ parentn, with n ≥ 0. ◮ This is not a regular language, and hence cannot be expressed as a 2RPQ. ◮ Since the length of a matched path is not accessible in a C2RPQ, this can also not be expressed as a C2RPQ.
2. The following C2RPQ expresses Query: (parent ◦ parent∗)(x, x′) ∧ bornIn(x, y) ∧ bornIn(x′, y) 3.
16 / 49
explain why there is none.
DDAnc(x, y) ← parent(x, y) ∧ bornIn(x, dresden) ∧ bornIn(y, dresden) DDAnc(x, z) ← DDAnc(x, y) ∧ parent(y, z) ∧ bornIn(z, dresden) Solution. 1.
◮ S matches paths of the form parentn ◦ human ◦ parentn, with n ≥ 0. ◮ This is not a regular language, and hence cannot be expressed as a 2RPQ. ◮ Since the length of a matched path is not accessible in a C2RPQ, this can also not be expressed as a C2RPQ.
2. The following C2RPQ expresses Query: (parent ◦ parent∗)(x, x′) ∧ bornIn(x, y) ∧ bornIn(x′, y) 3.
◮ DDAnc matches paths where every node has a bornIn-connection to dresden.
17 / 49
explain why there is none.
DDAnc(x, y) ← parent(x, y) ∧ bornIn(x, dresden) ∧ bornIn(y, dresden) DDAnc(x, z) ← DDAnc(x, y) ∧ parent(y, z) ∧ bornIn(z, dresden) Solution. 1.
◮ S matches paths of the form parentn ◦ human ◦ parentn, with n ≥ 0. ◮ This is not a regular language, and hence cannot be expressed as a 2RPQ. ◮ Since the length of a matched path is not accessible in a C2RPQ, this can also not be expressed as a C2RPQ.
2. The following C2RPQ expresses Query: (parent ◦ parent∗)(x, x′) ∧ bornIn(x, y) ∧ bornIn(x′, y) 3.
◮ DDAnc matches paths where every node has a bornIn-connection to dresden. ◮ This is not expressible as a 2RPQ, since (bornIn ◦ bornIn−1)(x, y) will generally be true for x y.
18 / 49
explain why there is none.
DDAnc(x, y) ← parent(x, y) ∧ bornIn(x, dresden) ∧ bornIn(y, dresden) DDAnc(x, z) ← DDAnc(x, y) ∧ parent(y, z) ∧ bornIn(z, dresden) Solution. 1.
◮ S matches paths of the form parentn ◦ human ◦ parentn, with n ≥ 0. ◮ This is not a regular language, and hence cannot be expressed as a 2RPQ. ◮ Since the length of a matched path is not accessible in a C2RPQ, this can also not be expressed as a C2RPQ.
2. The following C2RPQ expresses Query: (parent ◦ parent∗)(x, x′) ∧ bornIn(x, y) ∧ bornIn(x′, y) 3.
◮ DDAnc matches paths where every node has a bornIn-connection to dresden. ◮ This is not expressible as a 2RPQ, since (bornIn ◦ bornIn−1)(x, y) will generally be true for x y. ◮ Since the intermediate nodes on a matched path are not accessible in a C2RPQ, this is also not expressible as a C2RPQ.
19 / 49
required automaton “on the fly”?
20 / 49
required automaton “on the fly”? Solution.
21 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions.
22 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′):
23 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′):
◮ if E = ℓ ∈ L is a label, then N is the following NFA:
i start f ℓ
24 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′):
◮ if E = ℓ ∈ L is a label, then N is the following NFA:
i start f ℓ
◮ if E = (E1 ◦ E2), and N1andN2 are NFAs deciding E1 and E2, then N is the following NFA:
i start iN1 . . . fN1 iN2 . . . fN2 f ε ε ε
25 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′):
◮ if E = ℓ ∈ L is a label, then N is the following NFA:
i start f ℓ
◮ if E = (E1 ◦ E2), and N1andN2 are NFAs deciding E1 and E2, then N is the following NFA:
i start iN1 . . . fN1 iN2 . . . fN2 f ε ε ε
◮ if E = (E1 + E2), and N1andN2 are NFAs deciding E1 and E2, then N is the following NFA:
i start iN1 . . . fN1 iN2 . . . fN2 f ε ε ε ε
26 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′):
◮ if E = ℓ ∈ L is a label, then N is the following NFA:
i start f ℓ
◮ if E = (E1 ◦ E2), and N1andN2 are NFAs deciding E1 and E2, then N is the following NFA:
i start iN1 . . . fN1 iN2 . . . fN2 f ε ε ε
◮ if E = (E1 + E2), and N1andN2 are NFAs deciding E1 and E2, then N is the following NFA:
i start iN1 . . . fN1 iN2 . . . fN2 f ε ε ε ε
◮ If E = E∗
1 and N1 is an NFA deciding E1, then N is the following NFA:
i start iN1 . . . fN1 f ε ε ε ε
27 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′). ◮ Use the powerset construction to obtain equivalent (but exponentially large) DFAs D and D′.
28 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′). ◮ Use the powerset construction to obtain equivalent (but exponentially large) DFAs D and D′. ◮ Let D′ be the DFA obtained from D′ by making all accepting states reject, and vice versa. Then w ∈ D′ iff w D′.
29 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′). ◮ Use the powerset construction to obtain equivalent (but exponentially large) DFAs D and D′. ◮ Let D′ be the DFA obtained from D′ by making all accepting states reject, and vice versa. Then w ∈ D′ iff w D′. ◮ Construct the (polynomially large) product automaton ˆ D of D and D′; then ˆ D decides E ∩ E′.
30 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′). ◮ Use the powerset construction to obtain equivalent (but exponentially large) DFAs D and D′. ◮ Let D′ be the DFA obtained from D′ by making all accepting states reject, and vice versa. Then w ∈ D′ iff w D′. ◮ Construct the (polynomially large) product automaton ˆ D of D and D′; then ˆ D decides E ∩ E′. ◮ E ⊑ E′ iff L( ˆ D) is empty: if there is w ∈ L( ˆ D), then w ∈ L(E) but w L(E′).
31 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′). ◮ Use the powerset construction to obtain equivalent (but exponentially large) DFAs D and D′. ◮ Let D′ be the DFA obtained from D′ by making all accepting states reject, and vice versa. Then w ∈ D′ iff w D′. ◮ Construct the (polynomially large) product automaton ˆ D of D and D′; then ˆ D decides E ∩ E′. ◮ E ⊑ E′ iff L( ˆ D) is empty: if there is w ∈ L( ˆ D), then w ∈ L(E) but w L(E′). ◮ L( ˆ D) is empty iff the final state is not reachable from the initial state.
32 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′). ◮ Use the powerset construction to obtain equivalent (but exponentially large) DFAs D and D′. ◮ Let D′ be the DFA obtained from D′ by making all accepting states reject, and vice versa. Then w ∈ D′ iff w D′. ◮ Construct the (polynomially large) product automaton ˆ D of D and D′; then ˆ D decides E ∩ E′. ◮ E ⊑ E′ iff L( ˆ D) is empty: if there is w ∈ L( ˆ D), then w ∈ L(E) but w L(E′). ◮ L( ˆ D) is empty iff the final state is not reachable from the initial state. ◮ Reachability on directed graphs can be checked in nondeterministic logarithmic space.
33 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′). ◮ Use the powerset construction to obtain equivalent (but exponentially large) DFAs D and D′. ◮ Let D′ be the DFA obtained from D′ by making all accepting states reject, and vice versa. Then w ∈ D′ iff w D′. ◮ Construct the (polynomially large) product automaton ˆ D of D and D′; then ˆ D decides E ∩ E′. ◮ E ⊑ E′ iff L( ˆ D) is empty: if there is w ∈ L( ˆ D), then w ∈ L(E) but w L(E′). ◮ L( ˆ D) is empty iff the final state is not reachable from the initial state. ◮ Reachability on directed graphs can be checked in nondeterministic logarithmic space. ◮ Since the state graph of ˆ D is exponentially large, we can decide emptiness in nondeterministic polynomial space.
34 / 49
required automaton “on the fly”? Solution. ◮ Let E, E′ be regular expressions. ◮ Construct NFAs N and N′ deciding L(E) and L(E′). ◮ Use the powerset construction to obtain equivalent (but exponentially large) DFAs D and D′. ◮ Let D′ be the DFA obtained from D′ by making all accepting states reject, and vice versa. Then w ∈ D′ iff w D′. ◮ Construct the (polynomially large) product automaton ˆ D of D and D′; then ˆ D decides E ∩ E′. ◮ E ⊑ E′ iff L( ˆ D) is empty: if there is w ∈ L( ˆ D), then w ∈ L(E) but w L(E′). ◮ L( ˆ D) is empty iff the final state is not reachable from the initial state. ◮ Reachability on directed graphs can be checked in nondeterministic logarithmic space. ◮ Since the state graph of ˆ D is exponentially large, we can decide emptiness in nondeterministic polynomial space. ◮ Because of Savitch’s Theorem, we can thus decide containment in PSpace.
35 / 49
By a binary linear C2RPQ we mean a C2RPQ of the form ∃xk1, . . . , xkm. R1(x1, x2) ∧ R2(x2, x3) ∧ · · · ∧ Rn−1(xn−1, xn) where each Ri(xi, xi+1) is an atom or a 2RPQ, and the xkj are among the variables that occur in the query. Can every linear binary C2RPQ be expressed by a 2RPQ? Explain your answer.
36 / 49
By a binary linear C2RPQ we mean a C2RPQ of the form ∃xk1, . . . , xkm. R1(x1, x2) ∧ R2(x2, x3) ∧ · · · ∧ Rn−1(xn−1, xn) where each Ri(xi, xi+1) is an atom or a 2RPQ, and the xkj are among the variables that occur in the query. Can every linear binary C2RPQ be expressed by a 2RPQ? Explain your answer. Solution.
37 / 49
By a binary linear C2RPQ we mean a C2RPQ of the form ∃xk1, . . . , xkm. R1(x1, x2) ∧ R2(x2, x3) ∧ · · · ∧ Rn−1(xn−1, xn) where each Ri(xi, xi+1) is an atom or a 2RPQ, and the xkj are among the variables that occur in the query. Can every linear binary C2RPQ be expressed by a 2RPQ? Explain your answer. Solution. ◮ Consider, e.g., ∃x. a(x, x) ∧ b(x, x), which cannot be expressed as a 2RPQ.
38 / 49
By a binary linear C2RPQ we mean a C2RPQ of the form ∃xk1, . . . , xkm. R1(x1, x2) ∧ R2(x2, x3) ∧ · · · ∧ Rn−1(xn−1, xn) where each Ri(xi, xi+1) is an atom or a 2RPQ, and the xkj are among the variables that occur in the query. Can every linear binary C2RPQ be expressed by a 2RPQ? Explain your answer. Solution. ◮ Consider, e.g., ∃x. a(x, x) ∧ b(x, x), which cannot be expressed as a 2RPQ. ◮ Note that this is not a linear C2RPQ.
39 / 49
By a binary linear C2RPQ we mean a C2RPQ of the form ∃xk1, . . . , xkm. R1(x1, x2) ∧ R2(x2, x3) ∧ · · · ∧ Rn−1(xn−1, xn) where each Ri(xi, xi+1) is an atom or a 2RPQ, and the xkj are among the variables that occur in the query. Can every linear binary C2RPQ be expressed by a 2RPQ? Explain your answer. Solution. ◮ Consider, e.g., ∃x. a(x, x) ∧ b(x, x), which cannot be expressed as a 2RPQ. ◮ Note that this is not a linear C2RPQ. ◮ Indeed, most linear binary C2RPQ can be expressed by a 2RPQ:
40 / 49
By a binary linear C2RPQ we mean a C2RPQ of the form ∃xk1, . . . , xkm. R1(x1, x2) ∧ R2(x2, x3) ∧ · · · ∧ Rn−1(xn−1, xn) where each Ri(xi, xi+1) is an atom or a 2RPQ, and the xkj are among the variables that occur in the query. Can every linear binary C2RPQ be expressed by a 2RPQ? Explain your answer. Solution. ◮ Consider, e.g., ∃x. a(x, x) ∧ b(x, x), which cannot be expressed as a 2RPQ. ◮ Note that this is not a linear C2RPQ. ◮ Indeed, most linear binary C2RPQ can be expressed by a 2RPQ: ◮ Every atom p(xi, xi+1) in the query can be viewed as an RPQ with label p.
41 / 49
By a binary linear C2RPQ we mean a C2RPQ of the form ∃xk1, . . . , xkm. R1(x1, x2) ∧ R2(x2, x3) ∧ · · · ∧ Rn−1(xn−1, xn) where each Ri(xi, xi+1) is an atom or a 2RPQ, and the xkj are among the variables that occur in the query. Can every linear binary C2RPQ be expressed by a 2RPQ? Explain your answer. Solution. ◮ Consider, e.g., ∃x. a(x, x) ∧ b(x, x), which cannot be expressed as a 2RPQ. ◮ Note that this is not a linear C2RPQ. ◮ Indeed, most linear binary C2RPQ can be expressed by a 2RPQ: ◮ Every atom p(xi, xi+1) in the query can be viewed as an RPQ with label p. ◮ Since every 2RPQ in the query starts at the endpoint of the previous 2RPQ, the conjunctions can be replaced by composition.
42 / 49
By a binary linear C2RPQ we mean a C2RPQ of the form ∃xk1, . . . , xkm. R1(x1, x2) ∧ R2(x2, x3) ∧ · · · ∧ Rn−1(xn−1, xn) where each Ri(xi, xi+1) is an atom or a 2RPQ, and the xkj are among the variables that occur in the query. Can every linear binary C2RPQ be expressed by a 2RPQ? Explain your answer. Solution. ◮ Consider, e.g., ∃x. a(x, x) ∧ b(x, x), which cannot be expressed as a 2RPQ. ◮ Note that this is not a linear C2RPQ. ◮ Indeed, most linear binary C2RPQ can be expressed by a 2RPQ: ◮ Every atom p(xi, xi+1) in the query can be viewed as an RPQ with label p. ◮ Since every 2RPQ in the query starts at the endpoint of the previous 2RPQ, the conjunctions can be replaced by composition. ◮ Thus, ∃x2, . . . , xn−1. (R1 ◦ R2 ◦ · · · ◦ Rn−1)(x1, xn) is an equivalent 2RPQ.
43 / 49
By a binary linear C2RPQ we mean a C2RPQ of the form ∃xk1, . . . , xkm. R1(x1, x2) ∧ R2(x2, x3) ∧ · · · ∧ Rn−1(xn−1, xn) where each Ri(xi, xi+1) is an atom or a 2RPQ, and the xkj are among the variables that occur in the query. Can every linear binary C2RPQ be expressed by a 2RPQ? Explain your answer. Solution. ◮ Consider, e.g., ∃x. a(x, x) ∧ b(x, x), which cannot be expressed as a 2RPQ. ◮ Note that this is not a linear C2RPQ. ◮ Indeed, most linear binary C2RPQ can be expressed by a 2RPQ: ◮ Every atom p(xi, xi+1) in the query can be viewed as an RPQ with label p. ◮ Since every 2RPQ in the query starts at the endpoint of the previous 2RPQ, the conjunctions can be replaced by composition. ◮ Thus, ∃x2, . . . , xn−1. (R1 ◦ R2 ◦ · · · ◦ Rn−1)(x1, xn) is an equivalent 2RPQ. ◮ But in a 2RPQ, we lose access to x2, . . . , xn−1.
44 / 49
Query(x, z) ← pa(x, y) ∧ pb(y, z) Query(x, z) ← pa(x, x′) ∧ Query(x′, z′) ∧ pb(z′, z) and that can be expressed as a C2RPQ.
45 / 49
Query(x, z) ← pa(x, y) ∧ pb(y, z) Query(x, z) ← pa(x, x′) ∧ Query(x′, z′) ∧ pb(z′, z) and that can be expressed as a C2RPQ. Solution.
46 / 49
Query(x, z) ← pa(x, y) ∧ pb(y, z) Query(x, z) ← pa(x, x′) ∧ Query(x′, z′) ∧ pb(z′, z) and that can be expressed as a C2RPQ. Solution. ◮ The query would match paths of the form anbn with n ≥ 0, which is not a regular language.
47 / 49
Query(x, z) ← pa(x, y) ∧ pb(y, z) Query(x, z) ← pa(x, x′) ∧ Query(x′, z′) ∧ pb(z′, z) and that can be expressed as a C2RPQ. Solution. ◮ The query would match paths of the form anbn with n ≥ 0, which is not a regular language. ◮ We add rules so that all paths of the form anbm with n, m ≥ 0 match, which is a regular language: p(a+b)∗(x, y) ← pa(x, y) p(a+b)∗(x, y) ← pb(x, y) p(a+b)∗(x, y) ← p(a+b)∗(x, z) ∧ pa(z, y) p(a+b)∗(x, y) ← p(a+b)∗(x, z) ∧ pb(z, y) Query(x, y) ← p(a+b)∗(x, y)
48 / 49
Query(x, z) ← pa(x, y) ∧ pb(y, z) Query(x, z) ← pa(x, x′) ∧ Query(x′, z′) ∧ pb(z′, z) and that can be expressed as a C2RPQ. Solution. ◮ The query would match paths of the form anbn with n ≥ 0, which is not a regular language. ◮ We add rules so that all paths of the form anbm with n, m ≥ 0 match, which is a regular language: p(a+b)∗(x, y) ← pa(x, y) p(a+b)∗(x, y) ← pb(x, y) p(a+b)∗(x, y) ← p(a+b)∗(x, z) ∧ pa(z, y) p(a+b)∗(x, y) ← p(a+b)∗(x, z) ∧ pb(z, y) Query(x, y) ← p(a+b)∗(x, y) ◮ The resulting program is equivalent to the C2RPQ (a + b)∗(x, y)
49 / 49