Path Logics for Q uerying Graphs combining expressiveness and - - PowerPoint PPT Presentation

path logics for q uerying graphs
SMART_READER_LITE
LIVE PREVIEW

Path Logics for Q uerying Graphs combining expressiveness and - - PowerPoint PPT Presentation

LFDS - 17/11/2015 UCL, London Path Logics for Q uerying Graphs combining expressiveness and efficiency Diego Figueira CNRS, LaBRI France Graph databases Semantic web / RDF / social networks / . . . a c a b "Entities + Relations"


slide-1
SLIDE 1

Diego Figueira

CNRS, LaBRI France

Path Logics for Querying Graphs

combining expressiveness and efficiency

LFDS - 17/11/2015 UCL, London

slide-2
SLIDE 2

Graphdatabases

c b a a b c b a a c a c b a Semantic web / RDF / social networks / . . . Notion of path of central importance Modelled as: edge-labelled directed graphs "Entities + Relations"

slide-3
SLIDE 3

π1

Graphdatabases

c b a a b c b a a c a c b a

RPQ

π1

π1: (ab)* c

Evaluation: P (combined) NL (data)

slide-4
SLIDE 4

π1

Graphdatabases

c b a a b c b a a c a c b a

CRPQ

π1 π2 π3

π1: (ab)* c π2: (ac)* π3: a c*

Evaluation: NP (combined) NL (data) Acyclic P

Unions, inverse

slide-5
SLIDE 5

Graphdatabases

c b a a b c b a a c a c b a

CRPQ

π1 π2 π3

π1: (ab)* c π2: (ac)* π3: a c*

What about… “All the pairs (u,v) that can reach some node z in the same number of steps”

slide-6
SLIDE 6
  • |πi| = |πj|
  • πi is a prefix of πj
  • πi is a subsequence of πj
  • πi is a factor of πj
  • πi = πj projected onto A

CRPQ

π1 π2 π3

π1: (ab)* c π2: (ac)* π3: a c*

CRPQ(S)

R(π1, π2), R∈S

Motivations from: entity resolution, semantic associations, crime detection,…

Graphdatabases

What about testing for relations

  • n the paths?
slide-7
SLIDE 7

CRPQ

π1 π2 π3

π1: (ab)* c π2: (ac)* π3: a c*

CRPQ(S)

R(π1, π2), R∈S

CRPQ(S) =

S: Class of well-behaved word relations…

Graphdatabases

What about testing for relations

  • n the paths?

CRPQ + tests R(πi1,…,πin), R ∈ S

slide-8
SLIDE 8

recognizable regular rational

REGk RATk RECk

Word relations

slide-9
SLIDE 9

binary relations

recognizable regular rational

REG2 RAT2 REC2

R ⊆ 𝔹*×𝔹*

prefix, equal, equal length, ... suffix, infix, projection, subsequence, ...

c d a d d a c d c b a b a b c b b c d a d d a c d c b a b a b c b b c d a d d a c d c b a b a b c b b

slide-10
SLIDE 10

CRPQ(S) =

Graphdatabases

CRPQ + tests R(πi1,…,πin), R ∈ S CRPQ(REC) NP/NL complexity CRPQ(REG) PSPACE/NL complexity CRPQ(RAT) undecidable Related to the Intersection Problem: Given relations R1,…,Rn, whether R1∩···∩Rn≠∅ Can this be extended?

slide-11
SLIDE 11

problem

intersection

R ⋂ S = ∅ ?

input: R ∈ R, S ∈ S

  • utput: R ⋂ S = ∅ ?

REG ⋂ RAT = ∅ ?

already undecidable

R, S : classes of binary relations

...but what about real world relations? it has been studied...

like

suffix...? subword...? subsequence...?

PCP

v

u v

a b a c c b a b a c a b a c a b a c a b a c u v a b a c c b a b a c a b a c a b a c a b a c u v

subsequence

( . . . , . . . )

i

u 1

i

u n

i

v 1

i

v n

slide-12
SLIDE 12

Language Data complexity Combined complexity CRPQ(REGk) NL PSPACE CRPQ(RATk) Undecidable Undecidable CRPQ(REGk + suffix) Undecidable Undecidable CRPQ(REGk + factor) Undecidable Undecidable CRPQ(REGk + subsequence) non-elementary non-PR CRPQ(suffix) NL PSPACE CRPQ(factor) PSPACE PSPACE CRPQ(subsequene) PSPACE NEXPTIME

∀ k>1

Can we extend CRPQ beyond REG relations?

slide-13
SLIDE 13

Proposed alternative: approximate RAT through REG + counters

Can we extend CRPQ beyond REG relations?

How? 1) take a an NFA 2) add counters 3) use it to read k-tuples of words

slide-14
SLIDE 14

b 1 b 2 a 2 a 2 a 2 b 2 a 2 a 1 b 1 a 1 b 1 ababb baaaba

( , )

[ [ ] ] =

(𝔹×{1,2})* [ [ ] ] 𝔹*× 𝔹* = [ ] ((a,1)(a,2)|(b,1)(b,2))* [ ] equality = control word

b a a a b a a b a b b

1 2 | S∈REG(𝔹×{1,2}) is L-controlled } Rel(L)= {[ [ ] ] S

2 tapes over 𝔹 ≈ 1 tape over 𝔹×{1,2}

(1|2)*-controlled (12)*-controlled

(𝔹×{1,2})*

𝔹*×𝔹*

L⊆{1,2}*

slide-15
SLIDE 15

Rel((12)*)= length-preserving REG2 Rel((12)*(1*|2*))= REG2 Rel(1*2*)= REC2 Rel((1|2)*)= RAT2

Eg:

Rel((1*|2*)(12)*)= REG2

rev

slide-16
SLIDE 16

Idea Approximate with regular relations that can count patterns

R = { (u,v) | }

# of times (ab)*c appears in u 2 · # of times c*b appears in v =

More than just counting letters

slide-17
SLIDE 17

Instead of regular languages… …use automata with counting Idea

| S∈REG(𝔹×{1,2}) is L-controlled } Rel(L)= {[ [ ] ] S

slide-18
SLIDE 18

Evaluation of CRPQ with counting is feasible

PSPACE in combined complexity NL in data complexity

slide-19
SLIDE 19

Parikh Automata

NFA with n counters c1,…,cn and a semilinear set S⊆ℕn (𝔹,Q,q0,δ,F,n,S) Transitions of δ: (q,a,(x1,…,xn),q') ∈ Q×𝔹×ℕn×Q Run:

  • Initial configuration: (q0,(0,…,0)) ∈ Q×ℕn
  • Acceptance: last configuration in F×S

(q,x) (p,(x+y)) (q,a,y,p) ∈δ

❉ Many equivalent definitions (eg. reversal-bounded counter systems)

dimension

counters can only be incremented

[Klaedtke & Rueß]

slide-20
SLIDE 20

Parikh Automata

Eg:

Lba=ca = {

}

w| number of a’s afuer a b = number of a’s afuer a c

a b a a c a b a c a c a b a

c1++ c2++ c2++ c2++ c1++ c1++

Parikh Automaton A = (𝔹, Q, q0, δ, F, 2, {(k,k) | k ∈ℕ})

  • dimension 2 (2 counters)
  • increment c1 whenever we see “ba”
  • increment c2 whenever we see “ca”
  • F=Q
  • Semilinear set assures that counters must be equal to accept a word
slide-21
SLIDE 21

Parikh Automata

Closed under Decidable

non-emptiness, membership intersection, union, (inverse) homomorphisms, concatenation (not complementation/iteration)

slide-22
SLIDE 22

PA relations

| S∈PA(𝔹×{1,2}) is L-controlled } RelPA(L)= {[ [ ] ] S

slide-23
SLIDE 23

REGPA = RelPA((12)*(1*|2*))

Eg:

REGPA = RelPA((1*|2*)(12)*)

2 rev

. . . RATPA = RelPA((1|2)*)

2 2

slide-24
SLIDE 24

Parikh-regular

REGk

PA

recognizable regular rational

REGk RATk RECk

Word relations

slide-25
SLIDE 25

Tieorem: Evaluation of CRPQ(REGPA) is PSPACE in combined complexity NL in data complexity Tieorem: Evaluation of CRPQPA (no relations) is NP in combined complexity NL in data complexity

Proof ingredients:

  • Intersection problem for Parikh Automata

Given PA’s A1,…,An, is L(A1) ∩ · · · ∩ L(An) ≠ ∅ ? is PSPACE-complete

  • Intersection closure for REGPA

For all R,S ∈ REGPA, R∩S ∈ REGPA it suffices to intersect the automata representing them

  • Closure under product of REGPA
slide-26
SLIDE 26

Approximating rational relations

u ~k v are k-similar iff for all w with |w|≤k, they have the same number of appearances of w (as factor) (as subsequence) Given R∈RAT, Rk = {(u,v) | u ~k u', v ~k v', (u', v’)∈R} ∈ REGPA

slide-27
SLIDE 27

Alternative: Syntactic restrictions

π1 π4 π2 π6 π7 π3 π5

Gaifman multi-graph

  • f path variables

E.g. π1 π2 π3

π1: (ab)* c π2: (ac)* π3: a c* R(π1,π3) S(π3,π2)

π1 π3 π2

acyclic

E.g. π1 π2 π3

π1: (ab)* c π2: (ac)* π3: a c* R(π1,π3) S(π3,π2) R(π3,π2)

π1 π3 π2

cyclic

Tieorem: Evaluation of acyclic-CRPQ(RATPA) is PSPACE in combined complexity NL in data complexity If also fixed join size: NP combined complexity If also fixed PA dimension and unary representation: PTIME combined complexity

Maximum cardinality of connected component

slide-28
SLIDE 28

Avoid the curse of of rational relations

Or staying away from cycles in path relations Approximating by regular relations with counting

Conclusion

Tiank you

Counting does not increase complexity