Optimization of in collaboration with: Parke Godfrey and Jarek Gryz - PowerPoint PPT Presentation

Optimization of in collaboration with: Parke Godfrey and Jarek Gryz Regular Path Queries in Large Graphs Nikolay Yakovets

Optimization of RPQs Scalable & e ffi cient evaluation of regular path queries Evaluation Implementation RPQs Optimization WAVEGUIDE Plans Linked Data Costs Semantics 2

Graph Query Languages ? ? ? Adjacency Query list all neighbours, find k- ? ? ? neighbourhood of a node G Pattern Matching Query ? find all sub-graphs in a database that are pattern isomorphic to a given query pattern graph Summarization Query + + summarize or operate on query results e.g. aggregation; avg(), min(), max(), etc Reachability/Path Query navigational query deals with paths in a graph test whether nodes are reachable in a graph paths of fixed or arbitrary lengths 3

SPARQL - Query Language adjacency pattern matching summarization S PARQL P rotocol a nd R DF Q uery L anguage (SPARQL) ‣ declarative, based on pattern matching ‣ graph patterns describe subgraphs of the queried RDF graphs ‣ those subgraphs that match a description yield a result ny:nikolay Query: Graph: variables foaf:based_near SELECT ?pop foaf:name WHERE { dbpedia:Oakville :Oakville :population ?pop } "Nikolay dp:population Yakovets" ?pop graph pattern "182520" 4

SPARQL Property Paths ‣ Part of SPARQL 1.1 W3C recommendation path ‣ Allow regular expressions to describe paths between nodes: p 1 | p 2 p 1 /p 2 disjunction concatenation p ? ˆ p zero or one inverted ! iri negated p + Kleene star one or more p ∗ ‣ Useful in many application domains: social networks , biological , encyclopedic ‣ Convenient declarative mechanism to answer queries without prior knowledge of underlying data paths 5

SPARQL Property Paths ‣ Example: DBPedia snippet, part of a LOD dataset ‣ Two datasets English and Japanese interlinked with OWL terms en: Gundam G: en: Tokyo en: Japan :isLocatedIn :sameAs en: Daiba :sameAs jp: ガンダム :isLocatedIn :isLocatedIn jp: 本州 jp: ⽇旦本 jp: 関東地⽅斺 jp: 東京 jp: お台場 select ?place Q: { en: Gundam (:sameAs*/:isLocatedIn)+/sameAs* ?place .} ‣ Query: Where is Gundam statue located? ‣ Solution: Need to resolve equivalent data entities ( :sameAs ) and traverse spacial hierarchy ( :isLocatedIn ) to fully utilize richer spacial information in Japanese dataset 6

Formal Evaluation ‣ Property Paths in SPARQL are essentially Regular Path Queries (RPQs) ‣ RPQs have been well-studied before the advent of RDF and SPARQL regular language ‣ Formal def.: Q = ( x, L ( r ) , y ) free variables ‣ Semantics of Evaluation: [[ Q ]] G - an evaluation of Q over graph database G a collection ( s, t ) such that ∃ a path p in G between s and t such that p conforms to regex r aka. solution counting ∀ a bag (allow duplicates) path-induced string λ ( p ) ∈ L ( r ) path is simple or arbitrary a set (discard duplicates) aka. existential semantics ∃ 7

Paths in SPARQL regular ∀ simple ∀ ∃ simple Counting procedures are # P- Evaluation of simple paths is complete on general graphs NP-complete on general (Arenas et al., Losemann et al., 2013 ) graphs (Mendelzon et al., 1987 ) Tractable on DAGs, or restricted Tractable on DAGs, or restricted compatible regex compatible regex regular ∃ SPARQL (W3C proposal for RDF query language) support of RPQs through SPARQL1.1 property paths 8

RPQ Evaluation [[ Q ]] G - an evaluation of Q over graph database G + considering existential semantics on regular paths FA-based 𝝱 -RA-based Use finite state machines in Use relational algebra evaluation extended with alpha- Mendelzon et al., 1987 operator which computes transitive closure Losemann et al., 2013 9

FA-based Evaluation select ?place Q: { en: Gundam (:sameAs*/:isLocatedIn)+/sameAs* ?place .} 3. Construct a product P of 1. From a parse tree, construct a query ε -NFA : query and graph automata. 4. Check P for reachable accepting states to produce an answer to a query 2. Minimize the query automaton, if necessary : 10

𝝱 -RA-based Evaluation select ?place Q: { en: Gundam (:sameAs*/:isLocatedIn)+/sameAs* ?place .} Have SPRJU-RA extended with 𝝱 𝝱 computes the least-fixpoint: 𝝱 computes the transitive closure of a given relation 1. From a parse tree, construct an RA tree: Q parse tree Q RA tree favourite RDBMS 11

Comparing Approaches Th: FA and are 𝝱 -RA incomparable plan spaces Pf.: translation into Datalog examine induced sequence of joins 𝝱 -RA FA e.g. (?x, (a/b)+, ?y) P FA =((((a ⋈ b) ⋈ a) ⋈ b) ⋈ a).. P aRA =(a ⋈ b) ⋈ (a ⋈ b) ⋈ (a ⋈ b).. P FA P aRA P aRA ∉ FA P FA ∉ 𝝱 -RA 𝝱 -RA ⊈ FA FA ⊈ 𝝱 -RA 12

WAVEGUIDE Goal: Need to consider both FA and 𝝱 -RA plan spaces Search driven by a waveplan which guides a number of wavefronts which iteratively explore the graph guided iterative waveplan graph search P ab + P ab + W W ab · W ab · W ab + : W ab + : W ab · W ab · U W W ab : W ab : · b · b a · a · U W 13

search wavefronts accepting states seed W l a wavefront wavefront labels • an expanding search unit label edge labels • guided by a wavefront automaton W l = ( l, S, q 0 , Q, δ , E, L, F ) W l = ( l, S, q 0 , Q, δ , E, L, F ) • labeled with regex it evaluates starting state S • seeded with set of states transition function δ a transition function appending or prepending • appending and prepending transitions δ : Q × (( E ∪ L ) × {· , ·} ∪ { ε } ) → 2 Q δ : Q × (( E ∪ L ) × {· , ·} ∪ { ε } ) → 2 Q • transitions over graphs and views graph edges pipeline or wavefront labels S a seed starting state W l • edge incoming into accepting state in W l W l q 0 q 0 • defined with an RPQ, a wavefront or by construction S • can be universal , any node in a graph seed 14

a waveplan a waveplan P Q Q • produces an answer to a given query • an ordered set of wavefront automata • order defines which labels can be used in the seed and transitions over a view • higher wavefronts can use lower wavefronts as their labels and seeds, but not vice-versa • query answered by the highest wavefront P ab + P ab + set of wavefronts ordering < P ab + < P ab + W ab · W ab · W ab + : W ab + : e.g., query (?x, (a/b)+, ?y) W ab · W ab · W ab • produces an answer for (a/b) regex U W ab W ab + • uses as a view to compute W ab : W ab : (a/b)+ · b · b a · a · U 15

WAVEGUIDE - iterative search Exploration procedure based on semi- naive evaluation Intermediate search results kept in the search cache cache keeps track of end-nodes and corresponding states in a plan • seed specifies node pairs to start from loop while discover new tuples • crank advances simultaneously in a graph and automaton • reduce prunes the delta, handles unbounded computation • cache materializes according to the specified strategy • extract produces answers 16

challenges! vs. other e ffi cient? optimal? techniques? enumerator size? plan space optimizations cost model analysis? enabled by WAVEGUIDE? 17

WAVEGUIDE Plan Space WP • subsumes both FA and 𝝱 - RA • adds exclusive new plans 𝝱 -RA ∪ FA ⊂ WP 𝝱 -RA FA • e.g., (?x, (a/b/c)+, ?y) 18

WAVEGUIDE Plan Space WP • subsumes both FA and 𝝱 - RA • adds exclusive new plans 𝝱 -RA ∪ FA ⊂ WP 𝝱 -RA FA • e.g., (?x, (a/b/c)+, ?y) a < P ( abc )+ < P ( abc )+ P ( abc )+ P ( abc )+ a · a · W ( abc )+ : W ( abc )+ : b b · b · a c a · a · c · c · start start U 19

WAVEGUIDE Plan Space WP • subsumes both FA and 𝝱 - RA • adds exclusive new plans 𝝱 -RA ∪ FA ⊂ WP 𝝱 -RA FA • e.g., (?x, (a/b/c)+, ?y) α < P ( abc )+ < P ( abc )+ P ( abc )+ P ( abc )+ W abc · W abc · . . / o = s / o = s W ( abc )+ : W ( abc )+ : W abc · W abc · U . . / o = s / o = s W abc : W abc : b · b · a · a · c · c · σ p = a σ p = a σ p = b σ p = b σ p = c σ p = c U T T T T T 20

WAVEGUIDE Plan Space WP • subsumes both FA and 𝝱 - RA • adds exclusive new plans 𝝱 -RA ∪ FA ⊂ WP 𝝱 -RA FA • e.g., (?x, (a/b/c)+, ?y) a · a · P ( abc )+ P ( abc )+ < P ( abc )+ < P ( abc )+ W ( abc )+ : W ( abc )+ : W bc · W bc · a · a · U W bc : W bc : · b · b c · c · U 21

Optimization of in collaboration with: Parke Godfrey and Jarek Gryz - PowerPoint PPT Presentation

Optimization of in collaboration with: Parke Godfrey and Jarek Gryz Regular Path Queries in Large Graphs Nikolay Yakovets Optimization of RPQs Scalable & e ffi cient evaluation of regular path queries Evaluation Implementation RPQs

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

P2P Combinatorial Optimization Amir H. Payberah (amir@sics.se) P2P Combinatorial Optimization, 13

Optimization of HPSG Grammar Implementations in Trale Georgiana Dinu Optimization of HPSG

Search Engine Optimization What is Search Engine Optimization Search Engine Optimization is the

Optimization Optimization Goal: Find the minimizer ! that minimizes the objective (cost)

Five Steps to Optimization Five Steps to Optimization Beyond Best Practices Beyond Best

St Stress Aware Layout Stress Aware Layout St A A L L t t Optimization Optimization

TEG: A New Post-Layout TEG: A New Post-Layout Optimization Method Optimization Method Shuo

Evolutionary Algorithm 2. Swarm Intelligence and Ant Colony Optimization Ant Colony Optimization

Optimization Process Done by an Optimization Algorithm Jose Rueda Torres Learning Objectives

Optimization (Introduction) Optimization Goal: Find the minimizer that minimizes the

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

MATHEMATICS 1 CONTENTS Unconstrained optimization Constrained optimization Lagrange method

Convex Optimization by Stephen Boyd, and Lieven Vandenberghe. Optimization for Machine Learning by

AM 205: lecture 20 Today: PDE optimization, constrained optimization example New topic:

SI485i : NLP Set 8 PCFGs and the CKY Algorithm PCFGs We saw how CFGs can model English (sort

Low-power smart imagers for vision-enabled Low-power smart imagers for vision-enabled wireless

-coupling in Electrodynamics and the Reversed Vavilov-Cherenkov radiation M. Sc. Omar Jesus

3D Deep Learning: An Overview based on My Work Hao Su Feb 23, 2018 Our world is 3D Hao Su 2

Constructing Inductive Families in UniMath Felix Rech Advisor: Steven Schfer June 15, 2018

Constructing Premaximal Binary Cube-free Words of Any Level Elena Petrova and Arseny Shur Ural

The countable homogeneous poset Recognising R Peter J Cameron R is the unique countable graph with

Transverse-momentum resummation for Drell-Yan lepton pair production at NNLL accuracy Giancarlo

Sambuz

Useful Links

Newsletter

Mail Us