The ingraph project and incremental evaluation of Cypher queries - - PowerPoint PPT Presentation
The ingraph project and incremental evaluation of Cypher queries - - PowerPoint PPT Presentation
The ingraph project and incremental evaluation of Cypher queries Gbor Szrnyas, Jzsef Marton Incremental Queries Live railway model Live railway model Live railway model Live railway model Live railway model Proximity detection Live
Incremental Queries
Live railway model
Live railway model
Live railway model
Live railway model
Live railway model
Proximity detection
Live railway model
Proximity detection
Live railway model
Proximity detection
Live railway model
Trailing the switch Proximity detection
Live railway model
Live railway model
Live railway model
c d e g f div 2 a b 1
Live railway model
c d e g f div 2
NEXT NEXT STRAIGHT TOP ON
a b 1
NEXT ON NEXT
Proximity detection
Proximity detection
≤ 𝟐 segments
Proximity detection
seg1
NEXT: 1..2
t1
ON
Proximity detection
seg2
t2
ON
≤ 𝟐 segments
Proximity detection
seg1
NEXT: 1..2
t1
ON
MATCH (t1:Train)-[:ON]->(seg1:Segment)
- [:NEXT*1..2]->(seg2:Segment)
<-[:ON]-(t2:Train) RETURN t1, t2, seg1, seg2
Proximity detection
seg2
t2
ON
≤ 𝟐 segments
Proximity detection
seg1
NEXT: 1..2
t1
ON
MATCH (t1:Train)-[:ON]->(seg1:Segment)
- [:NEXT*1..2]->(seg2:Segment)
<-[:ON]-(t2:Train) RETURN t1, t2, seg1, seg2
Proximity detection
seg2
t2
ON
≤ 𝟐 segments
Trailing the switch
Trailing the switch
seg div t
STRAIGHT ON
Trailing the switch
seg div t
STRAIGHT ON
MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw
Trailing the switch
seg div t
STRAIGHT ON
MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw
Trailing the switch
seg div t
STRAIGHT ON
MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw
Evaluate continuously
Incremental queries
- Register a set of standing queries
- Continuously evaluate queries on changes
- The Rete algorithm (1974)
- Originally for rule-based expert systems
- Indexes the graph and caches interim query results
Ujhelyi, Z. et al. EMF-IncQuery: An integrated development environment for live model queries Science of Computer Programming (SCP), 2015 http://www.sciencedirect.com/science/article/pii/S0167642314000082
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
div
STRAIGHT
Trailing the switch
ON
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
a 1
ON
e 2
ON
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e 2
ON
a 1
ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
a 1
ON
e 2
ON
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e 2
ON
a 1
ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e 2
ON
a 1
ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
e div
STRAIGHT
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e 2
ON
a 1
ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
e div
STRAIGHT
e div
STRAIGHT
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
e 2
ON
a 1
ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
e 2
ON
a 1
ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
e div
STRAIGHT
e 2
ON
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
e 2
ON
a 1
ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
e div
STRAIGHT
e 2
ON
e div 2
STRAIGHT ON
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
e 2
ON
a 1
ON
e div
STRAIGHT
2
ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
e div
STRAIGHT
e 2
ON
e div 2
STRAIGHT ON
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
e 2
ON
a 1
ON
div
STRAIGHT ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
e 2 div
STRAIGHT ON
e 2
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
e 2
ON
a 1
ON
div
STRAIGHT ON
e div
STRAIGHT
2
ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
e 2 div
STRAIGHT ON
e 2
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
e 2
ON
a 1
ON
e div
STRAIGHT
2
ON
e div
STRAIGHT
2
ON
c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
div 2
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
e 2
ON
a 1
ON
e div
STRAIGHT
2
ON
e div
STRAIGHT
2
ON
div 2 c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
div 2
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
e 2
ON
a 1
ON
e div
STRAIGHT
2
ON
e div
STRAIGHT
2
ON
div 2 c d e g f div 2
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON ON
div
STRAIGHT
Trailing the switch
ON
div 2
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
e 2
ON
a 1
ON
e div
STRAIGHT
2
ON
e div
STRAIGHT
2
ON
div 2 c e g f div
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON
div
STRAIGHT
Trailing the switch
ON
div
ON
2 d
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
d 2
ON
a 1
ON
e div
STRAIGHT
2
ON
e div
STRAIGHT
2
ON
div 2 c e g f div
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON
div
STRAIGHT
Trailing the switch
ON
div
ON
2 d
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
d 2
ON
a 1
ON
e div
STRAIGHT
2
ON
div 2 c e g f div
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON
div
STRAIGHT
Trailing the switch
ON
div
ON
2 d
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
d 2
ON
a 1
ON
div 2 c e g f div
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON
div
STRAIGHT
Trailing the switch
ON
div
ON
2 d
πt.number, sw σsw.position = ′diverging′
⋈
STRAIGHT ON
e div
STRAIGHT
d 2
ON
a 1
ON
c e g f div
NEXT NEXT STRAIGHT TOP
a b 1
NEXT NEXT ON
div
STRAIGHT
Trailing the switch
ON
div
ON
2 d
ingraph
- PoC query engine for openCypher
- Based on the Rete algorithm
- Goals:
- Provide incremental query evaluation
- Cover standard openCypher constructs
- Run in parallel & distributedly to allow scalability
Szárnyas, G. et al. IncQuery-D: A distributed incremental model query framework in the cloud. MODELS, 2014, https://link.springer.com/chapter/10.1007/978-3-319-11653-2_40
Architecture & Building Blocks
MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw
- penCypher
query
MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw
- penCypher
query Query syntax tree
MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw
Query parser
- penCypher
query Query syntax tree
MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw
Query parser
- penCypher
query Relational Graph Algebra Query syntax tree
MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw
Relational algebra builder Query parser
- penCypher
query Relational Graph Algebra Query syntax tree
MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw
Relational algebra builder Query parser
- penCypher
query Relational Graph Algebra Query syntax tree
MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw
Relational algebra builder Query parser
- penCypher
query Relational Graph Algebra Query syntax tree
Relational Graph Algebra Rete network Rete network model
Relational Graph Algebra Rete network Rete network model Transformer and optimizer
Relational Graph Algebra Rete network Rete network model Transformer and optimizer
Relational Graph Algebra Rete network Rete network model Transformer and optimizer Query deployer
Operators of Relational Graph Algebra
- Basic relational algebra
- projection, selection, join, left outer join, antijoin, union
- Common extensions
- aggregation (𝛿), duplicate-elimination (𝜀), sort (𝜐), top (𝜇)
- Specific extensions
- get-vertices ()
- expand-out (↑), expand-in (↓), expand-both (↕)
- all-different (≡)
- unwind (𝜕)
Szárnyas, G., Marton, J. and Varró, D.: Formalising openCypher Graph Queries in Relational Algebra. https://arxiv.org/abs/1705.02844
/
Incremental query evaluation
- RGA defines a plan for a search-based engine
- Some operators cannot be maintained
incrementally
- expand-out, expand-in, …
- use edge indexers and joins instead
- Implement with graph transformations
Query “Trailing the switch”
Query “Close proximity”
Accessing attributes
Assuming that x is a column of a graph relation, we use the notation “x.a” in selection conditions to express the access to the corresponding value of property a in the property graph.
Hölsch, J. and Grossniklaus, M.: An algebra and equivalences to transform graph patterns in Neo4j, GraphQ 2016, EDBT , http://kops.uni-konstanz.de/handle/123456789/33584
Accessing attributes
Assuming that x is a column of a graph relation, we use the notation “x.a” in selection conditions to express the access to the corresponding value of property a in the property graph.
Hölsch, J. and Grossniklaus, M.: An algebra and equivalences to transform graph patterns in Neo4j, GraphQ 2016, EDBT , http://kops.uni-konstanz.de/handle/123456789/33584 Difficult to implement in incremental algorithms: “Schema calculation” problem
t, seg t, seg, t.number sw, seg sw, seg, sw.position t.number, sw.position
πt.number, sw σsw.position = ′diverging′
⋈
(sw:Switch)−[:STRAIGHT]−>(seg:Segment) (t:Train)−[:ON]−>(seg:Segment)
t.number, sw t.number, sw t, seg, sw t, seg, t.number, sw, sw.position t, seg, sw t, seg, t.number, sw, sw.position t.number t.number, sw.position sw.position t.number
2
- 1. external schema
- 2. extra attributes
- 3. internal schema
This is the current implementation
Works, but fragile
Nested Relational Algebra (NRA)
- Additional operators
- Nest (𝜉) ~ collect
- Unnest (𝜈) ~ UNWIND
- Catch: incrementality requires
Flat Relational Algebra (FRA)
Roth, M.A., Korth, H.F . and Silberschatz, A.: Extended algebra and calculus for nested relational databases. ACM Transactions on Database Systems (TODS), 1988 http://dl.acm.org/citation.cfm?id=49347
name works John year company 1982 Big Biz, Inc. 2010 Fusion Power Plant, Ltd. name works.year works.company John 1982 Big Biz, Inc. John 2010 Fusion Power Plant, Ltd.
Property graphs as nested relations
- Node/relationship properties:
id name age favColours beerRatings 1 John 32 [blue, green] {lager: 5, ale: 3}
Property graphs as nested relations
- Node/relationship properties:
- List:
id name age favColours beerRatings 1 John 32 [blue, green] {lager: 5, ale: 3} id name age favColours beerRatings 1 John 32 {lager: 5, ale: 3} id value blue 1 green
Property graphs as nested relations
- Node/relationship properties:
- List:
- Map:
id name age favColours beerRatings 1 John 32 [blue, green] {lager: 5, ale: 3} id name age favColours beerRatings 1 John 32 {lager: 5, ale: 3} id value blue 1 green id name age favColours beerRatings 1 John 32 id value blue 1 green key value lager 5 ale 3
Property graphs as nested relations
- Node/relationship properties:
- List:
- Map:
- Paths: […]
id name age favColours beerRatings 1 John 32 [blue, green] {lager: 5, ale: 3} id name age favColours beerRatings 1 John 32 {lager: 5, ale: 3} id value blue 1 green id name age favColours beerRatings 1 John 32 id value blue 1 green key value lager 5 ale 3
Flattening NRA to FRA
- It is possible to transform NRA to flat algebra expressions
- Research questions:
- Does it solve the schema calculation problem?
- Is it fast enough for practical implementations?
Paredaens, J. and Van Gucht, D.: Converting nested algebra expressions into flat algebra expressions. ACM Transactions on Database Systems (TODS), 1992 http://dl.acm.org/citation.cfm?id=128768
Incremental maintenance of FRA
Szárnyas, G., Maginecz, J. and Varró, D.: Evaluation of optimization strategies for incremental graph queries. Periodica Polytechnica EECS, 2017 http://docs.inf.mit.bme.hu/preprints/perpol2016-gqo.pdf For a change Δ𝑡 on the input
- define change Δ𝑢 on the output
- update internal data structures
Maintenance of the antijoin operator
IRE – Incremental Relational Engine
- Incremental (flat) relational engine built on Akka
- Independent from Cypher and property graphs
Source code: https://github.com/ftsrg/ingraph/tree/master/ire
OCIM1 revisited
- Composite
data structures
- Lists
- Maps
- Paths
- Nested
data structures
[ e1, e2, {k: […]} ]
OCIM1 revisited
- Composite
data structures
- Lists
- Maps
- Paths
- Nested
data structures
[ e1, e2, {k: […]} ]
OCIM1 revisited
- Composite
data structures
- Lists
- Maps
- Paths
- Nested
data structures
[ e1, e2, {k: […]} ]
( )
OCIM1 revisited
- Composite
data structures
- Lists
- Maps
- Paths
- Nested
data structures
[ e1, e2, {k: […]} ]
( )
OCIM1 revisited
- Composite
data structures
- Lists
- Maps
- Paths
- Nested
data structures
[ e1, e2, {k: […]} ]
( )
OCIM1 revisited
- Composite
data structures
- Lists
- Maps
- Paths
- Nested
data structures
[ e1, e2, {k: […]} ]
( )
CIR-2017-220
Current challenges
Update operations
- Presume a perfectly working incremental query engine
- How to perform updates?
- Low-level API operations:
indexer.addTuple()
- Adding new nodes:
CREATE (…)
- Matching and creating:
MATCH (n) CREATE (n)-[:REL]->(:Label)
- Loading CSVs (legacy construct):
LOAD CSV FROM … AS line CREATE (:Label {prop1: toInt(line[2]), …})
Update operations
- Presume a perfectly working incremental query engine
- How to perform updates?
- Low-level API operations:
indexer.addTuple()
- Adding new nodes:
CREATE (…)
- Matching and creating:
MATCH (n) CREATE (n)-[:REL]->(:Label)
- Loading CSVs (legacy construct):
LOAD CSV FROM … AS line CREATE (:Label {prop1: toInt(line[2]), …})
Not well suited to Rete
Roadmap
- Research
- Formalise openCypher using Nested Relational Algebra
- Transform nested expressions to Flat Relational Algebra
- Development
- Support for LDBC’s Social Network Benchmark / BI workload
- Use TCK for testing
- Implement NRA to FRA transformation
- See if it works
- Run benchmarks
- Use Akka clustering and Docker Compose for deployment
- Discover more use cases
Related resources
- Repository:
https://github.com/ftsrg/ingraph
- Technical report:
http://docs.inf.mit.bme.hu/ingraph/pub/opencypher-report.pdf
- Formalisation (preprint):