The ingraph project and incremental evaluation of Cypher queries - - PowerPoint PPT Presentation

the ingraph project
SMART_READER_LITE
LIVE PREVIEW

The ingraph project and incremental evaluation of Cypher queries - - PowerPoint PPT Presentation

The ingraph project and incremental evaluation of Cypher queries Gbor Szrnyas, Jzsef Marton Incremental Queries Live railway model Live railway model Live railway model Live railway model Live railway model Proximity detection Live


slide-1
SLIDE 1

The ingraph project

and incremental evaluation of Cypher queries

Gábor Szárnyas, József Marton

slide-2
SLIDE 2

Incremental Queries

slide-3
SLIDE 3

Live railway model

slide-4
SLIDE 4

Live railway model

slide-5
SLIDE 5

Live railway model

slide-6
SLIDE 6

Live railway model

slide-7
SLIDE 7

Live railway model

Proximity detection

slide-8
SLIDE 8

Live railway model

Proximity detection

slide-9
SLIDE 9

Live railway model

Proximity detection

slide-10
SLIDE 10

Live railway model

Trailing the switch Proximity detection

slide-11
SLIDE 11

Live railway model

slide-12
SLIDE 12

Live railway model

slide-13
SLIDE 13

Live railway model

c d e g f div 2 a b 1

slide-14
SLIDE 14

Live railway model

c d e g f div 2

NEXT NEXT STRAIGHT TOP ON

a b 1

NEXT ON NEXT

slide-15
SLIDE 15

Proximity detection

Proximity detection

≤ 𝟐 segments

slide-16
SLIDE 16

Proximity detection

seg1

NEXT: 1..2

t1

ON

Proximity detection

seg2

t2

ON

≤ 𝟐 segments

slide-17
SLIDE 17

Proximity detection

seg1

NEXT: 1..2

t1

ON

MATCH (t1:Train)-[:ON]->(seg1:Segment)

  • [:NEXT*1..2]->(seg2:Segment)

<-[:ON]-(t2:Train) RETURN t1, t2, seg1, seg2

Proximity detection

seg2

t2

ON

≤ 𝟐 segments

slide-18
SLIDE 18

Proximity detection

seg1

NEXT: 1..2

t1

ON

MATCH (t1:Train)-[:ON]->(seg1:Segment)

  • [:NEXT*1..2]->(seg2:Segment)

<-[:ON]-(t2:Train) RETURN t1, t2, seg1, seg2

Proximity detection

seg2

t2

ON

≤ 𝟐 segments

slide-19
SLIDE 19

Trailing the switch

slide-20
SLIDE 20

Trailing the switch

seg div t

STRAIGHT ON

slide-21
SLIDE 21

Trailing the switch

seg div t

STRAIGHT ON

MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw

slide-22
SLIDE 22

Trailing the switch

seg div t

STRAIGHT ON

MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw

slide-23
SLIDE 23

Trailing the switch

seg div t

STRAIGHT ON

MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw

Evaluate continuously

slide-24
SLIDE 24

Incremental queries

  • Register a set of standing queries
  • Continuously evaluate queries on changes
  • The Rete algorithm (1974)
  • Originally for rule-based expert systems
  • Indexes the graph and caches interim query results

Ujhelyi, Z. et al. EMF-IncQuery: An integrated development environment for live model queries Science of Computer Programming (SCP), 2015 http://www.sciencedirect.com/science/article/pii/S0167642314000082

slide-25
SLIDE 25

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

div

STRAIGHT

Trailing the switch

ON

slide-26
SLIDE 26

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

slide-27
SLIDE 27

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

slide-28
SLIDE 28

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

a 1

ON

e 2

ON

slide-29
SLIDE 29

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e 2

ON

a 1

ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

a 1

ON

e 2

ON

slide-30
SLIDE 30

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e 2

ON

a 1

ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

slide-31
SLIDE 31

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e 2

ON

a 1

ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

e div

STRAIGHT

slide-32
SLIDE 32

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e 2

ON

a 1

ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

e div

STRAIGHT

e div

STRAIGHT

slide-33
SLIDE 33

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

e 2

ON

a 1

ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

slide-34
SLIDE 34

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

e 2

ON

a 1

ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

e div

STRAIGHT

e 2

ON

slide-35
SLIDE 35

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

e 2

ON

a 1

ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

e div

STRAIGHT

e 2

ON

e div 2

STRAIGHT ON

slide-36
SLIDE 36

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

e 2

ON

a 1

ON

e div

STRAIGHT

2

ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

e div

STRAIGHT

e 2

ON

e div 2

STRAIGHT ON

slide-37
SLIDE 37

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

e 2

ON

a 1

ON

div

STRAIGHT ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

e 2 div

STRAIGHT ON

e 2

slide-38
SLIDE 38

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

e 2

ON

a 1

ON

div

STRAIGHT ON

e div

STRAIGHT

2

ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

e 2 div

STRAIGHT ON

e 2

slide-39
SLIDE 39

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

e 2

ON

a 1

ON

e div

STRAIGHT

2

ON

e div

STRAIGHT

2

ON

c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

div 2

slide-40
SLIDE 40

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

e 2

ON

a 1

ON

e div

STRAIGHT

2

ON

e div

STRAIGHT

2

ON

div 2 c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

div 2

slide-41
SLIDE 41

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

e 2

ON

a 1

ON

e div

STRAIGHT

2

ON

e div

STRAIGHT

2

ON

div 2 c d e g f div 2

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON ON

div

STRAIGHT

Trailing the switch

ON

div 2

slide-42
SLIDE 42

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

e 2

ON

a 1

ON

e div

STRAIGHT

2

ON

e div

STRAIGHT

2

ON

div 2 c e g f div

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON

div

STRAIGHT

Trailing the switch

ON

div

ON

2 d

slide-43
SLIDE 43

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

d 2

ON

a 1

ON

e div

STRAIGHT

2

ON

e div

STRAIGHT

2

ON

div 2 c e g f div

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON

div

STRAIGHT

Trailing the switch

ON

div

ON

2 d

slide-44
SLIDE 44

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

d 2

ON

a 1

ON

e div

STRAIGHT

2

ON

div 2 c e g f div

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON

div

STRAIGHT

Trailing the switch

ON

div

ON

2 d

slide-45
SLIDE 45

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

d 2

ON

a 1

ON

div 2 c e g f div

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON

div

STRAIGHT

Trailing the switch

ON

div

ON

2 d

slide-46
SLIDE 46

πt.number, sw σsw.position = ′diverging′

STRAIGHT ON

e div

STRAIGHT

d 2

ON

a 1

ON

c e g f div

NEXT NEXT STRAIGHT TOP

a b 1

NEXT NEXT ON

div

STRAIGHT

Trailing the switch

ON

div

ON

2 d

slide-47
SLIDE 47

ingraph

  • PoC query engine for openCypher
  • Based on the Rete algorithm
  • Goals:
  • Provide incremental query evaluation
  • Cover standard openCypher constructs
  • Run in parallel & distributedly to allow scalability

Szárnyas, G. et al. IncQuery-D: A distributed incremental model query framework in the cloud. MODELS, 2014, https://link.springer.com/chapter/10.1007/978-3-319-11653-2_40

slide-48
SLIDE 48

Architecture & Building Blocks

slide-49
SLIDE 49

MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw

  • penCypher

query

slide-50
SLIDE 50

MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw

  • penCypher

query Query syntax tree

slide-51
SLIDE 51

MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw

Query parser

  • penCypher

query Query syntax tree

slide-52
SLIDE 52

MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw

Query parser

  • penCypher

query Relational Graph Algebra Query syntax tree

slide-53
SLIDE 53

MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw

Relational algebra builder Query parser

  • penCypher

query Relational Graph Algebra Query syntax tree

slide-54
SLIDE 54

MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw

Relational algebra builder Query parser

  • penCypher

query Relational Graph Algebra Query syntax tree

slide-55
SLIDE 55

MATCH (t:Train)-[:ON]->(seg:Segment) <-[:STRAIGHT]-(sw:Switch) WHERE sw.position = 'diverging' RETURN t.number, sw

Relational algebra builder Query parser

  • penCypher

query Relational Graph Algebra Query syntax tree

slide-56
SLIDE 56

Relational Graph Algebra Rete network Rete network model

slide-57
SLIDE 57

Relational Graph Algebra Rete network Rete network model Transformer and optimizer

slide-58
SLIDE 58

Relational Graph Algebra Rete network Rete network model Transformer and optimizer

slide-59
SLIDE 59

Relational Graph Algebra Rete network Rete network model Transformer and optimizer Query deployer

slide-60
SLIDE 60

Operators of Relational Graph Algebra

  • Basic relational algebra
  • projection, selection, join, left outer join, antijoin, union
  • Common extensions
  • aggregation (𝛿), duplicate-elimination (𝜀), sort (𝜐), top (𝜇)
  • Specific extensions
  • get-vertices ()
  • expand-out (↑), expand-in (↓), expand-both (↕)
  • all-different (≡)
  • unwind (𝜕)

Szárnyas, G., Marton, J. and Varró, D.: Formalising openCypher Graph Queries in Relational Algebra. https://arxiv.org/abs/1705.02844

/

slide-61
SLIDE 61
slide-62
SLIDE 62
slide-63
SLIDE 63

Incremental query evaluation

  • RGA defines a plan for a search-based engine
  • Some operators cannot be maintained

incrementally

  • expand-out, expand-in, …
  • use edge indexers and joins instead
  • Implement with graph transformations
slide-64
SLIDE 64

Query “Trailing the switch”

slide-65
SLIDE 65

Query “Close proximity”

slide-66
SLIDE 66

Accessing attributes

Assuming that x is a column of a graph relation, we use the notation “x.a” in selection conditions to express the access to the corresponding value of property a in the property graph.

Hölsch, J. and Grossniklaus, M.: An algebra and equivalences to transform graph patterns in Neo4j, GraphQ 2016, EDBT , http://kops.uni-konstanz.de/handle/123456789/33584

slide-67
SLIDE 67

Accessing attributes

Assuming that x is a column of a graph relation, we use the notation “x.a” in selection conditions to express the access to the corresponding value of property a in the property graph.

Hölsch, J. and Grossniklaus, M.: An algebra and equivalences to transform graph patterns in Neo4j, GraphQ 2016, EDBT , http://kops.uni-konstanz.de/handle/123456789/33584 Difficult to implement in incremental algorithms: “Schema calculation” problem

slide-68
SLIDE 68

t, seg t, seg, t.number sw, seg sw, seg, sw.position t.number, sw.position

πt.number, sw σsw.position = ′diverging′

(sw:Switch)−[:STRAIGHT]−>(seg:Segment) (t:Train)−[:ON]−>(seg:Segment)

t.number, sw t.number, sw t, seg, sw t, seg, t.number, sw, sw.position t, seg, sw t, seg, t.number, sw, sw.position t.number t.number, sw.position sw.position t.number

2

  • 1. external schema
  • 2. extra attributes
  • 3. internal schema

This is the current implementation

slide-69
SLIDE 69
slide-70
SLIDE 70

Works, but fragile

slide-71
SLIDE 71

Nested Relational Algebra (NRA)

  • Additional operators
  • Nest (𝜉) ~ collect
  • Unnest (𝜈) ~ UNWIND
  • Catch: incrementality requires

Flat Relational Algebra (FRA)

Roth, M.A., Korth, H.F . and Silberschatz, A.: Extended algebra and calculus for nested relational databases. ACM Transactions on Database Systems (TODS), 1988 http://dl.acm.org/citation.cfm?id=49347

name works John year company 1982 Big Biz, Inc. 2010 Fusion Power Plant, Ltd. name works.year works.company John 1982 Big Biz, Inc. John 2010 Fusion Power Plant, Ltd.

slide-72
SLIDE 72

Property graphs as nested relations

  • Node/relationship properties:

id name age favColours beerRatings 1 John 32 [blue, green] {lager: 5, ale: 3}

slide-73
SLIDE 73

Property graphs as nested relations

  • Node/relationship properties:
  • List:

id name age favColours beerRatings 1 John 32 [blue, green] {lager: 5, ale: 3} id name age favColours beerRatings 1 John 32 {lager: 5, ale: 3} id value blue 1 green

slide-74
SLIDE 74

Property graphs as nested relations

  • Node/relationship properties:
  • List:
  • Map:

id name age favColours beerRatings 1 John 32 [blue, green] {lager: 5, ale: 3} id name age favColours beerRatings 1 John 32 {lager: 5, ale: 3} id value blue 1 green id name age favColours beerRatings 1 John 32 id value blue 1 green key value lager 5 ale 3

slide-75
SLIDE 75

Property graphs as nested relations

  • Node/relationship properties:
  • List:
  • Map:
  • Paths: […]

id name age favColours beerRatings 1 John 32 [blue, green] {lager: 5, ale: 3} id name age favColours beerRatings 1 John 32 {lager: 5, ale: 3} id value blue 1 green id name age favColours beerRatings 1 John 32 id value blue 1 green key value lager 5 ale 3

slide-76
SLIDE 76

Flattening NRA to FRA

  • It is possible to transform NRA to flat algebra expressions
  • Research questions:
  • Does it solve the schema calculation problem?
  • Is it fast enough for practical implementations?

Paredaens, J. and Van Gucht, D.: Converting nested algebra expressions into flat algebra expressions. ACM Transactions on Database Systems (TODS), 1992 http://dl.acm.org/citation.cfm?id=128768

slide-77
SLIDE 77

Incremental maintenance of FRA

Szárnyas, G., Maginecz, J. and Varró, D.: Evaluation of optimization strategies for incremental graph queries. Periodica Polytechnica EECS, 2017 http://docs.inf.mit.bme.hu/preprints/perpol2016-gqo.pdf For a change Δ𝑡 on the input

  • define change Δ𝑢 on the output
  • update internal data structures

Maintenance of the antijoin operator

slide-78
SLIDE 78

IRE – Incremental Relational Engine

  • Incremental (flat) relational engine built on Akka
  • Independent from Cypher and property graphs

Source code: https://github.com/ftsrg/ingraph/tree/master/ire

slide-79
SLIDE 79

OCIM1 revisited

  • Composite

data structures

  • Lists
  • Maps
  • Paths
  • Nested

data structures

[ e1, e2, {k: […]} ]

slide-80
SLIDE 80

OCIM1 revisited

  • Composite

data structures

  • Lists
  • Maps
  • Paths
  • Nested

data structures

[ e1, e2, {k: […]} ]

  

slide-81
SLIDE 81

OCIM1 revisited

  • Composite

data structures

  • Lists
  • Maps
  • Paths
  • Nested

data structures

[ e1, e2, {k: […]} ]

  

( )

slide-82
SLIDE 82

OCIM1 revisited

  • Composite

data structures

  • Lists
  • Maps
  • Paths
  • Nested

data structures

[ e1, e2, {k: […]} ]

    

( )

slide-83
SLIDE 83

OCIM1 revisited

  • Composite

data structures

  • Lists
  • Maps
  • Paths
  • Nested

data structures

[ e1, e2, {k: […]} ]

     

( )

slide-84
SLIDE 84

OCIM1 revisited

  • Composite

data structures

  • Lists
  • Maps
  • Paths
  • Nested

data structures

[ e1, e2, {k: […]} ]

     

( )

CIR-2017-220

slide-85
SLIDE 85

Current challenges

slide-86
SLIDE 86

Update operations

  • Presume a perfectly working incremental query engine
  • How to perform updates?
  • Low-level API operations:

indexer.addTuple()

  • Adding new nodes:

CREATE (…)

  • Matching and creating:

MATCH (n) CREATE (n)-[:REL]->(:Label)

  • Loading CSVs (legacy construct):

LOAD CSV FROM … AS line CREATE (:Label {prop1: toInt(line[2]), …})

slide-87
SLIDE 87

Update operations

  • Presume a perfectly working incremental query engine
  • How to perform updates?
  • Low-level API operations:

indexer.addTuple()

  • Adding new nodes:

CREATE (…)

  • Matching and creating:

MATCH (n) CREATE (n)-[:REL]->(:Label)

  • Loading CSVs (legacy construct):

LOAD CSV FROM … AS line CREATE (:Label {prop1: toInt(line[2]), …})

Not well suited to Rete

slide-88
SLIDE 88

Roadmap

  • Research
  • Formalise openCypher using Nested Relational Algebra
  • Transform nested expressions to Flat Relational Algebra
  • Development
  • Support for LDBC’s Social Network Benchmark / BI workload
  • Use TCK for testing
  • Implement NRA to FRA transformation
  • See if it works
  • Run benchmarks
  • Use Akka clustering and Docker Compose for deployment
  • Discover more use cases
slide-89
SLIDE 89

Related resources

  • Repository:

https://github.com/ftsrg/ingraph

  • Technical report:

http://docs.inf.mit.bme.hu/ingraph/pub/opencypher-report.pdf

  • Formalisation (preprint):

https://arxiv.org/abs/1705.02844