Cy Cypher pher-based based Graph ph Pattern ttern Ma Matc - - PowerPoint PPT Presentation

cy cypher pher based based graph ph pattern ttern ma matc
SMART_READER_LITE
LIVE PREVIEW

Cy Cypher pher-based based Graph ph Pattern ttern Ma Matc - - PowerPoint PPT Presentation

GRADES2017: Graph Data-management Experiences & Systems Chicago May 2017 Cy Cypher pher-based based Graph ph Pattern ttern Ma Matc tching hing in in Gradoo adoop Martin Junghanns 1 , Max Kieling 1,2 , Alex Averbuch 2 , Andr


slide-1
SLIDE 1

Cy Cypher pher-based based Graph ph Pattern ttern Ma Matc tching hing in in Gradoo adoop

GRADES2017: Graph Data-management Experiences & Systems Chicago May 2017

Martin Junghanns1, Max Kießling1,2, Alex Averbuch2, André Petermann1 and Erhard Rahm1

1University of Leipzig – Database Research Group 2Neo Technology

slide-2
SLIDE 2

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 2

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

2

Mo Moti tivati tion

  • n
slide-3
SLIDE 3

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 3

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

3

Moti tivation tion

slide-4
SLIDE 4

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 4

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

4

Moti tivation tion

„Who are the closest enemies of each Orc?“

slide-5
SLIDE 5

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 5

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

5

Moti tivation tion

Cypher

slide-6
SLIDE 6

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 6

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

6

Moti tivation tion

Flink Gelly

slide-7
SLIDE 7

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 7

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

7

Moti tivation tion

slide-8
SLIDE 8

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 8

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

8

Moti tivation tion

slide-9
SLIDE 9

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 9

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

9

Moti tivation tion

„Which two clan leaders hate each other and one

  • f them knows Frodo over one to ten hops?“
slide-10
SLIDE 10

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 10 10

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

10 10

Moti tivation tion

Cypher

slide-11
SLIDE 11

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 11 11

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

11 11

Moti tivation tion

Flink Gelly (or any other non-declarative graph processing system)

slide-12
SLIDE 12

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 12 12

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

12 12

GRA GRADO DOOP OP

slide-13
SLIDE 13

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 13 13

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark GRADOO DOOP

„An open-source graph dataflow system for declarative analytics of heterogeneous graph data.“

slide-14
SLIDE 14

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 14 14

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark GRADOO DOOP

Distributed Graph Storage (Apache HDFS) Distributed Operator Execution (Apache Flink) Extended Property Graph Model (EPGM) Analytical API I/O Graph Operators Graph Algorithms

slide-15
SLIDE 15

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 15 15

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark GRADOO DOOP

1 2 3 4 5

1 2 3 4 5

Hobbit

name : Samwise

Orc

name : Azog

Clan

name : Tribes of Moria founded : 1981

Orc

name : Bolg

Hobbit

name : Frodo yob : 2968

leaderOf

since : 2790

memberOf

since : 2013

hates

since : 2301

hates knows

since : 2990

Property Graph Model

slide-16
SLIDE 16

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 16 16

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark GRADOO DOOP

1 2 3 4 5

1 2 3 4 5

2 1

Hobbit

name : Samwise

Orc

name : Azog

Clan

name : Tribes of Moria founded : 1981

Orc

name : Bolg

Hobbit

name : Frodo yob : 2968

leaderOf

since : 2790

memberOf

since : 2013

hates

since : 2301

hates knows

since : 2990

|Area|title:Mordor |Area|title:Shire

Extended Property Graph Model

slide-17
SLIDE 17

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 17 17

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark GRADOO DOOP Gradoop Graph Transformations Unary Binary Graph Collection Logical Graph Equality Union Intersection Difference Limit Selection Pattern Matching Distinct Apply Reduce Call Aggregation Pattern Matching Transformation Grouping Call Subgraph Equality Combination Overlap Exclusion Fusion

slide-18
SLIDE 18

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 18 18

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark GRADOO DOOP

3

1 3 4 5 2

Pattern

4 5

1 3 4 2 Graph Collection Pattern Matching (Single-Graph Setting)

GraphCollection collection = graph3.cypher(‘MATCH (:Green)-[:orange]->(:Orange) RETURN *’, ISO, ISO);

slide-19
SLIDE 19

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 19 19

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

19 19

Im Imple plemen menta tati tion

  • n
slide-20
SLIDE 20

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 20 20

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

20 20

Impleme lementa ntation tion

Id Id Label Pro roperti ties 1 Area {title:Mordor} 2 Area {title:Shire} Id Id Label Pro roperti ties Graphs 1 Orc {name:Azog} {1} 2 Clan {name:Tribes of Moria, founded:1981} {1} 3 Orc {name:Bolg} {1,2} 4 Hobbit {name:Frodo, yob:2968} {2} 5 Hobbit {name:Samwise} {2} Id Id Label Sour urce Targ rget Pro roperti ties Graphs 1 leaderOf 1 2 {since:2790} {1} 2 memberOf 3 2 {since:2013} {1} 3 hates 3 4 {since:2301} {2} 4 hates 3 5 {} {2} 5 knows 5 4 {since:2990} {2}

DataSet<EPGMGraphHead> DataSet<EPGMVertex> DataSet<EPGMEdge>

slide-21
SLIDE 21

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 21 21

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

21 21

Impleme lementa ntation tion

Parsing Execution

c1

  • 2

h c2

  • 1

(c1 != c2) AND (o1 != o2) AND (h.name = Frodo Baggins)

=> 23 => 42 => 84 => 123 => 456 => 789 3 4 1 2 3 5 1 2 4 3 6 1 2 4 3 6 1 2 4 7

Planning

slide-22
SLIDE 22

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 22 22

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

22 22

Impleme lementa ntation tion

PlanTableEntry | type: GRAPH | all-vars: [...] | proc-vars: [...] | attr-vars: [] | est-card: 23 | prediates: () | Plan : |-FilterEmbeddingsNode{filterPredicate=((c1 != c2) AND (o1 != o2))} |.|-JoinEmbeddingsNode{joinVariables=[o2], vertexMorphism=H, edgeMorphism=I} |.|.|-JoinEmbeddingsNode{joinVariables=[o1], vertexMorphism=H, edgeMorphism=I} |.|.|.|-JoinEmbeddingsNode{joinVariables=[c1], vertexMorphism=H, edgeMorphism=I} |.|.|.|.|-FilterAndProjectVerticesNode{vertexVar=c1, filterPredicate=((c1.label = Clan)), projectionKeys=[]} |.|.|.|.|-FilterAndProjectEdgesNode{sourceVar='o1', edgeVar='_e0', targetVar='c1', filterPredicate=((_e0.label = leaderOf)), projectionKeys=[]} |.|.|.|-JoinEmbeddingsNode{joinVariables=[o1], vertexMorphism=H, edgeMorphism=I} |.|.|.|.|-FilterAndProjectVerticesNode{vertexVar=o1, filterPredicate=((o1.label = Orc)), projectionKeys=[]} |.|.|.|.|-FilterAndProjectEdgesNode{sourceVar='o1', edgeVar='_e1', targetVar='o2', filterPredicate=((_e1.label = hates)), projectionKeys=[]} |.|.|-JoinEmbeddingsNode{joinVariables=[o2], vertexMorphism=H, edgeMorphism=I} |.|.|.|-JoinEmbeddingsNode{joinVariables=[h], vertexMorphism=H, edgeMorphism=I} |.|.|.|.|-FilterAndProjectVerticesNode{vertexVar=h, filterPredicate=((h.label = Hobbit) AND (h.name = Frodo Baggins)), projectionKeys=[]} |.|.|.|.|-ExpandEmbeddingsNode={startVar='o2', pathVar='_e3', endVar='h', lb=1, ub=10, direction=OUT, vertexMorphism=H, edgeMorphism=I} |.|.|.|.|.|-FilterAndProjectVerticesNode{vertexVar=o2, filterPredicate=((o2.label = Orc)), projectionKeys=[]} |.|.|.|.|.|-FilterAndProjectEdgesNode{sourceVar='o2', edgeVar='_e3', targetVar='h', filterPredicate=((_e3.label = knows)), projectionKeys=[]} |.|.|.|-JoinEmbeddingsNode{joinVariables=[c2], vertexMorphism=H, edgeMorphism=I} |.|.|.|.|-FilterAndProjectVerticesNode{vertexVar=c2, filterPredicate=((c2.label = Clan)), projectionKeys=[]} |.|.|.|.|-FilterAndProjectEdgesNode{sourceVar='o2', edgeVar='_e2', targetVar='c2', filterPredicate=((_e2.label = leaderOf)), projectionKeys=[]}

slide-23
SLIDE 23

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 23 23

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

23 23

Impleme lementa ntation tion

Filter

Hobbit(name=Frodo Baggins)

name: Frodo Baggins height: 1.22m gender: male city: Bag End

Project

[ ]

h.id h.name h.height … 31 Frodo 1.22 … h.id 32 id Properties 1 {…} 2 {…} 3 {…} … …

DataSet<Vertex> DataSet<Embedding> FlatMap(Vertex -> Embedding)

𝜌ℎ.𝐽𝑒(𝑊′) 𝜏 𝑀𝑏𝑐𝑓𝑚=𝐼𝑝𝑐𝑐𝑗𝑢

∧𝑜𝑏𝑛𝑓=𝐺𝑠𝑝𝑒𝑝

(𝑊) FilterAndProject

slide-24
SLIDE 24

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 24 24

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

24 24

Impleme lementa ntation tion c.id _e1.id

  • 1.id

51 11 2 52 12 3 … … …

DataSet<Embedding> DataSet<Embedding>

FlatJoin(lhs, rhs -> combine(lhs, rhs))

DataSet<Embedding>

  • 1.id

_e2.id

  • 2.id

2 13 5 3 14 3 … … … c.id _e1.id

  • 1.id

_e2.id

  • 2.id

51 11 2 13 5 52 12 3 14 3 … … …

Combine

Check for vertex/edge isomorphism, Remove duplicate entries

JoinEmbeddings

Left: (c1:Clan)<-[:hasLeader]-(o1:Orc) Right: (o1:Orc)-[:hates]->(o2.Orc)

𝑀 ⋈𝑝1.𝑗𝑒 𝑆

slide-25
SLIDE 25

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 25 25

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

25 25

Impleme lementa ntation tion

slide-26
SLIDE 26

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 28 28

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

28 28

Be Benc nchm hmar ark

slide-27
SLIDE 27

Cypher-based Graph Pattern Matching in GRADOOP – GRADES 2017 29 29

Moti tivation tion GRADOO DOOP Impleme lementa ntation tion Benc nchmar hmark

29 29

Benc nchmar hmark

  • 16x Intel(R) Xeon(R) 2.50GHz 6 (12), 48 GB RAM
  • 1 Gigabit Ethernet
  • Hadoop 2.6.0
  • Flink 1.1.2

Dataset # Vertices # Edges Disk size

LDBC-SNB 10 29 M 167 M 19 GB LDBC-SNB 100 271 M 1.6 B 191 GB

Q2: Q6:

slide-28
SLIDE 28
  • Gradoop & Extended Property Graph Model
  • Schema flexible: Type Labels and Properties
  • Logical Graphs / Graphs Collection
  • Cypher Pattern Matching Operator
  • Flexible operator for computing matches
  • Combination with existing analytical operators
  • Extendible architecture (planner, statistics, …)
  • Implemented on Apache Flink
  • Horizontal scalability
  • Combine with other Flink libraries

Su Summar ary

slide-29
SLIDE 29

ww www.g .gradoop adoop.com .com

Junghanns, M.; Kießling, M.; Averbuch, A.,; Petermann, A.; Rahm, E. „Cypher-based Graph Pattern Matching in Gradoop“

  • Proc. ACM SIGMOD workshop on Graph Data Management Experiences and Systems (GRADES), 2017

Petermann, A.; Junghanns, M.; Kemper, S.; Gomez, K.; Teichmann, N.; Rahm, E., „Graph Mining for Complex Data Analytics “,

  • Proc. ICDM Conf. (Demo), 2016.

Junghanns, M.; Petermann, A.; Teichmann, N.; Gomez, K.; Rahm, E., „Analyzing Extended Property Graphs with Apache Flink“,

  • Int. Workshop on Network Data Analytics (NDA), SIGMOD, 2016.

Petermann, A.; Junghanns, M., „Scalable Business Intelligence with Graph Collections“, it – Special Issue on Big Data Analytics, 2016. Petermann, A.; Junghanns, M.; Müller, M.; Rahm, E., „Graph-based Data Integration and Business Intelligence with BIIIG“,

  • Proc. VLDB Conf. (Demo), 2014.