Extended Property Graphs and Cypher on Gradoop Martin Junghanns - - PowerPoint PPT Presentation

extended property graphs and cypher on gradoop
SMART_READER_LITE
LIVE PREVIEW

Extended Property Graphs and Cypher on Gradoop Martin Junghanns - - PowerPoint PPT Presentation

1st openCypher Implementers Meeting 8 February 2017 Walldorf, Germany Extended Property Graphs and Cypher on Gradoop Martin Junghanns University of Leipzig Database Research Group Grado Grado doop doop op op Extende ended d Proper


slide-1
SLIDE 1

Extended Property Graphs and Cypher on Gradoop

1st openCypher Implementers Meeting 8 February 2017 Walldorf, Germany

Martin Junghanns

University of Leipzig – Database Research Group

slide-2
SLIDE 2

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 2

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

2

Grado doop

  • p

„An open-source graph dataflow system for declarative analytics of heterogeneous graph data.“

slide-3
SLIDE 3

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 3

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

3

Grado doop

  • p

Distributed Graph Storage (Apache HDFS) Apache Flink Operator Implementation Distributed Operator Execution (Apache Flink) Extended Property Graph Model (EPGM) Graph Dataflow Operators I/O

slide-4
SLIDE 4

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 4

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

4

Extende ended d Proper perty ty Graph phs

  • Vertices and directed Edges
  • Logical Graphs
  • Identifiers
  • Type Labels
  • Properties

1 2 3 4 5

1 2 3 4 5

2 1

Hobbit

name : Samwise

Orc

name : Azog

Clan

name : Tribes of Moria founded : 1981

Orc

name : Bolg

Hobbit

name : Frodo yob : 2968

leaderOf

since : 2790

memberOf

since : 2013

hates

since : 2301

hates knows

since : 2990

|Area|title:Mordor |Area|title:Shire

slide-5
SLIDE 5

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 5

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

5

Extende ended d Proper perty ty Graph phs Graph Operators/Transformations Unary Binary Graph Collection Logical Graph Equality Union Intersection Difference Limit Selection Pattern Matching Distinct Apply Reduce Call Aggregation Pattern Matching Transformation Grouping Call Subgraph Equality Combination Overlap Exclusion Fusion

slide-6
SLIDE 6

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 6

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

6

Extende ended d Proper perty ty Graph phs

3

1 3 4 5 2 3

4

1 2 UDF

LogicalGraph graph3 = readFromHDFS(); LogicalGraph graph4 = graph3.subgraph( (vertex => vertex.getLabel().equals(‘Green’)), (edge => edge.getLabel().equals(‘orange’)));

Subgraph

slide-7
SLIDE 7

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 7

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

7

Extende ended d Proper perty ty Graph phs

3

1 3 4 5 2

Pattern

4 5

1 3 4 2 Graph Collection

GraphCollection collection = graph3.match(‘(:Green)-[:orange]->(:Orange)’);

Pattern Matching (Single Graph Input)

slide-8
SLIDE 8

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 8

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

8

Cypher her on Grado doop

  • p

„Which two clan leaders hate each other and one

  • f them knows Frodo over one to ten hops?“
slide-9
SLIDE 9

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 9

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

9

Cypher her on Grado doop

  • p
slide-10
SLIDE 10

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 10 10

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

10 10

Cypher her on Grado doop

  • p

PlanTableEntry | type: GRAPH | all-vars: [...] | proc-vars: [...] | attr-vars: [] | est-card: 23 | prediates: () | Plan : |-FilterEmbeddingsNode{filterPredicate=((c1 != c2) AND (o1 != o2))} |.|-JoinEmbeddingsNode{joinVariables=[o2], vertexMorphism=H, edgeMorphism=I} |.|.|-JoinEmbeddingsNode{joinVariables=[o1], vertexMorphism=H, edgeMorphism=I} |.|.|.|-JoinEmbeddingsNode{joinVariables=[c1], vertexMorphism=H, edgeMorphism=I} |.|.|.|.|-FilterAndProjectVerticesNode{vertexVar=c1, filterPredicate=((c1.label = Clan)), projectionKeys=[]} |.|.|.|.|-FilterAndProjectEdgesNode{sourceVar='o1', edgeVar='_e0', targetVar='c1', filterPredicate=((_e0.label = leaderOf)), projectionKeys=[]} |.|.|.|-JoinEmbeddingsNode{joinVariables=[o1], vertexMorphism=H, edgeMorphism=I} |.|.|.|.|-FilterAndProjectVerticesNode{vertexVar=o1, filterPredicate=((o1.label = Orc)), projectionKeys=[]} |.|.|.|.|-FilterAndProjectEdgesNode{sourceVar='o1', edgeVar='_e1', targetVar='o2', filterPredicate=((_e1.label = hates)), projectionKeys=[]} |.|.|-JoinEmbeddingsNode{joinVariables=[o2], vertexMorphism=H, edgeMorphism=I} |.|.|.|-JoinEmbeddingsNode{joinVariables=[h], vertexMorphism=H, edgeMorphism=I} |.|.|.|.|-FilterAndProjectVerticesNode{vertexVar=h, filterPredicate=((h.label = Hobbit) AND (h.name = Frodo Baggins)), projectionKeys=[]} |.|.|.|.|-ExpandEmbeddingsNode={startVar='o2', pathVar='_e3', endVar='h', lb=1, ub=10, direction=OUT, vertexMorphism=H, edgeMorphism=I} |.|.|.|.|.|-FilterAndProjectVerticesNode{vertexVar=o2, filterPredicate=((o2.label = Orc)), projectionKeys=[]} |.|.|.|.|.|-FilterAndProjectEdgesNode{sourceVar='o2', edgeVar='_e3', targetVar='h', filterPredicate=((_e3.label = knows)), projectionKeys=[]} |.|.|.|-JoinEmbeddingsNode{joinVariables=[c2], vertexMorphism=H, edgeMorphism=I} |.|.|.|.|-FilterAndProjectVerticesNode{vertexVar=c2, filterPredicate=((c2.label = Clan)), projectionKeys=[]} |.|.|.|.|-FilterAndProjectEdgesNode{sourceVar='o2', edgeVar='_e2', targetVar='c2', filterPredicate=((_e2.label = leaderOf)), projectionKeys=[]}

slide-11
SLIDE 11

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 11 11

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

11 11

Cypher her on Grado doop

  • p

1 2 3 4 5 6 7 8 9 1 37 5 3 7 8 45 99 12 3 1 2 Frodo Baggins 1.22 Saruman 45: [4,1,33]

EmbeddingMetaData – Stores information about the embedding content

Mapping : Variable -> ID Column {h: 0, e1: 1, o2: 5, ...} Mapping : Variable.Property -> Property Column {h.name: 0, h.height: 1, c1.name: 2, ...}

Embedding - Data structure used for intermediate results

Identifiers Properties Paths Embedding

slide-12
SLIDE 12

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 12 12

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

12 12

Cypher her on Grado doop

  • p

Filter

Hobbit(name=Frodo Baggins)

name: Frodo Baggins height: 1.22m gender: male city: Bag End

Project

[ ]

h.id h.name h.height … 31 Frodo 1.22 … h.id 32 id Properties 1 {…} 2 {…} 3 {…} … …

DataSet<Vertex> DataSet<Embedding> FlatMap(Vertex -> Embedding)

𝜌ℎ.𝐽𝑒(𝑊′) 𝜏 𝑀𝑏𝑐𝑓𝑚=𝐼𝑝𝑐𝑐𝑗𝑢

∧𝑜𝑏𝑛𝑓=𝐺𝑠𝑝𝑒𝑝

(𝑊)

slide-13
SLIDE 13

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 13 13

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

13 13

Cypher her on Grado doop

  • p

c.id _e1.id

  • 1.id

51 11 2 52 12 3 … … …

DataSet<Embedding> DataSet<Embedding>

FlatJoin(lhs, rhs -> combine(lhs, rhs))

DataSet<Embedding>

  • 1.id

_e2.id

  • 2.id

2 13 5 3 14 3 … … … c.id _e1.id

  • 1.id

_e2.id

  • 2.id

51 11 2 13 5 52 12 3 14 3 … … …

Combine

Check for vertex/edge isomorphism, Remove duplicate entries

JoinEmbeddings

Left: (c1:Clan)<-[:hasLeader]-(o1:Orc) Right: (o1:Orc)-[:hates]->(o2.Orc)

𝑀 ⋈𝑝1.𝑗𝑒 𝑆

slide-14
SLIDE 14

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 14 14

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

14 14

Cypher her on Grado doop

  • p

ExpandEmbeddings

  • 2.id

5

DataSet<Embedding> DataSet<Embedding> DataSet<Embedding>

_e3.sid _e3.id _e3.tid 5 26 31 31 27 32 32 28 33

  • 2.id

_e3.id h.id 3 [26] 31 3 [26,31,27] 32 3 [26,31,27,32,28] 33

FlatJoin(lhs, rhs -> combine(lhs, rhs))

BulkIteration

𝑀 ⋈𝑝2.𝑗𝑒=_𝑓3.𝑡𝑗𝑒 𝐹

Left: (o2:Orc) Edge: (o2)-[:knows*1..10]->(h)

𝐹′ ⋈𝑓.𝑢𝑗𝑒=_𝑓3.𝑡𝑗𝑒 𝐹 Combine

Check for vertex/edge isomorphism

slide-15
SLIDE 15

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 15 15

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

15 15

Cypher her on Grado doop

  • p
slide-16
SLIDE 16

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 16 16

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

16 16

Cypher her on Grado doop

  • p
slide-17
SLIDE 17

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 17 17

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

17 17

Cypher her on Grado doop

  • p
slide-18
SLIDE 18

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 18 18

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

18 18

Conc nclusio lusion

  • Implement Cypher Technology Compatibility KIT (TCK) integration tests
  • Benchmarking
  • Implement and evaluate LDBC benchmarking queries
  • Optimizations
  • DP-Planner
  • Improve cost model (more statistics, Flink optimizer hints)
  • Reuse of intermediate results
  • Consider graph partitioning
  • Support more Cypher features
  • e.g. Aggregation and Functions
  • Introduce new Cypher features
  • e.g. regular path queries
slide-19
SLIDE 19

Extended Property Graphs and Cypher on Gradoop – 1st openCypher Implementers Meeting – Martin Junghanns 19 19

Grado doop

  • p

Extende ended d Proper perty ty Graph phs Conc nclusio lusion Cypher her on Grado doop

  • p

19 19

Conc nclusio lusion

  • Gradoop on Apache Flink
  • Extended Property Graph abstraction on Apache Flink
  • Schema flexible: Type Labels and Properties
  • Logical Graphs / Graphs Collection
  • Graph Transformations for Graphs and Graph collections
  • Cypher on Gradoop
  • Covering many Cypher features (variable length paths, predicates)
  • Query execution engine incl. Greedy cost-based optimizer
  • Physical operators mapped to Flink transformations
slide-20
SLIDE 20

ww www.g .gradoop adoop.com .com

[1] Junghanns, M.; Petermann, A.; K.; Rahm, E., „Distributed Grouping of Property Graphs with Gradoop“,

  • Proc. BTW Conf. , 2017.

[2] Petermann, A.; Junghanns, M.; Kemper, S.; Gomez, K.; Teichmann, N.; Rahm, E., „Graph Mining for Complex Data Analytics “,

  • Proc. ICDM Conf. (Demo), 2016.

[3] Junghanns, M.; Petermann, A.; Teichmann, N.; Gomez, K.; Rahm, E., „Analyzing Extended Property Graphs with Apache Flink“,

  • Int. Workshop on Network Data Analytics (NDA), SIGMOD, 2016.

[4] Petermann, A.; Junghanns, M., „Scalable Business Intelligence with Graph Collections“, it – Special Issue on Big Data Analytics, 2016. [5] Petermann, A.; Junghanns, M.; Müller, M.; Rahm, E., „Graph-based Data Integration and Business Intelligence with BIIIG“,

  • Proc. VLDB Conf. (Demo), 2014.