Graph Exploration w/ Neo4j 1 - - PowerPoint PPT Presentation

graph exploration w neo4j
SMART_READER_LITE
LIVE PREVIEW

Graph Exploration w/ Neo4j 1 - - PowerPoint PPT Presentation

Graph Exploration w/ Neo4j 1 https://s3.amazonaws.com/dev.assets.neo4j.com/wp-content/uploads/graph-data-technologies-graph-databases-for-beginners.png Our Project Partners 2 Efficiently extracting knowledge from graph data even if we do not


slide-1
SLIDE 1

Graph Exploration w/ Neo4j

1

https://s3.amazonaws.com/dev.assets.neo4j.com/wp-content/uploads/graph-data-technologies-graph-databases-for-beginners.png

slide-2
SLIDE 2

Our Project Partners

2

slide-3
SLIDE 3

3

GRAPH EXPLORATION

Efficiently extracting knowledge from graph data

even if we do not know exactly what we are looking for

Graph Exploration: From Users to Large Graphs. CIKM 2016, SIGMOD 2017, KDD 2018

slide-4
SLIDE 4

Adaptive Databases

GRAPH EXPLORATION

4

Graph Exploration Stack

Intuitive queries Interactive algorithms

Users Graph

slide-5
SLIDE 5

Adaptive Databases Intuitive queries Interactive algorithms

GRAPH EXPLORATION

5

Graph Exploration Stack

Users Graph

T h i s p r

  • j

e c t i s a b

  • u

t . . .

slide-6
SLIDE 6

6

3,27 × 109

base pairs

Graph Exploration in Biology - Complex Graphs

http://jcs.biologists.org/content/joces/118/21/4947/F3.large.jpg

slide-7
SLIDE 7

Graph Exploration in Biology - Status Quo

M A T C H ( p 1 : P h e n

  • t

y p e )

  • [

: H A S ]

  • (

a 1 : A s s

  • c

i a t i

  • n

)

  • [

: H A S ]

  • (

s n p : S n p )

  • [

: H A S ]

  • (

a 2 : A s s

  • c

i a t i

  • n

)

  • [

: H A S ]

  • (

p 2 : P h e n

  • t

y p e ) W H E R E p 1 . n a m e = ' f

  • 1

‘ A N D p 2 . n a m e = ' f

  • 2

p ‘ A N D a 1 . p < . 1 A N D a 2 . p < . 1 W I T H D I S T I N C T s n p O R D E R B Y s n p . s i d R E T U R N c

  • l

l e c t ( s n p . s i d ) 7 MATCH (p1:Phenotype)-[:HAS]-(a1:Association)-[:HAS]-(snp:Snp)-[:HAS]-(a2:Association)-[:HAS]-(p2:Phenotype ) WHERE p1.name = 'foo1' AND p2.name = ‘foo2' AND a1.p < 0.01 AND a2.p < 0.01 WITH DISTINCT snp MATCH (snp)-[:IN]-(pw:PositionWindow)<-[:IN]-(l:Locus)--(g:Gene) WHERE l.feature= 'gene' RETURN collect(DISTINCT g.name) MATCH (p1:Phenotype)-[:HAS]-(a1:Association)-[:HAS]-(snp:Snp)-[:HAS]-(a2:Association)-[:HAS]-(p2:Phenotype) WHEREp1.name = 'foo1' ANDp2.name = ‘foo2' ANDa1.p < 0.01 AND a2.p < 0.01 WITH DISTINCT snp MATCH(snp)-[:IN]-(pw:PositionWindow)<-[:IN]-(l:Locus)--(g:Gene) WHERE l.feature= 'gene' WITH DISTINCT g ORDER BY g.name MATCH(g)-[:CODES]-(:Transcript)-[:CODES]-(p:Protein)-[:MEMBER]-(go:Goterm) WHERE go.namespace= 'biological_process' WITH DISTINCT go,p RETURN go.name, count(p) ORDER BY count(p)DESC LIMIT 10 M A T C H ( p 1 : P h e n

  • t

y p e )

  • [

: H A S ]

  • (

a 1 : A s s

  • c

i a t i

  • n

)

  • [

: H A S ]

  • (

s n p : S n p )

  • [

: H A S ]

  • (

a 2 : A s s

  • c

i a t i

  • n

)

  • [

: H A S ]

  • (

p 2 : P h e n

  • t

y p e ) W H E R E p 1 . n a m e = ' f

  • 1

' A N D p 2 . n a m e = ‘ f

  • 2

' A N D a 1 . p < . 1 A N D a 2 . p < . 1 W I T H D I S T I N C T s n p M A T C H ( s n p )

  • [

: I N ]

  • (

p w : P

  • s

i t i

  • n

W i n d

  • w

) <

  • [

: I N ]

  • (

l : L

  • c

u s )

  • (

g : G e n e ) W H E R E l . f e a t u r e = ' g e n e ' W I T H D I S T I N C T g O R D E R B Y g . n a m e M A T C H ( g )

  • [

: C O D E S ]

  • (

: T r a n s c r i p t )

  • [

: I S ]

  • (

p s : P r

  • b

e s e t )

  • [

: S I G ]

  • (

s : S a m p l e ) W H E R E s . n a m e = ' m u s t a f a v i ' R E T U R N D I S T I N C T g . n a m e

slide-8
SLIDE 8

Can we do better?

8

slide-9
SLIDE 9

Problem

  • Given two node sets:

How similar are they in my understanding?

  • Example

→ Set of movies I like → Set of movies I don’t know → Will I like the movies I don’t know?

9 As Good As It Gets Hell or High Water Pulp Fiction The Matrix Skyfall Avatar

?

slide-10
SLIDE 10

What is a Knowledge Graph?

  • (directed) graph G : ⟨V, E, φ, ψ⟩, where

V is a set of nodes,

E ⊆ V × V is a set of edges,

φ : V → LV is an edge labeling function and

ψ : E → LE is a node labeling function We refer to the elements of LV and LE as node labels and edge labels

10

slide-11
SLIDE 11

What are Meta-Paths?

Graph Path Graph Schema Meta-Path

A meta-path for a path ⟨n1 , ..., nt ⟩, ni ∈ V , 1 ≤ i ≤ t is a sequence P : ⟨φ(n1 ),ψ (n1 , n2 ), ..., ψ (nt−1 , nt ), φ(nt)⟩ that alternates node- and edge-types along the path.

11

slide-12
SLIDE 12

Motivating Example

Q: How famous is Diane Kruger in America?

MATCH(n:Person) WHERE n.name = “Diane Kruger” RETURN n MATCH(m:Movie) WHERE m.location = “America” RETURN m

Diane Kruger As Good As It Gets Stand By Me Top Gun Pulp Fiction A Few Good Men The Matrix Up 12

slide-13
SLIDE 13

How similar are they?

  • Similarity depends on

○ expert knowledge ○ connections among nodes

13

slide-14
SLIDE 14

Individualized exploration Extract ratings Compute Meta-Paths

What does the System do and how?

Overview

✗ ◎

Learn representation for meta-paths Calculate similarity

14

slide-15
SLIDE 15

Approximate Meta-Paths

Problem: How to compute all meta-paths fast?

  • Approx. Solution: Mine meta-paths using the graph’s schema and learn classifier
  • n real meta-paths

Meta-Paths Computation

Compute schema

15

slide-16
SLIDE 16

Learning a Meta-Path Embedding

Problem: Vector representation required for active learning and preference prediction.

Meta-Paths Embedding

?

(3 5 1)T

16

slide-17
SLIDE 17

Learning a Meta-Path Embedding

Problem: Vector representation required for active learning and preference prediction. Solution: Embed meta-paths → Similar meta-paths should have similar vectors. Our method: Transfer text embedding method to meta-paths.

Meta-Paths Embedding

(3 5 1)T

17

slide-18
SLIDE 18

Learn the Domain Value of all Meta-Paths

  • Problem: Users don’t want to rate all meta-paths

→ too many → time-consuming → tedious and boring

  • Solution: Label only a few, but very informative paths

Active Learning

✓ ✗ ◎

18

slide-19
SLIDE 19

Use Learned Preferences for Graph Exploration

Result Explanation

Icons made by Eucalyp from www.flaticon.com is licensed by CC 3.0 BY

Graph (with meta-paths) Domain Knowledge

What is important in the graph?

Personalized Exploration Tool Similarity Measure Related Nodes Stats

19

slide-20
SLIDE 20

Personalized Node Embedding

Result Explanation

Transform Nodes to Vectors

(Graph-Embedding)

Adapt Vectors Using Domain-Knowledge Personalized Vector Space

precomputed 20

slide-21
SLIDE 21

Personalized Exploration Tool

Applications

What nodes are close to my selection? How close are my sets? Find clusters! What are

  • utliers?

Personalized Vector Space

21

slide-22
SLIDE 22

System Architecture - How does it work with Neo4j?

Neo4j Graph Database Neo4j Graph Algorithm Procedures Containing Meta-Paths Computation Python Backend Server ReactJS Frontend

Meta-Path Embedding Active Learning Explanation Node selection Meta-Path

  • rdering

Result visualization

22

slide-23
SLIDE 23
  • Easy to get your code running in neo4j.
  • Neo4j-graph-algorithms: efficiency vs convenience.
  • Sometimes no stack-trace when an error occurs.
  • Great support and community. Always available.
  • Cypher: Easy to begin with, hard to master.

(hpi)-[:LIKES]->(neo4j)

What about neo4j?

Meta-Paths Computation

23

slide-24
SLIDE 24

Trending: #tweetyourthesis

24

slide-25
SLIDE 25

Trending: #tweetyourthesis

25