Graph Mining
Marco Serafini
COMPSCI 532 Lecture 11
Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of - - PowerPoint PPT Presentation
Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph Computation Think like a vertex Linear algebra Graph Search Find instances of path expressions Graph Mining Mine patterns of
COMPSCI 532 Lecture 11
3
3
4
4
5
1 4 6 5 1 6 1 3 6 4 3 6 4 2 6 2
Input graph Pattern Embeddings
3 2
6
Input graph … … … … … … 6
7
Size of embedding 4K 22K 335K 7.8M 117M 1.7B 1 2 3 4 5 6
# unique embedding (log-scale) E x p
e n t i a l ! ! !
7
8
boolean filter(Embedding e) { return isClique(e); } void process(Embedding e) {
} boolean shouldExpand(Embedding embedding) { return embedding.getNumVertices() < maxsize; } boolean isClique(Embedding e) { return e.getNumEdgesAddedWithExpansion()==e.getNumberOfVertices()-1; }
8
1 2 3 4 5 6 7 8 9 10 11 12
9
1 2 3 1 2 1 3 3 6 1 2 6
Exploration step i Exploration step i+1 Input Output
1 2 3 1 2 6
Input Output
1 2 3 1 2 6
Expand by 1 vertex/edge Filter Discard false
uninteresting candidates Process Save
true
1 2 1 3
set of initial embeddings 9 …
10
1 2 3
10
3 2 1
Worker 1 Worker 2 ==
11
1 2 3
11
3 2 1
Worker 1 Worker 2 == isCanonical(e) → true isCanonical(e) → false
12
1 2 3 6 4 5
Initial embedding (e)
Expansions:
12
13
1 2 2 4 3 5
3x Expensive graph canonization Canonical pattern
14
1 2 2 4 3 5
3x Linear matching to quick pattern 2) Canonical pattern 1) Quick patterns 2x Expensive graph canonization
15
4 1 5 2 3
Canonical Embeddings
1 4 2 1 4 3 1 4 5 2 3 4 2 4 5 3 4 5
Input Graph Embedding List
1 2 3 3 4 2 3 4 5
ODAG 15
16
16