graph mining
play

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of - PowerPoint PPT Presentation

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph Computation Think like a vertex Linear algebra Graph Search Find instances of path expressions Graph Mining Mine patterns of


  1. Graph Mining Marco Serafini COMPSCI 532 Lecture 11

  2. Classes of Graph Systems • Graph Computation • Think like a vertex • Linear algebra • Graph Search • Find instances of path expressions • Graph Mining • Mine patterns of interest and their matches 3 3

  3. Applications of Graph Mining • Web and Advertising • Link spam detection • Identify sub-markets • Attributed edges in knowledge bases • Biology • DNA motif detection • Protein-protein interaction • Social computing • Friend recommendation • Community detection 4 4

  4. Graph Mining - Concepts 1 2 1 1 4 4 3 4 2 3 3 2 5 6 6 6 6 6 Input graph Pattern Embeddings 5

  5. Graph Exploration • Enumerate (& prune) embeddings • Aggregate by pattern … … … Input graph … … … 6 6

  6. Challenges # unique embedding (log-scale) 1.7B ! ! ! l a i t n 117M e n o p x E 7.8M 335K 22K 4K 1 2 3 4 5 6 Size of embedding • Exponential number of embeddings 7 7

  7. API Example: Clique finding boolean filter (Embedding e) { 1 return isClique (e); 2 } 3 void process (Embedding e) { 4 output (e); 5 } 6 boolean shouldExpand (Embedding embedding) { 7 return embedding.getNumVertices() < maxsize ; 8 } 9 boolean isClique (Embedding e) { 10 return e.getNumEdgesAddedWithExpansion()==e.getNumberOfVertices()-1; 11 } 12 8 8

  8. Model - Think Like an Embedding Exploration step i+1 Exploration step i 1 2 1 2 1 2 … 1 2 1 2 1 3 3 3 3 6 6 6 Input Output Input Output true 1 2 Filter Process 1 2 6 false 1 3 Save 1 2 Discard 3 1. Start from a 2. Candidates : 3. Filter 4. Produce outputs set of initial Expand by 1 uninteresting embeddings vertex/edge candidates 9 9

  9. Avoiding redundant work • Problem: Automorphic embeddings • Automorphisms == subgraph equivalences • Redundant work == 1 2 3 3 2 1 Worker 1 Worker 2 10 10

  10. Avoiding redundant work • Solution: Decentralized Embedding Canonicality • No coordination • Efficient == 1 2 3 3 2 1 Worker 1 Worker 2 isCanonical(e) → true isCanonical(e) → false 11 11

  11. Embedding Canonicality • isCanonical(e) iff at every step add neighbor with smallest ID e Initial embedding (e) 5 6 ● 1 - 3 - 6 Expansions: 1 4 ● 1 - 3 - 6 - 5 → canonical ● 1 - 3 - 6 - 4 → canonical 2 3 ● 1 - 3 - 6 - 2 → not canonical (1 - 2 - 3 - 6) 12 12

  12. Efficient Pattern Aggregation • Goal: Aggregate automorphic patterns to single key • Find canonical pattern • No known polynomial solution 1 2 2 4 3 5 3x Expensive graph canonization Canonical pattern 13

  13. Efficient Pattern Aggregation • Solution: 2-level pattern aggregation 1. Embeddings → quick patterns 2. Quick patterns → canonical pattern 1 2 2 4 3 5 3x Linear matching to quick pattern 1) Quick patterns 2x Expensive graph canonization 2) Canonical pattern 14

  14. Handling Exponential growth • Goal: handle trillions+ different embeddings? • Solution: Overapproximating DAGs (ODAGs) • Compress into less restrictive superset • Deal with spurious embeddings Canonical Embeddings 2 3 1 4 2 2 1 1 4 3 3 3 2 4 1 4 5 4 4 3 2 3 4 5 1 5 2 4 5 3 4 5 ODAG Input Graph Embedding List 15 15

  15. Variants of Graph Mining Systems • G-Miner • For each embedding, decide how to expand • Easier to implement graph search • Systems for random walks • ASAP: Random walks for approximate subgraph enumeration • KnightKing: Random walks for node embeddings and graph neural networks 16 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend