GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten - PowerPoint PPT Presentation

Graph Mining and Graph Kernels GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan* ^University of Cambridge *IBM T. J. Watson Research Center August 24, 2008 | ACM SIG KDD, Las Vegas

Graph Mining and Graph Kernels Graphs Are Everywhere Magwene et al. Genome Biology 2004 5 :R100 �� Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 2

Graph Mining and Graph Kernels Part I: Graph Mining – from a pattern discovery perspective Graph Pattern Mining � Frequent graph patterns � Pattern summarization � Optimal graph patterns � Graph patterns with constraints � Approximate graph patterns Graph Classification � Pattern-based approach � Decision tree � Decision stumps Graph Compression Other important topics (graph model, laws, graph dynamics, social network analysis, visualization, summarization, graph clustering, link analysis, � ) Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 3

Graph Mining and Graph Kernels Applications of Graph Patterns � Mining biochemical structures � Finding biological conserved subnetworks � Finding functional modules � Program control flow analysis � Intrusion network analysis � Mining communication networks � Anomaly detection � Mining XML structures � Building blocks for graph classification, clustering, compression, comparison, correlation analysis, and indexing � … Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 4

Graph Mining and Graph Kernels Graph Pattern Mining multiple graphs setting Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 5

Graph Mining and Graph Kernels Graph Patterns Interestingness measures / Objective functions • Frequency: frequent graph pattern • Discriminative: information gain, Fisher score • Significance: G-test • … Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 6

Graph Mining and Graph Kernels Frequent Graph Pattern Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 7

Graph Mining and Graph Kernels Example: Frequent Subgraphs CHEMICAL COMPOUNDS … (a) caffeine (b) diurobromine (c) viagra FREQUENT SUBGRAPH Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 8

Graph Mining and Graph Kernels Example (cont.) PROGRAM CALL GRAPHS 1� 1� 1� 1:�makepat� 2� 2� 2� 2:�esc� 3:�addstr� 3� 3� 3� 6� 4:�getccl� 5:�dodash� 4� 4� 4� 6: in_set_2� 7� 7:�stclose� 5� 5� 5� (1)� (2)� (3)� FREQUENT SUBGRAPHS 1� (MIN SUPPORT IS 2) 2� 2� 3� 3� 4� 4� 5� 5� (1)� (2)� Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 9

Graph Mining and Graph Kernels Graph Mining Algorithms Inductive Logic Programming (WARMR, King et al. 2001) – Graphs are represented by Datalog facts Graph Based Approaches � Apriori-based approach – AGM/AcGM: Inokuchi, et al. (PKDD’00) – FSG: Kuramochi and Karypis (ICDM’01) – PATH # : Vanetik and Gudes (ICDM’02, ICDM’04) – FFSM: Huan, et al. (ICDM’03) and SPIN: Huan et al. (KDD’04) – FTOSM: Horvath et al. (KDD’06) � Pattern growth approach – Subdue: Holder et al. (KDD’94) – MoFa: Borgelt and Berthold (ICDM’02) – gSpan: Yan and Han (ICDM’02) – Gaston: Nijssen and Kok (KDD’04) – CMTreeMiner: Chi et al. (TKDE’05) – LEAP: Yan et al. (SIGMOD’08) Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 10

Graph Mining and Graph Kernels Apriori Property �� Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 11

Graph Mining and Graph Kernels Cost Analysis �� Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 12

Graph Mining and Graph Kernels Properties of Graph Mining Algorithms Search Order � breadth vs. depth � complete vs. incomplete Generation of Candidate Patterns � apriori vs. pattern growth Discovery Order of Patterns � DFS order � path � tree � graph Elimination of Duplicate Subgraphs � passive vs. active Support Calculation � embedding store or not Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 13

Graph Mining and Graph Kernels Generation of Candidate Patterns �� # � �� # � ! � � � � � "�� $ Apriori-Based Approach VS. Pattern-Growth Approach Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 14

Graph Mining and Graph Kernels Discovery Order: Free Extension �� Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 15

Graph Mining and Graph Kernels Discovery Order: Right-Most Extension (Yan and Han ICDM’02) �� right-most path depth-first search �� Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 16

Graph Mining and Graph Kernels Duplicates Elimination Existing patterns Newly discovered pattern Option 1 � Check graph isomorphism of with each graph (slow) Option 2 � Transform each graph to a canonical label, create a hash value for this canonical label, and check if there is a match with (faster) Option 3 � Build a canonical order and generate graph patterns in that order (fastest) Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 17

Graph Mining and Graph Kernels Performance: Run Time (Wörlein et al. PKDD’05) The AIDS antiviral screen compound dataset from NCI/NIH '�� %��&� Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 18

Graph Mining and Graph Kernels Performance: Memory Usage (Wörlein et al. PKDD’05) %��(��)� %��&� Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 19

Graph Mining and Graph Kernels Graph Pattern Explosion Problem � If a graph is frequent, all of its subgraphs are frequent ─ the Apriori property � An n -edge frequent graph may have 2 n subgraphs! � In the AIDS antiviral screen dataset with 400+ compounds, at the support level 5%, there are > 1M frequent graph patterns Conclusions: Many enumeration algorithms are available AGM, FSG, gSpan, Path-Join, MoFa, FFSM, SPIN, Gaston, and so on, but two significant problems exist Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 20

Graph Mining and Graph Kernels Pattern Summarization (Xin et al., KDD’06, Chen et al. CIKM’08) � Too many patterns may not lead to more explicit knowledge � It can confuse users as well as further discovery (e.g., clustering, classification, indexing, etc.) � A small set of “representative” patterns that preserve most of the information relevance� significance� Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 21

Graph Mining and Graph Kernels Pattern Distance �� * * �� +�� +�� ( � ��( Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 22

Graph Mining and Graph Kernels Closed and Maximal Graph Pattern Closed Frequent Graph � A frequent graph G is closed if there exists no supergraph of G that carries the same support as G � If some of G’s subgraphs have the same support, it is unnecessary to output these subgraphs (nonclosed graphs) � Lossless compression: still ensures that the mining result is complete Maximal Frequent Graph � A frequent graph G is maximal if there exists no supergraph of G that is frequent Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 23

Graph Mining and Graph Kernels Number of Patterns: Frequent vs. Closed �� !�� ,�� %�� Karsten Borgwardt and Xifeng Yan | Part I: Graph Mining 24

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten - PowerPoint PPT Presentation

Graph Mining and Graph Kernels GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan* ^University of Cambridge *IBM T. J. Watson Research Center August 24, 2008 | ACM SIG KDD, Las Vegas Graph Mining and Graph

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Overview: Kernels for Sequences and Graphs String Kernels 8 Example Sequence Classification

The Gray Code Kernels The Gray Code Kernels The Gray Code Kernels Gil Ben-Artzi Hagit Hel-Or

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Beta kernels and transformed kernels applications to copulas and quantiles Arthur Charpentier

Kernels on structures Andrea Passerini passerini@disi.unitn.it Machine Learning Kernels on

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Scalable Machine Learning 6. Kernels Alex Smola Yahoo! Research and ANU

SVM Kernels COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning SVM Kernels 1 /

A PRIMER ON GRAPH KERNELS Karsten Borgwardt Interdepartmental Bioinformatics Group

Kernel methods and Graph kernels Social and Technological Networks Rik Sarkar University of

Kernel methods and Graph kernels Social and Technological Networks Rik Sarkar University of

Chapter X: Graph Mining Information Retrieval & Data Mining Universitt des Saarlandes,

Frequent Pattern Mining Frequent Sequence Mining Frequent Tree Mining Christian Borgelt

Graph Essentials Graph Basics Social Media Mining Social Media Mining Measures and Metrics

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Melanoma Detection Using Capsule Networks Saurabh Mathur, Sumangali K. ICNTET 2018 1 Melanoma

! ! ! Dr!Wendy!Yared! Director,!Associa8on!of!European!Cancer!Leagues! ! 30!August!2012!

Topics for Discussion What is a sentinel lymph node (SLN)? Utility of sentinel lymph biopsies:

Molecular Characterization and Therapeutic Targeting of TFE Fusion Kidney Cancers Srinivas R.

Disclosures Towards a more equitable organ distribution system I have no financial disclosure I

Initializing A Max Heap 1 3 2 4 5 6 7 8 9 7 7 11 8 10 input array = [-, 1, 2, 3, 4,

Bridging Molecular Timescales with MELD and Blue Waters Alberto Perez We need to know protein

Self Adjusting Data Structures 1 Self Adjusting Data Structures t ve to front 2

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten - PowerPoint PPT Presentation

Graph Mining and Graph Kernels GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan* ^University of Cambridge *IBM T. J. Watson Research Center August 24, 2008 | ACM SIG KDD, Las Vegas Graph Mining and Graph

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Overview: Kernels for Sequences and Graphs String Kernels 8 Example Sequence Classification

The Gray Code Kernels The Gray Code Kernels The Gray Code Kernels Gil Ben-Artzi Hagit Hel-Or

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Beta kernels and transformed kernels applications to copulas and quantiles Arthur Charpentier

Kernels on structures Andrea Passerini passerini@disi.unitn.it Machine Learning Kernels on

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Scalable Machine Learning 6. Kernels Alex Smola Yahoo! Research and ANU

SVM Kernels COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning SVM Kernels 1 /

A PRIMER ON GRAPH KERNELS Karsten Borgwardt Interdepartmental Bioinformatics Group

Kernel methods and Graph kernels Social and Technological Networks Rik Sarkar University of

Kernel methods and Graph kernels Social and Technological Networks Rik Sarkar University of

Chapter X: Graph Mining Information Retrieval &amp; Data Mining Universitt des Saarlandes,

Frequent Pattern Mining Frequent Sequence Mining Frequent Tree Mining Christian Borgelt

Graph Essentials Graph Basics Social Media Mining Social Media Mining Measures and Metrics

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Melanoma Detection Using Capsule Networks Saurabh Mathur, Sumangali K. ICNTET 2018 1 Melanoma

! ! ! Dr!Wendy!Yared! Director,!Associa8on!of!European!Cancer!Leagues! ! 30!August!2012!

Topics for Discussion What is a sentinel lymph node (SLN)? Utility of sentinel lymph biopsies:

Molecular Characterization and Therapeutic Targeting of TFE Fusion Kidney Cancers Srinivas R.

Disclosures Towards a more equitable organ distribution system I have no financial disclosure I

Initializing A Max Heap 1 3 2 4 5 6 7 8 9 7 7 11 8 10 input array = [-, 1, 2, 3, 4,

Bridging Molecular Timescales with MELD and Blue Waters Alberto Perez We need to know protein

Self Adjusting Data Structures 1 Self Adjusting Data Structures t ve to front 2

Chapter X: Graph Mining Information Retrieval & Data Mining Universitt des Saarlandes,