http cs224w stanford edu subnetworks or subgraphs are the
play

http://cs224w.stanford.edu Subnetworks , or subgraphs, are the - PowerPoint PPT Presentation

CS224W: Machine Learning with Graphs Jure Leskovec, Stanford University http://cs224w.stanford.edu Subnetworks , or subgraphs, are the building blocks of networks: They have the power to characterize and discriminate networks 10/9/18 Jure


  1. CS224W: Machine Learning with Graphs Jure Leskovec, Stanford University http://cs224w.stanford.edu

  2. ¡ Subnetworks , or subgraphs, are the building blocks of networks: ¡ They have the power to characterize and discriminate networks 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 2

  3. Subgraph decomposition of an electronic circuit Oxford Protein Informatics Group 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 3

  4. Let’s consider all possible (non-isomorphoic) directed subgraphs of size 3 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 4

  5. ¡ For each subgraph: § Imagine you have a metric capable of classifying the subgraph “significance” [more on that later] § Negative values indicate under-representation § Positive values indicate over-representation ¡ We create a network significance profile: § A feature vector with values for all subgraph types ¡ Next: Compare profiles of different networks: § Regulatory network (gene regulation) § Neuronal network (synaptic connections) § World Wide Web (hyperlinks between pages) § Social network (friendships) § Language networks (word adjacency) 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 5

  6. Network significance profile Gene regulation networks Neurons Web and social Language networks Networks from the same domain have similar significance profiles Milo et al. , Science 2004 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 6

  7. Networks based on their significance profiles regulatory networks Gene Network significance profile similarity Neurons Social WWW Correlation in Language significance profile of the English and French language networks Closely related networks have more similar significance profiles Milo et al. , Science 2004 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 7

  8. 1) Subgraphs: Defining Motifs and graphlets § § Finding Motifs and Graphlets 2) Structural roles in networks § RolX: Structural Role Discovery Method 3) Discovering structural roles and its applications: Structural similarity § Role generalization and transfer learning § Making sense of roles § 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 8

  9. ¡ Network motifs: “recurring, significant patterns of interconnections” ¡ How to define a network motif: § Pattern: Small induced subgraph § Recurring: Found many times, i.e., with high frequency § Significant: More frequent than expected, i.e., in randomly generated networks § Erdos-Renyi random graphs, scale-free networks 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 10

  10. ¡ Motifs: § Help us understand how networks work § Help us predict operation and reaction of the network in a given situation ¡ Examples: Feed-forward loop § Feed-forward loops: found in networks of neurons, where they neutralize “biological noise” § Parallel loops: found in food webs § Single-input modules: found in gene control networks Single-input module Parallel loop 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 11

  11. )/$0123 Induced subgraph of interest (aka Motif): No match! Match! (not induced) (induced) Induced subgraph of graph G is a graph, formed from a subset X of the vertices of graph G and all of the edges connecting pairs of vertices in subset X. 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 12

  12. Motif of interest: ¡ Allow overlapping of motifs ¡ Network on the right has 4 occurrences of the motif: § {1,2,3,4,5} § {1,2,3,4,6} § {1,2,3,4,7} § {1,2,3,4,8} Example borrowed from Pedro Ribeiro 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 13

  13. ¡ Key idea: Subgraphs that occur in a real network much more often than in a random network have functional significance Milo et. al., Science 2002 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 14

  14. ¡ Motifs are overrepresented in a network when compared to randomized networks: § 𝑎 " captures statistical significance of motif 𝒋 : '()* − , ')-. )/std(𝑂 " ')-. ) 𝑎 " = (𝑂 " 𝑂 " '()* is #(subgraphs of type 𝑗) in network 𝐻 '()* § 𝑂 " ')-. is #(subgraphs of type 𝑗) in randomized network 𝐻 ')-. § 𝑂 " ¡ Network significance profile (SP): : 𝑇𝑄 " = 𝑎 " / 8 𝑎 9 9 § 𝑇𝑄 is a vector of normalized Z-scores § 𝑇𝑄 emphasizes relative significance of subgraphs: § Important for comparison of networks of different sizes § Generally, larger networks display higher Z-scores 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 15

  15. ¡ Goal: Generate a random graph with a given degree sequence k 1 , k 2 , … k N ¡ Useful as a “null” model of networks: § We can compare the real network 𝐻 '()* and a “random” 𝐻 ')-. which has the same degree sequence as 𝐻 '()* ¡ Configuration model: C A A C B D B D B C D A Randomly pair up Resulting graph Nodes with spokes “mini”-n0des We ignore double edges and self-loops when creating the final graph 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 16

  16. ¡ Start from a given graph 𝑯 ¡ Repeat the switching step 𝑅 ⋅ |𝐹 | times: § Select a pair of edges A à B, C à D at random C A § Exchange the endpoints to give A à D, C à B B D § Exchange edges only if no multiple edges or self-edges are generated C A ¡ Result: A randomly rewired graph: B D § Same node degrees, randomly rewired edges ¡ 𝑅 is chosen large enough (e.g., 𝑅 = 100 ) for the process to converge 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 17

  17. Network significance profile '()* − , ')-. )/std(𝑂 " ')-. ) 𝑎 " = (𝑂 " 𝑂 " Milo et al. , Science 2004 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 18

  18. ¡ Count subgraphs 𝑗 in 𝐻 '()* ¡ Count subgraphs 𝑗 in random networks 𝐻 ')-. : § Configuration model: Each 𝐻 ')-. has the same #(nodes), #(edges) and #(degree distribution) as 𝐻 '()* ¡ Assign Z-score to 𝑗 : '()* − , ')-. )/std(𝑂 " ')-. ) § 𝑎 " = (𝑂 " 𝑂 " § High Z-score: Subgraph 𝑗 is a network motif of 𝑯 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 19

  19. ¡ Canonical definition: § Directed and undirected § Colored and uncolored § Temporal and static motifs ¡ Variations on the concept § Different frequency concepts § Different significance metrics § Under-Representation ( anti-motifs ) § Different constraints for null model 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 20

  20. Z-scores of individual motifs for different networks Milo et al. , Science 2004 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 21

  21. Z-scores of individual motifs for different networks Milo et al. , Science 2004 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 22

  22. ¡ Network of neurons and a gene network contain similar motifs: § Feed-forward loops and bi-fan structures § Both are information processing networks with sensory and acting components ¡ Food webs have parallel loops: § Prey of a particular predator share prey ¡ WWW network has bidirectional links § Design that allows the shortest path between sets of related pages 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 23

  23. 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 24

  24. ¡ Graphlets : connected non-isomorphic subgraphs § Induced subgraphs of any frequency For 𝒐 = 𝟒, 𝟓, 𝟔, … 𝟐𝟏 there are 𝟑, 𝟕, 𝟑𝟐, … 𝟐𝟐𝟖𝟐𝟕𝟔𝟖𝟐 graphlets! Przulj et al. , Bioinformatics 2004 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 25

  25. ¡ Next: Use graphlets to obtain a node-level subgraph metric ¡ Degree counts #(edges) that a node touches: § Can we generalize this notion for graphlets? – Yes! ¡ Graphlet degree vector counts #(graphlets) that a node touches 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 26

  26. ¡ An automorphism orbit takes into account the symmetries of a subgraph ¡ Graphlet Degree Vector (GDV) : a vector with the frequency of the node in each orbit position ¡ Example: Graphlet degree vector of node v For a node 𝑣 of graph 𝐻 , the automorphism orbit of 𝑣 is 𝑃𝑠𝑐 𝑣 = {𝑤 ∈ 𝑊 𝐻 ; 𝑤 = 𝑔 𝑣 for some 𝑔 = Aut(𝐻)} . The Aut denotes an automorphism group of 𝐻 , i.e., an isomorphism from 𝐻 to itself. 10/9/18 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend