community detection
play

Community Detection : A Simple Example Joon Ho Park, Yumlembam - PowerPoint PPT Presentation

Community Detection : A Simple Example Joon Ho Park, Yumlembam Hemajit and Ki-Ho Lee Project Motivation To understand the basics of community det ection To apply the ideas on traditional methods fo r community detection to known system


  1. Community Detection : A Simple Example Joon Ho Park, Yumlembam Hemajit and Ki-Ho Lee

  2. Project Motivation • To understand the basics of community det ection • To apply the ideas on traditional methods fo r community detection to known system • To figure out the clustering of proteins from t he analogy of the project

  3. Quick Review of Community Detection • Traditional Methods of Clustering – Graph Partitioning • Dividing vertices in groups of predefined size • Minimizing cut size (# edges running between clusters) – Hierarchical clustering • Including small clusters in larger clusters according to similarity • Agglomerative (bottom-up) or divisive (top-down) algorithms – Partitional clustering • Distance between vertices = dissimilarity between vertices • E.g., k - means clustering: minimizing the total intra-cluster distance – Spectral clustering • Clustering by eigenvectors of matrices (e.g., similarity matrix)

  4. Mountain Top Valley Top Mountain Hub Valley Top Mountain Con dominium & S Valley Condo ki minium & Ski

  5. MT Z1 H VT # of vertices : 16 AT1 # of edges : 27 Z2 AP1 MH V VH AP2 AT2 Z3 AP3 MC&S VC&S

  6. Graph Partitioning • Simplest conditions – Dividing into two groups of equal size – Minimal # of edges between two groups – Maximal # of edges inside the modules – Kernighan-Ling algorithm • Maximizing Q • Q = ( # of edges inside the modules ) – ( # of edges lying between them )

  7. MT Z1 H VT AT1 Z2 AP1 MH V VH AP2 AT2 Z3 Cut Size : 11 AP3 Q = 4 MC&S VC&S

  8. MT Z1 H VT AT1 Z2 AP1 MH V Cut Size : 9 VH Q = 9 AP2 AT2 Z3 AP3 MC&S VC&S

  9. MT Z1 H VT AT1 Z2 AP1 MH V VH AP2 Cut Size : 8 AT2 Q = 10 Z3 AP3 MC&S VC&S

  10. MT Z1 H VT AT1 Cut Size : 6 Z2 Q = 15 AP1 MH V VH AP2 AT2 Z3 AP3 MC&S VC&S

  11. Cut Size : 5 MT Z1 Q = 17 H VT AT1 Z2 AP1 MH V VH AP2 AT2 Z3 AP3 MC&S VC&S

  12. MT Z1 H VT AT1 Cut Size : 5 Z2 Q = 17 AP1 MH V VH AP2 AT2 Z3 AP3 MC&S VC&S

  13. MT Cut Size : 5 Z1 H Q = 17 VT AT1 Z2 AP1 MH V VH AP2 AT2 Z3 AP3 MC&S VC&S

  14. MT Z1 H VT AT1 Z2 AP1 MH V VH AP2 AT2 Z3 Cut Size : 5 Q = 17 AP3 MC&S VC&S

  15. Hierarchical Clustering • Simplest conditions – Divisive algorithm • Clusters are iteratively split by removing edges conn ecting vertices with low similarity – Vertex similarity • Defined by the # of edge-(or vertex-) independent pa ths between two vertices • Independent paths do not share any edge (vertex).

  16. MT Z1 3 2 H VT 2 3 2 3 AT1 4 3 Z2 3 3 3 2 AP1 3 MH 3 3 2 V 3 VH 2 3 4 3 AP2 AT2 3 Z3 3 3 2 2 AP3 2 MC&S VC&S

  17. MT Z1 H VT 2 AT1 Z2 3 3 AP1 MH 3 3 V VH 3 3 AP2 AT2 Z3 3 2 AP3 MC&S VC&S

  18. MT Z1 H VT 2 AT1 Z2 3 AP1 MH 3 V VH 3 3 AP2 AT2 Z3 AP3 MC&S VC&S

  19. MT Z1 H VT AT1 Z2 3 3 AP1 MH 3 V VH 3 3 AP2 AT2 Z3 AP3 MC&S VC&S

  20. MT Z1 H VT 2 AT1 Z2 AP1 MH 3 V VH 3 AP2 AT2 Z3 3 2 AP3 MC&S VC&S

  21. 3 1 2 T s Ls cut size 4 = 1 4 10 6 9 11 5 7 12 14 3 1 0 0 ... 0 − ⎡ ⎤ 8 ⎢ ⎥ 1 3 1 0 ... 0 − − ⎢ ⎥ 13 0 1 4 0 ... 0 ⎢ ⎥ − L = ⎢ ⎥ 16 0 0 1 3 ... 0 − 15 ⎢ ⎥ ⎢ ... ... ... ... ... ... ⎥ ⎢ ⎥ 0 0 0 0 ... 2 ⎢ ⎥ ⎣ ⎦

  22. 3 1 2 T s Ls cut size 4 = 1 4 10 6 1 ⎡ ⎤ 9 ⎢ ⎥ 1 11 5 ⎢ ⎥ 1 7 ⎢ ⎥ ⎢ ⎥ 1 12 ⎢ ⎥ 14 1 ⎢ ⎥ ⎢ ⎥ 1 8 ⎢ ⎥ ⎢ ⎥ 1 ⎢ ⎥ 13 1 ⎢ ⎥ S = ⎢ ⎥ 1 − 16 ⎢ ⎥ 15 1 ⎢ − ⎥ ⎢ ⎥ 1 − ⎢ ⎥ 1 ⎢ ⎥ − ⎢ ⎥ 1 − ⎢ ⎥ ⎢ ⎥ 1 − ⎢ ⎥ 1 − ⎢ ⎥ ⎢ ⎥ 1 − ⎣ ⎦

  23. 5

  24. MT Z1 3 2 H VT 2 3 2 3 AT1 4 3 Z2 3 3 3 2 AP1 3 MH 3 3 2 V 3 VH 2 3 4 3 AP2 AT2 3 Z3 3 3 2 2 AP3 2 MC&S VC&S

  25. MT Z1 H VT 2 AT1 Z2 AP1 MH 3 V 3 VH 3 AP2 AT2 Not good 3 Z3 enough ! AP3 MC&S VC&S

  26. Conclusions • Graph partitioning proposes a basic idea for com munity detection • The concept of similarity is adopted to hierarchic al, partitional and spectral clustering • We ’ ve realized that the community detection can be used for the clustering of protein databases if the similarity is replaced by the score (TM-score or RMSD, etc)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend