Community Detection : A Simple Example Joon Ho Park, Yumlembam - - PowerPoint PPT Presentation

community detection
SMART_READER_LITE
LIVE PREVIEW

Community Detection : A Simple Example Joon Ho Park, Yumlembam - - PowerPoint PPT Presentation

Community Detection : A Simple Example Joon Ho Park, Yumlembam Hemajit and Ki-Ho Lee Project Motivation To understand the basics of community det ection To apply the ideas on traditional methods fo r community detection to known system


slide-1
SLIDE 1

Community Detection : A Simple Example

Joon Ho Park, Yumlembam Hemajit and Ki-Ho Lee

slide-2
SLIDE 2

Project Motivation

  • To understand the basics of community det

ection

  • To apply the ideas on traditional methods fo

r community detection to known system

  • To figure out the clustering of proteins from t

he analogy of the project

slide-3
SLIDE 3

Quick Review of Community Detection

  • Traditional Methods of Clustering

– Graph Partitioning

  • Dividing vertices in groups of predefined size
  • Minimizing cut size (# edges running between clusters)

– Hierarchical clustering

  • Including small clusters in larger clusters according to similarity
  • Agglomerative (bottom-up) or divisive (top-down) algorithms

– Partitional clustering

  • Distance between vertices = dissimilarity between vertices
  • E.g., k-means clustering: minimizing the total intra-cluster distance

– Spectral clustering

  • Clustering by eigenvectors of matrices (e.g., similarity matrix)
slide-4
SLIDE 4

Mountain Top Valley Top Valley Top Mountain Hub Mountain Con dominium & S ki Valley Condo minium & Ski

slide-5
SLIDE 5
slide-6
SLIDE 6

# of vertices : 16 # of edges : 27 Z1 VT V VH Z2 MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2

slide-7
SLIDE 7

Graph Partitioning

  • Simplest conditions

– Dividing into two groups of equal size – Minimal # of edges between two groups – Maximal # of edges inside the modules – Kernighan-Ling algorithm

  • Maximizing Q
  • Q = (# of edges inside the modules) – (# of edges lying between them)
slide-8
SLIDE 8

Z1 VT V VH Z2 MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 Cut Size : 11 Q = 4

slide-9
SLIDE 9

Z2 Z1 VT V VH MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 Cut Size : 9 Q = 9

slide-10
SLIDE 10

Z2 Z1 VT V VH MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 Cut Size : 8 Q = 10

slide-11
SLIDE 11

Z2 Z1 VT V VH MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 Cut Size : 6 Q = 15

slide-12
SLIDE 12

AT1 V Z2 Z1 VT VH MT H AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 Cut Size : 5 Q = 17

slide-13
SLIDE 13

AT1 V Z2 Z1 VT VH MT H AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 Cut Size : 5 Q = 17

slide-14
SLIDE 14

AT1 V Z2 Z1 VT VH MT H AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 Cut Size : 5 Q = 17

slide-15
SLIDE 15

AT1 V Z2 Z1 VT VH MT H AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 Cut Size : 5 Q = 17

slide-16
SLIDE 16

Hierarchical Clustering

  • Simplest conditions

– Divisive algorithm

  • Clusters are iteratively split by removing edges conn

ecting vertices with low similarity

– Vertex similarity

  • Defined by the # of edge-(or vertex-) independent pa

ths between two vertices

  • Independent paths do not share any edge (vertex).
slide-17
SLIDE 17

Z1 VT V VH Z2 MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 4 2 3 3 3 3 2 2 3 2 3 2 3 3 3 3 3 3 2 4 3 3 3 2 2 2 3

slide-18
SLIDE 18

Z1 VT V VH Z2 MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 2 3 3 3 3 3 3 3 2

slide-19
SLIDE 19

Z1 VT V VH Z2 MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 2 3 3 3 3

slide-20
SLIDE 20

Z1 VT V VH Z2 MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 3 3 3 3 3

slide-21
SLIDE 21

Z1 VT V VH Z2 MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 2 3 3 3 2

slide-22
SLIDE 22

1 2 3 4 6 5 7 9 10 11 14 12 13 8 15 16

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ − − − − − = 2 ... ... ... ... ... ... ... ... 3 1 ... 4 1 ... 1 3 1 ... 1 3 L

size cut 4 1

T

= Ls s

slide-23
SLIDE 23

1 2 3 4 6 5 7 9 10 11 14 12 13 8 15 16

size cut 4 1

T

= Ls s

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ − − − − − − − − = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 S

slide-24
SLIDE 24

5

slide-25
SLIDE 25
slide-26
SLIDE 26

Z1 VT V VH Z2 MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 4 2 3 3 3 3 2 2 3 2 3 2 3 3 3 3 3 3 2 4 3 3 3 2 2 2 3

slide-27
SLIDE 27

Z1 VT V VH Z2 MT H AT1 AP1 MH AT2 MC&S AP3 Z3 VC&S AP2 2 3 3 3 3

Not good enough !

slide-28
SLIDE 28

Conclusions

  • Graph partitioning proposes a basic idea for com

munity detection

  • The concept of similarity is adopted to hierarchic

al, partitional and spectral clustering

  • We’ve realized that the community detection can

be used for the clustering of protein databases if the similarity is replaced by the score (TM-score

  • r RMSD, etc)
slide-29
SLIDE 29