finding representatives in a heterogeneous network
play

Finding representatives in a heterogeneous network Laura Langohr - PowerPoint PPT Presentation

Outline Introduction K -medoids Experiments Future Work Conclusion Finding representatives in a heterogeneous network Laura Langohr Department of Computer Science University of Helsinki May 19, 2009 Laura Langohr Finding representatives


  1. Outline Introduction K -medoids Experiments Future Work Conclusion Finding representatives in a heterogeneous network Laura Langohr Department of Computer Science University of Helsinki May 19, 2009 Laura Langohr Finding representatives in a heterogeneous network

  2. Outline Introduction K -medoids Experiments Future Work Conclusion Introduction K -medoids Experiments Future Work Conclusion Laura Langohr Finding representatives in a heterogeneous network

  3. Outline Introduction K -medoids Experiments Future Work Conclusion Motivation • Finding representative vertices • Given a list of 100 vertices • But only resources to study 10 vertices • Cluster 100 vertices in 10 clusters • For each cluster suggest a vertex as representative Laura Langohr Finding representatives in a heterogeneous network

  4. Outline Introduction K -medoids Experiments Future Work Conclusion Example graph ���� ���� D � � 0 . 72 � 0 . 54 � � � � � � ���� ���� � � � � � � � � � 0 . 9 � B 0 . 62 � � � � � � 0 . 9 � 0 . 51 � � � � � � ���� ���� ���� ���� � � � � � � � � � � 0 . 64 � � � C F � � � 0 . 55 � � � � � � � � � 0 . 54 � 0 . 83 � � � � � � � � � � � � � � � � � 0 . 71 � ���� ���� ���� ���� � � � � � � � � � A E � ����������� � � � 0 . 71 � 0 . 63 0 . 78 � � � � � 0 . 55 � � � � Laura Langohr Finding representatives in a heterogeneous network

  5. Outline Introduction K -medoids Experiments Future Work Conclusion K -medoids • Clustering method • Objects are partitioned into k clusters • First, an initial partitioning is created • The partition is then iteratively improved • Cluster centers are objects → medoids Laura Langohr Finding representatives in a heterogeneous network

  6. Outline Introduction K -medoids Experiments Future Work Conclusion Algorithm 1. K objects are randomly chosen as medoids 2. Assign remaining objects to the medoid that is the nearest 3. Calculate new medoid for each cluster Laura Langohr Finding representatives in a heterogeneous network

  7. Outline Introduction K -medoids Experiments Future Work Conclusion K -means • K -medoids is similar to k -means • K -means uses mean value as cluster center Laura Langohr Finding representatives in a heterogeneous network

  8. Outline Introduction K -medoids Experiments Future Work Conclusion K -medoids vs k -means Laura Langohr Finding representatives in a heterogeneous network

  9. Outline Introduction K -medoids Experiments Future Work Conclusion K -medoids in a heterogeneous network • Select few representatives from a large set of vertices • Representatives should be independent of each other • Relations between two vertices in a graph → link • Including undiscovered relations • Undiscovered relations are manifested as path(s) Laura Langohr Finding representatives in a heterogeneous network

  10. Outline Introduction K -medoids Experiments Future Work Conclusion Measure for link strength • Probability of a path is the product of the probabilities of the edges along the path g ( p ) = � k i =1 w ( e i ) • Probability of the best path between two vertices P bp = p ∈ Pa ( G , o , o ′ ) g ( p ) max ���� ���� D ���� ���� 0.54 � � � � � � � � � � � B 0.62 � � � � 0.51 � ���� ���� ���� ���� � � � � � � � � � � C F � � 0.55 � � � � 0.83 � � � ���� ���� � ���� ���� � � � � � � � � � � 0.71 � � A E � � ������ � � � � � � Laura Langohr Finding representatives in a heterogeneous network

  11. Outline Introduction K -medoids Experiments Future Work Conclusion Algorithm 1. Calculate similarity matrix 2. Choose k objects randomly as initial medoids 3. Assign each remaining object to the most similar medoid 4. Calculate new medoid for each cluster P bp ( G , o , o ′ ) medoid ( C j ) = argmax � o ∈ C j o ′ ∈ C j o ′ � = o Repeat steps 3. and 4. until clustering converges Laura Langohr Finding representatives in a heterogeneous network

  12. Outline Introduction K -medoids Experiments Future Work Conclusion Biomine • 12 biological databases are integrated • Over 1 million vertices • Over 9 million edges �� �� �� �� Gene:434 � 0.54 � � �� �� � � �� �� � � � � � � Pathway:04916 0.62 � � � � � 0.51 � � � � � � �� �� �� �� �� �� �� �� � � � � � � � Gene:4157 Gene:4948 � � � 0.55 � � � � � � � � 0.83 � � � � � � � � � � � �� � �� � �� �� �� �� � �� �� 0.71 � � � � � � Phenotype:203200 Gene:7299 � ��������� � � � � � � � � � � http://biomine.cs.helsinki.fi Laura Langohr Finding representatives in a heterogeneous network

  13. Outline Introduction K -medoids Experiments Future Work Conclusion Artificial example • Three phenotypes, for each three genes • k -medoids with nine genes, and k = 3 Laura Langohr Finding representatives in a heterogeneous network

  14. Outline Introduction K -medoids Experiments Future Work Conclusion Result Laura Langohr Finding representatives in a heterogeneous network

  15. Outline Introduction K -medoids Experiments Future Work Conclusion Future Work • Hierarchical clustering • Statistical evaluation • Comparison to an existing method Laura Langohr Finding representatives in a heterogeneous network

  16. Outline Introduction K -medoids Experiments Future Work Conclusion Conclusion • Finding representative vertices, e.g. genes • K -medoids on Biomine • Example with nine genes is promising Laura Langohr Finding representatives in a heterogeneous network

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend