growing a graph matching from a handful of seeds
play

Growing a Graph Matching from a Handful of Seeds Ehsan KAZEMI 1 , S. - PowerPoint PPT Presentation

Growing a Graph Matching from a Handful of Seeds Ehsan KAZEMI 1 , S. Hamed HASSANI 2 , and Matthias GROSSGLAUSER 1 1 School of Computer and Communication Sciences, EPFL 2 Department of Computer Science, ETHZ September 1, 2015 Motivation Example


  1. Growing a Graph Matching from a Handful of Seeds Ehsan KAZEMI 1 , S. Hamed HASSANI 2 , and Matthias GROSSGLAUSER 1 1 School of Computer and Communication Sciences, EPFL 2 Department of Computer Science, ETHZ September 1, 2015

  2. Motivation Example 1: network de-anonymization z@epfl.ch Ehsan@epfl.ch y@epfl.ch Matthias@epfl.ch x@epfl.ch Hamed@epfl.ch Anonymized e-mail network Linkedin connections Example 2: protein-protein interaction network alignment P04637 P55957 P01127 Q58A65 O60271 Q8WUU5 Q920S3 P06436 P58391 Q9Y365 Q9JMD3 P62805 P62806 P00742 O88947 Q07890 P46108 Q92934 Human network Mouse network 1/18

  3. Motivation Graph matching (also known as network reconciliation or network alignment ) is studied in many fields: Network analysis: matching networks in similar domains for friend suggestion and personalized advertisements Bioinformatics: protein-protein interaction networks alignment Document and Image processing: OCR and handwritten recognition Biometric identification: face authentication and recognition Image database: matching graph segments of two scenes Matching graph segments of scenes [Lazebnik et al., 2006] 2/18

  4. What is Graph Matching? Goal: find the unknown matching (bijection) between nodes in the intersection of the two graphs G 1 ( V 1 , E 1 ) and G 2 ( V 2 , E 2 ) where the presence of edges between the same nodes in the two graphs are correlated Questions: When is it possible to align? How to align? graph matching algorithms Is it possible to use only the graph structures to establish the true matching between the nodes? 3/18

  5. Algorithm, Model and Performance Guarantee Algorithm: percolation graph matching [Yartseva and Grossglauser, 2013; Chiasserini et al., 2014; Korula and Lattanzi, 2014] Model: a random bigraph generator [Pedarsani and Grossglauser, 2011; Kazemi et al., 2015] Performance guarantee: theory of bootstrap percolation over random graphs [Janson et al., 2010] 4/18

  6. Percolation Graph Matching An initial candidate set of seed pairs Every non-matched pair with r neighbouring seed-pairs get matched and becomes a new seed 5/18

  7. Percolation Graph Matching An initial candidate set of seed pairs Every non-matched pair with r neighbouring seed-pairs get matched and becomes a new seed 5/18

  8. Percolation Graph Matching An initial candidate set of seed pairs Every non-matched pair with r neighbouring seed-pairs get matched and becomes a new seed 5/18

  9. Percolation Graph Matching An initial candidate set of seed pairs Every non-matched pair with r neighbouring seed-pairs get matched and becomes a new seed 5/18

  10. Percolation Graph Matching An initial candidate set of seed pairs Every non-matched pair with r neighbouring seed-pairs get matched and becomes a new seed Size of the final matching vs. number of initial seeds 5/18

  11. Bi( G ; t, s ) : A Random Bigraph Model Bi( G ; t, s ) is a random bigraph model to generate two correlated graphs G ( V, E ) Node sampling Bi( ; t, s ) Edge sampling G 1 ( V 1 , E 1 ) G 2 ( V 2 , E 2 ) 6/18

  12. Bootstrap Percolation Marks are spread over the tensor product of the two graphs: Green nodes are correct pairs Red nodes are wrong pairs Green nodes are more connected n 2 − n nodes ( u 2 , u 1 ) ( u 1 , u 4 ) n nodes ( u 2 , u 3 ) ( u 1 , u 3 ) ( u 1 , u 1 ) ( u 2 , u 2 ) ( u 2 , u 4 ) ( u 1 , u 2 ) ( u 3 , u 1 ) ( u 4 , u 3 ) ( u 4 , u 4 ) ( u 3 , u 3 ) ( u 3 , u 2 ) ( u 4 , u 2 ) ( u 3 , u 4 ) ( u 4 , u 1 ) 7/18

  13. Bootstrap Percolation Marks are spread over the tensor product of the two graphs: Green nodes are correct pairs Red nodes are wrong pairs Green nodes are more connected n 2 − n nodes ( u 2 , u 1 ) ( u 1 , u 4 ) n nodes ( u 2 , u 3 ) ( u 1 , u 3 ) ( u 1 , u 1 ) ( u 1 , u 1 ) ( u 2 , u 2 ) ( u 2 , u 4 ) ( u 1 , u 2 ) ( u 3 , u 1 ) ( u 4 , u 3 ) ( u 4 , u 4 ) ( u 3 , u 3 ) ( u 3 , u 3 ) ( u 3 , u 2 ) ( u 4 , u 2 ) ( u 3 , u 4 ) ( u 4 , u 1 ) 7/18

  14. Bootstrap Percolation Marks are spread over the tensor product of the two graphs: Green nodes are correct pairs Red nodes are wrong pairs Green nodes are more connected n 2 − n nodes ( u 2 , u 1 ) ( u 1 , u 4 ) n nodes ( u 2 , u 3 ) ( u 1 , u 3 ) ( u 1 , u 1 ) ( u 1 , u 1 ) ( u 2 , u 2 ) ( u 2 , u 2 ) ( u 2 , u 4 ) ( u 1 , u 2 ) ( u 3 , u 1 ) ( u 4 , u 3 ) ( u 4 , u 4 ) ( u 4 , u 4 ) ( u 3 , u 3 ) ( u 3 , u 3 ) ( u 3 , u 2 ) ( u 4 , u 2 ) ( u 3 , u 4 ) ( u 4 , u 1 ) 7/18

  15. Bootstrap Percolation: Phase Transition Supercritical regime: percolates to whole network PGM Seed set Matched set Subcritical regime: dies young PGM Seed set Matched set 8/18

  16. NoisySeeds Algorithms State-of-the-art PGM algorithms needs many seeds : with even moderate number of seeds percolation stuck in early steps Finding many seeds is difficult and expensive Observation: PGM is robust to the noise n 2 − n nodes ( u 2 , u 1 ) ( u 1 , u 4 ) n nodes ( u 2 , u 3 ) ( u 1 , u 3 ) ( u 1 , u 1 ) ( u 2 , u 2 ) ( u 2 , u 4 ) ( u 1 , u 2 ) ( u 3 , u 1 ) ( u 4 , u 3 ) ( u 4 , u 4 ) ( u 3 , u 3 ) ( u 3 , u 2 ) ( u 4 , u 2 ) ( u 3 , u 4 ) ( u 4 , u 1 ) 9/18

  17. NoisySeeds Algorithms State-of-the-art PGM algorithms needs many seeds : with even moderate number of seeds percolation stuck in early steps Finding many seeds is difficult and expensive Observation: PGM is robust to the noise n 2 − n nodes ( u 2 , u 1 ) ( u 1 , u 4 ) ( u 1 , u 4 ) n nodes ( u 2 , u 3 ) ( u 1 , u 3 ) ( u 1 , u 1 ) ( u 1 , u 1 ) ( u 2 , u 2 ) ( u 2 , u 4 ) ( u 2 , u 4 ) ( u 1 , u 2 ) ( u 1 , u 2 ) ( u 3 , u 1 ) ( u 4 , u 3 ) ( u 4 , u 4 ) ( u 3 , u 3 ) ( u 3 , u 3 ) ( u 3 , u 2 ) ( u 4 , u 2 ) ( u 3 , u 4 ) ( u 3 , u 4 ) ( u 4 , u 1 ) 9/18

  18. NoisySeeds Algorithms State-of-the-art PGM algorithms needs many seeds : with even moderate number of seeds percolation stuck in early steps Finding many seeds is difficult and expensive Observation: PGM is robust to the noise n 2 − n nodes ( u 2 , u 1 ) ( u 1 , u 4 ) ( u 1 , u 4 ) n nodes ( u 2 , u 3 ) ( u 1 , u 3 ) ( u 1 , u 1 ) ( u 1 , u 1 ) ( u 2 , u 2 ) ( u 2 , u 2 ) ( u 2 , u 4 ) ( u 2 , u 4 ) ( u 1 , u 2 ) ( u 1 , u 2 ) ( u 3 , u 1 ) ( u 4 , u 3 ) ( u 4 , u 4 ) ( u 4 , u 4 ) ( u 3 , u 3 ) ( u 3 , u 3 ) ( u 3 , u 2 ) ( u 4 , u 2 ) ( u 3 , u 4 ) ( u 3 , u 4 ) ( u 4 , u 1 ) 9/18

  19. NoisySeeds Algorithms Addition of many wrong pairs to the initial candidate set have a negligible effect on the performance of NoisySeeds Expand NoisySeeds Seed set Expanded noisy seed set Matched set Matched set 10/18

  20. NoisySeeds: Performance Guarantee Theorem (Performance Guarantee over Bi( G ( n, p ); t, s ) ) For Bi( G ( n, p ); t, s ) with fixed s and t assume n − 1 ≪ p ≤ n − 5 6 − ǫ , provided a seed set of 1   r − 1 - a t,s,r = (1 − 1 ( r − 1)! r ) correct pairs   nt 2 ( ps 2 ) r - O ( n ) wrong pairs, with high probability NoisySeeds percolates and outputs nt 2 ± o ( n ) correct pairs o ( n ) wrong pairs 11/18

  21. ExpandWhenStuck A heuristic based on the idea of robustness to noisy pairs Percolation process is stuck Node u is matched (correctly) u 1 u 1 u 2 u 2 u u u 5 u 4 u 3 u 3 12/18

  22. ExpandWhenStuck Unmatched neighbouring pairs of node-pair [ u, u ] are new candidate pairs Two graphs are correlated : among new candidate pairs a small fraction is correct , e.g, [ u 1 , u 1 ] PGM is robust to the noise in candidate pairs G 1 G 2 u 1 u 1 u 2 u 2 u u u 4 u 5 u 3 u 3 13/18

  23. ExpandWhenStuck Expand the candidate pairs by many noisy pairs whenever the percolation process stuck NoisySeeds Seed set Matched set 14/18

  24. ExpandWhenStuck Expand the candidate pairs by many noisy pairs whenever the percolation process stuck NoisySeeds Seed set Matched set 14/18

  25. ExpandWhenStuck Expand the candidate pairs by many noisy pairs whenever the percolation process stuck Matched set NoisySeeds Seed set Expand Expanded noisy candidate set 14/18

  26. ExpandWhenStuck Expand the candidate pairs by many noisy pairs whenever the percolation process stuck Matched set NoisySeeds Seed set Expand Expanded noisy candidate set 14/18

  27. ExpandWhenStuck Expand the candidate pairs by many noisy pairs whenever the percolation process stuck Matched set NoisySeeds Seed set Expand Expanded noisy candidate set 14/18

  28. ExpandWhenStuck Expand the candidate pairs by many noisy pairs whenever the percolation process stuck Matched set NoisySeeds Seed set Expand Expanded noisy candidate set 14/18

  29. Experiment 1: Random Graphs ExpandWhenStuck vs. PercolateMatche [Yartseva and Grossglauser, 2013] over Bi( G ( n, p ); t, s ) with n = 10 6 , n and t 2 = 1 . 0 p = 20 ✶❡✰✵✻ ✽✵✵✵✵✵ ❚♦t❛❧ ♥✉♠❜❡r ♦❢ ♠❛t❝❤❡❞ ♣❛✐rs ✻✵✵✵✵✵ ✹✵✵✵✵✵ P❡r❝♦❧❛t❡▼❛t❝❤❡❞ 1906 s❡❡❞s ❢♦r s 2 = 0 . 81 3052 s❡❡❞s ❢♦r s 2 = 0 . 64 5207 s❡❡❞s ❢♦r s 2 = 0 . 49 ✷✵✵✵✵✵ ❊①♣❛♥❞❲❤❡♥❙t✉❝❦✱ s 2 = 0 . 81 ❊①♣❛♥❞❲❤❡♥❙t✉❝❦✱ s 2 = 0 . 64 ❊①♣❛♥❞❲❤❡♥❙t✉❝❦✱ s 2 = 0 . 49 ✵ ✺ ✶✵ ✶✺ ✷✵ ✷✺ ✸✵ ✸✺ ✹✵ ✹✺ ✺✵ ◆✉♠❜❡r ♦❢ s❡❡❞s 238 times improvement for s 2 = 0 . 81 15/18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend