A Bayesian method for matching two similar graphs without seeds - PowerPoint PPT Presentation

A Bayesian method for matching two similar graphs without seeds Pedram Pedarsani (EPFL) Matthias Grossglauser (EPFL) Daniel R. Figueiredo (UFRJ) IEEE Allerton Conference 2013 Adapted by Daniel R. Figueiredo

Approximate Graph Matching  Match nodes two structurally related graphs Can we match the nodes? Approximate Graph Matching

Fundamental Questions  When is  How to match approximate graph nodes of two matching feasible? graphs in practice?  Assume graph model  Polynomial time (structure) algorithm to find correct matching  Consider model for graph similarity  Settle for mostly correct matching  Provide conditions for finding correct matching Approximate Graph Matching

Applications  Computer vision: object recognition ᴏ match part of segmented images  Biology: identifying genes or protein functions ᴏ match regulatory gene or protein interaction networks  Social networks: breeching privacy ᴏ identifying nodes using network structure Many applications require matching similar structures Approximate Graph Matching

Edge Sampling Model  Model for graph similarity  Consider fixed graph G ᴏ could be realization of G(n,p)  Sample every edge from G with probability s, iid.  G1 ~ G(s) and G2 ~ G(s) ᴏ G1 and G2 are two independent samples from the same sampled G  Structural correlations between G1 and G2 controlled by parameter s ᴏ s=1 : isomorphism problem, s=0 no structure! ᴏ preserves nodes, randomness only on edges Approximate Graph Matching

Edge Sampling Example 2 1 Fixed (or random) G 3 6 5 4 s s G1 2 G2 2 1 1 3 6 3 6 5 5 4 4  Problem: Match nodes of G1 and G2  Q1: When is it possible?  Q2: How to do it? Approximate Graph Matching

Theoretical Formulation  Consider a mapping π between nodes in G1 and G2 ᴏ n! possible mappings  Consider an error function of a mapping ᴏ ∆ ( π): number of edges that appear in G1 but not G2 (and vice-versa)  Let π 0 be correct mapping. Conditions such that { } π ∆ π → P unique min of ( ) 1 0  Adversary can then correctly match using just structure of graphs ᴏ inspect all mappings, choose one with lowest error Approximate Graph Matching

Theoretical Result threshold for  Assume fixed G ~ G(n,p) nps: E[degree] of G1,2 aug(G)=1  Thm [ PG'12 ]: For G(n,p; s) matching if 2 s = + ω nps 8 log n ( 1 ) − 2 s then correct permutation minimizes ∆( ) , aas. Penalty for difference G1- “growing slowly” G2  T wo pieces of bad news ᴏ surprisingly weak condition: avg degree of G 1,2 growing faster than log n is sufficient ᴏ decrease with s only quadratically Approximate Graph Matching

But in Practice?  Previous result is theoretical ᴏ unconstrained computational power (n! mappings) ᴏ does not help us find the right mapping  Idea: Bayesian framework based on fingerprint of nodes  Compute confidence of pairwise matchings  Reduce to maximum weighted bipartite matching problem  Iterative and incremental algorithm (produce evidence on the run) Approximate Graph Matching

Using Structural Evidence  P[U 1 = U 2 ] : two nodes chosen at random ᴏ 1/n if no other U 1 information  What if degree D 1 = 100, D 2 = 97?  What if degree D 1 = 100, D 2 = 2? U 2  Use degree as structural evidence Approximate Graph Matching

Distances as Evidence  Suppose s 1 is mapped s 1 s 2 to s 2 ᴏ (s 1 , s 2 ): anchor pair U 1  X 11 : distance between U1 and s 1  X 21 : distances between U2 and s 2 U 2  Will consider multiple anchor pairs s 4 s 3  Anchor pair match can be wrong! Approximate Graph Matching

Evidence Probability  Consider fingerprints of nodes U1 and U2 ᴏ F U1 = (D 1 , X 11 , X 12 , ..., X 1s ) s anchor pairs (distances) ᴏ F U2 = (D 2 , X 21 , X 22 , ..., X 2s ) ᴏ X {1,2}i , distance from U1, U2 and anchor i  Prob. of observing these fingerprints U1 = U2: nodes correspond to ᴏ P[F U1 , F U2 | U1 = U2] one another  Assume conditional independence between evidence pairs ᴏ = P[D 1 , D 2 | U1=U2] P[X 11 , X 21 | U1=U2]... P[X 1s , X 2s | U1=U2] Approximate Graph Matching

Evidence Probability  How to calculate P[D 1 , D 2 | U1=U2] or P[X 11 , X 21 | U1=U2] ?  Need a sampling model and prior distribut.  Consider a fixed but hidden G ᴏ assume we know degree, distance distribution  Edge sampling model to generate G1 and G2 ᴏ each edge in G sampled iid with prob s  Can now compute P[D 1 , D 2 | U1=U2] ᴏ P[D 1 , D 2 | U1=U2, D] is a product of binomials with parameters D, s and values D 1 and D 2 ᴏ uncondition D by using prior of G Approximate Graph Matching

Match Probability  Same reasoning for P[F U1 , F U2 | U1 != U2] ᴏ when nodes U1 and U2 are do not correspond  Using both and prior P[U1 = U2] = 1/n  Apply Bayes rule to obtain Prob of match P[U1 = U2 | FP1, FP2] given fingerprints!  M i : indicator for anchor pair i correctly mapped ᴏ P[M i = 1] : prob of anchor pair i correctly mapped P[U1 = U2 | FP1, FP2, M 1 , ..., M s ] ᴏ use priors to marginalize out M i Approximate Graph Matching

Weighted Bipartite Matching Nodes  Complete bipartite graph Nodes in G1 in G2  Weight of edge (U1, U2) = log P[U1 = U2 | FP1, FP2]  Assuming independence, P[all matched pairs | all evidence] = Π P[matched pair | evidence pair] ᴏ maximum weight matching = log ( matching with highest probability )  compute maximum weight matching ᴏ Hungarian algorithm O(n 3 ) Approximate Graph Matching

The Algorithm  Idea: generate and use evidence on the run ᴏ allows matching to change  Algorithm proceed in phases ᴏ in phase i, consider 2 i nodes to match ᴏ bipartite graph has only 2 i nodes  Candidate nodes in phase i are the highest degree nodes of each graph  Use half of matched nodes as anchors for next phase ᴏ best half: matches with highest edge weight  In phase i>1, we use 2 i-2 seeds as evidence ᴏ edge weight from phase i-1 used as prior for correct matched seed in phase i Approximate Graph Matching

Illustration of Algorithm Phase 1: Phase 2: Phase 3: 2 candidates 4 candidates 8 candidates 0 seeds used 1 seed used 2 seeds used 1 seed prod. 2 seeds prod. 4 seeds prod. . . . Green: correct decreasing Red: incorrect degree Thick : highest weight . . . . . . . . . Approximate Graph Matching

Evaluation  Email exchange network among EPFL users ᴏ Social network, week timescale  Experiment 1: ᴏ accumulate network for 5 weeks (2024 nodes, 25K edges) ᴏ edge sample network twice for different s values  Experiment 2: ᴏ accumulate network for 10 weeks (considering only nodes that appear in all weeks) ᴏ time shifted accumulation gives second network, overlap of 9,8,..., 1 week ᴏ No explicit edge sampling, s estimated from dataset based on overlapped edges Approximate Graph Matching

Evaluation: Experiment 1 Run time 90% error if performance overlap is 50% for different samples 5% error if overlap is 80% Expected fraction of edges that appear in both G1 and G2 Approximate Graph Matching

Evaluation: Experiment 2 Results can be very good! Results indicate sharp transition in edge overlap Time overlap Expected fraction of edges that appear in both G1 and G2 Approximate Graph Matching

Conclusions  Network privacy seems hard ᴏ in theory and practice! ᴏ two networks matched using just structure (no other side information) ᴏ conditions on avg. degree and edge overlap not unrealistic  Principled graph matching algorithm ᴏ sampling model allows for Bayesian formulation and bipartite matching ᴏ incremental and iterative approach: generate and use more evidence with uncertainty ᴏ performance is good if above threshold Approximate Graph Matching

Thank You  Questions or comments?  contact: daniel@land.ufrj.br Collaborators: Matthias Grossglauser Pedram Pedarsani Approximate Graph Matching

A Bayesian method for matching two similar graphs without seeds - PowerPoint PPT Presentation

A Bayesian method for matching two similar graphs without seeds Pedram Pedarsani (EPFL) Matthias Grossglauser (EPFL) Daniel R. Figueiredo (UFRJ) IEEE Allerton Conference 2013 Adapted by Daniel R. Figueiredo Approximate Graph Matching

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Matching of Matrix Elements and Parton Showers CKKW matching in e + e collisions Lecture 2:

Global Shape Matching Section 3.3: Articulated Matching using Graph Cuts Global Shape Matching:

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Trigonometric functions Step one: similar triangles Two similar triangles have the same set of

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Similarity is crucial to cognition General (often implicit) hypothesis: similar stimulus in

8.1 Matching in General Graphs For the most part, weve discussed matching restricted to

Sources for this lecture 3. Matching in bipartite and general graphs The material for this

1 Matching in General Graphs For the most part, weve discussed matching restricted to

Matching Bipartite Matching Input Given a (undirected) graph G = ( V , E ) Input Given a bipartite

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

Chapter 5.9 Stable Matchings: The Gale Shapley Algorithm Prof. Tesler Math 154 Winter 2020

Correlation Neglect in Student-to-School Matching Alex Rees-Jones, Ran Shorrer, and Chloe Tergiman

Lecture 28/Chapters 22 & 23 Hypothesis Tests Variable Types and Appropriate Tests

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed

Bicrossed products whose dual has property (RD) but not polynomial growth Hua Wang Institut de

January 2017 Data Linkage: An Overview Natalie Shlomo University of Manchester 1

+ Outcomes of a national web-based education program to support screening for cancer related

Matching Coefficients in NRQCD and HQET P . Marquard Institut for Theoretical Particle Physics

A Bayesian method for matching two similar graphs without seeds - PowerPoint PPT Presentation

A Bayesian method for matching two similar graphs without seeds Pedram Pedarsani (EPFL) Matthias Grossglauser (EPFL) Daniel R. Figueiredo (UFRJ) IEEE Allerton Conference 2013 Adapted by Daniel R. Figueiredo Approximate Graph Matching

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Matching of Matrix Elements and Parton Showers CKKW matching in e + e collisions Lecture 2:

Global Shape Matching Section 3.3: Articulated Matching using Graph Cuts Global Shape Matching:

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Trigonometric functions Step one: similar triangles Two similar triangles have the same set of

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Similarity is crucial to cognition General (often implicit) hypothesis: similar stimulus in

8.1 Matching in General Graphs For the most part, weve discussed matching restricted to

Sources for this lecture 3. Matching in bipartite and general graphs The material for this

1 Matching in General Graphs For the most part, weve discussed matching restricted to

Matching Bipartite Matching Input Given a (undirected) graph G = ( V , E ) Input Given a bipartite

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

Chapter 5.9 Stable Matchings: The Gale Shapley Algorithm Prof. Tesler Math 154 Winter 2020

Correlation Neglect in Student-to-School Matching Alex Rees-Jones, Ran Shorrer, and Chloe Tergiman

Lecture 28/Chapters 22 &amp; 23 Hypothesis Tests Variable Types and Appropriate Tests

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed

Bicrossed products whose dual has property (RD) but not polynomial growth Hua Wang Institut de

January 2017 Data Linkage: An Overview Natalie Shlomo University of Manchester 1

+ Outcomes of a national web-based education program to support screening for cancer related

Matching Coefficients in NRQCD and HQET P . Marquard Institut for Theoretical Particle Physics

Lecture 28/Chapters 22 & 23 Hypothesis Tests Variable Types and Appropriate Tests