(Nearly) Efficient Algorithms for the Graph Matching Problem - PowerPoint PPT Presentation

(Nearly) Efficient Algorithms for the Graph Matching Problem Tselil Schramm (Harvard/MIT) with Boaz Barak, Chi-Ning Chou, Zhixian Lei & Yueqi Sheng (Harvard)

graph matching problem (approximate graph isomorphism) input: two graphs on 𝑜 vertices goal: find permutation of vertices that maximizes # shared edges ma𝑦 𝐵 𝐻 0 , 𝜌 𝐵 𝐻 1 𝜌 𝐻 1 𝐻 0

graph matching problem (approximate graph isomorphism) input: two graphs on 𝑜 vertices goal: find permutation of vertices that maximizes # shared edges ma𝑦 𝐵 𝐻 0 , 𝜌 𝐵 𝐻 1 𝜌 # matched = 4 𝐻 1 𝐻 0

graph matching problem (approximate graph isomorphism) input: two graphs on 𝑜 vertices goal: find permutation of vertices that maximizes # shared edges ma𝑦 𝐵 𝐻 0 , 𝜌 𝐵 𝐻 1 𝜌 𝐻 1 𝐻 0

graph matching problem (approximate graph isomorphism) input: two graphs on 𝑜 vertices goal: find permutation of vertices that maximizes # shared edges ma𝑦 𝐵 𝐻 0 , 𝜌 𝐵 𝐻 1 𝜌 # matched = 5 𝐻 1 𝐻 0

computationally hard (of course) NP-hard: reduction from quadratic assignment problem (non-simple graphs). [ Lawler’63] also: reduction from sparse random 3-SAT to approximate version [ O’Donnell -Wright-Wu- Zhou’14]

practitioners: undeterred  computational biology [e.g. Singh-Xu- Berger‘08]  de-anonymization [e.g. Narayanan- Shmatikov’09]  social networks [e.g. Korula- Lattanzi’14]  image alignment [e.g. Cho- Lee’12]  machine learning [e.g. Cour-Srinivasan- Shi’07]  pattern recognition, e.g. “thirty years of graph matching in pattern recognition” [Conte-Foggia-Sansone- Vento’04 ]

“robust average - case graph isomorphism” average case: correlated random graphs structured model sample 𝐻 ∼ 𝐻(𝑜, 𝑞) 𝐻 ≈ 𝑞𝛿 2 ⋅ 𝑜 𝛿 𝛿 ma𝑦 𝐵 𝐻 0 , 𝜌 𝐵 𝐻 1 2 𝜌 subsample edges w/prob 𝛿 𝐻 0 𝐻 1 avg. degree 𝑞𝛿 ⋅ 𝑜 random permutation 𝜌 [e.g. Pedarsani- Grossglauser’11, 𝐻 0 Lyzinski-Fishkind- Priebe’14, Korula- Lattanzi’14]

“robust average - case graph isomorphism” average case: correlated random graphs structured model “null” model sample 𝐻 ∼ 𝐻(𝑜, 𝑞) 𝐻 𝛿 𝛿 subsample edges w/prob 𝛿 𝐻 0 𝐻 1 avg. degree 𝑞𝛿 ⋅ 𝑜 𝜌 ≈ 𝑞𝛿 2 ⋅ 𝑜 𝐻 0 2

“robust average - case graph isomorphism” average case: correlated random graphs structured model “null” model sample sample 𝐻 0 , 𝐻 1 ∼ 𝐻(𝑜, 𝑞𝛿) 𝐻 ∼ 𝐻(𝑜, 𝑞) 𝐻 𝛿 𝛿 subsample edges w/prob 𝛿 𝐻 1 𝐻 0 𝐻 0 𝐻 1 avg. degree 𝑞𝛿 ⋅ 𝑜 avg. degree 𝑞𝛿 ⋅ 𝑜 𝜌 ≈ 𝑞𝛿 2 ⋅ 𝑜 ≈ 𝑞𝛿 2 ⋅ 𝑜 ma𝑦 𝐵 𝐻 0 , 𝜌 𝐵 𝐻 1 𝐻 0 2 𝜌 2

information theoretic limit 𝐻(𝑜, 𝑞) 𝐻 𝛿 𝛿 for which 𝑞, 𝛿 can we recover 𝜌 ? 𝐻 0 𝐻 1 𝜌 𝐻 0 Theorem [Cullina- Kivayash’16&17] Iff 𝑞𝛿 2 > log 𝑜 𝑜 , with high probability 𝜌 is the unique maximizing permutation.

algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. matching local neighborhoods match radius- 𝑙 neighborhoods? 𝐻 1 𝐻 0

algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. matching local neighborhoods 𝛿 𝛿 match radius- 𝑙 neighborhoods? 𝐻 1 𝐻 0

algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. spectral algorithm 𝑤 max 𝑤 max unique entries in top eigenvector give isomorphism? 𝐻 1 𝐻 0

algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. spectral algorithm 𝛿 𝛿 𝛿 𝛿 𝑤 max 𝑤 max + + unique entries in top eigenvector give isomorphism? 𝐻 1 𝐻 0 perturb eigenvectors by ≈ 𝛿

actual algorithms for robust average case?

starting from a seed 𝜌ȁ 𝑇 known 𝑇 𝜌(𝑇) 𝐻 0 𝐻 1

starting from a seed match vertices with similar adjacency into 𝑇 𝑤 𝑣 𝐻 1 𝐻 0 𝑇

starting from a seed match vertices with similar adjacency into 𝑇 𝜌 𝑣 = 𝑤 𝐻 1 𝐻 0 𝑇 iff seed ≥ Ω(𝑜 𝜗 ) , the seeded algorithm approximately recovers 𝜌 . [Yartseva- Grossglauser’13] 𝑃 𝑜 𝜗 time to guess a seed. need 2 ෨

𝐻(𝑜, 𝑞) structured 𝐻 our results 𝛿 𝛿 𝐻 0 𝐻 1 𝜌 Theorem 𝐻 0 1 2 𝑜 𝑝(1) 𝑜 1−𝜗 𝑜 153 𝑜 3 and 𝛿 = Ω(1) ,* there is a 𝑜 𝑃(log 𝑜) For any 𝜗 > 0 , if 𝑞 ∈ , ∪ 𝑜 , 𝑜 𝑜 𝑜 time algorithm that recovers 𝜌 on 𝑜 − 𝑝(𝑜) of the vertices w/prob ≥ 0.99 . 1 *we allow 𝛿 = Ω loglog 𝑜 𝑜 1/2 𝑜 𝑝(1) 𝑜 1/153 𝑜 2/3 𝑜 1−𝜗 𝐻(𝑜, 𝑞) log 𝑜 𝑜 average degrees: 𝑜 3/5 𝑜 1/3

𝐻(𝑜, 𝑞) structured 𝐻 our results 𝛿 𝛿 𝐻 0 𝐻 1 𝜌 Theorem 𝐻 0 1 2 𝑜 𝑝(1) 𝑜 1−𝜗 𝑜 153 𝑜 3 and 𝛿 = Ω(1) ,* there is a 𝑜 𝑃(log 𝑜) For any 𝜗 > 0 , if 𝑞 ∈ , ∪ 𝑜 , 𝑜 𝑜 𝑜 time algorithm that recovers 𝜌 on 𝑜 − 𝑝(𝑜) of the vertices w/prob ≥ 0.99 . 1 *we allow 𝛿 ≥ log 𝑝(1) 𝑜 Theorem 𝐻(𝑜, 𝑞𝛿) hypothesis testing If 𝑞, 𝛿 are as above then there is a poly(𝑜) time distinguishing null algorithm for the structured vs null distributions. 𝐻 0 𝐻 1

our approach: small subgraphs hypothesis testing: correlation of subgraph counts recovery: match rare subgraphs seedless algorithms!

outline  distinguishing/hypothesis testing  recovery  concluding

distinguishing/hypothesis testing structured 𝐻(𝑜, 𝑞) 𝐻 Given 𝐻 0 , 𝐻 1 sampled equally likely from structured or null , 𝛿 𝛿 decide w/prob 1 − 𝑝(1) from which. 𝐻 0 𝐻 1 ? ? 𝜌 𝐻 0 𝐻 1 𝐻 0 ? 𝐻(𝑜, 𝑞𝛿) null brute force: is there a 𝜌 with ≥ 𝑞𝛿 2 𝑜 2 matched edges? 𝐻 0 𝐻 1

…counting triangles? structured 𝐻(𝑜, 𝑞) 𝑑𝑝𝑠 𝐿 3 𝐻 0 , 𝐻 1 : = # 𝐿 3 𝑗𝑜 𝐻 0 # K 3 𝑗𝑜 𝐻 1 . 𝐻 𝛿 𝛿 𝐻 0 𝐻 1 ? ? 𝜌 𝐻 0 𝐻 1 𝐻 0 ? 𝐻(𝑜, 𝑞𝛿) null 𝐻 0 𝐻 1

…counting triangles? structured 𝐻(𝑜, 𝑞) 𝑑𝑝𝑠 𝐿 3 𝐻 0 , 𝐻 1 : = # 𝐿 3 𝑗𝑜 𝐻 0 # K 3 𝑗𝑜 𝐻 1 . 𝐻 𝛿 𝛿 𝐻 0 𝐻 1 𝜌 𝐻 0 𝐻 1 𝐻 0 𝐻(𝑜, 𝑞𝛿) null triangle counts in 𝐻 0 , 𝐻 1 are independent 𝐻 0 𝐻 1 𝑙 3 (𝐻 0 , 𝐻 1 ) ≈ 𝑞𝛿𝑜 6 𝔽 𝑑𝑝𝑠

…counting triangles? structured 𝐻(𝑜, 𝑞) 𝐿 3 (𝐻 0 , 𝐻 1 ) ≈ 𝑞𝛿𝑜 6 + 𝛿 2 𝑞𝑜 3 𝔽 𝑑𝑝𝑠 𝐻 𝛿 𝛿 triangle counts in 𝐻 0 , 𝐻 1 are correlated 𝐻 0 𝐻 1 𝜌 𝐻 0 𝐻 1 𝐻 0 𝐻(𝑜, 𝑞𝛿) null 𝐻 0 𝐻 1

…counting triangles? structured 𝐻(𝑜, 𝑞) 𝐻 𝛿 𝛿 structured 𝐿 3 (𝐻 0 , 𝐻 1 ) ≈ 𝑞𝛿𝑜 6 + 𝛿 2 𝑞𝑜 3 𝔽 𝑑𝑝𝑠 𝐻 0 𝐻 1 𝜌 null 𝐿 3 (𝐻 0 , 𝐻 1 ) ≈ 𝑞𝛿𝑜 6 𝔽 𝑑𝑝𝑠 𝐻 0 Variance? 𝐻(𝑜, 𝑞𝛿) null Optimistically, in null case, 1/2 ≈ 𝑞𝛿𝑜 3 𝕎 𝑑𝑝𝑠 𝐿 3 𝐻 0 , 𝐻 1 𝐻 0 𝐻 1

“independent trials” 𝑈 𝑑𝑝𝑠 𝑈 𝐻 0 , 𝐻 1 = 1 (𝑗) (𝐻 0 , 𝐻 1 ) 𝑈 ෍ 𝑑𝑝𝑠 Suppose we had 𝑈 “independent trials”: 𝐿 3 𝑗=1 𝐻(𝑜, 𝑞) 𝐻 𝛿 structured 𝛿 𝐻 0 𝐻 1 ≈ 𝑞𝛿𝑜 6 + 𝛿 2 𝑞𝑜 3 𝔽 𝑑𝑝𝑠 𝑈 𝐻 0 , 𝐻 1 𝜌 𝐻 0 if 𝑈 > 1/𝛿 6 , ≈ 𝑞𝛿𝑜 6 𝔽 𝑑𝑝𝑠 𝑈 𝐻 0 , 𝐻 1 𝐻(𝑜, 𝑞𝛿) 𝑑𝑝𝑠 𝑈 is a good test null 1/2 ≈ 1 𝑞𝛿𝑜 3 𝕎 𝑑𝑝𝑠 𝑈 𝐻 0 , 𝐻 1 𝐻 0 𝐻 1 𝑈

near-independent subgraphs “independent trials” 𝑈 𝑑𝑝𝑠 𝑈 𝐻 0 , 𝐻 1 = 1 Suppose we had 𝑈 “independent” subgraphs : 𝑈 ෍ 𝑑𝑝𝑠 𝐼 𝑗 (𝐻 0 , 𝐻 1 ) 𝑗=1 𝐼 1 , … , 𝐼 𝑈 what properties must 𝐼 1 , … , 𝐼 𝑈 have to be “independent”?

surprisingly delicate (concentration) 𝑞 = 𝑜 −5/7 𝐻(𝑜, 𝑞) 𝐻 𝐼 How many labeled copies of 𝐼 in 𝐻 ? ȁ𝑏𝑣𝑢 𝐼 ȁ ⋅ 𝑜 5! 5 ⋅ 𝑞 7 ≈ 𝑜 5 𝑞 7 = Θ(1) 𝔽 # 𝐼 𝐻 =

surprisingly delicate (concentration) 𝑞 = 𝑜 −5/7 𝐻(𝑜, 𝑞) 𝐻 𝐼 # 𝐼 (𝐻) does not concentrate! How many labeled copies of 𝐼 in 𝐻 ? ȁ𝑏𝑣𝑢 𝐼 ȁ ⋅ 𝑜 5! 5 ⋅ 𝑞 7 ≈ 𝑜 5 𝑞 7 = Θ(1) 𝔽 # 𝐼 𝐻 = How many labeled copies of 𝐿 4 in 𝐻 ? ȁ𝑏𝑣𝑢 𝐿 4 ȁ ⋅ 𝑜 4! 4 ⋅ 𝑞 6 ≈ 𝑜 4 𝑞 6 = Θ(𝑜 −2/7 ) 𝔽 # 𝐿 4 𝐻 =

variance of subgraph counts 𝐻(𝑜, 𝑞) 𝐻 𝐼 Lemma For a constant-sized subgraph 𝐼 , 2 𝔽 # 𝐼 𝐻 𝕎 # 𝐼 (𝐻) = Θ 1 ⋅ min 𝐾⊂𝐼 𝔽[# 𝐾 𝐻 ] subgraph of 𝐼 with fewest expected appearances

(Nearly) Efficient Algorithms for the Graph Matching Problem - PowerPoint PPT Presentation

(Nearly) Efficient Algorithms for the Graph Matching Problem Tselil Schramm (Harvard/MIT) with Boaz Barak, Chi-Ning Chou, Zhixian Lei & Yueqi Sheng (Harvard) graph matching problem (approximate graph isomorphism) input: two graphs on

Graph Algorithms Chapter 22 1 CPTR 430 Algorithms Graph Algorithms Why Study Graph Algorithms?

Greedy Algorithms Chapter 16 1 CPTR 430 Algorithms Greedy Algorithms Greedy Algorithms For

Algorithms Chapter 3 Chapter Summary Algorithms n Example Algorithms n Algorithmic Paradigms

SCHOOL BOARD PROFILE Nearly 170 years of excellence in Nearly Catholic Catholic 91,000

General remarks Algorithms Algorithms Oliver Oliver Week 8 Kullmann Kullmann Greedy Greedy

Efficient signal processing using Haskell and LLVM Henning Thielemann 2016-09-15 Efficient

Efficient Algorithms P and NP So far, we have developed algorithms for finding shortest

Constraint satisfaction: algorithms For some classes of constraints there are efficient special

Efficient and Not-So-Efficient Algorithms Problem spaces tend to be big: NP-Complete Problems A

Efficient and Not-So-Efficient Algorithms Problem spaces tend to be big: NP-Complete Problems,

- - packing p a - packing algo- packing cking rithms algo- a l g o - theorems rithms

Evolutionary Algorithms CS 478 - Evolutionary Algorithms 1 Evolutionary Computation/Algorithms

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Algorithms Theory Algorithms Theory 10 10 Greedy Algorithms G d Al ith Dr. Alexander

Randomized Algorithms Randomized Algorithms Two Types of Randomized Algorithms Two Types of

Permanent estimators via random matrices Mark Rudelson joint work with Ofer Zeitouni Department

1 Introduction Fom early 1980s, the in fl ation rates in most developed and emerging

SESSION 3: BALANCE SHEETS - ASSETS OWNED & MONEY OWED Accounting for Finance The Balance

Nevada County d C Presented by, y, Peter Galbraith, and Sharon Delgado Occupy Nevada County

Project Topic (by SeonHong Na, Columbia University, New York) Chemo mechanical coupling

MANAGING CREDIT RISK UNDER THE BASEL III FRAMEWORK: THE PRESENTATION SLIDES Download Free

Chapter 5: Concentration The Probabilistic Method Summer 2020 Freie Universitt Berlin Chapter

Toward Large-Scale Image Segmentation On Summit Sudip K. Seal , Seung-Hwan Lim, Dali Wang, Jacob

(Nearly) Efficient Algorithms for the Graph Matching Problem - PowerPoint PPT Presentation

(Nearly) Efficient Algorithms for the Graph Matching Problem Tselil Schramm (Harvard/MIT) with Boaz Barak, Chi-Ning Chou, Zhixian Lei & Yueqi Sheng (Harvard) graph matching problem (approximate graph isomorphism) input: two graphs on

Graph Algorithms Chapter 22 1 CPTR 430 Algorithms Graph Algorithms Why Study Graph Algorithms?

Greedy Algorithms Chapter 16 1 CPTR 430 Algorithms Greedy Algorithms Greedy Algorithms For

Algorithms Chapter 3 Chapter Summary Algorithms n Example Algorithms n Algorithmic Paradigms

SCHOOL BOARD PROFILE Nearly 170 years of excellence in Nearly Catholic Catholic 91,000

General remarks Algorithms Algorithms Oliver Oliver Week 8 Kullmann Kullmann Greedy Greedy

Efficient signal processing using Haskell and LLVM Henning Thielemann 2016-09-15 Efficient

Efficient Algorithms P and NP So far, we have developed algorithms for finding shortest

Constraint satisfaction: algorithms For some classes of constraints there are efficient special

Efficient and Not-So-Efficient Algorithms Problem spaces tend to be big: NP-Complete Problems A

Efficient and Not-So-Efficient Algorithms Problem spaces tend to be big: NP-Complete Problems,

- - packing p a - packing algo- packing cking rithms algo- a l g o - theorems rithms

Evolutionary Algorithms CS 478 - Evolutionary Algorithms 1 Evolutionary Computation/Algorithms

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Algorithms Theory Algorithms Theory 10 10 Greedy Algorithms G d Al ith Dr. Alexander

Randomized Algorithms Randomized Algorithms Two Types of Randomized Algorithms Two Types of

Permanent estimators via random matrices Mark Rudelson joint work with Ofer Zeitouni Department

1 Introduction Fom early 1980s, the in fl ation rates in most developed and emerging

SESSION 3: BALANCE SHEETS - ASSETS OWNED &amp; MONEY OWED Accounting for Finance The Balance

Nevada County d C Presented by, y, Peter Galbraith, and Sharon Delgado Occupy Nevada County

Project Topic (by SeonHong Na, Columbia University, New York) Chemo mechanical coupling

MANAGING CREDIT RISK UNDER THE BASEL III FRAMEWORK: THE PRESENTATION SLIDES Download Free

Chapter 5: Concentration The Probabilistic Method Summer 2020 Freie Universitt Berlin Chapter

Toward Large-Scale Image Segmentation On Summit Sudip K. Seal , Seung-Hwan Lim, Dali Wang, Jacob

SESSION 3: BALANCE SHEETS - ASSETS OWNED & MONEY OWED Accounting for Finance The Balance