(Nearly) Efficient Algorithms for the Graph Matching Problem - - PowerPoint PPT Presentation

β–Ά
nearly efficient algorithms
SMART_READER_LITE
LIVE PREVIEW

(Nearly) Efficient Algorithms for the Graph Matching Problem - - PowerPoint PPT Presentation

(Nearly) Efficient Algorithms for the Graph Matching Problem Tselil Schramm (Harvard/MIT) with Boaz Barak, Chi-Ning Chou, Zhixian Lei & Yueqi Sheng (Harvard) graph matching problem (approximate graph isomorphism) input: two graphs on


slide-1
SLIDE 1

(Nearly) Efficient Algorithms

for the

Graph Matching Problem

Tselil Schramm

(Harvard/MIT)

with Boaz Barak, Chi-Ning Chou, Zhixian Lei & Yueqi Sheng (Harvard)

slide-2
SLIDE 2

graph matching problem (approximate graph isomorphism)

𝐻0 𝐻1

goal: find permutation of vertices that maximizes # shared edges

ma𝑦

𝜌

𝐡𝐻0, 𝜌 𝐡𝐻1

input: two graphs on π‘œ vertices

slide-3
SLIDE 3

graph matching problem (approximate graph isomorphism)

𝐻0 𝐻1 ma𝑦

𝜌

𝐡𝐻0, 𝜌 𝐡𝐻1 # matched = 4

goal: find permutation of vertices that maximizes # shared edges input: two graphs on π‘œ vertices

slide-4
SLIDE 4

graph matching problem (approximate graph isomorphism)

𝐻0 𝐻1 ma𝑦

𝜌

𝐡𝐻0, 𝜌 𝐡𝐻1

goal: find permutation of vertices that maximizes # shared edges input: two graphs on π‘œ vertices

slide-5
SLIDE 5

graph matching problem (approximate graph isomorphism)

𝐻0 𝐻1 ma𝑦

𝜌

𝐡𝐻0, 𝜌 𝐡𝐻1 # matched = 5

goal: find permutation of vertices that maximizes # shared edges input: two graphs on π‘œ vertices

slide-6
SLIDE 6

computationally hard (of course)

NP-hard: reduction from quadratic assignment problem (non-simple graphs).

[Lawler’63]

also: reduction from sparse random 3-SAT to approximate version

[O’Donnell-Wright-Wu-Zhou’14]

slide-7
SLIDE 7

practitioners: undeterred

ο‚– computational biology [e.g. Singh-Xu-Bergerβ€˜08] ο‚– de-anonymization [e.g. Narayanan-Shmatikov’09] ο‚– social networks [e.g. Korula-Lattanzi’14] ο‚– image alignment [e.g. Cho-Lee’12] ο‚– machine learning [e.g. Cour-Srinivasan-Shi’07] ο‚– pattern recognition, e.g. β€œthirty years of graph matching in pattern recognition”

[Conte-Foggia-Sansone-Vento’04]

slide-8
SLIDE 8

average case: correlated random graphs

ma𝑦

𝜌

𝐡𝐻0, 𝜌 𝐡𝐻1 β‰ˆ π‘žπ›Ώ2 β‹… π‘œ 2

sample 𝐻 ∼ 𝐻(π‘œ, π‘ž) subsample edges w/prob 𝛿 𝐻 𝐻0 𝐻1 𝐻0 random permutation 𝜌

𝛿 𝛿

  • avg. degree π‘žπ›Ώ β‹… π‘œ

structured model

[e.g. Pedarsani-Grossglauser’11, Lyzinski-Fishkind-Priebe’14, Korula-Lattanzi’14]

β€œrobust average-case graph isomorphism”

slide-9
SLIDE 9

average case: correlated random graphs

  • avg. degree π‘žπ›Ώ β‹… π‘œ

β‰ˆ π‘žπ›Ώ2 β‹… π‘œ 2 β€œnull” model

sample 𝐻 ∼ 𝐻(π‘œ, π‘ž) subsample edges w/prob 𝛿 𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿

𝜌 structured model

β€œrobust average-case graph isomorphism”

slide-10
SLIDE 10

average case: correlated random graphs

ma𝑦

𝜌

𝐡𝐻0, 𝜌 𝐡𝐻1 β‰ˆ π‘žπ›Ώ 2 β‹… π‘œ 2

  • avg. degree π‘žπ›Ώ β‹… π‘œ

β‰ˆ π‘žπ›Ώ2 β‹… π‘œ 2 β€œnull” model

sample 𝐻0, 𝐻1 ∼ 𝐻(π‘œ, π‘žπ›Ώ) 𝐻0 𝐻1

  • avg. degree π‘žπ›Ώ β‹… π‘œ

sample 𝐻 ∼ 𝐻(π‘œ, π‘ž) subsample edges w/prob 𝛿 𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿

𝜌 structured model

β€œrobust average-case graph isomorphism”

slide-11
SLIDE 11

information theoretic limit

Iff π‘žπ›Ώ2 >

log π‘œ π‘œ , with high probability 𝜌 is the unique maximizing permutation.

Theorem

𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿 𝜌 for which π‘ž, 𝛿 can we recover 𝜌?

[Cullina-Kivayash’16&17]

slide-12
SLIDE 12

algorithms for robust average case?

e.g. matching local neighborhoods average-case graph isomorphism algorithms fail. match radius-𝑙 neighborhoods?

𝐻0 𝐻1

slide-13
SLIDE 13

algorithms for robust average case?

𝛿 𝛿

e.g. matching local neighborhoods average-case graph isomorphism algorithms fail. match radius-𝑙 neighborhoods?

𝐻0 𝐻1

slide-14
SLIDE 14

algorithms for robust average case?

average-case graph isomorphism algorithms fail. e.g. spectral algorithm unique entries in top eigenvector give isomorphism? 𝑀max 𝑀max

𝐻0 𝐻1

slide-15
SLIDE 15

algorithms for robust average case?

average-case graph isomorphism algorithms fail. e.g. spectral algorithm unique entries in top eigenvector give isomorphism? 𝑀max 𝑀max

𝛿 𝛿

+ +

𝛿 𝛿

perturb eigenvectors by β‰ˆ 𝛿

𝐻0 𝐻1

slide-16
SLIDE 16

actual algorithms for robust average case?

slide-17
SLIDE 17

starting from a seed

πœŒΘπ‘‡ known 𝑇 𝜌(𝑇)

𝐻0 𝐻1

slide-18
SLIDE 18

starting from a seed

𝑇 𝑣 𝑀

match vertices with similar adjacency into 𝑇

𝐻0 𝐻1

slide-19
SLIDE 19

starting from a seed

𝑇 𝜌 𝑣 = 𝑀

match vertices with similar adjacency into 𝑇

need 2 ΰ·¨

𝑃 π‘œπœ— time to guess a seed.

iff seed β‰₯ Ξ©(π‘œπœ—), the seeded algorithm approximately recovers 𝜌. [Yartseva-Grossglauser’13]

𝐻0 𝐻1

slide-20
SLIDE 20
  • ur results

For any πœ— > 0, if π‘ž ∈

π‘œπ‘(1) π‘œ

,

π‘œ

1 153

π‘œ

βˆͺ

π‘œ

2 3

π‘œ , π‘œ1βˆ’πœ— π‘œ

and 𝛿 = Ξ©(1),* there is a π‘œπ‘ƒ(log π‘œ) time algorithm that recovers 𝜌 on π‘œ βˆ’ 𝑝(π‘œ) of the vertices w/prob β‰₯ 0.99. Theorem

𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿 𝜌

π‘œπ‘(1) π‘œ1/153 π‘œ2/3 π‘œ1βˆ’πœ— π‘œ log π‘œ average degrees: *we allow 𝛿 = Ξ©

1 loglog π‘œ

π‘œ1/3 π‘œ1/2 π‘œ3/5 structured 𝐻(π‘œ, π‘ž)

slide-21
SLIDE 21

For any πœ— > 0, if π‘ž ∈

π‘œπ‘(1) π‘œ

,

π‘œ

1 153

π‘œ

βˆͺ

π‘œ

2 3

π‘œ , π‘œ1βˆ’πœ— π‘œ

and 𝛿 = Ξ©(1),* there is a π‘œπ‘ƒ(log π‘œ) time algorithm that recovers 𝜌 on π‘œ βˆ’ 𝑝(π‘œ) of the vertices w/prob β‰₯ 0.99.

  • ur results

Theorem

*we allow 𝛿 β‰₯

1 log𝑝(1) π‘œ

𝐻(π‘œ, π‘žπ›Ώ) 𝐻0 𝐻1

If π‘ž, 𝛿 are as above then there is a poly(π‘œ) time distinguishing algorithm for the structured vs null distributions. Theorem

hypothesis testing structured null 𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿 𝜌

slide-22
SLIDE 22
  • ur approach: small subgraphs

seedless algorithms! hypothesis testing: correlation of subgraph counts recovery: match rare subgraphs

slide-23
SLIDE 23
  • utline

ο‚– distinguishing/hypothesis testing ο‚– recovery ο‚– concluding

slide-24
SLIDE 24
  • utline

ο‚– distinguishing/hypothesis testing ο‚– recovery ο‚– concluding

slide-25
SLIDE 25

distinguishing/hypothesis testing

𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿 𝜌

𝐻(π‘œ, π‘žπ›Ώ) 𝐻0 𝐻1 structured null Given 𝐻0, 𝐻1 sampled equally likely from structured or null, decide w/prob 1 βˆ’ 𝑝(1) from which. 𝐻1 𝐻0

? ? ?

brute force: is there a 𝜌 with β‰₯ π‘žπ›Ώ2π‘œ2 matched edges?

slide-26
SLIDE 26

…counting triangles?

𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿 𝜌

𝐻(π‘œ, π‘žπ›Ώ) 𝐻0 𝐻1 structured null 𝐻1 𝐻0

? ? ? 𝑑𝑝𝑠𝐿3 𝐻0, 𝐻1 : = # 𝐿3 π‘—π‘œ 𝐻0 # K3 π‘—π‘œ 𝐻1 .

slide-27
SLIDE 27

…counting triangles?

𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿 𝜌

𝐻(π‘œ, π‘žπ›Ώ) 𝐻0 𝐻1 structured null 𝐻1 𝐻0

triangle counts in 𝐻0, 𝐻1 are independent 𝔽 𝑑𝑝𝑠

𝑙3(𝐻0, 𝐻1) β‰ˆ π‘žπ›Ώπ‘œ 6

𝑑𝑝𝑠𝐿3 𝐻0, 𝐻1 : = # 𝐿3 π‘—π‘œ 𝐻0 # K3 π‘—π‘œ 𝐻1 .

slide-28
SLIDE 28

…counting triangles?

𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿 𝜌

structured 𝐻1 𝐻0

triangle counts in 𝐻0, 𝐻1 are correlated

𝐻(π‘œ, π‘žπ›Ώ) 𝐻0 𝐻1 null

𝔽 𝑑𝑝𝑠

𝐿3(𝐻0, 𝐻1) β‰ˆ π‘žπ›Ώπ‘œ 6 + 𝛿2π‘žπ‘œ 3

slide-29
SLIDE 29

…counting triangles?

𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿 𝜌

structured 𝐻(π‘œ, π‘žπ›Ώ) 𝐻0 𝐻1 null

𝔽 𝑑𝑝𝑠

𝐿3(𝐻0, 𝐻1) β‰ˆ π‘žπ›Ώπ‘œ 6

𝔽 𝑑𝑝𝑠

𝐿3(𝐻0, 𝐻1) β‰ˆ π‘žπ›Ώπ‘œ 6 + 𝛿2π‘žπ‘œ 3

Variance?

Optimistically, in null case,

π•Ž 𝑑𝑝𝑠

𝐿3 𝐻0, 𝐻1 1/2 β‰ˆ π‘žπ›Ώπ‘œ 3

structured null

slide-30
SLIDE 30

β€œindependent trials”

Suppose we had π‘ˆ β€œindependent trials”:

𝔽 π‘‘π‘π‘ π‘ˆ 𝐻0, 𝐻1 β‰ˆ π‘žπ›Ώπ‘œ 6 𝔽 π‘‘π‘π‘ π‘ˆ 𝐻0, 𝐻1 β‰ˆ π‘žπ›Ώπ‘œ 6 + 𝛿2π‘žπ‘œ 3

structured null

π•Ž π‘‘π‘π‘ π‘ˆ 𝐻0, 𝐻1

1/2 β‰ˆ 1

π‘ˆ π‘žπ›Ώπ‘œ 3

if π‘ˆ > 1/𝛿6, π‘‘π‘π‘ π‘ˆ is a good test π‘‘π‘π‘ π‘ˆ 𝐻0, 𝐻1 = 1 π‘ˆ ෍

𝑗=1 π‘ˆ

𝑑𝑝𝑠

𝐿3 (𝑗)(𝐻0, 𝐻1) 𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝛿 𝛿

𝜌 𝐻(π‘œ, π‘žπ›Ώ) 𝐻0 𝐻1

slide-31
SLIDE 31

β€œindependent trials”

near-independent subgraphs

𝐼1, … , πΌπ‘ˆ what properties must 𝐼1, … , πΌπ‘ˆ have to be β€œindependent”?

Suppose we had π‘ˆ β€œindependent” subgraphs: π‘‘π‘π‘ π‘ˆ 𝐻0, 𝐻1 = 1 π‘ˆ ෍

𝑗=1 π‘ˆ

𝑑𝑝𝑠

𝐼𝑗(𝐻0, 𝐻1)

slide-32
SLIDE 32

𝔽 #𝐼 𝐻 = 5! ȁ𝑏𝑣𝑒 𝐼 ȁ β‹… π‘œ 5 β‹… π‘ž7

surprisingly delicate (concentration)

π‘ž = π‘œβˆ’5/7

How many labeled copies of 𝐼 in 𝐻?

𝐻(π‘œ, π‘ž)

𝐻

𝐼

β‰ˆ π‘œ5π‘ž7 = Θ(1)

slide-33
SLIDE 33

surprisingly delicate (concentration)

π‘ž = π‘œβˆ’5/7

How many labeled copies of 𝐼 in 𝐻?

𝐻(π‘œ, π‘ž)

𝐻

𝐼

𝔽 #𝐼 𝐻 = 5! ȁ𝑏𝑣𝑒 𝐼 ȁ β‹… π‘œ 5 β‹… π‘ž7 How many labeled copies of 𝐿4 in 𝐻? 𝔽 #𝐿4 𝐻 = 4! ȁ𝑏𝑣𝑒 𝐿4 ȁ β‹… π‘œ 4 β‹… π‘ž6 β‰ˆ π‘œ5π‘ž7 = Θ(1) β‰ˆ π‘œ4π‘ž6 = Θ(π‘œβˆ’2/7)

#𝐼(𝐻) does not concentrate!

slide-34
SLIDE 34

variance of subgraph counts

𝐼

𝐻(π‘œ, π‘ž)

𝐻

For a constant-sized subgraph 𝐼, Lemma

π•Ž #𝐼(𝐻) = Θ 1 β‹… 𝔽 #𝐼 𝐻

2

min

πΎβŠ‚πΌ 𝔽[#𝐾 𝐻 ]

subgraph of 𝐼 with fewest expected appearances

slide-35
SLIDE 35

strict balance

For a constant-sized subgraph 𝐼, Lemma

π•Ž #𝐼(𝐻) = Θ 1 β‹… 𝔽 #𝐼 𝐻

2

min

πΎβŠ‚πΌ 𝔽[#𝐾 𝐻 ]

𝐼 is strictly balanced if all its strict subgraphs have edge density <

𝐹 𝐼 π‘Š 𝐼 .

if 𝔽 #𝐼 𝐻 β‰ˆ π‘œ π‘Š 𝐼 π‘ž 𝐹 𝐼 = Θ(1), then 𝔽 #𝐾 𝐻 = πœ• 1 for any 𝐾 βŠ‚ 𝐼. = 𝑝 1 β‹… 𝔽 #𝐼 𝐻

slide-36
SLIDE 36

concentration AND independence

If 𝐼1, … , πΌπ‘ˆ are non-isomorphic strictly balanced graphs with 𝔽 #𝐼𝑗 𝐻 = Θ 1 ,

π•Ž #𝐼𝑗 𝐻 = 𝑝 1 β‹… 𝔽 #𝐼𝑗 𝐻 𝔽 #𝐼𝑗 𝐻 β‹… #πΌπ‘˜(𝐻) = (1 + 𝑝 1 ) β‹… 𝔽 #𝐼𝑗 𝐻 β‹… 𝔽 #πΌπ‘˜ 𝐻

βˆ€π‘— ∈ [π‘ˆ], βˆ€π‘— β‰  π‘˜ ∈ π‘ˆ ,

their counts concentrate their counts are asymptotically independent

slide-37
SLIDE 37

distinguishing algorithm

For 𝑀 =

1 poly 𝛿 , design a β€œtest set”

  • f π‘ˆ = 𝑀Ω(𝑓) strictly balanced graphs w/ 𝑀 vertices & 𝑓 edges.

𝐼1, … , πΌπ‘ˆ

set π‘œπ‘€ π‘žπ›Ώ 𝑓 β‰ˆ 1 𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝐻(π‘œ, π‘žπ›Ώ) 𝐻0 𝐻1

vs.

slide-38
SLIDE 38

distinguishing algorithm

For 𝑀 =

1 poly 𝛿 , design a β€œtest set”

  • f π‘ˆ = 𝑀Ω(𝑓) strictly balanced graphs w/ 𝑀 vertices & 𝑓 edges.

𝐼1, … , πΌπ‘ˆ

π‘‘π‘π‘ π‘ˆ 𝐻0, 𝐻1 = 1 π‘ˆ ෍

𝑗=1 π‘ˆ

𝑑𝑝𝑠

𝐼𝑗(𝐻0, 𝐻1)

compute

𝔽 π‘‘π‘π‘ π‘ˆ 𝐻0, 𝐻1 = π‘œ2𝑀(π›Ώπ‘ž)2𝑓+π‘œπ‘€ 𝛿2π‘ž 𝑓 set π‘œπ‘€ π‘žπ›Ώ 𝑓 β‰ˆ 1 π‘œ2𝑀(π›Ώπ‘ž)2𝑓 π•Ž π‘‘π‘π‘ π‘ˆ 𝐻0, 𝐻1 = 1 π‘ˆ π‘œπ‘€ π›Ώπ‘ž 𝑓 < π‘œπ‘€ 𝛿2π‘ž 𝑓 structured null null β‰₯ πœ„ < πœ„ null structured TODO: variance in structured case. 𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

𝐻(π‘œ, π‘žπ›Ώ) 𝐻0 𝐻1

vs.

slide-39
SLIDE 39
  • utline

ο‚– distinguishing/hypothesis testing ο‚– recovery ο‚– concluding

slide-40
SLIDE 40
  • utline

ο‚– distinguishing/hypothesis testing ο‚– test graphs ο‚– recovery ο‚– concluding

slide-41
SLIDE 41

designing a β€œtest set”

For 𝑀 =

1 poly 𝛿 , design a β€œtest set”

  • f π‘ˆ = 𝑀Ω(𝑓) strictly balanced graphs w/ 𝑀 vertices & 𝑓 edges.

set π‘œπ‘€ π‘žπ›Ώ 𝑓 β‰ˆ 1 π‘œπ‘(1) π‘œ1/153 π‘œ2/3 π‘œ1βˆ’πœ— π‘œ log π‘œ average degree: π‘œ1/3 π‘œ1/2 π‘œ3/5 remember? 𝐻(π‘œ, π‘ž)

𝐼1, … , πΌπ‘ˆ

slide-42
SLIDE 42

designing a β€œtest set”

For 𝑀 =

1 poly 𝛿 , design a β€œtest set”

  • f π‘ˆ = 𝑀Ω(𝑓) strictly balanced graphs w/ 𝑀 vertices & 𝑓 edges.

set π‘œπ‘€ π‘žπ›Ώ 𝑓 β‰ˆ 1

𝐼1, … , πΌπ‘ˆ

slide-43
SLIDE 43

designing a β€œtest set”

For 𝑀 =

1 poly 𝛿 , design a β€œtest set”

  • f π‘ˆ = 𝑀Ω(𝑓) strictly balanced graphs w/ 𝑀 vertices & 𝑓 edges.

set π‘œπ‘€ π‘žπ›Ώ 𝑓 β‰ˆ 1

𝐼1, … , πΌπ‘ˆ

claim: connected 𝑒-regular graphs are strictly balanced. proof: in any strict subgraph, average degree < 𝑒.

slide-44
SLIDE 44

designing a β€œtest set”

For 𝑀 =

1 poly 𝛿 , design a β€œtest set”

  • f π‘ˆ = 𝑀Ω(𝑓) strictly balanced graphs w/ 𝑀 vertices & 𝑓 edges.

set π‘œπ‘€ π‘žπ›Ώ 𝑓 β‰ˆ 1

𝐼1, … , πΌπ‘ˆ

claim: connected 𝑒-regular graphs are strictly balanced. proof: in any strict subgraph, average degree < 𝑒.

π‘œ2/3 π‘œ log π‘œ average degree: π‘œ1/3 π‘œ1/2 π‘œ3/5 𝐻(π‘œ, π‘ž)

slide-45
SLIDE 45

β€œtest set” for non-integer degrees

For 𝑀 =

1 poly 𝛿 , design a β€œtest set”

  • f π‘ˆ = 𝑀Ω(𝑓) strictly balanced graphs w/ 𝑀 vertices & 𝑓 edges.

set π‘œπ‘€ π‘žπ›Ώ 𝑓 β‰ˆ 1

𝐼1, … , πΌπ‘ˆ

what if we want 2 β‹…

𝑓 𝑀 = πœ‡ β‹… 𝑒 + 1 + 1 βˆ’ πœ‡ β‹… 𝑒? 𝑒-regular random graph on 𝑀 vertices + random matching

  • n πœ‡π‘€ vertices

strict balance? expansion.

π‘œ2/3 π‘œ1βˆ’πœ— π‘œ log π‘œ average degree: π‘œ1/3 π‘œ1/2 π‘œ3/5 𝐻(π‘œ, π‘ž)

slide-46
SLIDE 46

β€œtest set” for non-integer degrees

For 𝑀 =

1 poly 𝛿 , design a β€œtest set”

  • f π‘ˆ = 𝑀Ω(𝑓) strictly balanced graphs w/ 𝑀 vertices & 𝑓 edges.

set π‘œπ‘€ π‘žπ›Ώ 𝑓 β‰ˆ 1

𝐼1, … , πΌπ‘ˆ

what if we want 2 β‹…

𝑓 𝑀 = πœ‡ β‹… 𝑒 + 1 + 1 βˆ’ πœ‡ β‹… 𝑒? 𝑒-regular random graph on 𝑀 vertices + random matching

  • n πœ‡π‘€ vertices

strict balance? expansion.

𝑒 < 3?

2-regular graphs don’t expand.

slide-47
SLIDE 47

β€œtest set” for non-integer degrees < 3

For 𝑀 =

1 poly 𝛿 , design a β€œtest set”

  • f π‘ˆ = 𝑀Ω(𝑓) strictly balanced graphs w/ 𝑀 vertices & 𝑓 edges.

set π‘œπ‘€ π‘žπ›Ώ 𝑓 β‰ˆ 1

𝐼1, … , πΌπ‘ˆ

what if we want 2 β‹…

𝑓 𝑀 = πœ‡ β‹… 3 + 1 βˆ’ πœ‡ β‹… 2? 3-regular random

graph on πœ‡π‘€ vertices subdivide edges into 𝑙 and 𝑙 + 1 - length paths 𝑣 𝑀 𝑣 𝑀

strict balance? expansion.

π‘œπ‘(1) π‘œ1/153 π‘œ2/3 π‘œ1βˆ’πœ— π‘œ log π‘œ average degree: π‘œ1/3 π‘œ1/2 π‘œ3/5 𝐻(π‘œ, π‘ž)

slide-48
SLIDE 48

designing a β€œtest set”

For 𝑀 =

1 poly 𝛿 , design a β€œtest set”

  • f π‘ˆ = 𝑀Ω(𝑓) strictly balanced graphs w/ 𝑀 vertices & 𝑓 edges.

set π‘œπ‘€ π‘žπ›Ώ 𝑓 β‰ˆ 1

𝐼1, … , πΌπ‘ˆ

Conjecture: our construction achieves all

𝑓 𝑀 π‘œ1βˆ’πœ— π‘œ log π‘œ average degree: π‘œ1/3 𝐻(π‘œ, π‘ž) 𝑒-reg +matching subdivide

+ more conditions (for recovery)

slide-49
SLIDE 49
  • utline

ο‚– distinguishing/hypothesis testing ο‚– test graphs ο‚– recovery ο‚– concluding

slide-50
SLIDE 50
  • utline

ο‚– distinguishing/hypothesis testing ο‚– test graphs ο‚– recovery ο‚– concluding

slide-51
SLIDE 51

distinguishing β‰  recovery

distinguishing: subgraphs on

1 poly 𝛿 = 𝑃(1) vertices, each appearing 𝑃(1) times

  • nly 𝑃(1) vertices participate in subgraphs from our test set.

distinguishing: counting subgraphs ambiguity in matching; how to conclude 𝜌 𝑣 = 𝑀?

slide-52
SLIDE 52

the β€œblack swan” approach

choose test set 𝐼1, … , πΌπ‘ˆ so that 𝛿2π‘ž π‘“π‘œπ‘€ ≫ π›Ώπ‘ž 2π‘“π‘œ2𝑀, if we see 𝐼𝑗 in both graphs, it is most likely because of correlation. choose large test set 𝐼1, … , πΌπ‘ˆ with 𝑀 = 𝑃(log π‘œ) vertices Ξ©(π‘œ) vertices participate in subgraphs from our test set.

identify rare subgraphs appearing in both graphs, and match vertices.

𝐻1 𝐻0 expected number of that survive subsampling expected number of unrelated pairs of

slide-53
SLIDE 53

the β€œblack swan” approach

identify rare subgraphs appearing in both graphs, and match vertices.

𝐻1 𝐻0

Claim: there is at most one copy of each in 𝐻 with high probability

𝐻(π‘œ, π‘ž)

𝐻 𝐻0 𝐻1 𝐻0

Claim: Ξ©(π‘œ) vertices in 𝐻0 ∩ 𝐻1, appear in a surviving subsampled with high probability

proofs: second moment method

slide-54
SLIDE 54
  • utline

ο‚– distinguishing/hypothesis testing ο‚– test graphs ο‚– recovery ο‚– concluding

slide-55
SLIDE 55
  • utline

ο‚– distinguishing/hypothesis testing ο‚– test graphs ο‚– recovery ο‚– concluding

slide-56
SLIDE 56

why subgraph counts/statistics?

emerging intuition/conjectures: SoS β‰‘π’ƒπ’˜π’‰ low-degree polynomials the sum-of-squares (SoS) semidefinite program is at most as powerful as β€œlow-degree” statistics for average-case problems. known to hold for: planted clique [Barak-Hopkins-Kelner-Kothari-Moitra-Potechin’16] CSP refutation [Grigoriev’01, Schoenebeck’08, Kothari-Mori-O’Donnell-Witmer’17] tensor PCA [Hopkins-Kothari-Potechin-Raghavendra-S-Steurer’17] also known: SoS is at most as powerful as β€œlow-degree” spectral algorithms for average-case problems [Hopkins-Kothari-Potechin-Raghavendra-S-Steurer’17]

slide-57
SLIDE 57

does SoS know about the black swans?

ma𝑦

𝜌

𝐡𝐻0, 𝜌 𝐡𝐻1

does the natural SoS relaxation recover 𝜌?

  • r

?

cares about can ask similar questions about other low-degree functions, e.g. non-backtracking random walk matrix.

SoS relaxation

slide-58
SLIDE 58

more questions

ο‚– recovery in polynomial time?

SoS? or, many variations on our theme are possible.

ο‚– all information-theoretically possible π‘ž ∈

log π‘œ π‘œ , 𝑃 1 ?

ο‚– practical heuristics?

π‘œ log π‘œ π‘œ1/3

slide-59
SLIDE 59

Thank you!