Lecture 6: Linear Programming for Sparsest Cut Sparsest Cut and SOS - - PowerPoint PPT Presentation

lecture 6 linear programming for sparsest cut sparsest
SMART_READER_LITE
LIVE PREVIEW

Lecture 6: Linear Programming for Sparsest Cut Sparsest Cut and SOS - - PowerPoint PPT Presentation

Lecture 6: Linear Programming for Sparsest Cut Sparsest Cut and SOS The SOS hierarchy captures the algorithms for sparsest cut, but they were discovered directly without thinking about SOS (and this is how well present them) Why we


slide-1
SLIDE 1

Lecture 6: Linear Programming for Sparsest Cut

slide-2
SLIDE 2

Sparsest Cut and SOS

  • The SOS hierarchy captures the algorithms for

sparsest cut, but they were discovered directly without thinking about SOS (and this is how we’ll present them)

  • Why we are covering sparsest cut in detail:
  • 1. Quite interesting in its own right
  • 2. Illustrates the kinds of things SOS can capture
  • 3. Determining if SOS can do better is a major open

problem on SOS.

slide-3
SLIDE 3

Lecture Outline

  • Part I: Sparsest cut
  • Part II: Linear programming relaxation and

analysis via metric embeddings

  • Part III: Bourgain’s Theorem
  • Part IV: Tight example: expanders
slide-4
SLIDE 4

Part I: Sparsest Cut

slide-5
SLIDE 5

Flaw of Minimum Cut

  • We’ve seen that MIN-CUT can be solved

efficiently

  • However, MIN-CUT may not be the best way to

decompose a graph

  • Example:
slide-6
SLIDE 6

Flaw of Minimum Cut

  • MIN-CUT:
  • Desired Cut:
slide-7
SLIDE 7

Sparsest Cut Problem

  • Idea: Divide # of cut edges by # of possible

which could have been cut

  • Definition: Given a cut 𝐷 = (𝑇, ҧ

𝑇), define 𝜚 𝐷 = # 𝑝𝑔 𝑓𝑒𝑕𝑓𝑡 𝑑𝑣𝑢 𝑇 ⋅ ҧ 𝑇

  • Sparsest cut problem: Minimize 𝜚(𝐷)
  • Can also have a weighted version:

𝜚 𝐷 = σ𝑗,𝑘:𝑗∈𝑇,𝑘∈ ҧ

𝑇, 𝑗,𝑘 ∈𝐹(𝐻) 𝑥(𝑗, 𝑘)

σ𝑗,𝑘:𝑗∈𝑇,𝑘∈ ҧ

𝑇 𝑥(𝑗, 𝑘)

slide-8
SLIDE 8

Linear Programming for Sparsest Cut

  • Theorem [LR99]: There is a linear programming

relaxation for sparsest cut which gives an 𝑃(log 𝑜) approximation.

slide-9
SLIDE 9

Part II: Linear Programming Relaxation and Analysis via Metric Embeddings

slide-10
SLIDE 10

Metric and Pseudo-metric Spaces

  • Definition: A metric space (𝑌, 𝑒) is a set of

points 𝑌 and a distance function 𝑒: 𝑌 × 𝑌 → ℝ≥0 where

1. ∀𝑦1, 𝑦2 ∈ 𝑌, 𝑒 𝑦1, 𝑦2 = 𝑒(𝑦1, 𝑦2) 2. ∀𝑦1, 𝑦2 ∈ 𝑌, 𝑒 𝑦1, 𝑦2 = 0 ⬄ 𝑦1 = 𝑦2 3. ∀𝑦1, 𝑦2, 𝑦3 ∈ 𝑌, d x1, x3 ≤ 𝑒 𝑦1, 𝑦2 + 𝑒(𝑦2, 𝑦3)

  • Example 1: Euclidean Space: 𝑒 𝑦, 𝑧 =

𝑧 − 𝑦

  • Example 2: 𝑀1 distance: 𝑒 𝑦, 𝑧 = σ𝑗 |𝑧𝑗 − 𝑦𝑗|
  • Without the second condition, this is called a

pseudo-metric space

slide-11
SLIDE 11

Cut Spaces

  • A cut 𝐷 = (𝑇, ҧ

𝑇) induces a pseudo-metric space

  • n a graph 𝐻: Take 𝑒(𝑣, 𝑤) = 0 if 𝑣, 𝑤 ∈ 𝑇 or

𝑣, 𝑤 ∈ ҧ 𝑇 and otherwise take 𝑒 𝑣, 𝑤 = 𝑑 for some 𝑑 > 0.

  • We call this a cut space.
slide-12
SLIDE 12

Problem Reformulation

  • Reformulation: Minimize

σ𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) 𝑒(𝑗,𝑘) σ𝑗,𝑘:𝑗<𝑘 𝑒(𝑗,𝑘)

  • ver

all cut spaces

  • First issue: Objective function is nonlinear
  • Fix: Set denominator equal to 1.
  • Modified Reformulation: Minimize

σ𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) 𝑒(𝑗, 𝑘) over all cut spaces normalized so that σ𝑗,𝑘:𝑗<𝑘 𝑒(𝑗, 𝑘) = 1

slide-13
SLIDE 13
  • Want to minimize σ𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) 𝑒(𝑗, 𝑘) over

all cut spaces normalized so that σ𝑗,𝑘:𝑗<𝑘 𝑒(𝑗, 𝑘) = 1

  • Relaxation: Minimize σ𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) 𝑒(𝑗, 𝑘)
  • ver all pseudo-metrics normalized so that

σ𝑗,𝑘:𝑗<𝑘 𝑒(𝑗, 𝑘) = 1. Linear program constraints:

1. ∀𝑗, 𝑘, 𝑒 𝑗, 𝑘 = 𝑒(𝑘, 𝑗) ≥ 0 2. ∀𝑗, 𝑘, 𝑙, 𝑒(𝑗, 𝑙) ≤ 𝑒(𝑗, 𝑘) + 𝑒(𝑘, 𝑙) 3. σ𝑗,𝑘:𝑗<𝑘 𝑒(𝑗, 𝑘) = 1

Problem Relaxation

slide-14
SLIDE 14
  • Definition: We say that a pseudo-metric (𝑌, 𝑒) is

an 𝑀1 space if there is a mapping f: 𝑌 → ℝn such that ∀𝑦, 𝑧 ∈ 𝑌, 𝑒 𝑦, 𝑧 = σ𝑗 |𝑔 𝑧 𝑗 − 𝑔 𝑦 𝑗|

  • In this case, we may as well pretend we are

already in ℝ𝑜 with the 𝑀1 distance function

  • Lemma: For the sparsest cut relaxation, there is

no gap between 𝑀1 spacs and cut spaces!

𝑀1 Spaces

slide-15
SLIDE 15
  • If 𝑦1 = 1,2 , 𝑦2 = (0,3), and 𝑦3 = (4,4), then

in the 𝑀1 metric, 𝑒 𝑦1, 𝑦2 = 2, 𝑒 𝑦1, 𝑦3 = 5, and 𝑒 𝑦2, 𝑦3 = 5

𝑀1 Space Example

0 1 2 3 4 5 6 1 2 3 4 5 6

𝑦1 𝑦2 𝑦3

slide-16
SLIDE 16
  • Lemma: Any finite 𝑀1 space can be decomposed

as a linear combination of cut spaces.

  • Proof sketch: We can work coordinate by
  • coordinate. For a single coordinate, here is the

picture:

Decomposing 𝑀1 Pseudo-metrics

  • 2 -1

1 2

=

  • 2 -1

1 2

  • 2 -1

1 2

  • 2 -1

1 2

1 × 2 × 1 × + +

slide-17
SLIDE 17

Useful Lemma

  • Lemma: If 𝑏, 𝑐 ≥ 0 and 𝑑, 𝑒 > 0 then

min 𝑏 𝑑 , 𝑐 𝑒 ≤ 𝑏 + 𝑐 𝑑 + 𝑒 ≤ max 𝑏 𝑑 , 𝑐 𝑒

  • Proof: Without loss of generality, assume that

𝑏 𝑑 ≤ 𝑐 𝑒. Take 𝑏′ = 𝑐𝑑 𝑒 ≥ 𝑏 and take 𝑐′ = 𝑒𝑏 𝑑 ≤ 𝑐.

Now 𝑏

𝑑 = 𝑏+𝑐′ 𝑑+𝑒 ≤ 𝑏+𝑐 𝑑+𝑒 ≤ 𝑏′+𝑐 𝑑+𝑒 = 𝑐 𝑒

  • Together with the previous decomposition, this

shows that for any 𝑀1 space, there’s always a cut spacec which is as good or better.

slide-18
SLIDE 18

Metric Embeddings and Distortion

  • Often want to embed a more complicated

metric space into a simpler one. This embedding won’t be perfect, but may still be useful

  • Given metric spaces 𝑌, 𝑒 , (𝑍, 𝑒′) and a map

𝑔: 𝑌 → 𝑍:

  • 1. Define the expansion of 𝑔 to be m𝑏𝑦

𝑣,𝑤∈𝑌 𝑒′(𝑔 𝑣 ,𝑔(𝑤)) 𝑒(𝑣,𝑤)

  • 2. Define the contraction of 𝑔 to be m𝑏𝑦

𝑣,𝑤∈𝑌 𝑒(𝑣,𝑤) 𝑒′(𝑔 𝑣 ,𝑔(𝑤))

  • 3. Define the distortion of 𝑔 to be the product of the

expansion and the contraction of 𝑔

slide-19
SLIDE 19

Metric Embeddings into 𝑀1

  • If the pseudo-metric given by our linear

program can be embedded into 𝑀1 with distortion 𝛽, this gives an 𝛽-approximation for the value of the sparsest cut.

  • Question: How well can general finite pseudo-

metric spaces be embedded into 𝑀1?

slide-20
SLIDE 20

Part III: Bourgain’s Theorem

slide-21
SLIDE 21

Bourgain’s Theorem

  • Theorem [Bou85]: Every metric on 𝑜 points can

be embedded into an 𝑀1 metric with distortion 𝑃(log 𝑜). Moreover, 𝑃( 𝑚𝑝𝑕𝑜 2) coordinates are sufficient

  • Note: the bound on the number of coordinates

is due to Linial, London, and Rabinovich [LLR95]

slide-22
SLIDE 22

Fréchet Embeddings

  • Def: Given a set of points 𝑇, define

𝑒 𝑦, 𝑇 = min

𝑡∈𝑇 𝑒 𝑦, 𝑡

  • Fréchet embedding: Gives a value to each point

based on its distance from some subset 𝑇 of points and takes the distance between. In other words, 𝑒𝑇 𝑦, 𝑧 = |𝑒 𝑧, 𝑇 − 𝑒(𝑦, 𝑇)|

  • Proposition: For any 𝑇, 𝑒𝑇 𝑦, 𝑧 ≤ 𝑒(𝑦, 𝑧)
slide-23
SLIDE 23

Fréchet Embedding Example

  • Start with the distance metric 𝑒 𝑣, 𝑤 = length
  • f the shortest path from 𝑣 to 𝑤 on the graph
  • shown. If we take 𝑇 to be the set of red vertices,

we get the values shown for 𝑒(𝑤, 𝑇).

1 1 1 1 2 3 2

slide-24
SLIDE 24

Fréchet Embeddings Bound

  • 𝑒 𝑦, 𝑇 = min

𝑡∈𝑇 𝑒 𝑦, 𝑡

  • 𝑒𝑇 𝑦, 𝑧 = |𝑒 𝑧, 𝑇 − 𝑒(𝑦, 𝑇)|
  • Proposition: For any 𝑇, 𝑒𝑇 𝑦, 𝑧 ≤ 𝑒(𝑦, 𝑧)
  • Proof: Let 𝑡 be the point in 𝑇 of minimal distance

from 𝑦.

𝑒 𝑧, 𝑇 ≤ 𝑒 𝑧, 𝑡 ≤ 𝑒 𝑦, 𝑡 + 𝑒 𝑦, 𝑧 = 𝑒 𝑦, 𝑧 + 𝑒(𝑦, 𝑇)

  • By symmetry, d 𝑦, 𝑇 ≤ 𝑒 𝑦, 𝑧 + 𝑒(𝑧, 𝑇) so

dS x, y = 𝑒 𝑧, 𝑇 − 𝑒 𝑦, 𝑇 ≤ 𝑒(𝑦, 𝑧), as needed.

slide-25
SLIDE 25

Bourgain’s Theorem Proof Idea

  • Proof idea: Choose many Fréchet embeddings,

have a coordinate for each one.

  • Resulting expansion is at most the sum of the

weights on the embeddings (this will be 𝑃(𝑚𝑝𝑕𝑜) for us)

  • Challenge: Ensure that the contraction is 𝑃(1).

In other words, ensure that some of the Fréchet embeddings preserve some of the distance between each pair of points 𝑦 and 𝑧.

slide-26
SLIDE 26

Bad Case #1

  • Issue: Could have that 𝑔

𝑇 𝑦, 𝑧 ≪ 𝑒(𝑦, 𝑧). In

fact, 𝑔

𝑇(𝑦, 𝑧) can easily be zero!

  • Case 1: All points in 𝑇 are far from 𝑦 and 𝑧 and

𝑒 𝑦, 𝑇 = 𝑒(𝑧, 𝑇).

  • Example:

x y Nearest point in 𝑇

slide-27
SLIDE 27

Bad Case #2

  • Case 2: There two points 𝑡𝑦 and 𝑡𝑧 in 𝑇 where 𝑡𝑦

is very close to 𝑦 and 𝑡𝑧 is very close to 𝑧. If so, can have that d x, S = 𝑒 𝑦, 𝑡𝑦 = 𝑒 𝑧, 𝑡𝑧 = 𝑒(𝑧, 𝑇)

  • Example:

x y 𝑡𝑦 𝑡𝑧

slide-28
SLIDE 28

Attempt #1

  • Want 𝑇 to contain exactly one point 𝑞 which is

very close to 𝑦 or 𝑧.

  • Let 𝑒 = 𝑒(𝑦, 𝑧). Pick 𝑇 so that 𝑇 has precisely
  • ne point 𝑞 which is within distance 𝑒

3 of either 𝑦

  • r 𝑧.
  • Can be accomplished with constant probability

by taking a random S of the appropriate size.

x y

slide-29
SLIDE 29

Attempt #1

  • Attempt #1: Pick 𝑇 so that 𝑇 has precisely one

point 𝑞 which is within distance 𝑒

3 of either 𝑦 or

𝑧.

  • Danger: 𝑇 also contains point(s) of distance

slightly more than 𝑒

3 from the other point.

x y

slide-30
SLIDE 30

Attempt #1

  • Possible fix: Require that 𝑇 contains exactly one

point within distance 𝑒

3 of 𝑦 or 𝑧 and no other

points within distance

𝑒 2 of 𝑦 or 𝑧

  • This implies 𝑒𝑇 𝑦, 𝑧 ≥

𝑒 6

  • However, may be too much to ask for…

x y

slide-31
SLIDE 31

Actual Analysis

  • Def: Given 𝑠, 𝑞, define 𝐶𝑠 𝑞 = {𝑦: 𝑒 𝑦, 𝑞 ≤ 𝑠}
  • For each 𝑗 ∈ [1, ⌈log2 𝑜⌉], define 𝑒𝑗 to be

𝑒𝑗 = min min{𝑠 : 𝐶𝑠 𝑦 ∪ 𝐶𝑠 𝑧 ≥ 2𝑗}, 𝑒

3

  • Lemma: If 𝑇 consists of

𝑜 2𝑗 points chosen at

random then P 𝑔

𝑇 𝑦, 𝑧 ≥ 𝑒𝑗+1 − 𝑒𝑗 is Ω(1)

  • Proof: With probability Ω(1),

1. ∃𝑞 ∈ 𝑇: 𝑞 ∈ 𝐶𝑒𝑗 𝑦 ∪ 𝐶𝑒𝑗(𝑧) 2. ∄𝑞′: 𝑞′ ∈ 𝑇, 𝑞′ ≠ 𝑞, 𝑛𝑗𝑜{𝑒 𝑦, 𝑞′ , 𝑒 𝑧, 𝑞′ } < 𝑒𝑗+1

slide-32
SLIDE 32

Actual Analysis Picture

  • If 𝑇 consists of 𝑜

2𝑗 points chosen at random then

with probability Ω(1):

x y s 𝑒𝑗 𝑒𝑗+1

slide-33
SLIDE 33

Actual Analysis Continued

  • Lemma: If 𝑇 consists of

𝑜 2𝑗 points chosen at

random then with constant probability, 𝑔

𝑇 𝑦, 𝑧 ≥ 𝑒𝑗+1 − 𝑒𝑗

  • Corollary: Averaging over all 𝑗 ∈ [1, 𝑚𝑝𝑕𝑜 ], the

expected value of 𝑔

𝑇(𝑦, 𝑧) is at least Ω 𝑒 𝑚𝑝𝑕𝑜

  • For each 𝑗 ∈ [0, 𝑚𝑝𝑕𝑜 ], take 𝑃(𝑚𝑝𝑕𝑜) 𝑇 of size

2𝑗 at random. This ensures that everything is close to its expectation with high probability.

slide-34
SLIDE 34

Actual Analysis Continued

  • Full embedding procedure: For each 𝑗 ∈

[0, 𝑚𝑝𝑕𝑜 − 1], take m = 𝑃(𝑚𝑝𝑕𝑜) 𝑇 of size 2𝑗 at random. For each such 𝑇, create a coordinate where each point 𝑦 has value

1 𝑛 𝑒 𝑦, 𝑇 .

  • Averaging over many subsets of each size

ensures that everything is close to its expectation with high probability.

slide-35
SLIDE 35

Part IV: Tight Example: Expanders

slide-36
SLIDE 36

Expander Graphs

  • A vertex/edge expander is a graph 𝐻 where

every subset of 𝐻 has a lot of neighbors/outgoing edges

  • Definition: The vertex expansion of a graph 𝐻 is

min

𝑇:0< 𝑇 ≤𝑜

2

𝑂(𝑇) |𝑇|

where 𝑂 𝑇 = {𝑤: ∃𝑣 ∈ 𝑇: 𝑣, 𝑤 ∈ 𝐹(𝐻)}

  • Definition: The edge expansion of a graph 𝐻 is

min

𝑇:0< 𝑇 ≤𝑜

2

𝜀(𝑇) |𝑇| where

𝜀 𝑇 = { 𝑣, 𝑤 : 𝑣 ∈ 𝑇, 𝑤 ∉ 𝑇, 𝑣, 𝑤 ∈ 𝐹(𝐻)}

slide-37
SLIDE 37

Observations on Expander Graphs

  • Expander graphs are extremely useful in

complexity theory.

  • Derandomization: random walks mix well
  • Here: Edge expanders have no sparse cuts.
  • Proposition: If 𝐻 has edge expansion 𝑑 then for

all cuts C = (𝑇, ҧ 𝑇), 𝜚 𝐷 =

# 𝑝𝑔 𝑓𝑒𝑕𝑓𝑡 𝑑𝑣𝑢 𝑇 ⋅ ҧ 𝑇

≥ 𝑑

𝑜

  • Proof: By definition, # 𝑝𝑔 𝑓𝑒𝑕𝑓𝑡 𝑑𝑣𝑢 ≥ 𝑑|𝑇| and

ҧ 𝑇 ≤ 𝑜

slide-38
SLIDE 38

Constructing Expanders

  • With high probability, random graphs are

excellent expanders.

  • Constructing expanders explicitly is more

challenging and is an entire field of research on its own.

slide-39
SLIDE 39

Ω(log 𝑜) gap with expanders

  • Use the distance metric 𝑒𝑗𝑘 = smallest length of

a path from 𝑗 to 𝑘.

  • For a 𝑒-regular expander with edge expansion

𝑒 4:

1. σ𝑗,𝑘:𝑗<𝑘, 𝑗,𝑘 ∈𝐹(𝐻) 𝑒𝑗𝑘 = |𝐹 𝐻 | which is 𝑃(𝑜𝑒) 2. σ𝑗,𝑘:𝑗<𝑘 𝑒𝑗𝑘 is Ω(𝑜2log(𝑜)) as most pairs of vertices are logarithmic distance apart

  • Linear programming relaxation value: 𝑃

𝑒 𝑜𝑚𝑝𝑕𝑜

  • Actual value is Ω

𝑒 𝑜

slide-40
SLIDE 40

References

  • [Bou85] J. Bourgain. On Lipschitz embedding of finite metric spaces in Hilbert space.

Israel J. Math., 52(1–2), p. 46–52. 1985.

  • [LR99] T. Leighton and S. Rao. Multicommodity max-flow min-cut theorems and

their use in designing approximation algorithms. Journal of the ACM (JACM) 46(6),

  • p. 787–832. 1999
  • [LLR95] N. Linial, E. London, Y. Rabinovich. The geometry of graphs and some of its

algorithmic applications. Combinatorica 15(2),p. 215–245. 1995