Data Streams & Communication Complexity Lecture 2: Graph - - PowerPoint PPT Presentation

data streams communication complexity
SMART_READER_LITE
LIVE PREVIEW

Data Streams & Communication Complexity Lecture 2: Graph - - PowerPoint PPT Presentation

Data Streams & Communication Complexity Lecture 2: Graph Spanners, Sparsifiers, & Sketches Andrew McGregor, UMass Amherst 1/25 Graph Streams Consider a stream of m edges e 1 , e 2 , . . . . . . , e m defining a graph G with


slide-1
SLIDE 1

Data Streams & Communication Complexity

Lecture 2: Graph Spanners, Sparsifiers, & Sketches Andrew McGregor, UMass Amherst

1/25

slide-2
SLIDE 2

Graph Streams

◮ Consider a stream of m edges

e1, e2, . . . . . . , em defining a graph G with nodes V = [n] and E = {e1, . . . , em}

2/25

slide-3
SLIDE 3

Graph Streams

◮ Consider a stream of m edges

e1, e2, . . . . . . , em defining a graph G with nodes V = [n] and E = {e1, . . . , em}

◮ Semi-streaming: What can we compute with O(n · polylog n) space?

2/25

slide-4
SLIDE 4

Outline

Spanners and Distances Sparsifiers and Cuts Sketches and Dynamic Graphs Connectivity k-Connectivity Minimum Cut

3/25

slide-5
SLIDE 5

Outline

Spanners and Distances Sparsifiers and Cuts Sketches and Dynamic Graphs Connectivity k-Connectivity Minimum Cut

4/25

slide-6
SLIDE 6

Graph Distances

◮ Goal: Approximate length of the shortest path dG(u, v) between a

pair of nodes u, v ∈ G,

5/25

slide-7
SLIDE 7

Graph Distances

◮ Goal: Approximate length of the shortest path dG(u, v) between a

pair of nodes u, v ∈ G,

Definition

An α-spanner of graph G is a subgraph H such that for any nodes u, v, dG(u, v) ≤ dH(u, v) ≤ αdG(u, v) .

5/25

slide-8
SLIDE 8

Warm-Up: Connectivity

◮ Goal: Compute the number of connected components.

6/25

slide-9
SLIDE 9

Warm-Up: Connectivity

◮ Goal: Compute the number of connected components. ◮ Algorithm: Maintain a spanning forest F

6/25

slide-10
SLIDE 10

Warm-Up: Connectivity

◮ Goal: Compute the number of connected components. ◮ Algorithm: Maintain a spanning forest F

◮ F ← ∅ 6/25

slide-11
SLIDE 11

Warm-Up: Connectivity

◮ Goal: Compute the number of connected components. ◮ Algorithm: Maintain a spanning forest F

◮ F ← ∅ ◮ For each edge (u, v), if u and v aren’t connected in F,

F ← F ∪ {(u, v)}

6/25

slide-12
SLIDE 12

Warm-Up: Connectivity

◮ Goal: Compute the number of connected components. ◮ Algorithm: Maintain a spanning forest F

◮ F ← ∅ ◮ For each edge (u, v), if u and v aren’t connected in F,

F ← F ∪ {(u, v)}

◮ Analysis:

6/25

slide-13
SLIDE 13

Warm-Up: Connectivity

◮ Goal: Compute the number of connected components. ◮ Algorithm: Maintain a spanning forest F

◮ F ← ∅ ◮ For each edge (u, v), if u and v aren’t connected in F,

F ← F ∪ {(u, v)}

◮ Analysis:

◮ F has the same number of connected components as G 6/25

slide-14
SLIDE 14

Warm-Up: Connectivity

◮ Goal: Compute the number of connected components. ◮ Algorithm: Maintain a spanning forest F

◮ F ← ∅ ◮ For each edge (u, v), if u and v aren’t connected in F,

F ← F ∪ {(u, v)}

◮ Analysis:

◮ F has the same number of connected components as G ◮ F has at most n − 1 edges. 6/25

slide-15
SLIDE 15

Warm-Up: Connectivity

◮ Goal: Compute the number of connected components. ◮ Algorithm: Maintain a spanning forest F

◮ F ← ∅ ◮ For each edge (u, v), if u and v aren’t connected in F,

F ← F ∪ {(u, v)}

◮ Analysis:

◮ F has the same number of connected components as G ◮ F has at most n − 1 edges.

◮ Thm: Can count connected components in O(n log n) space.

6/25

slide-16
SLIDE 16

Spanners

◮ Algorithm:

7/25

slide-17
SLIDE 17

Spanners

◮ Algorithm:

◮ H ← ∅. 7/25

slide-18
SLIDE 18

Spanners

◮ Algorithm:

◮ H ← ∅. ◮ For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)} 7/25

slide-19
SLIDE 19

Spanners

◮ Algorithm:

◮ H ← ∅. ◮ For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}

◮ Analysis:

7/25

slide-20
SLIDE 20

Spanners

◮ Algorithm:

◮ H ← ∅. ◮ For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}

◮ Analysis:

◮ Distances increase by at most a factor 2t − 1 since an edge (u, v) is

  • nly forgotten if there’s already a detour of length at most 2t − 1.

7/25

slide-21
SLIDE 21

Spanners

◮ Algorithm:

◮ H ← ∅. ◮ For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}

◮ Analysis:

◮ Distances increase by at most a factor 2t − 1 since an edge (u, v) is

  • nly forgotten if there’s already a detour of length at most 2t − 1.

◮ Lemma: H has O(n1+1/t) edges since all cycles have length ≥ 2t + 1. 7/25

slide-22
SLIDE 22

Spanners

◮ Algorithm:

◮ H ← ∅. ◮ For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}

◮ Analysis:

◮ Distances increase by at most a factor 2t − 1 since an edge (u, v) is

  • nly forgotten if there’s already a detour of length at most 2t − 1.

◮ Lemma: H has O(n1+1/t) edges since all cycles have length ≥ 2t + 1.

Theorem

Can (2t − 1)-approximate all distances using only O(n1+1/t) space.

7/25

slide-23
SLIDE 23

Proof of Lemma

Lemma

A graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

8/25

slide-24
SLIDE 24

Proof of Lemma

Lemma

A graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

◮ Let d = 2m/n be average degree of H.

8/25

slide-25
SLIDE 25

Proof of Lemma

Lemma

A graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

◮ Let d = 2m/n be average degree of H. ◮ Let J be the graph formed by removing all nodes with degree less

than d/2.

8/25

slide-26
SLIDE 26

Proof of Lemma

Lemma

A graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

◮ Let d = 2m/n be average degree of H. ◮ Let J be the graph formed by removing all nodes with degree less

than d/2. Note J = ∅ because < n(d/2) = m edges are removed.

8/25

slide-27
SLIDE 27

Proof of Lemma

Lemma

A graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

◮ Let d = 2m/n be average degree of H. ◮ Let J be the graph formed by removing all nodes with degree less

than d/2. Note J = ∅ because < n(d/2) = m edges are removed.

◮ Grow a BFS of depth t from an arbitrary node in J.

8/25

slide-28
SLIDE 28

Proof of Lemma

Lemma

A graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

◮ Let d = 2m/n be average degree of H. ◮ Let J be the graph formed by removing all nodes with degree less

than d/2. Note J = ∅ because < n(d/2) = m edges are removed.

◮ Grow a BFS of depth t from an arbitrary node in J. ◮ Because a) no cycles of length less than 2t + 1 and b) all degrees in

J are at least d/2, number of nodes at t-th level of BFS is at least (d/2 − 1)t = (m/n − 1)t

8/25

slide-29
SLIDE 29

Proof of Lemma

Lemma

A graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

◮ Let d = 2m/n be average degree of H. ◮ Let J be the graph formed by removing all nodes with degree less

than d/2. Note J = ∅ because < n(d/2) = m edges are removed.

◮ Grow a BFS of depth t from an arbitrary node in J. ◮ Because a) no cycles of length less than 2t + 1 and b) all degrees in

J are at least d/2, number of nodes at t-th level of BFS is at least (d/2 − 1)t = (m/n − 1)t

◮ But (m/n − 1)t ≤ |J| ≤ n and therefore m ≤ n + n1+1/t.

8/25

slide-30
SLIDE 30

Outline

Spanners and Distances Sparsifiers and Cuts Sketches and Dynamic Graphs Connectivity k-Connectivity Minimum Cut

9/25

slide-31
SLIDE 31

Cuts and Sparsifiers

◮ Goal: Approximate capacity CG(S) of any cut (S, V \ S) in G.

10/25

slide-32
SLIDE 32

Cuts and Sparsifiers

◮ Goal: Approximate capacity CG(S) of any cut (S, V \ S) in G.

Definition

An α-sparsifier of graph G is a weighted subgraph H such that for any cut (S, V \ S), CG(S) ≤ CH(S) ≤ αCG(S) . where CG and CH is the capacity of the cut in G and H respectively.

10/25

slide-33
SLIDE 33

Cuts and Sparsifiers

◮ Goal: Approximate capacity CG(S) of any cut (S, V \ S) in G.

Definition

An α-sparsifier of graph G is a weighted subgraph H such that for any cut (S, V \ S), CG(S) ≤ CH(S) ≤ αCG(S) . where CG and CH is the capacity of the cut in G and H respectively.

Theorem (Batson, Spielman, Srivastava)

Exists offline algorithm A returning (1 + ǫ)-sparsifier with O(nǫ−2) edges.

10/25

slide-34
SLIDE 34

Cuts and Sparsifiers

◮ Goal: Approximate capacity CG(S) of any cut (S, V \ S) in G.

Definition

An α-sparsifier of graph G is a weighted subgraph H such that for any cut (S, V \ S), CG(S) ≤ CH(S) ≤ αCG(S) . where CG and CH is the capacity of the cut in G and H respectively.

Theorem (Batson, Spielman, Srivastava)

Exists offline algorithm A returning (1 + ǫ)-sparsifier with O(nǫ−2) edges.

◮ Idea: Use A as a black box to recursively sparsify graph stream.

10/25

slide-35
SLIDE 35

Basic Properties of Sparsifiers

Lemma

If H1 and H2 are α-sparsifiers of G1 and G2. Then H1 ∪ H2 is an α-sparsifier of G1 ∪ G2.

11/25

slide-36
SLIDE 36

Basic Properties of Sparsifiers

Lemma

If H1 and H2 are α-sparsifiers of G1 and G2. Then H1 ∪ H2 is an α-sparsifier of G1 ∪ G2.

Lemma

If J is an α-sparsifiers of H and H is an α-sparsifier of G. Then J is an α2-sparsifier of G.

11/25

slide-37
SLIDE 37

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges.

12/25

slide-38
SLIDE 38

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

12/25

slide-39
SLIDE 39

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

◮ Recursively use A with parameter 1 + γ:

12/25

slide-40
SLIDE 40

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

◮ Recursively use A with parameter 1 + γ:

◮ Read in G1: compute A(G1) and forget G1 12/25

slide-41
SLIDE 41

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

◮ Recursively use A with parameter 1 + γ:

◮ Read in G1: compute A(G1) and forget G1 ◮ Read in G2: compute A(G2) and forget G2 12/25

slide-42
SLIDE 42

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

◮ Recursively use A with parameter 1 + γ:

◮ Read in G1: compute A(G1) and forget G1 ◮ Read in G2: compute A(G2) and forget G2 ◮ Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2) 12/25

slide-43
SLIDE 43

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

◮ Recursively use A with parameter 1 + γ:

◮ Read in G1: compute A(G1) and forget G1 ◮ Read in G2: compute A(G2) and forget G2 ◮ Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2) ◮ Read in G3: compute A(G3) and forget G3 12/25

slide-44
SLIDE 44

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

◮ Recursively use A with parameter 1 + γ:

◮ Read in G1: compute A(G1) and forget G1 ◮ Read in G2: compute A(G2) and forget G2 ◮ Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2) ◮ Read in G3: compute A(G3) and forget G3 ◮ Read in G4: compute A(G4) and forget G4 12/25

slide-45
SLIDE 45

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

◮ Recursively use A with parameter 1 + γ:

◮ Read in G1: compute A(G1) and forget G1 ◮ Read in G2: compute A(G2) and forget G2 ◮ Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2) ◮ Read in G3: compute A(G3) and forget G3 ◮ Read in G4: compute A(G4) and forget G4 ◮ Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4) 12/25

slide-46
SLIDE 46

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

◮ Recursively use A with parameter 1 + γ:

◮ Read in G1: compute A(G1) and forget G1 ◮ Read in G2: compute A(G2) and forget G2 ◮ Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2) ◮ Read in G3: compute A(G3) and forget G3 ◮ Read in G4: compute A(G4) and forget G4 ◮ Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4) ◮ Compute A(A(A(G1) ∪ A(G2)) ∪ A(A(G3) ∪ A(G4))) and forget . . . 12/25

slide-47
SLIDE 47

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

◮ Recursively use A with parameter 1 + γ:

◮ Read in G1: compute A(G1) and forget G1 ◮ Read in G2: compute A(G2) and forget G2 ◮ Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2) ◮ Read in G3: compute A(G3) and forget G3 ◮ Read in G4: compute A(G4) and forget G4 ◮ Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4) ◮ Compute A(A(A(G1) ∪ A(G2)) ∪ A(A(G3) ∪ A(G4))) and forget . . .

◮ Results in a (1 + γ)log m-sparsifier for G in O(nγ−2 log m) space.

12/25

slide-48
SLIDE 48

Stream Sparsification

◮ Divide stream into segments G1, G2, . . . each of t = O(nǫ−2) edges. ◮ Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8 G1∪G2 G3∪G4 G5∪G6 G7∪G8 G1∪G2∪G3∪G4 G5∪G6∪G7∪G8 G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

◮ Recursively use A with parameter 1 + γ:

◮ Read in G1: compute A(G1) and forget G1 ◮ Read in G2: compute A(G2) and forget G2 ◮ Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2) ◮ Read in G3: compute A(G3) and forget G3 ◮ Read in G4: compute A(G4) and forget G4 ◮ Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4) ◮ Compute A(A(A(G1) ∪ A(G2)) ∪ A(A(G3) ∪ A(G4))) and forget . . .

◮ Results in a (1 + γ)log m-sparsifier for G in O(nγ−2 log m) space. ◮ If γ = O(ǫ/ log m), we get (1 + ǫ)-sparsifier in O(nǫ−2 log3 m) space.

12/25

slide-49
SLIDE 49

Outline

Spanners and Distances Sparsifiers and Cuts Sketches and Dynamic Graphs Connectivity k-Connectivity Minimum Cut

13/25

slide-50
SLIDE 50

Dynamic Graph Streams

◮ Consider a stream of edges inserts and deletions, e.g.,

add(1, 2), add(1, 4), add(2, 3), add(1, 3), add(4, 5), add(3, 4), del(1, 4) would result in the following graph

1 2 3 5 4 14/25

slide-51
SLIDE 51

Dynamic Graph Streams

◮ Consider a stream of edges inserts and deletions, e.g.,

add(1, 2), add(1, 4), add(2, 3), add(1, 3), add(4, 5), add(3, 4), del(1, 4) would result in the following graph

1 2 3 5 4

◮ Dynamic semi-streaming: What can we compute about a dynamic

graph with only O(n · polylog n) space?

14/25

slide-52
SLIDE 52

Outline

Spanners and Distances Sparsifiers and Cuts Sketches and Dynamic Graphs Connectivity k-Connectivity Minimum Cut

15/25

slide-53
SLIDE 53

Connectivity

◮ Goal: Test whether G is connected.

16/25

slide-54
SLIDE 54

Connectivity

◮ Goal: Test whether G is connected. ◮ Our algorithm will actually return a spanning forest of G.

16/25

slide-55
SLIDE 55

Connectivity

◮ Goal: Test whether G is connected. ◮ Our algorithm will actually return a spanning forest of G.

Lemma

Consider the offline algorithm:

  • 1. For each node, select an incident edge
  • 2. Contract selected edges.
  • 3. Repeat until no edges remain.

After log n steps, number of nodes is number of connected components in G. Furthermore, set of selected edges contains a spanning forest.

16/25

slide-56
SLIDE 56

Connectivity

◮ Goal: Test whether G is connected. ◮ Our algorithm will actually return a spanning forest of G.

Lemma

Consider the offline algorithm:

  • 1. For each node, select an incident edge
  • 2. Contract selected edges.
  • 3. Repeat until no edges remain.

After log n steps, number of nodes is number of connected components in G. Furthermore, set of selected edges contains a spanning forest.

◮ Idea: Emulate above algorithm in a single pass using ℓ0-sampling of

a particular vector representation of G.

16/25

slide-57
SLIDE 57

Useful Graph Representation

◮ Represent graph on [n] with edges E ⊂ [n] × [n], as matrix

G ∈ {−1, 0, 1}n×(

n 2)

with non-zero entries Gj,(j,k) = 1, Gk,(j,k) = −1 if (j, k) ∈ E.

17/25

slide-58
SLIDE 58

Useful Graph Representation

◮ Represent graph on [n] with edges E ⊂ [n] × [n], as matrix

G ∈ {−1, 0, 1}n×(

n 2)

with non-zero entries Gj,(j,k) = 1, Gk,(j,k) = −1 if (j, k) ∈ E. E.g.,

1 2 3 5 4

becomes,      

(1,2) (1,3) (1,4) (1,5) (2,3) (2,4) (2,5) (3,4) (3,5) (4,5)

1 1 1 2 −1 1 3 −1 −1 1 4 −1 1 5 −1      

17/25

slide-59
SLIDE 59

Useful Graph Representation

◮ Represent graph on [n] with edges E ⊂ [n] × [n], as matrix

G ∈ {−1, 0, 1}n×(

n 2)

with non-zero entries Gj,(j,k) = 1, Gk,(j,k) = −1 if (j, k) ∈ E. E.g.,

1 2 3 5 4

becomes,      

(1,2) (1,3) (1,4) (1,5) (2,3) (2,4) (2,5) (3,4) (3,5) (4,5)

1 1 1 2 −1 1 3 −1 −1 1 4 −1 1 5 −1      

◮ Lemma: For S ⊂ [n], support( i∈S ai) = E(S) where ai is ith row

  • f A and E(S) are edges across cut (S, V \ S).

17/25

slide-60
SLIDE 60

Connectivity Algorithm

◮ Let A(a1), A(a2), . . . , A(an) be sketches for ℓ0 sampling. Can

post-process each sketch to find incident edge on each node.

18/25

slide-61
SLIDE 61

Connectivity Algorithm

◮ Let A(a1), A(a2), . . . , A(an) be sketches for ℓ0 sampling. Can

post-process each sketch to find incident edge on each node.

◮ Suppose we found edges that connected, e.g., S = {a1, a2, a3}. How

can find an edge e ∈ E(S) without taking another pass?

18/25

slide-62
SLIDE 62

Connectivity Algorithm

◮ Let A(a1), A(a2), . . . , A(an) be sketches for ℓ0 sampling. Can

post-process each sketch to find incident edge on each node.

◮ Suppose we found edges that connected, e.g., S = {a1, a2, a3}. How

can find an edge e ∈ E(S) without taking another pass?

◮ Linearity: Because of linearity we can just add sketches,

A(a1) + A(a2) + A(a3) = A(a1 + a2 + a3) − → e ∈ E(S)

18/25

slide-63
SLIDE 63

Connectivity Algorithm

◮ Let A(a1), A(a2), . . . , A(an) be sketches for ℓ0 sampling. Can

post-process each sketch to find incident edge on each node.

◮ Suppose we found edges that connected, e.g., S = {a1, a2, a3}. How

can find an edge e ∈ E(S) without taking another pass?

◮ Linearity: Because of linearity we can just add sketches,

A(a1) + A(a2) + A(a3) = A(a1 + a2 + a3) − → e ∈ E(S)

◮ Under-the-rug: Actually we need to use log n independent sketch

matrices B, C, D, . . . to emulate each round of algorithm. But this is fine: we can compute each B(ai), C(ai), D(ai), . . . during same pass.

18/25

slide-64
SLIDE 64

Outline

Spanners and Distances Sparsifiers and Cuts Sketches and Dynamic Graphs Connectivity k-Connectivity Minimum Cut

19/25

slide-65
SLIDE 65

k-Connectivity

◮ Goal: Test whether all cuts of G have size at least k.

20/25

slide-66
SLIDE 66

k-Connectivity

◮ Goal: Test whether all cuts of G have size at least k. ◮ Our algorithm actually returns a certificate of k-connectivity.

20/25

slide-67
SLIDE 67

k-Connectivity

◮ Goal: Test whether all cuts of G have size at least k. ◮ Our algorithm actually returns a certificate of k-connectivity.

Definition

We say subgraph H is a k-certificate for G if, ∀ cuts (S, V \ S) : CH(S) ≥ min(CG(S), k) .

20/25

slide-68
SLIDE 68

k-Connectivity

◮ Goal: Test whether all cuts of G have size at least k. ◮ Our algorithm actually returns a certificate of k-connectivity.

Definition

We say subgraph H is a k-certificate for G if, ∀ cuts (S, V \ S) : CH(S) ≥ min(CG(S), k) .

Lemma

Let F1 be a spanning forest of G and, for i ≥ 2, let Fi be a spanning forest of G \ (F1 ∪ . . . ∪ Fi−1). Then F1 ∪ . . . ∪ Fk is a k-certificate for G.

20/25

slide-69
SLIDE 69

k-Connectivity

◮ Goal: Test whether all cuts of G have size at least k. ◮ Our algorithm actually returns a certificate of k-connectivity.

Definition

We say subgraph H is a k-certificate for G if, ∀ cuts (S, V \ S) : CH(S) ≥ min(CG(S), k) .

Lemma

Let F1 be a spanning forest of G and, for i ≥ 2, let Fi be a spanning forest of G \ (F1 ∪ . . . ∪ Fi−1). Then F1 ∪ . . . ∪ Fk is a k-certificate for G.

◮ Idea: Emulate above algorithm in a single pass by exploiting linearity

  • f connectivity algorithm.

20/25

slide-70
SLIDE 70

k-Connectivity Algorithm

◮ Can find F1 using the connectivity algorithm.

21/25

slide-71
SLIDE 71

k-Connectivity Algorithm

◮ Can find F1 using the connectivity algorithm. ◮ But how can we find F2 without taking another pass over the data?

21/25

slide-72
SLIDE 72

k-Connectivity Algorithm

◮ Can find F1 using the connectivity algorithm. ◮ But how can we find F2 without taking another pass over the data? ◮ Linearity: Suppose we have independent connectivity sketches

A(G) and B(G) of the graph G.

21/25

slide-73
SLIDE 73

k-Connectivity Algorithm

◮ Can find F1 using the connectivity algorithm. ◮ But how can we find F2 without taking another pass over the data? ◮ Linearity: Suppose we have independent connectivity sketches

A(G) and B(G) of the graph G.

  • 1. Construct F1 from A(G)

21/25

slide-74
SLIDE 74

k-Connectivity Algorithm

◮ Can find F1 using the connectivity algorithm. ◮ But how can we find F2 without taking another pass over the data? ◮ Linearity: Suppose we have independent connectivity sketches

A(G) and B(G) of the graph G.

  • 1. Construct F1 from A(G)
  • 2. Construct B(F1)

21/25

slide-75
SLIDE 75

k-Connectivity Algorithm

◮ Can find F1 using the connectivity algorithm. ◮ But how can we find F2 without taking another pass over the data? ◮ Linearity: Suppose we have independent connectivity sketches

A(G) and B(G) of the graph G.

  • 1. Construct F1 from A(G)
  • 2. Construct B(F1)
  • 3. Then B(G) − B(F1) = B(G \ F1) can be used to construct F2.

21/25

slide-76
SLIDE 76

k-Connectivity Algorithm

◮ Can find F1 using the connectivity algorithm. ◮ But how can we find F2 without taking another pass over the data? ◮ Linearity: Suppose we have independent connectivity sketches

A(G) and B(G) of the graph G.

  • 1. Construct F1 from A(G)
  • 2. Construct B(F1)
  • 3. Then B(G) − B(F1) = B(G \ F1) can be used to construct F2.

◮ Given A(G), B(G), C(G) we would find F1 and F2 as above. We

then find F3 from C(G) − C(F1) − C(F2) = C(G \ F1 ∪ F2) ,

21/25

slide-77
SLIDE 77

k-Connectivity Algorithm

◮ Can find F1 using the connectivity algorithm. ◮ But how can we find F2 without taking another pass over the data? ◮ Linearity: Suppose we have independent connectivity sketches

A(G) and B(G) of the graph G.

  • 1. Construct F1 from A(G)
  • 2. Construct B(F1)
  • 3. Then B(G) − B(F1) = B(G \ F1) can be used to construct F2.

◮ Given A(G), B(G), C(G) we would find F1 and F2 as above. We

then find F3 from C(G) − C(F1) − C(F2) = C(G \ F1 ∪ F2) ,

◮ And so on. . . resulting algorithm, connectivityk, requires one

pass and uses O(k · n · polylog n) space.

21/25

slide-78
SLIDE 78

Outline

Spanners and Distances Sparsifiers and Cuts Sketches and Dynamic Graphs Connectivity k-Connectivity Minimum Cut

22/25

slide-79
SLIDE 79

Estimating Minimum Cut

◮ Goal: Estimate the size of the min-cut up to a (1 + ǫ) factor.

23/25

slide-80
SLIDE 80

Estimating Minimum Cut

◮ Goal: Estimate the size of the min-cut up to a (1 + ǫ) factor. ◮ If min-cut size is O(ǫ−2 · polylog n) then connectivityk algorithm

can find exact min-cut exactly in O(ǫ−2 · n · polylog n) space.

23/25

slide-81
SLIDE 81

Estimating Minimum Cut

◮ Goal: Estimate the size of the min-cut up to a (1 + ǫ) factor. ◮ If min-cut size is O(ǫ−2 · polylog n) then connectivityk algorithm

can find exact min-cut exactly in O(ǫ−2 · n · polylog n) space.

◮ What can be done if min-cut is large?

23/25

slide-82
SLIDE 82

Estimating Minimum Cut

◮ Goal: Estimate the size of the min-cut up to a (1 + ǫ) factor. ◮ If min-cut size is O(ǫ−2 · polylog n) then connectivityk algorithm

can find exact min-cut exactly in O(ǫ−2 · n · polylog n) space.

◮ What can be done if min-cut is large?

Theorem (Karger)

Let G = (V , E) be an unweighted graph with min-cut value λ. If we sample each edge with probability p ≥ p∗ := 6λ−1ǫ−2 log n and assign weight 1/p to sampled edges, then the resulting graph is an (1 + ǫ)-sparsification of G with high probability.

23/25

slide-83
SLIDE 83

Estimating Minimum Cut

◮ Goal: Estimate the size of the min-cut up to a (1 + ǫ) factor. ◮ If min-cut size is O(ǫ−2 · polylog n) then connectivityk algorithm

can find exact min-cut exactly in O(ǫ−2 · n · polylog n) space.

◮ What can be done if min-cut is large?

Theorem (Karger)

Let G = (V , E) be an unweighted graph with min-cut value λ. If we sample each edge with probability p ≥ p∗ := 6λ−1ǫ−2 log n and assign weight 1/p to sampled edges, then the resulting graph is an (1 + ǫ)-sparsification of G with high probability.

◮ Idea: Subsample the input graph at different rates and use

connectivityk to compute min-cut size if it’s small enough.

23/25

slide-84
SLIDE 84

Min-Cut Algorithm

◮ Let hi be a hash function such that for each e ∈ [n] × [n]

P [hi(e) = 1] = 1/2i

24/25

slide-85
SLIDE 85

Min-Cut Algorithm

◮ Let hi be a hash function such that for each e ∈ [n] × [n]

P [hi(e) = 1] = 1/2i

◮ Let Gi = (V , Ei) where Ei = {e ∈ E : hi(e) = 1}

24/25

slide-86
SLIDE 86

Min-Cut Algorithm

◮ Let hi be a hash function such that for each e ∈ [n] × [n]

P [hi(e) = 1] = 1/2i

◮ Let Gi = (V , Ei) where Ei = {e ∈ E : hi(e) = 1} ◮ Let Hi = connectivityk(Gi) where k := 24ǫ−2 log n

24/25

slide-87
SLIDE 87

Min-Cut Algorithm

◮ Let hi be a hash function such that for each e ∈ [n] × [n]

P [hi(e) = 1] = 1/2i

◮ Let Gi = (V , Ei) where Ei = {e ∈ E : hi(e) = 1} ◮ Let Hi = connectivityk(Gi) where k := 24ǫ−2 log n ◮ Post-Processing: Let µi be min-cut size of Hi. Return

2j · µj where j = min{i : µi < k}

24/25

slide-88
SLIDE 88

Min-Cut Algorithm

◮ Let hi be a hash function such that for each e ∈ [n] × [n]

P [hi(e) = 1] = 1/2i

◮ Let Gi = (V , Ei) where Ei = {e ∈ E : hi(e) = 1} ◮ Let Hi = connectivityk(Gi) where k := 24ǫ−2 log n ◮ Post-Processing: Let µi be min-cut size of Hi. Return

2j · µj where j = min{i : µi < k}

◮ Analysis:

◮ Let λi be the size of min-cut of Gi 24/25

slide-89
SLIDE 89

Min-Cut Algorithm

◮ Let hi be a hash function such that for each e ∈ [n] × [n]

P [hi(e) = 1] = 1/2i

◮ Let Gi = (V , Ei) where Ei = {e ∈ E : hi(e) = 1} ◮ Let Hi = connectivityk(Gi) where k := 24ǫ−2 log n ◮ Post-Processing: Let µi be min-cut size of Hi. Return

2j · µj where j = min{i : µi < k}

◮ Analysis:

◮ Let λi be the size of min-cut of Gi ◮ Karger’s result implies 2iλi = (1 ± ǫ)λ for all i = 0, 1, . . . , ⌊lg 1/p∗⌋. 24/25

slide-90
SLIDE 90

Min-Cut Algorithm

◮ Let hi be a hash function such that for each e ∈ [n] × [n]

P [hi(e) = 1] = 1/2i

◮ Let Gi = (V , Ei) where Ei = {e ∈ E : hi(e) = 1} ◮ Let Hi = connectivityk(Gi) where k := 24ǫ−2 log n ◮ Post-Processing: Let µi be min-cut size of Hi. Return

2j · µj where j = min{i : µi < k}

◮ Analysis:

◮ Let λi be the size of min-cut of Gi ◮ Karger’s result implies 2iλi = (1 ± ǫ)λ for all i = 0, 1, . . . , ⌊lg 1/p∗⌋. ◮ If λi < k, connectivityk algorithm guarantees λi = µi. 24/25

slide-91
SLIDE 91

Min-Cut Algorithm

◮ Let hi be a hash function such that for each e ∈ [n] × [n]

P [hi(e) = 1] = 1/2i

◮ Let Gi = (V , Ei) where Ei = {e ∈ E : hi(e) = 1} ◮ Let Hi = connectivityk(Gi) where k := 24ǫ−2 log n ◮ Post-Processing: Let µi be min-cut size of Hi. Return

2j · µj where j = min{i : µi < k}

◮ Analysis:

◮ Let λi be the size of min-cut of Gi ◮ Karger’s result implies 2iλi = (1 ± ǫ)λ for all i = 0, 1, . . . , ⌊lg 1/p∗⌋. ◮ If λi < k, connectivityk algorithm guarantees λi = µi. ◮ Lemma: j ≤ ⌊lg 1/p∗⌋ 24/25

slide-92
SLIDE 92

Min-Cut Algorithm

◮ Let hi be a hash function such that for each e ∈ [n] × [n]

P [hi(e) = 1] = 1/2i

◮ Let Gi = (V , Ei) where Ei = {e ∈ E : hi(e) = 1} ◮ Let Hi = connectivityk(Gi) where k := 24ǫ−2 log n ◮ Post-Processing: Let µi be min-cut size of Hi. Return

2j · µj where j = min{i : µi < k}

◮ Analysis:

◮ Let λi be the size of min-cut of Gi ◮ Karger’s result implies 2iλi = (1 ± ǫ)λ for all i = 0, 1, . . . , ⌊lg 1/p∗⌋. ◮ If λi < k, connectivityk algorithm guarantees λi = µi. ◮ Lemma: j ≤ ⌊lg 1/p∗⌋

◮ Total space is O(k · n · polylog n) = O(ǫ−2 · n · polylog n).

24/25

slide-93
SLIDE 93

Min-Cut Algorithm

◮ Let hi be a hash function such that for each e ∈ [n] × [n]

P [hi(e) = 1] = 1/2i

◮ Let Gi = (V , Ei) where Ei = {e ∈ E : hi(e) = 1} ◮ Let Hi = connectivityk(Gi) where k := 24ǫ−2 log n ◮ Post-Processing: Let µi be min-cut size of Hi. Return

2j · µj where j = min{i : µi < k}

◮ Analysis:

◮ Let λi be the size of min-cut of Gi ◮ Karger’s result implies 2iλi = (1 ± ǫ)λ for all i = 0, 1, . . . , ⌊lg 1/p∗⌋. ◮ If λi < k, connectivityk algorithm guarantees λi = µi. ◮ Lemma: j ≤ ⌊lg 1/p∗⌋

◮ Total space is O(k · n · polylog n) = O(ǫ−2 · n · polylog n). ◮ Can extend these ideas to get (1 + ǫ)-sparsification of a dynamic

graph in a single pass and O(ǫ−2 · n · polylog n) space.

24/25

slide-94
SLIDE 94

Proof of Lemma

◮ Consider i = ⌊lg 1/p∗⌋ and so sampling probability for Gi is

2−i < 2p∗ = 12λ−1ǫ−2 log n

25/25

slide-95
SLIDE 95

Proof of Lemma

◮ Consider i = ⌊lg 1/p∗⌋ and so sampling probability for Gi is

2−i < 2p∗ = 12λ−1ǫ−2 log n

◮ Consider a cut in G of size λ. Expected number of edges across

same cut is Gi is at most 2p∗ · λ = 12ǫ−2 log n and is < 24 log n

ǫ2

= k with high probability. Hence, λi < k.

25/25