Optimal Lower Bounds for Distributed and Streaming Spanning Forest - - PowerPoint PPT Presentation

optimal lower bounds for distributed and streaming
SMART_READER_LITE
LIVE PREVIEW

Optimal Lower Bounds for Distributed and Streaming Spanning Forest - - PowerPoint PPT Presentation

Optimal Lower Bounds for Distributed and Streaming Spanning Forest Computation Huacheng Yu Oct 18, 2018 Harvard University Joint work with Jelani Nelson Warm-up Consider the following dynamic problem: edges are inserted into an initially


slide-1
SLIDE 1

Optimal Lower Bounds for Distributed and Streaming Spanning Forest Computation

Huacheng Yu Oct 18, 2018

Harvard University

Joint work with Jelani Nelson

slide-2
SLIDE 2

Warm-up

Consider the following dynamic problem:

  • edges are inserted into an initially empty graph G on n vertices

1

slide-3
SLIDE 3

Warm-up

Consider the following dynamic problem:

  • edges are inserted into an initially empty graph G on n vertices
  • must output a spanning forest when queried

1

slide-4
SLIDE 4

Warm-up

Consider the following dynamic problem:

  • edges are inserted into an initially empty graph G on n vertices
  • must output a spanning forest when queried

Goal: minimize space

1

slide-5
SLIDE 5

Warm-up

Consider the following dynamic problem:

  • edges are inserted into an initially empty graph G on n vertices
  • must output a spanning forest when queried

Goal: minimize space Space complexity: Θ(n log n) bits

  • maintain list of edges in the spanning forest: O(n log n)
  • when the final graph is a tree itself, have to output the whole

graph: Ω(n log n)

1

slide-6
SLIDE 6

Warm-up

Consider the following dynamic problem:

  • edges are inserted into an initially empty graph G on n vertices
  • must output a spanning forest when queried

Goal: minimize space Space complexity: Θ(n log n) bits

  • maintain list of edges in the spanning forest: O(n log n)
  • when the final graph is a tree itself, have to output the whole

graph: Ω(n log n)

what if we allow edge deletions?

1

slide-7
SLIDE 7

Fully dynamic spanning forest

Maintain a dynamic graph on n vertices, supporting

  • edge insertions,
  • edge deletions, and
  • spanning forest queries

Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O(n log3 n) bits of space with error probability 1/poly(n).

2

slide-8
SLIDE 8

Fully dynamic spanning forest

Maintain a dynamic graph on n vertices, supporting

  • edge insertions,
  • edge deletions, and
  • spanning forest queries

Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O(n log3 n) bits of space with error probability 1/poly(n).

  • nly two more log factors!

2

slide-9
SLIDE 9

Fully dynamic spanning forest

Maintain a dynamic graph on n vertices, supporting

  • edge insertions,
  • edge deletions, and
  • spanning forest queries

Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O(n log(n/δ) log2 n) bits of space with error probability δ.

  • nly two more log factors!

2

slide-10
SLIDE 10

Fully dynamic spanning forest

Maintain a dynamic graph on n vertices, supporting

  • edge insertions,
  • edge deletions, and
  • spanning forest queries

Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O(n log(n/δ) log2 n) bits of space with error probability δ.

  • nly two more log factors!

why two more?

2

slide-11
SLIDE 11

Main result I

Theorem (This paper) Any data structure for fully dynamic spanning forest with error probability δ must use Ω(n log(n/δ) log2 n) bits of memory, for any 2−n0.99 < δ < 0.99.

3

slide-12
SLIDE 12

Main result I

Theorem (This paper) Any data structure for fully dynamic spanning forest with error probability δ must use Ω(n log(n/δ) log2 n) bits of memory, for any 2−n0.99 < δ < 0.99. δ is a constant = ⇒ Ω(n log3 n) bits of space: need exactly two more log factors!

3

slide-13
SLIDE 13

Simultaneous communication

The [Ahn, Guha, McGregor’12] solution also solves the following n-player communication problem

4

slide-14
SLIDE 14

Simultaneous communication

The [Ahn, Guha, McGregor’12] solution also solves the following n-player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness:

  • each player only sees one vertex and its neighborhood

4

slide-15
SLIDE 15

Simultaneous communication

The [Ahn, Guha, McGregor’12] solution also solves the following n-player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness:

  • each player only sees one vertex and its neighborhood
  • each player sends a message to a referee

4

slide-16
SLIDE 16

Simultaneous communication

The [Ahn, Guha, McGregor’12] solution also solves the following n-player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness:

  • each player only sees one vertex and its neighborhood
  • each player sends a message to a referee
  • referee outputs a spanning forest w.p. 1 − δ

4

slide-17
SLIDE 17

Simultaneous communication

The [Ahn, Guha, McGregor’12] solution also solves the following n-player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness:

  • each player only sees one vertex and its neighborhood
  • each player sends a message to a referee
  • referee outputs a spanning forest w.p. 1 − δ

Goal: minimize communication

4

slide-18
SLIDE 18

Simultaneous communication

The [Ahn, Guha, McGregor’12] solution also solves the following n-player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness:

  • each player only sees one vertex and its neighborhood
  • each player sends a message to a referee
  • referee outputs a spanning forest w.p. 1 − δ

Goal: minimize communication (compute a global function given small “sketches” of “local information”)

4

slide-19
SLIDE 19

AGM sketch for simultaneous communication

A graph on n vertices is given to n players w. shared randomness:

  • each player only sees one vertex and its neighborhood
  • each player sends a message to a referee
  • referee outputs a spanning forest w.p. 1 − δ

Goal: minimize communication

5

slide-20
SLIDE 20

AGM sketch for simultaneous communication

A graph on n vertices is given to n players w. shared randomness:

  • each player only sees one vertex and its neighborhood
  • each player sends a message to a referee
  • referee outputs a spanning forest w.p. 1 − δ

Goal: minimize communication Theorem (AGM’12) ... solvable using (worst-case) O(log(n/δ) log2 n) bits of communication per player with error probability δ.

5

slide-21
SLIDE 21

AGM sketch for simultaneous communication

A graph on n vertices is given to n players w. shared randomness:

  • each player only sees one vertex and its neighborhood
  • each player sends a message to a referee
  • referee outputs a spanning forest w.p. 1 − δ

Goal: minimize communication Theorem (AGM’12) ... solvable using (worst-case) O(log(n/δ) log2 n) bits of communication per player with error probability δ. Trivial: Ω(log n) since the referee has to learn Ω(n log n) bits

5

slide-22
SLIDE 22

Main result II

Theorem (This paper) Any simultaneous communication protocol for spanning forest with error probability 0.99 must use Ω(log3 n) bits of communication on average.

6

slide-23
SLIDE 23

Main result II

Theorem (This paper) Any simultaneous communication protocol for spanning forest with error probability 0.99 must use Ω(log3 n) bits of communication on average. exactly two more log factors needed than the trivial information theoretical lower bound

6

slide-24
SLIDE 24

Main result II

Theorem (This paper) Any simultaneous communication protocol for spanning forest with error probability 0.99 must use Ω(log3 n) bits of communication on average. exactly two more log factors needed than the trivial information theoretical lower bound Open: higher lower bounds when error probability δ is lower?

6

slide-25
SLIDE 25

Graph sketching for spanning forest

[AGM’12] designed a (randomized) linear sketch: S : Nn2 → NO(n log2 n) such that

7

slide-26
SLIDE 26

Graph sketching for spanning forest

[AGM’12] designed a (randomized) linear sketch: S : Nn2 → NO(n log2 n) such that

  • S is a linear mapping with poly-bounded coefficients

7

slide-27
SLIDE 27

Graph sketching for spanning forest

[AGM’12] designed a (randomized) linear sketch: S : Nn2 → NO(n log2 n) such that

  • S is a linear mapping with poly-bounded coefficients
  • S(G) is a concatenation of S1(G), S2(G), . . . , Sn(G),

each Si(G) has O(log2 n) dimensions, and it is computed from the neighborhood of vertex i

7

slide-28
SLIDE 28

Graph sketching for spanning forest

[AGM’12] designed a (randomized) linear sketch: S : Nn2 → NO(n log2 n) such that

  • S is a linear mapping with poly-bounded coefficients
  • S(G) is a concatenation of S1(G), S2(G), . . . , Sn(G),

each Si(G) has O(log2 n) dimensions, and it is computed from the neighborhood of vertex i

  • S(G) determines a spanning forest with probability 1 − 1/nc

7

slide-29
SLIDE 29

Streaming algorithm

Store S(G) in memory:

  • update: S(G ± (u, v)) = S(G) ± S((u, v))
  • at end of stream: S(G) determines a spanning forest w.h.p.

Use O(n log3 n) bits of space

8

slide-30
SLIDE 30

Communication protocol

Given graph G:

  • Player i computes Si(G), and sends it to referee
  • referee concatenates all Si(G), obtains S(G)
  • referee outputs a spanning forest w.h.p.

Use O(log3 n) bits of communication per player

9

slide-31
SLIDE 31

Simultaneous communication complexity of spanning forest

10

slide-32
SLIDE 32

Recall...

An n-vertex graph is given to n players with shared randomness:

  • each player only sees one vertex and its neighborhood
  • each player sends a message to a referee
  • referee outputs a spanning forest w.p. 1 − δ

Goal: prove an average player must send Ω(log3 n) bits for constant δ

11

slide-33
SLIDE 33

Recall...

An n-vertex graph is given to n players with shared randomness:

  • each player only sees one vertex and its neighborhood
  • each player sends a message to a referee
  • referee outputs a spanning forest w.p. 1 − δ

Goal: prove some player must send Ω(log3 n) bits for δ = 1/nc

11

slide-34
SLIDE 34

Recall...

An n-vertex graph is given to n players with shared randomness:

  • each player only sees one vertex and its neighborhood
  • each player sends a message to a referee
  • referee outputs a spanning forest w.p. 1 − δ

Goal: prove some player must send Ω(log3 n) bits for δ = 1/nc Starting point: Universal Relation UR⊃...

11

slide-35
SLIDE 35

Universal Relation UR⊃

shared random bits... Alice: S ⊆ [n] Bob: T ⊂ S M

  • utput any x ∈ S \ T

12

slide-36
SLIDE 36

Universal Relation UR⊃

shared random bits... Alice: S ⊆ [n] Bob: T ⊂ S M

  • utput any x ∈ S \ T

Theorem (KNPWWY’17) For failure probability δ > 2−n0.99, the optimal length of M is Θ(log(1/δ) log2 n).

12

slide-37
SLIDE 37

Universal Relation UR⊃

shared random bits... Alice: S ⊆ [n] Bob: T ⊂ S M

  • utput any x ∈ S \ T

Theorem (KNPWWY’17) For failure probability δ > 2−n0.99, the optimal length of M is Θ(log(1/δ) log2 n). In particular, 1/nc failure probability, optimal length is Θ(log3 n).

12

slide-38
SLIDE 38

Connection to UR⊃

v u1 u2 · · · uk 13

slide-39
SLIDE 39

Connection to UR⊃

v u1 u2 · · · uk

referee Mv

13

slide-40
SLIDE 40

Connection to UR⊃

v u1 u2 · · · uk

referee Mv

13

slide-41
SLIDE 41

Connection to UR⊃

v u1 u2 · · · uk

referee Mv

v is only neighbor 13

slide-42
SLIDE 42

Connection to UR⊃

v u1 u2 · · · uk

referee Mv

v is only neighbor

S: neighbors of v T: vertices that v is only neighbor

13

slide-43
SLIDE 43

Connection to UR⊃

v u1 u2 · · · uk

referee Mv

v is only neighbor

S: neighbors of v T: vertices that v is only neighbor

Referee has to find some element in S \ T.

13

slide-44
SLIDE 44

Connection to UR⊃

v u1 u2 · · · uk

referee Mv

v is only neighbor

S: neighbors of v T: vertices that v is only neighbor

Referee has to find some element in S \ T. Why not already an Ω(log3 n) LB?

13

slide-45
SLIDE 45

Connection to UR⊃

v u1 u2 · · · uk

referee Mv

v is only neighbor

S: neighbors of v T: vertices that v is only neighbor

Referee has to find some element in S \ T. Why not already an Ω(log3 n) LB? Mu1 may also reveal (v, u1)...

13

slide-46
SLIDE 46

Hard instances

14

slide-47
SLIDE 47

Hard instances

n1−ǫ

14

slide-48
SLIDE 48

Hard instances

n1−ǫ |Vr| = nǫ

14

slide-49
SLIDE 49

Hard instances

n1−ǫ |Vr| = nǫ nǫ

Vi vi 14

slide-50
SLIDE 50

Hard instances

n1−ǫ |Vr| = nǫ nǫ

Vi vi

vertices randomly permuted

14

slide-51
SLIDE 51

Hard instances

n1−ǫ |Vr| = nǫ nǫ

Vi vi

vertices randomly permuted

Si

For vertex vi, its neighbors encode set Si

14

slide-52
SLIDE 52

Hard instances

n1−ǫ |Vr| = nǫ nǫ

Vi vi

vertices randomly permuted

Si Ti

For vertex vi, its neighbors encode set Si, its neighbors on the left encode set Ti.

14

slide-53
SLIDE 53

Hard instances

n1−ǫ |Vr| = nǫ nǫ

Vi vi

vertices randomly permuted

Si Ti

For vertex vi, its neighbors encode set Si, its neighbors on the left encode set Ti. Spanning forest contains an edge between vi and Vr.

14

slide-54
SLIDE 54

Hard distribution

Generating hard instances:

  • 1. Fix {vi} arbitrarily, randomly partition the rest into {Vi}, Vr;

Vr

Vi vi Ti Si 15

slide-55
SLIDE 55

Hard distribution

Generating hard instances:

  • 1. Fix {vi} arbitrarily, randomly partition the rest into {Vi}, Vr;
  • 2. For each vi, generate Si, Ti from hard distribution for UR⊃;

Vr

Vi vi Ti Si 15

slide-56
SLIDE 56

Hard distribution

Generating hard instances:

  • 1. Fix {vi} arbitrarily, randomly partition the rest into {Vi}, Vr;
  • 2. For each vi, generate Si, Ti from hard distribution for UR⊃;
  • 3. Connect each vi to |Ti| random vertices in Vi;
  • 4. Connect each vi to |Si \ Ti| random vertices in Vr.

Vr

Vi vi Ti Si 15

slide-57
SLIDE 57

Reduction from UR⊃

Make a reduction from UR⊃, main idea to solve UR⊃:

embed input (S, T) into one of (Si, Ti),

then simulate the spanning forest protocol.

16

slide-58
SLIDE 58

Reduction from UR⊃

Make a reduction from UR⊃, main idea to solve UR⊃:

embed input (S, T) into one of (Si, Ti),

then simulate the spanning forest protocol. Goals:

  • 1. Generate a graph G that “looks like” a hard instance

16

slide-59
SLIDE 59

Reduction from UR⊃

Make a reduction from UR⊃, main idea to solve UR⊃:

embed input (S, T) into one of (Si, Ti),

then simulate the spanning forest protocol. Goals:

  • 1. Generate a graph G that “looks like” a hard instance
  • 2. Spanning forest tells us an element in S \ T

16

slide-60
SLIDE 60

Reduction from UR⊃

Make a reduction from UR⊃, main idea to solve UR⊃:

embed input (S, T) into one of (Si, Ti),

then simulate the spanning forest protocol. Goals:

  • 1. Generate a graph G that “looks like” a hard instance
  • 2. Spanning forest tells us an element in S \ T
  • 3. Low communication cost and preserve success probability

16

slide-61
SLIDE 61

Solving UR⊃

Given (S, T) over universe [nǫ], generate a random graph G:

  • 1. Sample a random vi, a random injection f : [nǫ] → V \ {vi}i

vi 17

slide-62
SLIDE 62

Solving UR⊃

Given (S, T) over universe [nǫ], generate a random graph G:

  • 1. Sample a random vi, a random injection f : [nǫ] → V \ {vi}i
  • 2. Connect vi to f (S)

vi f (S) 17

slide-63
SLIDE 63

Solving UR⊃

Given (S, T) over universe [nǫ], generate a random graph G:

  • 1. Sample a random vi, a random injection f : [nǫ] → V \ {vi}i
  • 2. Connect vi to f (S)
  • 3. Vi := f (T) ∪ (nǫ − |T| other vertices)

vi Vi f (T) f (S) 17

slide-64
SLIDE 64

Solving UR⊃

Given (S, T) over universe [nǫ], generate a random graph G:

  • 1. Sample a random vi, a random injection f : [nǫ] → V \ {vi}i
  • 2. Connect vi to f (S)
  • 3. Vi := f (T) ∪ (nǫ − |T| other vertices)
  • 4. Vr := f ([nǫ] \ T) ∪ (|T| other vertices)

vi Vi f (T) Vr = f ([nǫ] \ T) ∪ (|T| other vertices) f (S) 17

slide-65
SLIDE 65

Solving UR⊃

Given (S, T) over universe [nǫ], generate a random graph G:

  • 1. Sample a random vi, a random injection f : [nǫ] → V \ {vi}i
  • 2. Connect vi to f (S)
  • 3. Vi := f (T) ∪ (nǫ − |T| other vertices)
  • 4. Vr := f ([nǫ] \ T) ∪ (|T| other vertices)
  • 5. Randomly partition other vertices into V1, . . . , Vi−1, Vi+1, . . .,

sample the neighborhoods of v1, . . . , vi−1, vi+1, . . .

vi Vi f (T) Vr = f ([nǫ] \ T) ∪ (|T| other vertices) f (S) 17

slide-66
SLIDE 66

Solving UR⊃

Given (S, T) over universe [nǫ], generate a random graph G:

  • 1. Sample a random vi, a random injection f : [nǫ] → V \ {vi}i
  • 2. Connect vi to f (S)
  • 3. Vi := f (T) ∪ (nǫ − |T| other vertices)
  • 4. Vr := f ([nǫ] \ T) ∪ (|T| other vertices)
  • 5. Randomly partition other vertices into V1, . . . , Vi−1, Vi+1, . . .,

sample the neighborhoods of v1, . . . , vi−1, vi+1, . . . Distribution of G is the hard distribution.

17

slide-67
SLIDE 67

Solving UR⊃

Given (S, T) over universe [nǫ], generate a random graph G:

  • 1. Sample a random vi, a random injection f : [nǫ] → V \ {vi}i
  • 2. Connect vi to f (S)
  • 3. Vi := f (T) ∪ (nǫ − |T| other vertices)
  • 4. Vr := f ([nǫ] \ T) ∪ (|T| other vertices)
  • 5. Randomly partition other vertices into V1, . . . , Vi−1, Vi+1, . . .,

sample the neighborhoods of v1, . . . , vi−1, vi+1, . . . Distribution of G is the hard distribution. Let u be one vi’s neighbor in Vr, then f −1(u) ∈ S \ T.

17

slide-68
SLIDE 68

UR⊃ protocol

Given (S, T) over universe [nǫ] A: send Mvi based on f (S) B: analyze the distribution of G conditioned on f , T, Mvi B: find u ∈ Vr that is a neighbor of vi with the highest prob.,

  • utput f −1(u)

Vr = f ([nǫ] \ T) ∪ (|T| other vertices) Vi vi f (T) f (S) 18

slide-69
SLIDE 69

Analyzing the protocol

The protocol for UR⊃ has

  • communication cost |Mvi|, and
  • failure probability ≤ δ + 1/n0.1.

By [KNPWWY’17], |Mvi| ≥ Ω(log(1/δ) log2 n) (Ω(log3 n) lower bound when δ = 1/nc)

19

slide-70
SLIDE 70

Open question

Lower bounds for simultaneous communication when error probability is small? Ω(log(n/δ) log2 n)?

20

slide-71
SLIDE 71

Open question

Lower bounds for simultaneous communication when error probability is small? Ω(log(n/δ) log2 n)? Proving the same lower bounds for maintaining connected components? and for connectivity: “if the whole graph is connected”?

20

slide-72
SLIDE 72

Open question

Lower bounds for simultaneous communication when error probability is small? Ω(log(n/δ) log2 n)? Proving the same lower bounds for maintaining connected components? and for connectivity: “if the whole graph is connected”? Thank you!

20