Data Streams & Communication Complexity Lecture 3: Communication - - PowerPoint PPT Presentation

data streams communication complexity
SMART_READER_LITE
LIVE PREVIEW

Data Streams & Communication Complexity Lecture 3: Communication - - PowerPoint PPT Presentation

Data Streams & Communication Complexity Lecture 3: Communication Complexity and Lower Bounds Andrew McGregor, UMass Amherst 1/23 Basic Communication Complexity Three friends Alice, Bob, and Charlie each have some information x , y , z


slide-1
SLIDE 1

Data Streams & Communication Complexity

Lecture 3: Communication Complexity and Lower Bounds Andrew McGregor, UMass Amherst

1/23

slide-2
SLIDE 2

Basic Communication Complexity

◮ Three friends Alice, Bob, and Charlie each have some information

x, y, z and Charlie wants to compute some function P(x, y, z).

2/23

slide-3
SLIDE 3

Basic Communication Complexity

◮ Three friends Alice, Bob, and Charlie each have some information

x, y, z and Charlie wants to compute some function P(x, y, z).

x y z

m1 m2

  • ut

◮ To help Charlie, Alice sends a message m1 to Bob, and then Bob

sends a message m2 to Charlie.

2/23

slide-4
SLIDE 4

Basic Communication Complexity

◮ Three friends Alice, Bob, and Charlie each have some information

x, y, z and Charlie wants to compute some function P(x, y, z).

x y z

m1 m2

  • ut

◮ To help Charlie, Alice sends a message m1 to Bob, and then Bob

sends a message m2 to Charlie.

◮ Question: How large must the total length of the messages be for

Charlie to evaluate P(x, y, z) correctly?

2/23

slide-5
SLIDE 5

Basic Communication Complexity

◮ Three friends Alice, Bob, and Charlie each have some information

x, y, z and Charlie wants to compute some function P(x, y, z).

x y z

m1 m2

  • ut

◮ To help Charlie, Alice sends a message m1 to Bob, and then Bob

sends a message m2 to Charlie.

◮ Question: How large must the total length of the messages be for

Charlie to evaluate P(x, y, z) correctly?

◮ Deterministic: m1(x), m2(m1, y), out(m2, z) = P(x, y, z) 2/23

slide-6
SLIDE 6

Basic Communication Complexity

◮ Three friends Alice, Bob, and Charlie each have some information

x, y, z and Charlie wants to compute some function P(x, y, z).

x y z

m1 m2

  • ut

◮ To help Charlie, Alice sends a message m1 to Bob, and then Bob

sends a message m2 to Charlie.

◮ Question: How large must the total length of the messages be for

Charlie to evaluate P(x, y, z) correctly?

◮ Deterministic: m1(x), m2(m1, y), out(m2, z) = P(x, y, z) ◮ Random: m1(x, r), m2(m1, y, r), out(m2, z, r) where r is public

random string. Require Pr[out(m2, z, r) = P(x, y, z)] ≥ 9/10.

2/23

slide-7
SLIDE 7

Stream Algorithms Yield Communication Protocols

3/23

slide-8
SLIDE 8

Stream Algorithms Yield Communication Protocols

◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,

y → S2, z → S3 such that knowing Q(S1 ◦ S2 ◦ S3) solves P(x, y, z).

x y z

m1 m2

  • ut

S1 S2 S3 3/23

slide-9
SLIDE 9

Stream Algorithms Yield Communication Protocols

◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,

y → S2, z → S3 such that knowing Q(S1 ◦ S2 ◦ S3) solves P(x, y, z).

x y z

m1 m2

  • ut

S1 S2 S3

◮ An s-bit stream algorithm A for Q yields 2s-bit protocol for P:

3/23

slide-10
SLIDE 10

Stream Algorithms Yield Communication Protocols

◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,

y → S2, z → S3 such that knowing Q(S1 ◦ S2 ◦ S3) solves P(x, y, z).

x y z

m1 m2

  • ut

S1 S2 S3

◮ An s-bit stream algorithm A for Q yields 2s-bit protocol for P: Alice

runs A of S1;

3/23

slide-11
SLIDE 11

Stream Algorithms Yield Communication Protocols

◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,

y → S2, z → S3 such that knowing Q(S1 ◦ S2 ◦ S3) solves P(x, y, z).

x y z

m1 m2

  • ut

S1 S2 S3

◮ An s-bit stream algorithm A for Q yields 2s-bit protocol for P: Alice

runs A of S1; sends memory state to Bob;

3/23

slide-12
SLIDE 12

Stream Algorithms Yield Communication Protocols

◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,

y → S2, z → S3 such that knowing Q(S1 ◦ S2 ◦ S3) solves P(x, y, z).

x y z

m1 m2

  • ut

S1 S2 S3

◮ An s-bit stream algorithm A for Q yields 2s-bit protocol for P: Alice

runs A of S1; sends memory state to Bob; Bob instantiates A with state and runs it on S2;

3/23

slide-13
SLIDE 13

Stream Algorithms Yield Communication Protocols

◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,

y → S2, z → S3 such that knowing Q(S1 ◦ S2 ◦ S3) solves P(x, y, z).

x y z

m1 m2

  • ut

S1 S2 S3

◮ An s-bit stream algorithm A for Q yields 2s-bit protocol for P: Alice

runs A of S1; sends memory state to Bob; Bob instantiates A with state and runs it on S2; sends state to Charlie who finishes running A on S3 and infers P(x, y, z) from Q(S1 ◦ S2 ◦ S3).

3/23

slide-14
SLIDE 14

Communication Lower Bounds imply Stream Lower Bounds

◮ Had there been t players, the s-bit stream algorithm for Q would

have lead to a (t − 1)s bit protocol P.

4/23

slide-15
SLIDE 15

Communication Lower Bounds imply Stream Lower Bounds

◮ Had there been t players, the s-bit stream algorithm for Q would

have lead to a (t − 1)s bit protocol P.

◮ Hence, a lower bound of L on the communication required for P

implies s ≥ L/(t − 1) bits of space are required to solve Q.

4/23

slide-16
SLIDE 16

Outline of Lecture

Classic Problems and Reductions Information Statistics Approach Hamming Approximation

5/23

slide-17
SLIDE 17

Outline

Classic Problems and Reductions Information Statistics Approach Hamming Approximation

6/23

slide-18
SLIDE 18

Indexing

◮ Consider a binary string x ∈ {0, 1}n and j ∈ [n], e.g.,

x =

  • 1

1 1

  • and

j = 3 and define Index(x, j) = xj

7/23

slide-19
SLIDE 19

Indexing

◮ Consider a binary string x ∈ {0, 1}n and j ∈ [n], e.g.,

x =

  • 1

1 1

  • and

j = 3 and define Index(x, j) = xj

◮ Suppose Alice knows x and Bob knows j.

7/23

slide-20
SLIDE 20

Indexing

◮ Consider a binary string x ∈ {0, 1}n and j ∈ [n], e.g.,

x =

  • 1

1 1

  • and

j = 3 and define Index(x, j) = xj

◮ Suppose Alice knows x and Bob knows j. ◮ How many bits need to be sent by Alice for Bob to determine

Index(x, j) with probability 9/10?

7/23

slide-21
SLIDE 21

Indexing

◮ Consider a binary string x ∈ {0, 1}n and j ∈ [n], e.g.,

x =

  • 1

1 1

  • and

j = 3 and define Index(x, j) = xj

◮ Suppose Alice knows x and Bob knows j. ◮ How many bits need to be sent by Alice for Bob to determine

Index(x, j) with probability 9/10? Ω(n)

7/23

slide-22
SLIDE 22

Application: Median Finding

◮ Thm: Any algorithm that returns the exact median of length 2n − 1

stream requires Ω(n) memory.

8/23

slide-23
SLIDE 23

Application: Median Finding

◮ Thm: Any algorithm that returns the exact median of length 2n − 1

stream requires Ω(n) memory.

◮ Reduction from Index: On input x ∈ {0, 1}n, Alice generates

S1 = {2i + xi : i ∈ [n]}. On input j ∈ [n], Bob generates S2 = {n − j copies of 0 and j − 1 copies of 2n + 2}. E.g., x =

  • 1

1 1

{2, 5, 6, 9, 11, 12} j = 3 → {0, 0, 0, 14, 14}

8/23

slide-24
SLIDE 24

Application: Median Finding

◮ Thm: Any algorithm that returns the exact median of length 2n − 1

stream requires Ω(n) memory.

◮ Reduction from Index: On input x ∈ {0, 1}n, Alice generates

S1 = {2i + xi : i ∈ [n]}. On input j ∈ [n], Bob generates S2 = {n − j copies of 0 and j − 1 copies of 2n + 2}. E.g., x =

  • 1

1 1

{2, 5, 6, 9, 11, 12} j = 3 → {0, 0, 0, 14, 14}

◮ Then median(S1 ∪ S2) = 2j + xj and this determines Index(x, j).

8/23

slide-25
SLIDE 25

Application: Median Finding

◮ Thm: Any algorithm that returns the exact median of length 2n − 1

stream requires Ω(n) memory.

◮ Reduction from Index: On input x ∈ {0, 1}n, Alice generates

S1 = {2i + xi : i ∈ [n]}. On input j ∈ [n], Bob generates S2 = {n − j copies of 0 and j − 1 copies of 2n + 2}. E.g., x =

  • 1

1 1

{2, 5, 6, 9, 11, 12} j = 3 → {0, 0, 0, 14, 14}

◮ Then median(S1 ∪ S2) = 2j + xj and this determines Index(x, j). ◮ An s-space algorithm implies an s-bit protocol so

s = Ω(n) by the communication complexity of indexing.

8/23

slide-26
SLIDE 26

Multi-Party Set-Disjointness

◮ Consider a t × n matrix where column has weight 0, 1, or t, e.g.,

C =     1 1 1 1 1 1 1     and let Disjt(C) = 1 if there is an all 1’s column and 0 otherwise.

9/23

slide-27
SLIDE 27

Multi-Party Set-Disjointness

◮ Consider a t × n matrix where column has weight 0, 1, or t, e.g.,

C =     1 1 1 1 1 1 1     and let Disjt(C) = 1 if there is an all 1’s column and 0 otherwise.

◮ Consider t players where Pi knows i-th row of C.

9/23

slide-28
SLIDE 28

Multi-Party Set-Disjointness

◮ Consider a t × n matrix where column has weight 0, 1, or t, e.g.,

C =     1 1 1 1 1 1 1     and let Disjt(C) = 1 if there is an all 1’s column and 0 otherwise.

◮ Consider t players where Pi knows i-th row of C. ◮ How many bits need to be communicated between the players to

determine Disjt(C)?

9/23

slide-29
SLIDE 29

Multi-Party Set-Disjointness

◮ Consider a t × n matrix where column has weight 0, 1, or t, e.g.,

C =     1 1 1 1 1 1 1     and let Disjt(C) = 1 if there is an all 1’s column and 0 otherwise.

◮ Consider t players where Pi knows i-th row of C. ◮ How many bits need to be communicated between the players to

determine Disjt(C)? Ω(n/t)

9/23

slide-30
SLIDE 30

Application: Frequency Moments

◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space.

10/23

slide-31
SLIDE 31

Application: Frequency Moments

◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness:

10/23

slide-32
SLIDE 32

Application: Frequency Moments

◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness: The i-th player generates set

Si = {j : Cij = 1}, e.g.,     1 1 1 1 1 1 1     − → {4, 1, 4, 5, 2, 4, 4}

10/23

slide-33
SLIDE 33

Application: Frequency Moments

◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness: The i-th player generates set

Si = {j : Cij = 1}, e.g.,     1 1 1 1 1 1 1     − → {4, 1, 4, 5, 2, 4, 4}

◮ If all columns have weight 0 or 1: Fk(S) ≤ n

10/23

slide-34
SLIDE 34

Application: Frequency Moments

◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness: The i-th player generates set

Si = {j : Cij = 1}, e.g.,     1 1 1 1 1 1 1     − → {4, 1, 4, 5, 2, 4, 4}

◮ If all columns have weight 0 or 1: Fk(S) ≤ n ◮ If there’s column of weight t: Fk(S) ≥ tk

10/23

slide-35
SLIDE 35

Application: Frequency Moments

◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness: The i-th player generates set

Si = {j : Cij = 1}, e.g.,     1 1 1 1 1 1 1     − → {4, 1, 4, 5, 2, 4, 4}

◮ If all columns have weight 0 or 1: Fk(S) ≤ n ◮ If there’s column of weight t: Fk(S) ≥ tk ◮ If t > 21/kn1/k then a 2 approximation of Fk(S) distinguishes cases.

10/23

slide-36
SLIDE 36

Application: Frequency Moments

◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness: The i-th player generates set

Si = {j : Cij = 1}, e.g.,     1 1 1 1 1 1 1     − → {4, 1, 4, 5, 2, 4, 4}

◮ If all columns have weight 0 or 1: Fk(S) ≤ n ◮ If there’s column of weight t: Fk(S) ≥ tk ◮ If t > 21/kn1/k then a 2 approximation of Fk(S) distinguishes cases. ◮ An s-space 2-approximation implies an s(t − 1) bit protocol so

s = Ω(n/t2) = Ω(n1−2/k) by the communication complexity of set-disjointness.

10/23

slide-37
SLIDE 37

Hamming Approximation

◮ Consider 2 binary vectors x, y ∈ {0, 1}n, e.g.,

x =

  • 1

1 1

  • y =

1 1 1 1 and define the Hamming distance ∆(x, y) = |{i : xi = yi}|.

11/23

slide-38
SLIDE 38

Hamming Approximation

◮ Consider 2 binary vectors x, y ∈ {0, 1}n, e.g.,

x =

  • 1

1 1

  • y =

1 1 1 1 and define the Hamming distance ∆(x, y) = |{i : xi = yi}|.

◮ Suppose Alice knows x and Bob knows y.

11/23

slide-39
SLIDE 39

Hamming Approximation

◮ Consider 2 binary vectors x, y ∈ {0, 1}n, e.g.,

x =

  • 1

1 1

  • y =

1 1 1 1 and define the Hamming distance ∆(x, y) = |{i : xi = yi}|.

◮ Suppose Alice knows x and Bob knows y. ◮ How many bits need to be communicated to estimate ∆(x, y) up to

an additive √n error?

11/23

slide-40
SLIDE 40

Hamming Approximation

◮ Consider 2 binary vectors x, y ∈ {0, 1}n, e.g.,

x =

  • 1

1 1

  • y =

1 1 1 1 and define the Hamming distance ∆(x, y) = |{i : xi = yi}|.

◮ Suppose Alice knows x and Bob knows y. ◮ How many bits need to be communicated to estimate ∆(x, y) up to

an additive √n error? Ω(n) bits.

11/23

slide-41
SLIDE 41

Application: Distinct Elements

◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space.

12/23

slide-42
SLIDE 42

Application: Distinct Elements

◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space. ◮ Reduction from Hamming Approximation: On input x, y ∈ {0, 1}n,

players form S1 = {j : xj = 1} and S2 = {j : yj = 1}, e.g.,

  • 1

1 1

  • ,
  • 1

1 1 1

→ {2, 4, 5, 1, 2, 5, 6}

12/23

slide-43
SLIDE 43

Application: Distinct Elements

◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space. ◮ Reduction from Hamming Approximation: On input x, y ∈ {0, 1}n,

players form S1 = {j : xj = 1} and S2 = {j : yj = 1}, e.g.,

  • 1

1 1

  • ,
  • 1

1 1 1

→ {2, 4, 5, 1, 2, 5, 6}

◮ Note that 2F0(S) = |x| + |y| + ∆(x, y).

12/23

slide-44
SLIDE 44

Application: Distinct Elements

◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space. ◮ Reduction from Hamming Approximation: On input x, y ∈ {0, 1}n,

players form S1 = {j : xj = 1} and S2 = {j : yj = 1}, e.g.,

  • 1

1 1

  • ,
  • 1

1 1 1

→ {2, 4, 5, 1, 2, 5, 6}

◮ Note that 2F0(S) = |x| + |y| + ∆(x, y). ◮ We may assume |x| and |y| are known Bob. Hence, a (1 + ǫ)

approximation of F0 yields an additive approximation to ∆(x, y) of ǫ(|x| + |y| + ∆(x, y))/2 ≤ nǫ

12/23

slide-45
SLIDE 45

Application: Distinct Elements

◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space. ◮ Reduction from Hamming Approximation: On input x, y ∈ {0, 1}n,

players form S1 = {j : xj = 1} and S2 = {j : yj = 1}, e.g.,

  • 1

1 1

  • ,
  • 1

1 1 1

→ {2, 4, 5, 1, 2, 5, 6}

◮ Note that 2F0(S) = |x| + |y| + ∆(x, y). ◮ We may assume |x| and |y| are known Bob. Hence, a (1 + ǫ)

approximation of F0 yields an additive approximation to ∆(x, y) of ǫ(|x| + |y| + ∆(x, y))/2 ≤ nǫ

◮ This is less than √n if ǫ < 1/√n

12/23

slide-46
SLIDE 46

Application: Distinct Elements

◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space. ◮ Reduction from Hamming Approximation: On input x, y ∈ {0, 1}n,

players form S1 = {j : xj = 1} and S2 = {j : yj = 1}, e.g.,

  • 1

1 1

  • ,
  • 1

1 1 1

→ {2, 4, 5, 1, 2, 5, 6}

◮ Note that 2F0(S) = |x| + |y| + ∆(x, y). ◮ We may assume |x| and |y| are known Bob. Hence, a (1 + ǫ)

approximation of F0 yields an additive approximation to ∆(x, y) of ǫ(|x| + |y| + ∆(x, y))/2 ≤ nǫ

◮ This is less than √n if ǫ < 1/√n ◮ An s-space (1 + ǫ)-approximation implies an s bit protocol so

s = Ω(n) = Ω(1/ǫ2) by communication complexity of approximating Hamming distance.

12/23

slide-47
SLIDE 47

Outline

Classic Problems and Reductions Information Statistics Approach Hamming Approximation

13/23

slide-48
SLIDE 48

Information Statistics Approach

◮ Information statistics approach is based on analyzing the

“information revealed” about the input from the messages.

14/23

slide-49
SLIDE 49

Information Statistics Approach

◮ Information statistics approach is based on analyzing the

“information revealed” about the input from the messages.

◮ Useful for proving bounds on complicated functions in terms of

simpler problems, e.g., proving a bound on Disjt(M) =

  • j∈[n]

Andt(M1,j, . . . , Mt,j) by first establishing a bound on Andt.

14/23

slide-50
SLIDE 50

Information Statistics Approach

◮ Information statistics approach is based on analyzing the

“information revealed” about the input from the messages.

◮ Useful for proving bounds on complicated functions in terms of

simpler problems, e.g., proving a bound on Disjt(M) =

  • j∈[n]

Andt(M1,j, . . . , Mt,j) by first establishing a bound on Andt.

◮ We’ll first give some definitions and then run through an example.

14/23

slide-51
SLIDE 51

Information Theory Definitions

◮ Let X and Y be random variables.

15/23

slide-52
SLIDE 52

Information Theory Definitions

◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i]

15/23

slide-53
SLIDE 53

Information Theory Definitions

◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)]

15/23

slide-54
SLIDE 54

Information Theory Definitions

◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X)

15/23

slide-55
SLIDE 55

Information Theory Definitions

◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X) ◮ Mutual Information: I(X : Y ) = H(X) − H(X|Y )

15/23

slide-56
SLIDE 56

Information Theory Definitions

◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X) ◮ Mutual Information: I(X : Y ) = H(X) − H(X|Y )

H(X) H(Y) I(X:Y) H(X|Y) H(Y|X)

15/23

slide-57
SLIDE 57

Information Theory Definitions

◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X) ◮ Mutual Information: I(X : Y ) = H(X) − H(X|Y )

H(X) H(Y) I(X:Y) H(X|Y) H(Y|X)

◮ Useful Facts:

15/23

slide-58
SLIDE 58

Information Theory Definitions

◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X) ◮ Mutual Information: I(X : Y ) = H(X) − H(X|Y )

H(X) H(Y) I(X:Y) H(X|Y) H(Y|X)

◮ Useful Facts:

◮ If X takes at most 2ℓ values, then H(X) ≤ ℓ. 15/23

slide-59
SLIDE 59

Information Theory Definitions

◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X) ◮ Mutual Information: I(X : Y ) = H(X) − H(X|Y )

H(X) H(Y) I(X:Y) H(X|Y) H(Y|X)

◮ Useful Facts:

◮ If X takes at most 2ℓ values, then H(X) ≤ ℓ. ◮ If X and Y are independent, then I(XY : Z) ≥ I(X : Z) + I(Y : Z). 15/23

slide-60
SLIDE 60

Information Cost

◮ Suppose you have a protocol Π for a two-party communication

problem P in which Alice and Bob have random inputs X and Y .

16/23

slide-61
SLIDE 61

Information Cost

◮ Suppose you have a protocol Π for a two-party communication

problem P in which Alice and Bob have random inputs X and Y .

◮ Let M be the (random) message sent by Alice and define:

cost(Π) = max |M| and icost(Π) = I(M : X)

16/23

slide-62
SLIDE 62

Information Cost

◮ Suppose you have a protocol Π for a two-party communication

problem P in which Alice and Bob have random inputs X and Y .

◮ Let M be the (random) message sent by Alice and define:

cost(Π) = max |M| and icost(Π) = I(M : X)

◮ Note that

icost(Π) = I(M : X) ≤ H(M) ≤ cost(Π) .

16/23

slide-63
SLIDE 63

Example: Indexing

◮ We’ll prove a lower bound on the information cost of Index where

X ∈R {0, 1}n in terms a simpler problem “Echo.”

17/23

slide-64
SLIDE 64

Example: Indexing

◮ We’ll prove a lower bound on the information cost of Index where

X ∈R {0, 1}n in terms a simpler problem “Echo.”

◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output

B with probability at least 1 − δ.

17/23

slide-65
SLIDE 65

Example: Indexing

◮ We’ll prove a lower bound on the information cost of Index where

X ∈R {0, 1}n in terms a simpler problem “Echo.”

◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output

B with probability at least 1 − δ.

◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo

where i is hard-coded into the protocol:

17/23

slide-66
SLIDE 66

Example: Indexing

◮ We’ll prove a lower bound on the information cost of Index where

X ∈R {0, 1}n in terms a simpler problem “Echo.”

◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output

B with probability at least 1 − δ.

◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo

where i is hard-coded into the protocol:

  • 1. Given B, Alice picks Xj ∈R {0, 1} for j = i and generates:

X = (X1, X2, . . . , Xi−1, B, Xi+1, . . . , Xn)

17/23

slide-67
SLIDE 67

Example: Indexing

◮ We’ll prove a lower bound on the information cost of Index where

X ∈R {0, 1}n in terms a simpler problem “Echo.”

◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output

B with probability at least 1 − δ.

◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo

where i is hard-coded into the protocol:

  • 1. Given B, Alice picks Xj ∈R {0, 1} for j = i and generates:

X = (X1, X2, . . . , Xi−1, B, Xi+1, . . . , Xn)

  • 2. She sends the message M she’d have sent in ΠIndex if she’d had X.

17/23

slide-68
SLIDE 68

Example: Indexing

◮ We’ll prove a lower bound on the information cost of Index where

X ∈R {0, 1}n in terms a simpler problem “Echo.”

◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output

B with probability at least 1 − δ.

◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo

where i is hard-coded into the protocol:

  • 1. Given B, Alice picks Xj ∈R {0, 1} for j = i and generates:

X = (X1, X2, . . . , Xi−1, B, Xi+1, . . . , Xn)

  • 2. She sends the message M she’d have sent in ΠIndex if she’d had X.
  • 3. Bob receives M and outputs the value he’d have returned in ΠIndex

had his input been i.

17/23

slide-69
SLIDE 69

Example: Indexing

◮ We’ll prove a lower bound on the information cost of Index where

X ∈R {0, 1}n in terms a simpler problem “Echo.”

◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output

B with probability at least 1 − δ.

◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo

where i is hard-coded into the protocol:

  • 1. Given B, Alice picks Xj ∈R {0, 1} for j = i and generates:

X = (X1, X2, . . . , Xi−1, B, Xi+1, . . . , Xn)

  • 2. She sends the message M she’d have sent in ΠIndex if she’d had X.
  • 3. Bob receives M and outputs the value he’d have returned in ΠIndex

had his input been i.

◮ Note 1: If ΠIndex is correct with probability 1 − δ then ΠEcho,i is

also correct with probability 1 − δ.

17/23

slide-70
SLIDE 70

Example: Indexing

◮ We’ll prove a lower bound on the information cost of Index where

X ∈R {0, 1}n in terms a simpler problem “Echo.”

◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output

B with probability at least 1 − δ.

◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo

where i is hard-coded into the protocol:

  • 1. Given B, Alice picks Xj ∈R {0, 1} for j = i and generates:

X = (X1, X2, . . . , Xi−1, B, Xi+1, . . . , Xn)

  • 2. She sends the message M she’d have sent in ΠIndex if she’d had X.
  • 3. Bob receives M and outputs the value he’d have returned in ΠIndex

had his input been i.

◮ Note 1: If ΠIndex is correct with probability 1 − δ then ΠEcho,i is

also correct with probability 1 − δ.

◮ Note 2: The message in ΠIndex on input X ∈R {0, 1}n is distributed

identically to the message in ΠEcho,i on input B ∈R {0, 1}.

17/23

slide-71
SLIDE 71

Relating Information Cost of Index and Echo

◮ Since X1, X2, . . . , Xn are independent:

cost(ΠIndex) ≥ icost(ΠIndex)

18/23

slide-72
SLIDE 72

Relating Information Cost of Index and Echo

◮ Since X1, X2, . . . , Xn are independent:

cost(ΠIndex) ≥ icost(ΠIndex) = I(X1X2 . . . Xn : M)

18/23

slide-73
SLIDE 73

Relating Information Cost of Index and Echo

◮ Since X1, X2, . . . , Xn are independent:

cost(ΠIndex) ≥ icost(ΠIndex) = I(X1X2 . . . Xn : M) ≥ I(X1 : M) + I(X2 : M) + . . . + I(Xn : M)

18/23

slide-74
SLIDE 74

Relating Information Cost of Index and Echo

◮ Since X1, X2, . . . , Xn are independent:

cost(ΠIndex) ≥ icost(ΠIndex) = I(X1X2 . . . Xn : M) ≥ I(X1 : M) + I(X2 : M) + . . . + I(Xn : M) = icost(ΠEcho,1) + icost(ΠEcho,2) + . . . + icost(ΠEcho,n)

18/23

slide-75
SLIDE 75

Relating Information Cost of Index and Echo

◮ Since X1, X2, . . . , Xn are independent:

cost(ΠIndex) ≥ icost(ΠIndex) = I(X1X2 . . . Xn : M) ≥ I(X1 : M) + I(X2 : M) + . . . + I(Xn : M) = icost(ΠEcho,1) + icost(ΠEcho,2) + . . . + icost(ΠEcho,n)

◮ By Fano’s inequality, solving Echo with probability > 1 − δ requires

icost(ΠEcho,i) = H(B) − H(B|M) ≥ 1 − H2(δ) where H2(p) = −p lg p − (1 − p) lg(1 − p).

18/23

slide-76
SLIDE 76

Relating Information Cost of Index and Echo

◮ Since X1, X2, . . . , Xn are independent:

cost(ΠIndex) ≥ icost(ΠIndex) = I(X1X2 . . . Xn : M) ≥ I(X1 : M) + I(X2 : M) + . . . + I(Xn : M) = icost(ΠEcho,1) + icost(ΠEcho,2) + . . . + icost(ΠEcho,n)

◮ By Fano’s inequality, solving Echo with probability > 1 − δ requires

icost(ΠEcho,i) = H(B) − H(B|M) ≥ 1 − H2(δ) where H2(p) = −p lg p − (1 − p) lg(1 − p).

◮ Hence, cost(ΠIndex) ≥ (1 − H2(δ))n.

18/23

slide-77
SLIDE 77

Outline for Disjt Lower Bound

◮ Express Disjt in terms of Andt where Andt(x1, . . . , xt) = i xi:

Disjt(C) =

  • j∈[n]

Andt(C1,j, . . . , Ct,j)

19/23

slide-78
SLIDE 78

Outline for Disjt Lower Bound

◮ Express Disjt in terms of Andt where Andt(x1, . . . , xt) = i xi:

Disjt(C) =

  • j∈[n]

Andt(C1,j, . . . , Ct,j)

◮ Define input C by CDjj ∈R {0, 1} for Dj ∈R [t]. All other entries 0.

19/23

slide-79
SLIDE 79

Outline for Disjt Lower Bound

◮ Express Disjt in terms of Andt where Andt(x1, . . . , xt) = i xi:

Disjt(C) =

  • j∈[n]

Andt(C1,j, . . . , Ct,j)

◮ Define input C by CDjj ∈R {0, 1} for Dj ∈R [t]. All other entries 0. ◮ Let M = (M1, . . . , Mt−1) be the messages sent in a t-party protocol

and define the information cost of a protocol as: icost(Π|D) = I(C : M|D) where D = (D1, . . . , Dn) .

19/23

slide-80
SLIDE 80

Outline for Disjt Lower Bound

◮ Express Disjt in terms of Andt where Andt(x1, . . . , xt) = i xi:

Disjt(C) =

  • j∈[n]

Andt(C1,j, . . . , Ct,j)

◮ Define input C by CDjj ∈R {0, 1} for Dj ∈R [t]. All other entries 0. ◮ Let M = (M1, . . . , Mt−1) be the messages sent in a t-party protocol

and define the information cost of a protocol as: icost(Π|D) = I(C : M|D) where D = (D1, . . . , Dn) .

◮ A protocol for Disjt yields n different protocols ΠAndt,i for Andt:

icost(ΠDisjt|D) ≥

  • i∈[n]

icost(ΠAndt,i|D) .

19/23

slide-81
SLIDE 81

Outline for Disjt Lower Bound

◮ Express Disjt in terms of Andt where Andt(x1, . . . , xt) = i xi:

Disjt(C) =

  • j∈[n]

Andt(C1,j, . . . , Ct,j)

◮ Define input C by CDjj ∈R {0, 1} for Dj ∈R [t]. All other entries 0. ◮ Let M = (M1, . . . , Mt−1) be the messages sent in a t-party protocol

and define the information cost of a protocol as: icost(Π|D) = I(C : M|D) where D = (D1, . . . , Dn) .

◮ A protocol for Disjt yields n different protocols ΠAndt,i for Andt:

icost(ΠDisjt|D) ≥

  • i∈[n]

icost(ΠAndt,i|D) .

◮ Result follows by showing icost(ΠAndt,i|D) = Ω(1/t).

19/23

slide-82
SLIDE 82

Outline

Classic Problems and Reductions Information Statistics Approach Hamming Approximation

20/23

slide-83
SLIDE 83

Hamming Approximation Lower Bound

Some communication results can be proved via a reduction from other communication results.

Theorem

If Alice and Bob have x, y ∈ {0, 1}n and Bob wants to determine ∆(x, y) up to ±√n with probability 9/10, then Alice must send Ω(n) bits.

21/23

slide-84
SLIDE 84

Hamming Approximation Lower Bound

◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob

knows j ∈ [t]. Let’s assume |z| = t/2 and this is odd.

22/23

slide-85
SLIDE 85

Hamming Approximation Lower Bound

◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob

knows j ∈ [t]. Let’s assume |z| = t/2 and this is odd.

◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits.

22/23

slide-86
SLIDE 86

Hamming Approximation Lower Bound

◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob

knows j ∈ [t]. Let’s assume |z| = t/2 and this is odd.

◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits. ◮ Alice computes sign(r.z) and Bob computes sign(rj)

22/23

slide-87
SLIDE 87

Hamming Approximation Lower Bound

◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob

knows j ∈ [t]. Let’s assume |z| = t/2 and this is odd.

◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits. ◮ Alice computes sign(r.z) and Bob computes sign(rj) ◮ Lemma: For some constant c > 0,

P [sign(r.z) = sign(rj)] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

22/23

slide-88
SLIDE 88

Hamming Approximation Lower Bound

◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob

knows j ∈ [t]. Let’s assume |z| = t/2 and this is odd.

◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits. ◮ Alice computes sign(r.z) and Bob computes sign(rj) ◮ Lemma: For some constant c > 0,

P [sign(r.z) = sign(rj)] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

◮ Repeat n = 25t/c2 times to construct

xi = I[sign(r.z) = +] and yi = I[sign(rj) = +]

22/23

slide-89
SLIDE 89

Hamming Approximation Lower Bound

◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob

knows j ∈ [t]. Let’s assume |z| = t/2 and this is odd.

◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits. ◮ Alice computes sign(r.z) and Bob computes sign(rj) ◮ Lemma: For some constant c > 0,

P [sign(r.z) = sign(rj)] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

◮ Repeat n = 25t/c2 times to construct

xi = I[sign(r.z) = +] and yi = I[sign(rj) = +]

◮ Note that

zj = 0 ⇒ E [∆(x, y)] = n/2 zj = 1 ⇒ E [∆(x, y)] = n/2 − 5√n and by Chernoff bounds P

  • |∆(x, y) − E [∆(x, y)] | ≥ 2√n
  • < 1/10.

22/23

slide-90
SLIDE 90

Hamming Approximation Lower Bound

◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob

knows j ∈ [t]. Let’s assume |z| = t/2 and this is odd.

◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits. ◮ Alice computes sign(r.z) and Bob computes sign(rj) ◮ Lemma: For some constant c > 0,

P [sign(r.z) = sign(rj)] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

◮ Repeat n = 25t/c2 times to construct

xi = I[sign(r.z) = +] and yi = I[sign(rj) = +]

◮ Note that

zj = 0 ⇒ E [∆(x, y)] = n/2 zj = 1 ⇒ E [∆(x, y)] = n/2 − 5√n and by Chernoff bounds P

  • |∆(x, y) − E [∆(x, y)] | ≥ 2√n
  • < 1/10.

◮ Hence, a ±√n approx. of ∆(x, y) determines zj with prob. > 9/10.

22/23

slide-91
SLIDE 91

Proof of Lemma

Claim

Let A be the event A = {sign(r.z) = rj}. For some constant c > 0, P [A] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

23/23

slide-92
SLIDE 92

Proof of Lemma

Claim

Let A be the event A = {sign(r.z) = rj}. For some constant c > 0, P [A] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2.

23/23

slide-93
SLIDE 93

Proof of Lemma

Claim

Let A be the event A = {sign(r.z) = rj}. For some constant c > 0, P [A] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)

  • f independent ±1 values.

23/23

slide-94
SLIDE 94

Proof of Lemma

Claim

Let A be the event A = {sign(r.z) = rj}. For some constant c > 0, P [A] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)

  • f independent ±1 values. Then,

P [A] = P [A|s = 0] P [s = 0] + P [A|s = 0] P [s = 0]

23/23

slide-95
SLIDE 95

Proof of Lemma

Claim

Let A be the event A = {sign(r.z) = rj}. For some constant c > 0, P [A] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)

  • f independent ±1 values. Then,

P [A] = P [A|s = 0] P [s = 0] + P [A|s = 0] P [s = 0]

◮ P [s = 0] =

ℓ/2

  • /2ℓ = 2c/√t for some constant c > 0.

23/23

slide-96
SLIDE 96

Proof of Lemma

Claim

Let A be the event A = {sign(r.z) = rj}. For some constant c > 0, P [A] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)

  • f independent ±1 values. Then,

P [A] = P [A|s = 0] P [s = 0] + P [A|s = 0] P [s = 0]

◮ P [s = 0] =

ℓ/2

  • /2ℓ = 2c/√t for some constant c > 0.

◮ P [A|s = 0] = 1 since s = 0 ⇒ r.z = rj ⇒ A. 23/23

slide-97
SLIDE 97

Proof of Lemma

Claim

Let A be the event A = {sign(r.z) = rj}. For some constant c > 0, P [A] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)

  • f independent ±1 values. Then,

P [A] = P [A|s = 0] P [s = 0] + P [A|s = 0] P [s = 0]

◮ P [s = 0] =

ℓ/2

  • /2ℓ = 2c/√t for some constant c > 0.

◮ P [A|s = 0] = 1 since s = 0 ⇒ r.z = rj ⇒ A. ◮ P [A|s = 0] = 1/2 since s = 0 ⇒ s = {. . . , −4, −2, 2, 4, . . .}. Hence,

sign(r.z) = sign(s) which is independent of rj.

23/23

slide-98
SLIDE 98

Proof of Lemma

Claim

Let A be the event A = {sign(r.z) = rj}. For some constant c > 0, P [A] = 1/2 if zj = 0 1/2 + c/√t if zj = 1

◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)

  • f independent ±1 values. Then,

P [A] = P [A|s = 0] P [s = 0] + P [A|s = 0] P [s = 0]

◮ P [s = 0] =

ℓ/2

  • /2ℓ = 2c/√t for some constant c > 0.

◮ P [A|s = 0] = 1 since s = 0 ⇒ r.z = rj ⇒ A. ◮ P [A|s = 0] = 1/2 since s = 0 ⇒ s = {. . . , −4, −2, 2, 4, . . .}. Hence,

sign(r.z) = sign(s) which is independent of rj.

◮ So P [A] = P [s = 0] + P[s=0] 2

= 1

2 + P[s=0] 2

= 1

2 + c √t .

23/23