Data Streams & Communication Complexity
Lecture 3: Communication Complexity and Lower Bounds Andrew McGregor, UMass Amherst
1/23
Data Streams & Communication Complexity Lecture 3: Communication - - PowerPoint PPT Presentation
Data Streams & Communication Complexity Lecture 3: Communication Complexity and Lower Bounds Andrew McGregor, UMass Amherst 1/23 Basic Communication Complexity Three friends Alice, Bob, and Charlie each have some information x , y , z
1/23
◮ Three friends Alice, Bob, and Charlie each have some information
2/23
◮ Three friends Alice, Bob, and Charlie each have some information
m1 m2
◮ To help Charlie, Alice sends a message m1 to Bob, and then Bob
2/23
◮ Three friends Alice, Bob, and Charlie each have some information
m1 m2
◮ To help Charlie, Alice sends a message m1 to Bob, and then Bob
◮ Question: How large must the total length of the messages be for
2/23
◮ Three friends Alice, Bob, and Charlie each have some information
m1 m2
◮ To help Charlie, Alice sends a message m1 to Bob, and then Bob
◮ Question: How large must the total length of the messages be for
◮ Deterministic: m1(x), m2(m1, y), out(m2, z) = P(x, y, z) 2/23
◮ Three friends Alice, Bob, and Charlie each have some information
m1 m2
◮ To help Charlie, Alice sends a message m1 to Bob, and then Bob
◮ Question: How large must the total length of the messages be for
◮ Deterministic: m1(x), m2(m1, y), out(m2, z) = P(x, y, z) ◮ Random: m1(x, r), m2(m1, y, r), out(m2, z, r) where r is public
2/23
3/23
◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,
m1 m2
S1 S2 S3 3/23
◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,
m1 m2
S1 S2 S3
◮ An s-bit stream algorithm A for Q yields 2s-bit protocol for P:
3/23
◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,
m1 m2
S1 S2 S3
◮ An s-bit stream algorithm A for Q yields 2s-bit protocol for P: Alice
3/23
◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,
m1 m2
S1 S2 S3
◮ An s-bit stream algorithm A for Q yields 2s-bit protocol for P: Alice
3/23
◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,
m1 m2
S1 S2 S3
◮ An s-bit stream algorithm A for Q yields 2s-bit protocol for P: Alice
3/23
◮ Let Q be some stream problem. Suppose there’s a reduction x → S1,
m1 m2
S1 S2 S3
◮ An s-bit stream algorithm A for Q yields 2s-bit protocol for P: Alice
3/23
◮ Had there been t players, the s-bit stream algorithm for Q would
4/23
◮ Had there been t players, the s-bit stream algorithm for Q would
◮ Hence, a lower bound of L on the communication required for P
4/23
5/23
6/23
◮ Consider a binary string x ∈ {0, 1}n and j ∈ [n], e.g.,
7/23
◮ Consider a binary string x ∈ {0, 1}n and j ∈ [n], e.g.,
◮ Suppose Alice knows x and Bob knows j.
7/23
◮ Consider a binary string x ∈ {0, 1}n and j ∈ [n], e.g.,
◮ Suppose Alice knows x and Bob knows j. ◮ How many bits need to be sent by Alice for Bob to determine
7/23
◮ Consider a binary string x ∈ {0, 1}n and j ∈ [n], e.g.,
◮ Suppose Alice knows x and Bob knows j. ◮ How many bits need to be sent by Alice for Bob to determine
7/23
◮ Thm: Any algorithm that returns the exact median of length 2n − 1
8/23
◮ Thm: Any algorithm that returns the exact median of length 2n − 1
◮ Reduction from Index: On input x ∈ {0, 1}n, Alice generates
8/23
◮ Thm: Any algorithm that returns the exact median of length 2n − 1
◮ Reduction from Index: On input x ∈ {0, 1}n, Alice generates
◮ Then median(S1 ∪ S2) = 2j + xj and this determines Index(x, j).
8/23
◮ Thm: Any algorithm that returns the exact median of length 2n − 1
◮ Reduction from Index: On input x ∈ {0, 1}n, Alice generates
◮ Then median(S1 ∪ S2) = 2j + xj and this determines Index(x, j). ◮ An s-space algorithm implies an s-bit protocol so
8/23
◮ Consider a t × n matrix where column has weight 0, 1, or t, e.g.,
9/23
◮ Consider a t × n matrix where column has weight 0, 1, or t, e.g.,
◮ Consider t players where Pi knows i-th row of C.
9/23
◮ Consider a t × n matrix where column has weight 0, 1, or t, e.g.,
◮ Consider t players where Pi knows i-th row of C. ◮ How many bits need to be communicated between the players to
9/23
◮ Consider a t × n matrix where column has weight 0, 1, or t, e.g.,
◮ Consider t players where Pi knows i-th row of C. ◮ How many bits need to be communicated between the players to
9/23
◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space.
10/23
◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness:
10/23
◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness: The i-th player generates set
10/23
◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness: The i-th player generates set
◮ If all columns have weight 0 or 1: Fk(S) ≤ n
10/23
◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness: The i-th player generates set
◮ If all columns have weight 0 or 1: Fk(S) ≤ n ◮ If there’s column of weight t: Fk(S) ≥ tk
10/23
◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness: The i-th player generates set
◮ If all columns have weight 0 or 1: Fk(S) ≤ n ◮ If there’s column of weight t: Fk(S) ≥ tk ◮ If t > 21/kn1/k then a 2 approximation of Fk(S) distinguishes cases.
10/23
◮ Thm: A 2-approximation algorithm for Fk needs Ω(n1−2/k) space. ◮ Reduction from Set Disjointness: The i-th player generates set
◮ If all columns have weight 0 or 1: Fk(S) ≤ n ◮ If there’s column of weight t: Fk(S) ≥ tk ◮ If t > 21/kn1/k then a 2 approximation of Fk(S) distinguishes cases. ◮ An s-space 2-approximation implies an s(t − 1) bit protocol so
10/23
◮ Consider 2 binary vectors x, y ∈ {0, 1}n, e.g.,
11/23
◮ Consider 2 binary vectors x, y ∈ {0, 1}n, e.g.,
◮ Suppose Alice knows x and Bob knows y.
11/23
◮ Consider 2 binary vectors x, y ∈ {0, 1}n, e.g.,
◮ Suppose Alice knows x and Bob knows y. ◮ How many bits need to be communicated to estimate ∆(x, y) up to
11/23
◮ Consider 2 binary vectors x, y ∈ {0, 1}n, e.g.,
◮ Suppose Alice knows x and Bob knows y. ◮ How many bits need to be communicated to estimate ∆(x, y) up to
11/23
◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space.
12/23
◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space. ◮ Reduction from Hamming Approximation: On input x, y ∈ {0, 1}n,
12/23
◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space. ◮ Reduction from Hamming Approximation: On input x, y ∈ {0, 1}n,
◮ Note that 2F0(S) = |x| + |y| + ∆(x, y).
12/23
◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space. ◮ Reduction from Hamming Approximation: On input x, y ∈ {0, 1}n,
◮ Note that 2F0(S) = |x| + |y| + ∆(x, y). ◮ We may assume |x| and |y| are known Bob. Hence, a (1 + ǫ)
12/23
◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space. ◮ Reduction from Hamming Approximation: On input x, y ∈ {0, 1}n,
◮ Note that 2F0(S) = |x| + |y| + ∆(x, y). ◮ We may assume |x| and |y| are known Bob. Hence, a (1 + ǫ)
◮ This is less than √n if ǫ < 1/√n
12/23
◮ Thm: A (1 + ǫ)-approximation algorithm for F0 needs Ω(ǫ−2) space. ◮ Reduction from Hamming Approximation: On input x, y ∈ {0, 1}n,
◮ Note that 2F0(S) = |x| + |y| + ∆(x, y). ◮ We may assume |x| and |y| are known Bob. Hence, a (1 + ǫ)
◮ This is less than √n if ǫ < 1/√n ◮ An s-space (1 + ǫ)-approximation implies an s bit protocol so
12/23
13/23
◮ Information statistics approach is based on analyzing the
14/23
◮ Information statistics approach is based on analyzing the
◮ Useful for proving bounds on complicated functions in terms of
14/23
◮ Information statistics approach is based on analyzing the
◮ Useful for proving bounds on complicated functions in terms of
◮ We’ll first give some definitions and then run through an example.
14/23
◮ Let X and Y be random variables.
15/23
◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i]
15/23
◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)]
15/23
◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X)
15/23
◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X) ◮ Mutual Information: I(X : Y ) = H(X) − H(X|Y )
15/23
◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X) ◮ Mutual Information: I(X : Y ) = H(X) − H(X|Y )
H(X) H(Y) I(X:Y) H(X|Y) H(Y|X)
15/23
◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X) ◮ Mutual Information: I(X : Y ) = H(X) − H(X|Y )
H(X) H(Y) I(X:Y) H(X|Y) H(Y|X)
◮ Useful Facts:
15/23
◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X) ◮ Mutual Information: I(X : Y ) = H(X) − H(X|Y )
H(X) H(Y) I(X:Y) H(X|Y) H(Y|X)
◮ Useful Facts:
◮ If X takes at most 2ℓ values, then H(X) ≤ ℓ. 15/23
◮ Let X and Y be random variables. ◮ Entropy: H(X) := i −P [X = i] lg P [X = i] ◮ Conditional Entropy: H(X|Y ) := Ey∼Y [H(X|Y = y)] ≤ H(X) ◮ Mutual Information: I(X : Y ) = H(X) − H(X|Y )
H(X) H(Y) I(X:Y) H(X|Y) H(Y|X)
◮ Useful Facts:
◮ If X takes at most 2ℓ values, then H(X) ≤ ℓ. ◮ If X and Y are independent, then I(XY : Z) ≥ I(X : Z) + I(Y : Z). 15/23
◮ Suppose you have a protocol Π for a two-party communication
16/23
◮ Suppose you have a protocol Π for a two-party communication
◮ Let M be the (random) message sent by Alice and define:
16/23
◮ Suppose you have a protocol Π for a two-party communication
◮ Let M be the (random) message sent by Alice and define:
◮ Note that
16/23
◮ We’ll prove a lower bound on the information cost of Index where
17/23
◮ We’ll prove a lower bound on the information cost of Index where
◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output
17/23
◮ We’ll prove a lower bound on the information cost of Index where
◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output
◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo
17/23
◮ We’ll prove a lower bound on the information cost of Index where
◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output
◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo
17/23
◮ We’ll prove a lower bound on the information cost of Index where
◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output
◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo
17/23
◮ We’ll prove a lower bound on the information cost of Index where
◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output
◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo
17/23
◮ We’ll prove a lower bound on the information cost of Index where
◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output
◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo
◮ Note 1: If ΠIndex is correct with probability 1 − δ then ΠEcho,i is
17/23
◮ We’ll prove a lower bound on the information cost of Index where
◮ Echo: Alice has a single bit B ∈R {0, 1} and Bob wants to output
◮ A protocol ΠIndex for Index yields a protocol ΠEcho,i for Echo
◮ Note 1: If ΠIndex is correct with probability 1 − δ then ΠEcho,i is
◮ Note 2: The message in ΠIndex on input X ∈R {0, 1}n is distributed
17/23
◮ Since X1, X2, . . . , Xn are independent:
18/23
◮ Since X1, X2, . . . , Xn are independent:
18/23
◮ Since X1, X2, . . . , Xn are independent:
18/23
◮ Since X1, X2, . . . , Xn are independent:
18/23
◮ Since X1, X2, . . . , Xn are independent:
◮ By Fano’s inequality, solving Echo with probability > 1 − δ requires
18/23
◮ Since X1, X2, . . . , Xn are independent:
◮ By Fano’s inequality, solving Echo with probability > 1 − δ requires
◮ Hence, cost(ΠIndex) ≥ (1 − H2(δ))n.
18/23
◮ Express Disjt in terms of Andt where Andt(x1, . . . , xt) = i xi:
19/23
◮ Express Disjt in terms of Andt where Andt(x1, . . . , xt) = i xi:
◮ Define input C by CDjj ∈R {0, 1} for Dj ∈R [t]. All other entries 0.
19/23
◮ Express Disjt in terms of Andt where Andt(x1, . . . , xt) = i xi:
◮ Define input C by CDjj ∈R {0, 1} for Dj ∈R [t]. All other entries 0. ◮ Let M = (M1, . . . , Mt−1) be the messages sent in a t-party protocol
19/23
◮ Express Disjt in terms of Andt where Andt(x1, . . . , xt) = i xi:
◮ Define input C by CDjj ∈R {0, 1} for Dj ∈R [t]. All other entries 0. ◮ Let M = (M1, . . . , Mt−1) be the messages sent in a t-party protocol
◮ A protocol for Disjt yields n different protocols ΠAndt,i for Andt:
19/23
◮ Express Disjt in terms of Andt where Andt(x1, . . . , xt) = i xi:
◮ Define input C by CDjj ∈R {0, 1} for Dj ∈R [t]. All other entries 0. ◮ Let M = (M1, . . . , Mt−1) be the messages sent in a t-party protocol
◮ A protocol for Disjt yields n different protocols ΠAndt,i for Andt:
◮ Result follows by showing icost(ΠAndt,i|D) = Ω(1/t).
19/23
20/23
21/23
◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob
22/23
◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob
◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits.
22/23
◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob
◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits. ◮ Alice computes sign(r.z) and Bob computes sign(rj)
22/23
◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob
◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits. ◮ Alice computes sign(r.z) and Bob computes sign(rj) ◮ Lemma: For some constant c > 0,
22/23
◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob
◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits. ◮ Alice computes sign(r.z) and Bob computes sign(rj) ◮ Lemma: For some constant c > 0,
◮ Repeat n = 25t/c2 times to construct
22/23
◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob
◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits. ◮ Alice computes sign(r.z) and Bob computes sign(rj) ◮ Lemma: For some constant c > 0,
◮ Repeat n = 25t/c2 times to construct
◮ Note that
22/23
◮ Reduction from index problem: Alice knows z ∈ {0, 1}t and Bob
◮ Alice and Bob pick r ∈R {−1, 1}t using public random bits. ◮ Alice computes sign(r.z) and Bob computes sign(rj) ◮ Lemma: For some constant c > 0,
◮ Repeat n = 25t/c2 times to construct
◮ Note that
◮ Hence, a ±√n approx. of ∆(x, y) determines zj with prob. > 9/10.
22/23
23/23
◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2.
23/23
◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)
23/23
◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)
23/23
◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)
◮ P [s = 0] =
ℓ/2
23/23
◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)
◮ P [s = 0] =
ℓ/2
◮ P [A|s = 0] = 1 since s = 0 ⇒ r.z = rj ⇒ A. 23/23
◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)
◮ P [s = 0] =
ℓ/2
◮ P [A|s = 0] = 1 since s = 0 ⇒ r.z = rj ⇒ A. ◮ P [A|s = 0] = 1/2 since s = 0 ⇒ s = {. . . , −4, −2, 2, 4, . . .}. Hence,
23/23
◮ If zj = 0: sign(r.z) and rj are independent so P [A] = 1/2. ◮ If zj = 1: Let s = r.z − rj, the sum of an even number (ℓ = t/2 − 1)
◮ P [s = 0] =
ℓ/2
◮ P [A|s = 0] = 1 since s = 0 ⇒ r.z = rj ⇒ A. ◮ P [A|s = 0] = 1/2 since s = 0 ⇒ s = {. . . , −4, −2, 2, 4, . . .}. Hence,
◮ So P [A] = P [s = 0] + P[s=0] 2
2 + P[s=0] 2
2 + c √t .
23/23