Joshua Brody and Amit Chakrabarti Dartmouth College 24 th CCC, 2009, - - PowerPoint PPT Presentation

joshua brody and amit chakrabarti
SMART_READER_LITE
LIVE PREVIEW

Joshua Brody and Amit Chakrabarti Dartmouth College 24 th CCC, 2009, - - PowerPoint PPT Presentation

Gap-Hamming Lower Bound July 18, 2009 Lower Bounds for Gap-Hamming-Distance and Consequences for Data Stream Algorithms Joshua Brody and Amit Chakrabarti Dartmouth College 24 th CCC, 2009, Paris Joshua Brody 1 Gap-Hamming Lower Bound July


slide-1
SLIDE 1

Gap-Hamming Lower Bound July 18, 2009

Lower Bounds for Gap-Hamming-Distance and Consequences for Data Stream Algorithms

Joshua Brody and Amit Chakrabarti

Dartmouth College 24th CCC, 2009, Paris

Joshua Brody 1

slide-2
SLIDE 2

Gap-Hamming Lower Bound July 18, 2009

Counting Distinct Elements in a Data Stream

14 3 1 3 9 9 4 2 1 5 2 3 6

Input: Stream of integers σ =< a1, . . . , am > Output: F0 := number of distinct elements in σ

Joshua Brody 2

slide-3
SLIDE 3

Gap-Hamming Lower Bound July 18, 2009

Counting Distinct Elements in a Data Stream

14 3 1 3 9 9 4 2 1 5 2 3 6

Input: Stream of integers σ =< a1, . . . , am > Output: F0 := number of distinct elements in σ Goal: Minimize space used to compute F0

Joshua Brody 2-a

slide-4
SLIDE 4

Gap-Hamming Lower Bound July 18, 2009

Previous Streaming Results

Frequency Moments: Fk = n

i=1 freq(i)k

[Alon-Matias-Szegedy ’96]

Joshua Brody 3

slide-5
SLIDE 5

Gap-Hamming Lower Bound July 18, 2009

Previous Streaming Results

Frequency Moments: Fk = n

i=1 freq(i)k

[Alon-Matias-Szegedy ’96]

  • Ω(n) space unless randomization and approximation used
  • Upper, lower bounds for randomized algorithms that approximate Fk
  • Spawned lots of research, won 2005 G¨
  • del Prize

Joshua Brody 3-a

slide-6
SLIDE 6

Gap-Hamming Lower Bound July 18, 2009

Previous Streaming Results

Frequency Moments: Fk = n

i=1 freq(i)k

[Alon-Matias-Szegedy ’96]

  • Ω(n) space unless randomization and approximation used
  • Upper, lower bounds for randomized algorithms that approximate Fk
  • Spawned lots of research, won 2005 G¨
  • del Prize

One-pass, randomized, ε-approximate:

  • utput

answer − 1

  • ≤ ε

Joshua Brody 3-c

slide-7
SLIDE 7

Gap-Hamming Lower Bound July 18, 2009

Previous Streaming Results

Frequency Moments: Fk = n

i=1 freq(i)k

[Alon-Matias-Szegedy ’96]

  • Ω(n) space unless randomization and approximation used
  • Upper, lower bounds for randomized algorithms that approximate Fk
  • Spawned lots of research, won 2005 G¨
  • del Prize

One-pass, randomized, ε-approximate:

  • utput

answer − 1

  • ≤ ε

Status as of Jan 2009:

  • Space upper bound:

O(ε−2)

  • Space lower bound:

Ω(ε−2)

  • Also hold for other problems, e.g. empirical entropy

Do multiple passes help?

Joshua Brody 3-d

slide-8
SLIDE 8

Gap-Hamming Lower Bound July 18, 2009

Previous Streaming Results

Frequency Moments: Fk = n

i=1 freq(i)k

[Alon-Matias-Szegedy ’96]

  • Ω(n) space unless randomization and approximation used
  • Upper, lower bounds for randomized algorithms that approximate Fk
  • Spawned lots of research, won 2005 G¨
  • del Prize

One-pass, randomized, ε-approximate:

  • utput

answer − 1

  • ≤ ε

Status as of Jan 2009:

  • Space upper bound:

O(ε−2)

  • Space lower bound:

Ω(ε−2)

  • Also hold for other problems, e.g. empirical entropy

Do multiple passes help? If not, why not?

Joshua Brody 3-e

slide-9
SLIDE 9

Gap-Hamming Lower Bound July 18, 2009

The Gap-Hamming-Distance Problem

Input: Alice gets x ∈ {0, 1}n, Bob gets y ∈ {0, 1}n. Output:

  • ghd(x, y) = 1 if ∆(x, y) > n

2 + √n

  • ghd(x, y) = 0 if ∆(x, y) < n

2 − √n

Problem: Design randomized, constant error protocol to solve this Cost: Worst case number of bits communicated

1 x = y = 1 1 1 1 1 1 1 1

n = 12; ∆(x, y) = 3 ∈ [6 − √ 12, 6 + √ 12]

Joshua Brody 4

slide-10
SLIDE 10

Gap-Hamming Lower Bound July 18, 2009

The Reductions

E.g., Distinct Elements (Other problems: similar)

σ : y = 1 1 1 1

( 1 , ) ( 3 , ) ( 4 , ) ( 2 , 1 ) ( 5 , 1 ) ( 6 , ) ( 8 , ) ( 9 , ) ( 9 , ) ( 1 2 , 1 ) ( 1 1 , ) ( 1 , ) ( 1 2 , 1 ) ( 1 1 , ) ( 1 , )

x = 1 1 1 1 1

( 1 , ) ( 3 , ) ( 4 , ) ( 6 , ) ( 8 , ) ( 9 , ) ( 2 , ) ( 5 , ) ( 9 , 1 )

τ :

Alice: x − → σ = (1, x1), (2, x2), . . . , (n, xn) Bob: y − → τ = (1, y1), (2, y2), . . . , (n, yn) Notice: F0(σ ◦ τ) = n + ∆(x, y) =    < 3n

2 − √n, or

> 3n

2 + √n.

Set ε =

1 √n. Joshua Brody 5

slide-11
SLIDE 11

Gap-Hamming Lower Bound July 18, 2009

Communication to Streaming

p-pass streaming algorithm = ⇒ (2p − 1)-round communication protocol messages = memory contents of streaming algorithm

And Thus

Previous results [Indyk-Woodruff’03], [Woodruff’04], [C.-Cormode-McGregor’07]:

  • For one-round protocols, R→(ghd) = Ω(n)
  • Implies the

Ω(ε−2) streaming lower bounds

Joshua Brody 6

slide-12
SLIDE 12

Gap-Hamming Lower Bound July 18, 2009

Communication to Streaming

p-pass streaming algorithm = ⇒ (2p − 1)-round communication protocol messages = memory contents of streaming algorithm

And Thus

Previous results [Indyk-Woodruff’03], [Woodruff’04], [C.-Cormode-McGregor’07]:

  • For one-round protocols, R→(ghd) = Ω(n)
  • Implies the

Ω(ε−2) streaming lower bounds Key open questions:

  • What is the unrestricted randomized complexity R(ghd)?
  • Better algorithm for Distinct Elements (or Fk, or H) using two passes?

Joshua Brody 6-a

slide-13
SLIDE 13

Gap-Hamming Lower Bound July 18, 2009

Our Results

Previous Results (Communication):

  • One-round (one-way) lower bound: R→(ghd) = Ω(n)

[Woodruff’04]

  • Simplification, clever reduction from index [Jayram-Kumar-Sivakumar]
  • Multi-round case: R(ghd) = Ω(√n)

[Folklore]

Joshua Brody 7

slide-14
SLIDE 14

Gap-Hamming Lower Bound July 18, 2009

Our Results

Previous Results (Communication):

  • One-round (one-way) lower bound: R→(ghd) = Ω(n)

[Woodruff’04]

  • Simplification, clever reduction from index [Jayram-Kumar-Sivakumar]

Hard distribution “contrived,” non-uniform

  • Multi-round case: R(ghd) = Ω(√n)

[Folklore]

Joshua Brody 7-a

slide-15
SLIDE 15

Gap-Hamming Lower Bound July 18, 2009

Our Results

Previous Results (Communication):

  • One-round (one-way) lower bound: R→(ghd) = Ω(n)

[Woodruff’04]

  • Simplification, clever reduction from index [Jayram-Kumar-Sivakumar]

Hard distribution “contrived,” non-uniform

  • Multi-round case: R(ghd) = Ω(√n)

[Folklore]

Reduction from disjointness using “repetition code” Hard distribution again far from uniform

Joshua Brody 7-b

slide-16
SLIDE 16

Gap-Hamming Lower Bound July 18, 2009

Our Results

Previous Results (Communication):

  • One-round (one-way) lower bound: R→(ghd) = Ω(n)

[Woodruff’04]

  • Simplification, clever reduction from index [Jayram-Kumar-Sivakumar]

Hard distribution “contrived,” non-uniform

  • Multi-round case: R(ghd) = Ω(√n)

[Folklore]

Reduction from disjointness using “repetition code” Hard distribution again far from uniform What we show:

  • Theorem 1: Ω(n) lower bound for any O(1)-round protocol

Holds under uniform distribution

Joshua Brody 7-c

slide-17
SLIDE 17

Gap-Hamming Lower Bound July 18, 2009

Our Results

Previous Results (Communication):

  • One-round (one-way) lower bound: R→(ghd) = Ω(n)

[Woodruff’04]

  • Simplification, clever reduction from index [Jayram-Kumar-Sivakumar]

Hard distribution “contrived,” non-uniform

  • Multi-round case: R(ghd) = Ω(√n)

[Folklore]

Reduction from disjointness using “repetition code” Hard distribution again far from uniform What we show:

  • Theorem 1: Ω(n) lower bound for any O(1)-round protocol

Holds under uniform distribution

  • Theorem 2: one-round, deterministic: D→(ghd) = n − Θ(√n log n)
  • Theorem 3: R→(ghd) = Ω(n)

(simpler proof, uniform distrib) (independently proved by [Woodruff’09])

Joshua Brody 7-d

slide-18
SLIDE 18

Gap-Hamming Lower Bound July 18, 2009

Technique: Round Elimination

Base Case Lemma: There is no “nice” 0-round ghd protocol. Round Elimination Lemma: If there is a “nice” k-round ghd protocol, then there is a “nice” (k − 1)-round ghd protocol.

  • The (k − 1)-round protocol will be solving a “simpler” problem
  • Parameters degrade with each round elimination step

Joshua Brody 8

slide-19
SLIDE 19

Gap-Hamming Lower Bound July 18, 2009

Technique: Round Elimination

Base Case Lemma: There is no 0-round ghd protocol with error ε < 1

2.

Round Elimination Lemma: If there is a “nice” k-round ghd protocol, then there is a “nice” (k − 1)-round ghd′ protocol.

Joshua Brody 8

slide-20
SLIDE 20

Gap-Hamming Lower Bound July 18, 2009

Technique: Round Elimination

Base Case Lemma: There is no 0-round ghd protocol with error ε < 1

2.

Round Elimination Lemma: If there is a “nice” k-round ghd protocol, then there is a “nice” (k − 1)-round ghd′ protocol.

  • The (k − 1)-round protocol will be solving a “simpler” problem
  • Parameters degrade with each round elimination step

Joshua Brody 8-a

slide-21
SLIDE 21

Gap-Hamming Lower Bound July 18, 2009

Parametrized Gap-Hamming-Distance Problem

The problem: ghdc,n(x, y) =        1 , if ∆(x, y) ≥ n/2 + c√n , 0 , if ∆(x, y) ≤ n/2 − c√n , ⋆ ,

  • therwise.

Joshua Brody 9

slide-22
SLIDE 22

Gap-Hamming Lower Bound July 18, 2009

Parametrized Gap-Hamming-Distance Problem

The problem: ghdc,n(x, y) =        1 , if ∆(x, y) ≥ n/2 + c√n , 0 , if ∆(x, y) ≤ n/2 − c√n , ⋆ ,

  • therwise.

Hard input distribution: µc,n : uniform over (x, y) such that |∆(x, y) − n/2| ≥ c√n

Joshua Brody 9-a

slide-23
SLIDE 23

Gap-Hamming Lower Bound July 18, 2009

Parametrized Gap-Hamming-Distance Problem

The problem: ghdc,n(x, y) =        1 , if ∆(x, y) ≥ n/2 + c√n , 0 , if ∆(x, y) ≤ n/2 − c√n , ⋆ ,

  • therwise.

Hard input distribution: µc,n : uniform over (x, y) such that |∆(x, y) − n/2| ≥ c√n Protocol assumptions (eventually, will lead to contradiction):

  • Deterministic k-round protocol for ghdc,n
  • Each message is s ≪ n bits
  • Error probability ≤ ε, under distribution µc,n

Joshua Brody 9-b

slide-24
SLIDE 24

Gap-Hamming Lower Bound July 18, 2009

Round Elimination

Main Construction: Given k-round protocol P for ghdc,n, construct (k − 1)-round protocol Q for ghdc′,n′

Joshua Brody 10

slide-25
SLIDE 25

Gap-Hamming Lower Bound July 18, 2009

Round Elimination

Main Construction: Given k-round protocol P for ghdc,n, construct (k − 1)-round protocol Q for ghdc′,n′ First Attempt:

  • Fix Alice’s first message m in P, suitably

Joshua Brody 10-a

slide-26
SLIDE 26

Gap-Hamming Lower Bound July 18, 2009

Round Elimination

Main Construction: Given k-round protocol P for ghdc,n, construct (k − 1)-round protocol Q for ghdc′,n′ First Attempt:

  • Fix Alice’s first message m in P, suitably
  • Protocol Q1:

– Input: x′, y′ ∈ {0, 1}A where A ⊆ [n], |A| = n′ – Extend x′ → x s.t. Alice sends m on input x – Extend y′ → y uniformly at random – Output P(x, y); Note: first message unnecessary

Joshua Brody 10-b

slide-27
SLIDE 27

Gap-Hamming Lower Bound July 18, 2009

Round Elimination

Main Construction: Given k-round protocol P for ghdc,n, construct (k − 1)-round protocol Q for ghdc′,n′ First Attempt:

  • Fix Alice’s first message m in P, suitably
  • Protocol Q1:

– Input: x′, y′ ∈ {0, 1}A where A ⊆ [n], |A| = n′ – Extend x′ → x s.t. Alice sends m on input x – Extend y′ → y uniformly at random – Output P(x, y); Note: first message unnecessary

  • Errors: Q1 correct, unless

– BAD1: ghdc′,n′(x′, y′) = ghdc,n(x, y). – BAD2: ghdc,n(x, y) = P(x, y).

Joshua Brody 10-c

slide-28
SLIDE 28

Gap-Hamming Lower Bound July 18, 2009

Round Elimination

Main Construction: Given k-round protocol P for ghdc,n, construct (k − 1)-round protocol Q for ghdc′,n′ First Attempt:

  • Fix Alice’s first message m in P, suitably
  • Protocol Q1:

– Input: x′, y′ ∈ {0, 1}A where A ⊆ [n], |A| = n′ – Extend x′ → x s.t. Alice sends m on input x (why possible?) – Extend y′ → y uniformly at random – Output P(x, y); Note: first message unnecessary

  • Errors: Q1 correct, unless

– BAD1: ghdc′,n′(x′, y′) = ghdc,n(x, y). – BAD2: ghdc,n(x, y) = P(x, y).

Joshua Brody 10-d

slide-29
SLIDE 29

Gap-Hamming Lower Bound July 18, 2009

VC-Dimension

Fixing Alice’s first message:

  • Call x good if Pry[P(x, y) = ghdc,n(x, y)] ≤ 2ε

Then #{good x} ≥ 2n−1 (Markov)

  • Let M = Mm = {good x : Alice sends m on input x}.
  • Fix m to maximize |M|; then |M| ≥ 2n−1−s.

Joshua Brody 11

slide-30
SLIDE 30

Gap-Hamming Lower Bound July 18, 2009

VC-Dimension

Fixing Alice’s first message:

  • Call x good if Pry[P(x, y) = ghdc,n(x, y)] ≤ 2ε

Then #{good x} ≥ 2n−1 (Markov)

  • Let M = Mm = {good x : Alice sends m on input x}.
  • Fix m to maximize |M|; then |M| ≥ 2n−1−s.

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 A

Shattering:

  • Say S ⊆ {0, 1}n shatters A ⊆ [n] if #{x|A : x ∈ S} = 2|A|
  • VCD(S) := size of largest A shattered by S

Joshua Brody 11-a

slide-31
SLIDE 31

Gap-Hamming Lower Bound July 18, 2009

VC-Dimension

Fixing Alice’s first message:

  • Call x good if Pry[P(x, y) = ghdc,n(x, y)] ≤ 2ε

Then #{good x} ≥ 2n−1 (Markov)

  • Let M = Mm = {good x : Alice sends m on input x}.
  • Fix m to maximize |M|; then |M| ≥ 2n−1−s.

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 A

Shattering:

  • Say S ⊆ {0, 1}n shatters A ⊆ [n] if #{x|A : x ∈ S} = 2|A|
  • VCD(S) := size of largest A shattered by S

Sauer’s Lemma: If VCD(S) < αn then |S| < 2nH(α).

Joshua Brody 11-b

slide-32
SLIDE 32

Gap-Hamming Lower Bound July 18, 2009

VC-Dimension

Fixing Alice’s first message:

  • Call x good if Pry[P(x, y) = ghdc,n(x, y)] ≤ 2ε

Then #{good x} ≥ 2n−1 (Markov)

  • Let M = Mm = {good x : Alice sends m on input x}.
  • Fix m to maximize |M|; then |M| ≥ 2n−1−s.

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 A

Shattering:

  • Say S ⊆ {0, 1}n shatters A ⊆ [n] if #{x|A : x ∈ S} = 2|A|
  • VCD(S) := size of largest A shattered by S

Sauer’s Lemma: If VCD(S) < αn then |S| < 2nH(α). Corollary: VCD(M) ≥ n′ := n/3 (Because s ≪ n)

Joshua Brody 11-c

slide-33
SLIDE 33

Gap-Hamming Lower Bound July 18, 2009

VC-Dimension

Fixing Alice’s first message:

  • Call x good if Pry[P(x, y) = ghdc,n(x, y)] ≤ 2ε

Then #{good x} ≥ 2n−1 (Markov)

  • Let M = Mm = {good x : Alice sends m on input x}.
  • Fix m to maximize |M|; then |M| ≥ 2n−1−s.

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 A

Shattering:

  • Say S ⊆ {0, 1}n shatters A ⊆ [n] if #{x|A : x ∈ S} = 2|A|
  • VCD(S) := size of largest A shattered by S

Sauer’s Lemma: If VCD(S) < αn then |S| < 2nH(α). Corollary: VCD(M) ≥ n′ := n/3 (Because s ≪ n) Extend x′ → x: pick x ∈ M such that x′ = x|A

Joshua Brody 11-d

slide-34
SLIDE 34

Gap-Hamming Lower Bound July 18, 2009

The First Bad Event

Recall BAD1: ghdc′,n′(x′, y′) = ghdc,n(x, y). Notation: x = x′ ◦ ¯ x, y = y′ ◦ ¯ y, n = n′ + ¯ n.

Joshua Brody 12

slide-35
SLIDE 35

Gap-Hamming Lower Bound July 18, 2009

The First Bad Event

Recall BAD1: ghdc′,n′(x′, y′) = ghdc,n(x, y). Notation: x = x′ ◦ ¯ x, y = y′ ◦ ¯ y, n = n′ + ¯ n. Definition: ¯ x, ¯ y nearly orthogonal if |∆(¯ x, ¯ y) − ¯ n/2| < 2√¯ n.

Joshua Brody 12-a

slide-36
SLIDE 36

Gap-Hamming Lower Bound July 18, 2009

The First Bad Event

Recall BAD1: ghdc′,n′(x′, y′) = ghdc,n(x, y). Notation: x = x′ ◦ ¯ x, y = y′ ◦ ¯ y, n = n′ + ¯ n. Definition: ¯ x, ¯ y nearly orthogonal if |∆(¯ x, ¯ y) − ¯ n/2| < 2√¯ n. Lemma: Pr¯

y[¯

x, ¯ y nearly orthogonal] > 7/8. (Binom distrib tail)

Joshua Brody 12-b

slide-37
SLIDE 37

Gap-Hamming Lower Bound July 18, 2009

The First Bad Event

Recall BAD1: ghdc′,n′(x′, y′) = ghdc,n(x, y). Notation: x = x′ ◦ ¯ x, y = y′ ◦ ¯ y, n = n′ + ¯ n. Definition: ¯ x, ¯ y nearly orthogonal if |∆(¯ x, ¯ y) − ¯ n/2| < 2√¯ n. Lemma: Pr¯

y[¯

x, ¯ y nearly orthogonal] > 7/8. (Binom distrib tail) Lemma: If ¯ x, ¯ y nearly orthogonal and c′ ≥ 2c, then

  • ghdc′,n′(x′, y′) = 1 =

⇒ ghdc,n(x, y) = 1

  • ghdc′,n′(x′, y′) = 0 =

⇒ ghdc,n(x, y) = 0

Joshua Brody 12-c

slide-38
SLIDE 38

Gap-Hamming Lower Bound July 18, 2009

The First Bad Event

Recall BAD1: ghdc′,n′(x′, y′) = ghdc,n(x, y). Notation: x = x′ ◦ ¯ x, y = y′ ◦ ¯ y, n = n′ + ¯ n. Definition: ¯ x, ¯ y nearly orthogonal if |∆(¯ x, ¯ y) − ¯ n/2| < 2√¯ n. Lemma: Pr¯

y[¯

x, ¯ y nearly orthogonal] > 7/8. (Binom distrib tail) Lemma: If ¯ x, ¯ y nearly orthogonal and c′ ≥ 2c, then

  • ghdc′,n′(x′, y′) = 1 =

⇒ ghdc,n(x, y) = 1

  • ghdc′,n′(x′, y′) = 0 =

⇒ ghdc,n(x, y) = 0 Corollary: Pr[BAD1] < 1/8.

Joshua Brody 12-d

slide-39
SLIDE 39

Gap-Hamming Lower Bound July 18, 2009

The Second Bad Event

Recall BAD2: ghdc,n(x, y) = P(x, y). Bounding Pr[BAD2] is subtle:

  • x is good, so Pr[P errs | x] ≤ 2ε

– But this requires (x, y) ∼ µc,n

  • Random extension (x′, y′) → (x, y) is not ∼ µc,n.
  • Actual distrib (fixed x, random y):

– (x, y) ∼ (µc′,n′ | x) ⊗ Unif ¯

n

– y uniform over a subset of {0, 1}n, just like in µc,n

y

1 y 2 2n

x y

Lemma: Pr[BAD2] = O(ε).

Joshua Brody 13

slide-40
SLIDE 40

Gap-Hamming Lower Bound July 18, 2009

The Second Bad Event

Recall BAD2: ghdc,n(x, y) = P(x, y). Bounding Pr[BAD2] is subtle:

  • x is good, so Pr[P errs | x] ≤ 2ε

– But this requires (x, y) ∼ µc,n

  • Random extension (x′, y′) → (x, y) is not ∼ µc,n.
  • Actual distrib (fixed x, random y):

– (x, y) ∼ (µc′,n′ | x) ⊗ Unif ¯

n

– y uniform over a subset of {0, 1}n, just like in µc,n

y

1 y 2 2n

x y

Lemma: Pr[BAD2] = O(ε).

Joshua Brody 13

slide-41
SLIDE 41

Gap-Hamming Lower Bound July 18, 2009

The Second Bad Event

Recall BAD2: ghdc,n(x, y) = P(x, y). Bounding Pr[BAD2] is subtle:

  • x is good, so Pr[P errs | x] ≤ 2ε

– But this requires (x, y) ∼ µc,n

  • Random extension (x′, y′) → (x, y) is not ∼ µc,n.
  • Actual distrib (fixed x, random y):

– (x, y) ∼ (µc′,n′ | x) ⊗ Unif ¯

n

– y uniform over a subset of {0, 1}n, just like in µc,n

y

1 y 2 2n

y x

Lemma: Pr[BAD2] = O(ε).

Joshua Brody 13

slide-42
SLIDE 42

Gap-Hamming Lower Bound July 18, 2009

The Second Bad Event

Recall BAD2: ghdc,n(x, y) = P(x, y). Bounding Pr[BAD2] is subtle:

  • x is good, so Pr[P errs | x] ≤ 2ε

– But this requires (x, y) ∼ µc,n

  • Random extension (x′, y′) → (x, y) is not ∼ µc,n.
  • Actual distrib (fixed x, random y):

– (x, y) ∼ (µc′,n′ | x) ⊗ Unif ¯

n

– y uniform over a subset of {0, 1}n, just like in µc,n

y

1 y 2 2n

x y

Lemma: Pr[BAD2] = O(ε).

Joshua Brody 13

slide-43
SLIDE 43

Gap-Hamming Lower Bound July 18, 2009

The Second Bad Event

Recall BAD2: ghdc,n(x, y) = P(x, y). Bounding Pr[BAD2] is subtle:

  • x is good, so Pr[P errs | x] ≤ 2ε

– But this requires (x, y) ∼ µc,n

  • Random extension (x′, y′) → (x, y) is not ∼ µc,n.
  • Actual distrib (fixed x, random y):

– (x, y) ∼ (µc′,n′ | x) ⊗ Unif ¯

n

– y uniform over a subset of {0, 1}n, just like in µc,n

y

1 y 2 2n

x y

Lemma: Pr[BAD2] = O(ε).

Joshua Brody 13

slide-44
SLIDE 44

Gap-Hamming Lower Bound July 18, 2009

The Second Bad Event

Recall BAD2: ghdc,n(x, y) = P(x, y). Bounding Pr[BAD2] is subtle:

  • x is good, so Pr[P errs | x] ≤ 2ε

– But this requires (x, y) ∼ µc,n

  • Random extension (x′, y′) → (x, y) is not ∼ µc,n.
  • Actual distrib (fixed x, random y):

– (x, y) ∼ (µc′,n′ | x) ⊗ Unif ¯

n

– y uniform over a subset of {0, 1}n, just like in µc,n

y

1 y 2 2n

x y

Joshua Brody 13

slide-45
SLIDE 45

Gap-Hamming Lower Bound July 18, 2009

The Second Bad Event

Recall BAD2: ghdc,n(x, y) = P(x, y). Bounding Pr[BAD2] is subtle:

  • x is good, so Pr[P errs | x] ≤ 2ε

– But this requires (x, y) ∼ µc,n

  • Random extension (x′, y′) → (x, y) is not ∼ µc,n.
  • Actual distrib (fixed x, random y):

– (x, y) ∼ (µc′,n′ | x) ⊗ Unif ¯

n

– y uniform over a subset of {0, 1}n, just like in µc,n

y

1 y 2 2n

x y

Lemma: Pr[BAD2] = O(ε).

Joshua Brody 13-b

slide-46
SLIDE 46

Gap-Hamming Lower Bound July 18, 2009

Round Elimination, First Attempt (Recap)

Putting it together:

  • P is k-round ε-error protocol for ghdc,n
  • Q1 is (k − 1)-round ε′-error protocol for ghdc′,n′ with

– c′ = 2c, n′ = n/3 – ε′ = 1/8 + O(ε) Second attempt: protocol Q:

  • Repeat Q1 2O(k) times in parallel, take majority
  • Blows up communication by 2O(k)
  • Error analysis even more subtle: not just a Chernoff bound

Lemma: Pr[Q errs] = O(ε).

Joshua Brody 14

slide-47
SLIDE 47

Gap-Hamming Lower Bound July 18, 2009

Round Elimination, First Attempt (Recap)

Putting it together:

  • P is k-round ε-error protocol for ghdc,n
  • Q1 is (k − 1)-round ε′-error protocol for ghdc′,n′ with

– c′ = 2c, n′ = n/3 – ε′ ≤ 1/8 + 16ε ← − Can’t repeat this argument! Second attempt: protocol Q:

  • Repeat Q1 2O(k) times in parallel, take majority
  • Blows up communication by 2O(k)
  • Error analysis even more subtle: not just a Chernoff bound

Lemma: Pr[Q errs] = O(ε).

Joshua Brody 14

slide-48
SLIDE 48

Gap-Hamming Lower Bound July 18, 2009

Round Elimination, Second Attempt

Putting it together:

  • P is k-round ε-error protocol for ghdc,n
  • Q1 is (k − 1)-round ε′-error protocol for ghdc′,n′ with

– c′ = 2c, n′ = n/3 – ε′ ≤ 1/8 + 16ε ← − Can’t repeat this argument! Second attempt: protocol Q:

  • Repeat Q1 2O(k) times in parallel, take majority
  • Blows up communication by 2O(k)
  • Error analysis even more subtle: not just a Chernoff bound

Lemma: Pr[Q errs] = O(ε).

Joshua Brody 14

slide-49
SLIDE 49

Gap-Hamming Lower Bound July 18, 2009

Round Elimination, Second Attempt

Putting it together:

  • P is k-round ε-error protocol for ghdc,n
  • Q1 is (k − 1)-round ε′-error protocol for ghdc′,n′ with

– c′ = 2c, n′ = n/3 – ε′ ≤ 1/8 + 16ε ← − Can’t repeat this argument! Second attempt: protocol Q:

  • Repeat Q1 2O(k) times in parallel, take majority
  • Blows up communication by 2O(k)
  • Error analysis even more subtle: not just a Chernoff bound

Lemma: Pr[Q errs] = O(ε).

Joshua Brody 14

slide-50
SLIDE 50

Gap-Hamming Lower Bound July 18, 2009

Eventual Round Elimination Lemma

Lemma: If there is a k-round, ε-error protocol for ghdc,n in which each player sends s ≪ n bits, then there is a (k − 1)-round, O(ε)-error protocol for ghd2c,n/3 in which each player sends 2O(k)s bits. Recall Base Case Lemma: There is no zero-round protocol with error < 1/2.

Joshua Brody 15

slide-51
SLIDE 51

Gap-Hamming Lower Bound July 18, 2009

Eventual Round Elimination Lemma

Lemma: If there is a k-round, ε-error protocol for ghdc,n in which each player sends s ≪ n bits, then there is a (k − 1)-round, O(ε)-error protocol for ghd2c,n/3 in which each player sends 2O(k)s bits. Recall Base Case Lemma: There is no zero-round protocol with error < 1/2.

Consequence: Main Theorem

Theorem: There is no o(n)-bit, 1

3-error, O(1)-round randomized protocol

for ghdc,n. In other words, RO(1)(ghd) = Ω(n).

Joshua Brody 15-a

slide-52
SLIDE 52

Gap-Hamming Lower Bound July 18, 2009

Eventual Round Elimination Lemma

Lemma: If there is a k-round, ε-error protocol for ghdc,n in which each player sends s ≪ n bits, then there is a (k − 1)-round, O(ε)-error protocol for ghd2c,n/3 in which each player sends 2O(k)s bits. Recall Base Case Lemma: There is no zero-round protocol with error < 1/2.

Consequence: Main Theorem

Theorem: There is no o(n)-bit, 1

3-error, O(1)-round randomized protocol

for ghdc,n. In other words, RO(1)(ghd) = Ω(n). More Specific: Rk(ghd) = n/2O(k2).

Joshua Brody 15-b

slide-53
SLIDE 53

Gap-Hamming Lower Bound July 18, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Joshua Brody 16

slide-54
SLIDE 54

Gap-Hamming Lower Bound July 18, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny!

Joshua Brody 16-a

slide-55
SLIDE 55

Gap-Hamming Lower Bound July 18, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)
  • Approximate polynomial degree
  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]
  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Joshua Brody 16-b

slide-56
SLIDE 56

Gap-Hamming Lower Bound July 18, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)

Matrix has large near-monochromatic rectangles

  • Approximate polynomial degree
  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]
  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Joshua Brody 16-c

slide-57
SLIDE 57

Gap-Hamming Lower Bound July 18, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)

Matrix has large near-monochromatic rectangles

  • Approximate polynomial degree

Underlying predicate has approx degree O(√n)

  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]
  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Joshua Brody 16-d

slide-58
SLIDE 58

Gap-Hamming Lower Bound July 18, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)

Matrix has large near-monochromatic rectangles

  • Approximate polynomial degree

Underlying predicate has approx degree O(√n)

  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]

Quantum communication upper bound O(√n log n)

  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Joshua Brody 16-e

slide-59
SLIDE 59

Gap-Hamming Lower Bound July 18, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)

Matrix has large near-monochromatic rectangles

  • Approximate polynomial degree

Underlying predicate has approx degree O(√n)

  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]

Quantum communication upper bound O(√n log n)

  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Hmm! Can’t see a concrete obstacle

Joshua Brody 16-f

slide-60
SLIDE 60

Gap-Hamming Lower Bound July 18, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)

Matrix has large near-monochromatic rectangles

  • Approximate polynomial degree

Underlying predicate has approx degree O(√n)

  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]

Quantum communication upper bound O(√n log n)

  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Hmm! Can’t see a concrete obstacle We’re biased (Amit helped invent it, so it’s his pet technique)

Joshua Brody 16-g

slide-61
SLIDE 61

Gap-Hamming Lower Bound July 18, 2009

Open Problems

  • 1. The key problem here: Settle R(ghd).
  • 2. More generally: Understand communication complexity of

“gap problems” better.

  • 3. This should help with other streaming problems,

e.g., longest increasing subsequence.

Questions? Comments? Post-Doc/Job offers? Contact jbrody@cs.dartmouth.edu

Joshua Brody 17