Amit Chakrabarti Dartmouth College WAPMDS, IIT Kanpur, Dec 2009 - - PowerPoint PPT Presentation

amit chakrabarti
SMART_READER_LITE
LIVE PREVIEW

Amit Chakrabarti Dartmouth College WAPMDS, IIT Kanpur, Dec 2009 - - PowerPoint PPT Presentation

Multi-Pass Lower Bounds Dec 20, 2009 Multi-pass Data Stream Lower Bounds via Round Elimination Amit Chakrabarti Dartmouth College WAPMDS, IIT Kanpur, Dec 2009 Amit Chakrabarti 1 Multi-Pass Lower Bounds Dec 20, 2009 Lower Bounds Paradigms


slide-1
SLIDE 1

Multi-Pass Lower Bounds Dec 20, 2009

Multi-pass Data Stream Lower Bounds via Round Elimination

Amit Chakrabarti

Dartmouth College WAPMDS, IIT Kanpur, Dec 2009

Amit Chakrabarti 1

slide-2
SLIDE 2

Multi-Pass Lower Bounds Dec 20, 2009

Lower Bounds Paradigms

Algorithm design: Lower bounds:

Amit Chakrabarti 2

slide-3
SLIDE 3

Multi-Pass Lower Bounds Dec 20, 2009

Lower Bounds Paradigms

Algorithm design: divide & conquer, greedy, dynamic programming, LP relaxation, . . . Lower bounds: ? ? ?

Amit Chakrabarti 2-a

slide-4
SLIDE 4

Multi-Pass Lower Bounds Dec 20, 2009

Lower Bounds Paradigms

Algorithm design: divide & conquer, greedy, dynamic programming, LP relaxation, . . . Lower bounds: ? ? ?

  • Information complexity paradigm

[C.-Shi-Wirth-Yao’01]

  • Round elimination paradigm

[Miltersen-Nisan-Safra-Wigderson’95]

Amit Chakrabarti 2-b

slide-5
SLIDE 5

Multi-Pass Lower Bounds Dec 20, 2009

Multi-Pass Lower Bounds

Data streams: two broad application scenarios

  • Networks: Busy router, packets whizzing by

– Web traffic statistics – Intrusion detection

  • Databases: Huge DB, linear scan cheaper than random access

– Query optimisation: join size estimation – Log analysis

Amit Chakrabarti 3

slide-6
SLIDE 6

Multi-Pass Lower Bounds Dec 20, 2009

Multi-Pass Lower Bounds

Data streams: two broad application scenarios

  • Networks: Busy router, packets whizzing by

– Web traffic statistics – Intrusion detection

  • Databases: Huge DB, linear scan cheaper than random access

– Query optimisation: join size estimation – Log analysis

  • DB setting: Multiple passes meaningful

This talk: Pass/space tradeoffs for some basic stream problems

Amit Chakrabarti 3-a

slide-7
SLIDE 7

Multi-Pass Lower Bounds Dec 20, 2009

Data Stream Model

  • Formally: input stream = n tokens, each token ∈ [m]

– Assume log m = Θ(log n)

  • Compute some function of stream, using

– Small space, s ≪ m, n ... ideally, s = O(log n) – Small number of passes, p

Amit Chakrabarti 4

slide-8
SLIDE 8

Multi-Pass Lower Bounds Dec 20, 2009

Problems of Interest

Class A:

  • Median

Class B:

  • Distinct elements
  • Frequency moments
  • Empirical entropy

Amit Chakrabarti 5

slide-9
SLIDE 9

Multi-Pass Lower Bounds Dec 20, 2009

Problems of Interest

Class A:

  • Median

Class B:

  • Distinct elements ,

F0

  • Frequency moments ,

Fk = m

i=1 freq(i)k

  • Empirical entropy ,

H = m

i=1(freq(i)/m)·log(m/freq(i)) Amit Chakrabarti 5-a

slide-10
SLIDE 10

Multi-Pass Lower Bounds Dec 20, 2009

Problems of Interest

Class A:

  • Median
  • Key question: Want s = O(log n); then p = ??

– Dates back to first “data streams” paper

[Munro-Paterson’78]

Class B:

  • Distinct elements ,

F0

  • Frequency moments ,

Fk = m

i=1 freq(i)k

  • Empirical entropy ,

H = m

i=1(freq(i)/m)·log(m/freq(i)) Amit Chakrabarti 5-b

slide-11
SLIDE 11

Multi-Pass Lower Bounds Dec 20, 2009

Problems of Interest

Class A:

  • Median
  • Key question: Want s = O(log n); then p = ??

– Dates back to first “data streams” paper

[Munro-Paterson’78]

Class B:

  • Distinct elements ,

F0

  • Frequency moments ,

Fk = m

i=1 freq(i)k

  • Empirical entropy ,

H = m

i=1(freq(i)/m)·log(m/freq(i))

  • Key question: Want ε-approx; then s = ??

– One-pass: e

O(ε−2), e Ω(ε−2) [BarYossef-J.-K.-S.-T.’02]; [Woodruff’04]

– Dependence of s on n:

[A-M-S’96]; [C.-Khot-Sun’03]; [Gronemeier’09]

Amit Chakrabarti 5-c

slide-12
SLIDE 12

Multi-Pass Lower Bounds Dec 20, 2009

Our Results (Answering the Key Questions)

Class A: Median

[C.-Cormode-McGregor’08]

  • Achieving s = O(log n) requires p = Ω(log n)
  • If tokens randomly ordered, requires p = Ω(log log n)
  • Above lower bounds are tight

[Guha-McGregor’07]

Amit Chakrabarti 6

slide-13
SLIDE 13

Multi-Pass Lower Bounds Dec 20, 2009

Our Results (Answering the Key Questions)

Class A: Median

[C.-Cormode-McGregor’08]

  • Achieving s = O(log n) requires p = Ω(log n)
  • If tokens randomly ordered, requires p = Ω(log log n)

– Specifically: s ≈ Ω(n1/p)

h Ω(n2−p) i for adversarial [random] order

  • Above lower bounds are tight

[Guha-McGregor’07]

Amit Chakrabarti 6-a

slide-14
SLIDE 14

Multi-Pass Lower Bounds Dec 20, 2009

Our Results (Answering the Key Questions)

Class A: Median

[C.-Cormode-McGregor’08]

  • Achieving s = O(log n) requires p = Ω(log n)
  • If tokens randomly ordered, requires p = Ω(log log n)

– Specifically: s ≈ Ω(n1/p)

h Ω(n2−p) i for adversarial [random] order

  • Above lower bounds are tight

[Guha-McGregor’07]

Class B: Distinct elements

[Brody-C.’09]

  • Need s = Ω(1/ε2) space for any p = O(1)

– Specifically: s = e

Ω(1/(ε2p2)) [Brody-C.-Regev-Vidick-deWolf’10]

  • Holds under random order, and even random data
  • Matching upper bound, even with one pass and adversarial data

Amit Chakrabarti 6-b

slide-15
SLIDE 15

Multi-Pass Lower Bounds Dec 20, 2009

Method: Reduce from Communication Complexity

32 17 1 25 31 5 6 27 16 21 24 13 12 9 18 4 14 22 11 29 2 7 3 23 30 8 20 19 15 10 28 26

p-pass streaming algorithm = ⇒ Θ(p)-round communication protocol messages = memory contents of streaming algorithm

Amit Chakrabarti 7

slide-16
SLIDE 16

Multi-Pass Lower Bounds Dec 20, 2009

Communication vs Data Stream

  • Alice

Bob Carl

22 3 18 23 30 8 20 19 15 9 12 32 17 1 28 25 31 5 6 27 26 4 16 21 24 13 10 11 29 2 7 14

split amongst many players

32 17 1 25 31 5 6 27 16 21 24 13 12 9 18 4 14 22 11 29 2 7 3 23 30 8 20 19 15 10 28 26

p-pass streaming algorithm = ⇒ Θ(p)-round communication protocol messages = memory contents of streaming algorithm

Amit Chakrabarti 7

slide-17
SLIDE 17

Multi-Pass Lower Bounds Dec 20, 2009

Communication vs Data Stream

1 1

take special case input + interpret combinatorially

1 1 1 1 1 1 1 1 1 1 1 1 1 1

  • Alice

Bob Carl

22 3 18 23 30 8 20 19 15 9 12 32 17 1 28 25 31 5 6 27 26 4 16 21 24 13 10 11 29 2 7 14

split amongst many players

32 17 1 25 31 5 6 27 16 21 24 13 12 9 18 4 14 22 11 29 2 7 3 23 30 8 20 19 15 10 28 26

p-pass streaming algorithm = ⇒ Θ(p)-round communication protocol messages = memory contents of streaming algorithm

Amit Chakrabarti 7

slide-18
SLIDE 18

Multi-Pass Lower Bounds Dec 20, 2009

The Round Elimination Paradigm

If there exists...

msg3 msg3 msg3 msg3 msg2 msg2 msg2 msg2

A A B B C C D D

Round 2: Input: Round 3: Round 1:

msg1 msg1 msg1 msg1

A B C D

with short messages, then there exists...

Padding:

msg3 msg3 msg3 msg3 msg2 msg2 msg2 msg2

A A B B C C D D

Round 2: Input: Round 3:

Amit Chakrabarti 8

slide-19
SLIDE 19

Multi-Pass Lower Bounds Dec 20, 2009

The Round Elimination Paradigm

If there exists...

msg3 msg3 msg3 msg3 msg2 msg2 msg2 msg2

A A B B C C D D

Round 2: Input: Round 3: Round 1:

msg1 msg1 msg1 msg1

A B C D

with short messages, then there exists...

Padding:

msg3 msg3 msg3 msg3 msg2 msg2 msg2 msg2

A A B B C C D D

Round 2: Input: Round 3:

Eventually, if original protocol too short, then 0-round protocol for a nontrivial problem = ⇒ Contradiction

Amit Chakrabarti 8-a

slide-20
SLIDE 20

Multi-Pass Lower Bounds Dec 20, 2009

Class A: Median

Amit Chakrabarti 9

slide-21
SLIDE 21

Multi-Pass Lower Bounds Dec 20, 2009

Tree Pointer Jumping

Complete k-level t-ary tree T Input φ : V (T) → [t] with φ(leaf) ∈ {0, 1} Player i knows φ at level i gφ(v) :=    φ(v)-th child of v, if v internal φ(v), if v leaf Desired output = gφ(gφ(· · · gφ(root) · · · )) Model: k − 1 rounds of communication Each round: (Plr 1, Plr 2, . . . , Plr k) Call this tpjk,t

1 0 0 1 1 1 1 Level Level 2 3 Level 1

Amit Chakrabarti 10

slide-22
SLIDE 22

Multi-Pass Lower Bounds Dec 20, 2009

Weight-Based TPJ

Theorem: For uniform random input, 1

3-error, CCp(tpjp+1,t) = Ω(t/p2)

Contrast: Dp(tpjp+1,t) = O(t) and Dp+1(tpjp+1,t) = O(p log t)

Amit Chakrabarti 11

slide-23
SLIDE 23

Multi-Pass Lower Bounds Dec 20, 2009

Weight-Based TPJ

Theorem: For uniform random input, 1

3-error, CCp(tpjp+1,t) = Ω(t/p2)

Contrast: Dp(tpjp+1,t) = O(t) and Dp+1(tpjp+1,t) = O(p log t) Actually, use a variant w-tpj (weight-based):

  • Input specifies xv ∈ {0, 1}ℓv with φ(v) = t

2 + bias(|xv|)

  • Lengths ℓv = tlevel(v)−1

Median lower bound: reduction from w-tpj (next slide)

Amit Chakrabarti 11-a

slide-24
SLIDE 24

Multi-Pass Lower Bounds Dec 20, 2009

Weight-Based TPJ

Theorem: For uniform random input, 1

3-error, CCp(tpjp+1,t) = Ω(t/p2)

Contrast: Dp(tpjp+1,t) = O(t) and Dp+1(tpjp+1,t) = O(p log t) Actually, use a variant w-tpj (weight-based):

  • Input specifies xv ∈ {0, 1}ℓv with φ(v) = t

2 + bias(|xv|)

  • Lengths ℓv = tlevel(v)−1

Median lower bound: reduction from w-tpj (next slide) Robust communication complexity: Above CC lower bound still holds when input bits allocated amongst players at random.

Relevant theory developed in [C.-Cormode-McGregor’08]

Amit Chakrabarti 11-b

slide-25
SLIDE 25

Multi-Pass Lower Bounds Dec 20, 2009

Weight-Based TPJ

Theorem: For uniform random input, 1

3-error, CCp(tpjp+1,t) = Ω(t/p2)

Contrast: Dp(tpjp+1,t) = O(t) and Dp+1(tpjp+1,t) = O(p log t) Actually, use a variant w-tpj (weight-based):

  • Input specifies xv ∈ {0, 1}ℓv with φ(v) = t

2 + bias(|xv|)

  • Lengths ℓv = tlevel(v)−1
  • For random order, ℓv ≈ t2level(v)−1

(hence, smaller lower bound) Median lower bound: reduction from w-tpj (next slide) Robust communication complexity: Above CC lower bound still holds when input bits allocated amongst players at random.

Relevant theory developed in [C.-Cormode-McGregor’08]

Amit Chakrabarti 11-c

slide-26
SLIDE 26

Multi-Pass Lower Bounds Dec 20, 2009

From TPJ to Median

Map each input bit to an integer: x − → multiset Sx, s.t. w-tpj(x) = LSB(median(Sx)) Basic idea, for k = 2 levels:

  • At level 2, 0 → −∞ (min value) and 1 → +∞ (max value)
  • At level 1, xi → 2i + xi (for ith leaf)

2 1 1 5 7 8 10 0,1,1,1

−∞, +∞, +∞, +∞ 5 t = 2 k =

Amit Chakrabarti 12

slide-27
SLIDE 27

Multi-Pass Lower Bounds Dec 20, 2009

Class B: Distinct Elements

Amit Chakrabarti 13

slide-28
SLIDE 28

Multi-Pass Lower Bounds Dec 20, 2009

The Gap-Hamming-Distance Problem

Input: Alice gets x ∈ {0, 1}n, Bob gets y ∈ {0, 1}n. Output:

  • ghd(x, y) = 1 if ∆(x, y) > n

2 + √n

  • ghd(x, y) = 0 if ∆(x, y) < n

2 − √n

Want: randomized, constant error protocol Cost: Worst case number of bits communicated

1 x = y = 1 1 1 1 1 1 1 1

n = 12; ∆(x, y) = 3 ∈ [6 − √ 12, 6 + √ 12]

Amit Chakrabarti 14

slide-29
SLIDE 29

Multi-Pass Lower Bounds Dec 20, 2009

The Reductions

E.g., Distinct Elements (Other problems: similar)

( 9 , )

y = 1 1 1 1

( 1 2 , 1 ) ( 1 1 , ) ( 1 , ) ( 1 2 , 1 ) ( 1 1 , ) ( 1 , )

x = 1 1 1 1 1

( 1 , ) ( 3 , ) ( 4 , ) ( 6 , ) ( 8 , ) ( 7 , ) ( 2 , ) ( 5 , ) ( 9 , 1 )

τ : σ :

( 1 , ) ( 3 , ) ( 4 , ) ( 2 , 1 ) ( 5 , 1 ) ( 6 , ) ( 8 , ) ( 7 , )

Alice: x − → σ = (1, x1), (2, x2), . . . , (n, xn) Bob: y − → τ = (1, y1), (2, y2), . . . , (n, yn) Notice: F0(σ ◦ τ) = n + ∆(x, y) =    < 3n

2 − √n, or

> 3n

2 + √n.

Set ε =

1 √n. Amit Chakrabarti 15

slide-30
SLIDE 30

Multi-Pass Lower Bounds Dec 20, 2009

State of Play, Jan. 2009

Using one round = one message... Previous results [Indyk-Woodruff’03], [Woodruff’04], [C.-Cormode-McGregor’07]:

  • For one-round protocols, R→(ghd) = Ω(n)
  • Implies the

Ω(ε−2) streaming lower bounds

Amit Chakrabarti 16

slide-31
SLIDE 31

Multi-Pass Lower Bounds Dec 20, 2009

State of Play, Jan. 2009

Using one round = one message... Previous results [Indyk-Woodruff’03], [Woodruff’04], [C.-Cormode-McGregor’07]:

  • For one-round protocols, R→(ghd) = Ω(n)
  • Implies the

Ω(ε−2) streaming lower bounds Key open questions:

  • What is the two-way randomized complexity R(ghd)?
  • Better algorithm for Distinct Elements (or Fk, or H) using two passes?

Amit Chakrabarti 16-a

slide-32
SLIDE 32

Multi-Pass Lower Bounds Dec 20, 2009

State of Play, Jan. 2009

Using one round = one message... Previous results [Indyk-Woodruff’03], [Woodruff’04], [C.-Cormode-McGregor’07]:

  • For one-round protocols, R→(ghd) = Ω(n)
  • Implies the

Ω(ε−2) streaming lower bounds Key open questions:

  • What is the two-way randomized complexity R(ghd)?
  • Better algorithm for Distinct Elements (or Fk, or H) using two passes?

New Results

Summer Thm: RO(1)(ghd) = Ω(n); i.e., O(1) rounds/passes no better

Amit Chakrabarti 16-b

slide-33
SLIDE 33

Multi-Pass Lower Bounds Dec 20, 2009

State of Play, Jan. 2009

Using one round = one message... Previous results [Indyk-Woodruff’03], [Woodruff’04], [C.-Cormode-McGregor’07]:

  • For one-round protocols, R→(ghd) = Ω(n)
  • Implies the

Ω(ε−2) streaming lower bounds Key open questions:

  • What is the two-way randomized complexity R(ghd)?
  • Better algorithm for Distinct Elements (or Fk, or H) using two passes?

New Results

Summer Thm: RO(1)(ghd) = Ω(n); i.e., O(1) rounds/passes no better Winter Thm: Rp(ghd) = Ω(n/p2); previously was Ω(n/2O(p2))

Remark: These hold under uniform input distribution

Amit Chakrabarti 16-c

slide-34
SLIDE 34

Multi-Pass Lower Bounds Dec 20, 2009

A Simplification

Will prove distributional lower bound under uniform dist In this setting, may as well work with threshold version, thd

  • thd(x, y) = 1 if ∆(x, y) ≥ n

2

  • thd(x, y) = 0 if ∆(x, y) < n

2 Amit Chakrabarti 17

slide-35
SLIDE 35

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination V1.0: Subcube Lifting

First message constant on large set:

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 }

2

points 0.99n

Alice, Bob lift their (n/3)-dim inputs from inner coords to full n-dim space First message now redundant, so eliminate!

Amit Chakrabarti 18

slide-36
SLIDE 36

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination V1.0: Subcube Lifting

First message constant on large set:

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 }

2

points 0.99n 1 1 1 1 1 1 1 inner coords, the real input (Rest: outer coords, padding) S:

Alice, Bob lift their (n/3)-dim inputs from inner coords to full n-dim space First message now redundant, so eliminate!

Amit Chakrabarti 18

slide-37
SLIDE 37

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination V1.0: Subcube Lifting

First message constant on large set:

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 }

2

points 0.99n 1 1 1 1 1 1 1 inner coords, the real input (Rest: outer coords, padding) S:

Amit Chakrabarti 18

slide-38
SLIDE 38

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination V1.0: Subcube Lifting

First message constant on large set:

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 }

2

points 0.99n 1 1 1 1 1 1 1 inner coords, the real input (Rest: outer coords, padding) S:

Alice, Bob lift their (n/3)-dim inputs from inner coords to full n-dim space First message now redundant, so eliminate!

[Brody-C.’09]

Amit Chakrabarti 18-a

slide-39
SLIDE 39

Multi-Pass Lower Bounds Dec 20, 2009

Subcube Lifting: Wasteful?

  • Each step: dimension n −

→ n/3

  • Inherently, can eliminate at most O(log n) rounds

In fact, get Rp(ghd) = n/2O(p2)

  • Solved long-standing open problem (IITK 2006 list)... happy?

Rethinking Round Elimination

  • Crux: delete first round, solve simpler instance
  • Simpler need not mean smaller!

E.g., could mean increased error prob.

Amit Chakrabarti 19

slide-40
SLIDE 40

Multi-Pass Lower Bounds Dec 20, 2009

Subcube Lifting: Wasteful?

  • Each step: dimension n −

→ n/3

  • Inherently, can eliminate at most O(log n) rounds

In fact, get Rp(ghd) = n/2O(p2)

  • Solved long-standing open problem (IITK 2006 list)... happy?

Rethinking Round Elimination

  • Crux: delete first round, solve simpler instance
  • Simpler need not mean smaller!

Amit Chakrabarti 19

slide-41
SLIDE 41

Multi-Pass Lower Bounds Dec 20, 2009

Subcube Lifting: Wasteful?

  • Each step: dimension n −

→ n/3

  • Inherently, can eliminate at most O(log n) rounds

In fact, get Rp(ghd) = n/2O(p2)

  • Solved long-standing open problem (IITK 2006 list)... happy?

Rethinking Round Elimination

  • Crux: delete first round, solve simpler instance
  • Simpler need not mean smaller!

E.g., could mean increased error prob.

Amit Chakrabarti 19-a

slide-42
SLIDE 42

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination V2.0: Geometric Perturbation

Max message size = cn First message constant over set A of size 2n−cn

A {0,1}n

Alice: replace x with z = NearestNeighbour(x, A)

Amit Chakrabarti 20

slide-43
SLIDE 43

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination V2.0: Geometric Perturbation

Max message size = cn First message constant over set A of size 2n−cn

A {0,1}n x y

Alice: replace x with z = NearestNeighbour(x, A)

Amit Chakrabarti 20

slide-44
SLIDE 44

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination V2.0: Geometric Perturbation

Max message size = cn First message constant over set A of size 2n−cn

A {0,1}n x y z

Alice: replace x with z = NearestNeighbour(x, A)

Amit Chakrabarti 20

slide-45
SLIDE 45

Multi-Pass Lower Bounds Dec 20, 2009

Geometric Perturbation: A Better Picture

x z c

1/2n

ERR

{0,1}n

Pr[A] = 2−cn . . . . . . thus, w.h.p., ∆(x, z) ≤ (√cn std devs) = √c · n

Amit Chakrabarti 21

slide-46
SLIDE 46

Multi-Pass Lower Bounds Dec 20, 2009

Geometric Perturbation: A Better Picture

x z c

1/2n

ERR

{0,1}n

Pr[A] = 2−cn . . . . . . thus, w.h.p., ∆(x, z) ≤ (√cn std devs) = √c · n Assumed A is Hamming ball . . . . . .

Amit Chakrabarti 21-a

slide-47
SLIDE 47

Multi-Pass Lower Bounds Dec 20, 2009

Geometric Perturbation: A Better Picture

x z c

1/2n

ERR

{0,1}n

Pr[A] = 2−cn . . . . . . thus, w.h.p., ∆(x, z) ≤ (√cn std devs) = √c · n Assumed A is Hamming ball . . . . . . that’s indeed the worst case [Harper’66]

Amit Chakrabarti 21-b

slide-48
SLIDE 48

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination: Analysis

Alice: x ∈R {0, 1}n − → z ∼ ??; Bob: y ∈R {0, 1}n Why does the shorter protocol work?

Amit Chakrabarti 22

slide-49
SLIDE 49

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination: Analysis

Alice: x ∈R {0, 1}n − → z ∼ ??; Bob: y ∈R {0, 1}n Why does the shorter protocol work? How can it fail? Two ways:

  • E1: ∆(x, y) too close to n/2
  • E2: Not near threshold, but thd(x, y) = thd(z, y)

Amit Chakrabarti 22-a

slide-50
SLIDE 50

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination: Analysis

Alice: x ∈R {0, 1}n − → z ∼ ??; Bob: y ∈R {0, 1}n Why does the shorter protocol work? How can it fail? Two ways:

  • E1: ∆(x, y) too close to n/2
  • E2: Not near threshold, but thd(x, y) = thd(z, y)

Estimating the probabilities:

  • E1: “anticoncentration” of Binomial dist

Amit Chakrabarti 22-b

slide-51
SLIDE 51

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination: Analysis

Alice: x ∈R {0, 1}n − → z ∼ ??; Bob: y ∈R {0, 1}n Why does the shorter protocol work? How can it fail? Two ways:

  • E1: ∆(x, y) too close to n/2
  • E2: Not near threshold, but thd(x, y) = thd(z, y)

Estimating the probabilities:

  • E1: “anticoncentration” of Binomial dist

Pr

  • |∆(x, y) − n/2| < δ√n
  • ≤ δ

Amit Chakrabarti 22-c

slide-52
SLIDE 52

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination: Analysis

Alice: x ∈R {0, 1}n − → z ∼ ??; Bob: y ∈R {0, 1}n Why does the shorter protocol work? How can it fail? Two ways:

  • E1: ∆(x, y) too close to n/2
  • E2: Not near threshold, but thd(x, y) = thd(z, y)

Estimating the probabilities:

  • E1: “anticoncentration” of Binomial dist

Pr

  • |∆(x, y) − n/2| < δ√n
  • ≤ δ
  • E2: shift to assume x =

Pr

  • |y| < n/2 − δ√n ∧ |y ⊕ z| > n/2
  • ≤ ??

Amit Chakrabarti 22-d

slide-53
SLIDE 53

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination: Analysis

Alice: x ∈R {0, 1}n − → z ∼ ??; Bob: y ∈R {0, 1}n Why does the shorter protocol work? How can it fail? Two ways:

  • E1: ∆(x, y) too close to n/2
  • E2: Not near threshold, but thd(x, y) = thd(z, y)

Estimating the probabilities:

  • E1: “anticoncentration” of Binomial dist

Pr

  • |∆(x, y) − n/2| < δ√n
  • ≤ δ
  • E2: shift to assume x =

Pr

  • |y| < n/2 − δ√n ∧ |y ⊕ z| > n/2
  • ≤ ??

Recall: |z| = ∆(x, z) ≤ √c · n, w.h.p.

Amit Chakrabarti 22-e

slide-54
SLIDE 54

Multi-Pass Lower Bounds Dec 20, 2009

Switcheroo

Fixed y ∈ {0, 1}n, with |y| < n/2 − δ√n Random z ∈R {0, 1}n, with |z| ≤ √c · n Recall: first message length = cn Pr

  • |y ⊕ z| > n/2
  • ≤ ??

Amit Chakrabarti 23

slide-55
SLIDE 55

Multi-Pass Lower Bounds Dec 20, 2009

Switcheroo

Fixed y ∈ {0, 1}n, with |y| < n/2 − δ√n Random z ∈R {0, 1}n, with |z| ≤ √c · n Recall: first message length = cn Pr

  • |y ⊕ z| > n/2
  • ≤ ??

Random coordinate flipping: y − → y ⊕ z

Amit Chakrabarti 23-a

slide-56
SLIDE 56

Multi-Pass Lower Bounds Dec 20, 2009

Switcheroo

Fixed y ∈ {0, 1}n, with |y| < n/2 − δ√n Random z ∈R {0, 1}n, with |z| ≤ √c · n Recall: first message length = cn Pr

  • |y ⊕ z| > n/2
  • ≤ ??

Random coordinate flipping: y − → y ⊕ z Expect |y| to change by about √c · n

Amit Chakrabarti 23-b

slide-57
SLIDE 57

Multi-Pass Lower Bounds Dec 20, 2009

Switcheroo

Fixed y ∈ {0, 1}n, with |y| < n/2 − δ√n Random z ∈R {0, 1}n, with |z| ≤ √c · n Recall: first message length = cn Pr

  • |y ⊕ z| > n/2
  • ≤ ??

Random coordinate flipping: y − → y ⊕ z Expect |y| to change by about √c · n W.h.p., change is no more than c1/4√n log p

[Hoeffding’63]

We’re good if this = δ√n, i.e., if δ = c1/4 log1/2 p

Amit Chakrabarti 23-c

slide-58
SLIDE 58

Multi-Pass Lower Bounds Dec 20, 2009

Switcheroo

Fixed y ∈ {0, 1}n, with |y| < n/2 − δ√n Random z ∈R {0, 1}n, with |z| ≤ √c · n Recall: first message length = cn Pr

  • |y ⊕ z| > n/2
  • ≤ ??

Random coordinate flipping: y − → y ⊕ z Expect |y| to change by about √c · n W.h.p., change is no more than c1/4√n log p

[Hoeffding’63]

We’re good if this = δ√n, i.e., if δ = c1/4 log1/2 p Overall error = δ+(tiny) ≈ c1/4 log1/2 p

Amit Chakrabarti 23-d

slide-59
SLIDE 59

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination: Wrap-Up

  • Killed a message of length cn, adding c1/4 log1/2 p to error
  • Have to do this p times
  • Final error must be Ω(1), else contradiction

= ⇒ pc1/4 log1/2 p = Ω(1) = ⇒ (max comm) = Ω(n/p4 log2 p)

[Brody-C.-Regev-Vidick-deWolf’10]

Amit Chakrabarti 24

slide-60
SLIDE 60

Multi-Pass Lower Bounds Dec 20, 2009

Round Elimination: Wrap-Up

  • Killed a message of length cn, adding c1/4 log1/2 p to error
  • Have to do this p times
  • Final error must be Ω(1), else contradiction

= ⇒ pc1/4 log1/2 p = Ω(1) = ⇒ (max comm) = Ω(n/p4 log2 p)

  • Work on sphere, not Hamming cube: Rp(ghd) = Ω(n/p2 log p)

x ∈ {0, 1}n − →

  • x ∈
  • − 1

√n, 1 √n n ghd − → Gap-Inner-Product

[Brody-C.-Regev-Vidick-deWolf’10]

Amit Chakrabarti 24-a

slide-61
SLIDE 61

Multi-Pass Lower Bounds Dec 20, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Amit Chakrabarti 25

slide-62
SLIDE 62

Multi-Pass Lower Bounds Dec 20, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny!

Amit Chakrabarti 25-a

slide-63
SLIDE 63

Multi-Pass Lower Bounds Dec 20, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)
  • Approximate polynomial degree
  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]
  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Amit Chakrabarti 25-b

slide-64
SLIDE 64

Multi-Pass Lower Bounds Dec 20, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)

Matrix has large near-monochromatic rectangles

  • Approximate polynomial degree
  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]
  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Amit Chakrabarti 25-c

slide-65
SLIDE 65

Multi-Pass Lower Bounds Dec 20, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)

Matrix has large near-monochromatic rectangles

  • Approximate polynomial degree

Underlying predicate has approx degree O(√n)

  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]
  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Amit Chakrabarti 25-d

slide-66
SLIDE 66

Multi-Pass Lower Bounds Dec 20, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)

Matrix has large near-monochromatic rectangles

  • Approximate polynomial degree

Underlying predicate has approx degree O(√n)

  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]

Quantum communication upper bound O(√n log n)

  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Amit Chakrabarti 25-e

slide-67
SLIDE 67

Multi-Pass Lower Bounds Dec 20, 2009

Why Did This Take So Long?

Multi-pass lower bounds for Distinct Elements and Fk has been an important

  • pen question since at least 2003. Why did it remain open for so long?

Underlying communication problem thorny! Resists the “usual” attacks:

  • Rectangle-based methods (discrepancy/corruption)

Matrix has large near-monochromatic rectangles

  • Approximate polynomial degree

Underlying predicate has approx degree O(√n)

  • Pattern matrix, Factorization norms [Sherstov’08], [Linial-Shraibman’07]

Quantum communication upper bound O(√n log n)

  • Information complexity

[C.-Shi-Wirth-Yao’01], [BarYossef-J.-K.-S.’02]

Hmm! Can’t see a concrete obstacle

Amit Chakrabarti 25-f

slide-68
SLIDE 68

Multi-Pass Lower Bounds Dec 20, 2009

Final Remarks

Summary:

  • 1. Round elimination is a great paradigm for proving lower bounds

(especially when you don’t over-define it).

  • 2. Gives clean proofs
  • 3. Cases in point: Multi-player Pointer Jumping, Gap-Hamming-Distance
  • 4. Data stream consequences

Amit Chakrabarti 26

slide-69
SLIDE 69

Multi-Pass Lower Bounds Dec 20, 2009

Final Remarks

Summary:

  • 1. Round elimination is a great paradigm for proving lower bounds

(especially when you don’t over-define it).

  • 2. Gives clean proofs
  • 3. Cases in point: Multi-player Pointer Jumping, Gap-Hamming-Distance
  • 4. Data stream consequences

Open “problems”:

  • 1. Understand communication complexity of

“gap problems” better... get further streaming results.

  • 2. Apply round elimination to your favourite problem.

Amit Chakrabarti 26-a

slide-70
SLIDE 70

Multi-Pass Lower Bounds Dec 20, 2009

Breaking News

Very recently, Oded Regev proved a remarkable new “corre- lation inequality” for Gaussian distributions. This, plus a new generalization of the rectangle method, im- plies that R(ghd) = Ω(n).

Amit Chakrabarti 27