l ecture 10
play

L ECTURE 10 Last time Multipurpose sketches Count-min and - PowerPoint PPT Presentation

Sublinear Algorithms L ECTURE 10 Last time Multipurpose sketches Count-min and count-sketch Range queries, heavy hitters, quantiles Today Limitations of streaming algorithms Communication complexity 10/6/2020 Sofya


  1. Sublinear Algorithms L ECTURE 10 Last time • Multipurpose sketches • Count-min and count-sketch • Range queries, heavy hitters, quantiles Today • Limitations of streaming algorithms • Communication complexity 10/6/2020 Sofya Raskhodnikova;Boston University

  2. Recall: Frequency Moments Estimation Input: a stream 𝑏 1 , 𝑏 2 , … , 𝑏 𝑛 ∈ 𝑜 𝑛 The frequency vector of the stream is 𝑔 = (𝑔 1 , … , 𝑔 𝑜 ) , • where 𝑔 𝑗 is the number of times 𝑗 appears in the stream 𝑞 = σ 𝑗=1 𝑞 𝑜 The 𝑞 -th frequency moment is 𝐺 𝑞 = 𝑔 𝑔 • 𝑗 𝑞 𝐺 0 is the number of nonzero entries of 𝑔 (# of distinct elements) 𝐺 1 = 𝑛 (# of elements in the stream) 2 is a measure of non-uniformity 𝐺 2 = 𝑔 2 used e.g. for anomaly detection in network analysis 𝐺 ∞ = max 𝑔 𝑗 is the most frequent element 𝑗 We obtained streaming algorithms for 𝐺 0 , 𝐺 1 , 𝐺 2 . What about 𝐺 3 to 𝐺 ∞ ? 2

  3. Communication Complexity A Method for Proving Lower Bounds

  4. (Randomized) Communication Complexity 𝑇ℎ𝑏𝑠𝑓𝑒 𝑠𝑏𝑜𝑒𝑝𝑛 𝑡𝑢𝑠𝑗𝑜𝑕 1101000101110101110101010110 … Alice Bob 0100 11 001 ⋯ 0011 𝐽𝑜𝑞𝑣𝑢: 𝑦 Input: 𝑧 Compute 𝐷 𝑦, 𝑧 Goal: minimize the number of bits exchanged. Communication complexity of a protocol is t he maximum number of bits • exchanged by the protocol. • Communication complexity of a function 𝐷 , denoted 𝑆(𝐷) , is the communication complexity of the best protocol for computing C. Partially based on slides by Eric Blais 4

  5. Example: Set Disjointness 𝐸𝐽𝑇𝐾 𝒍 1101000101110101110101010110 … Alice Bob 𝐽𝑜𝑞𝑣𝑢: 𝑇 ⊆ [𝑜] , 𝑇 = 𝑙 . Input: 𝑈 ⊆ [𝑜] , 𝑈 = 𝑙 Compute 𝐸𝐽𝑇𝐾 𝑙 𝑇, 𝑈 = ቊ 𝒃𝒅𝒅𝒇𝒒𝒖 if 𝑇 ∩ 𝑈 = ∅ 𝒔𝒇𝒌𝒇𝒅𝒖 otherwise Theorem [Kalyanasundaram Schmitger 92, Razborov 92] 𝑜 𝑆 DISJ 𝑙 ≥ Ω 𝑙 for all 𝑙 ≤ 2 . 5

  6. One-Way Communication Complexity 𝑇ℎ𝑏𝑠𝑓𝑒 𝑠𝑏𝑜𝑒𝑝𝑛 𝑡𝑢𝑠𝑗𝑜𝑕 1101000101110101110101010110 … Alice Bob 𝑛 1 𝐽𝑜𝑞𝑣𝑢: 𝑦 Input: 𝑧 Compute 𝐷 𝑦, 𝑧 Goal: minimize the number of bits Alice sends to Bob. One-way communication complexity of a function 𝐷 , denoted 𝑆 → (𝐷) , is the communication complexity of the best one-way protocol for computing C. 6

  7. 3-Player One-Way Communication Complexity 𝑇ℎ𝑏𝑠𝑓𝑒 𝑠𝑏𝑜𝑒𝑝𝑛 𝑡𝑢𝑠𝑗𝑜𝑕 1101000101110101110101010110 … Alice Carol Bob 𝑛 1 𝑛 2 Compute 𝐷 𝑦, 𝑧, 𝑨 Input: 𝑨 𝐽𝑜𝑞𝑣𝑢: 𝑦 Input: 𝑧 Goal: minimize 𝑛 1 + |𝑛 2 | . • Require correct output w.p. at least 2/3 over the random string 7

  8. Converting Streaming Algorithm to CC Protocol Let 𝓠 be a streaming problem. • Suppose there is a transformation 𝑦 → 𝑡 1 , 𝑧 → 𝑡 2 , 𝑨 → 𝑡 3 such that 𝓠 (𝑡 1 ∘ 𝑡 2 ∘ 𝑡 3 ) suffices to compute 𝐷(𝑦, 𝑧, 𝑨) 𝑛 1 𝑛 2 Compute 𝐷 𝑦, 𝑧, 𝑨 Input: 𝑨 𝐽𝑜𝑞𝑣𝑢: 𝑦 Input: 𝑧 𝑡 1 𝑡 2 𝑡 3 An 𝑡 -bit algorithm 𝐵 for 𝓠 gives a 2𝑡 -bit protocol for 𝐷 • Alice runs 𝐵 on 𝑡 1 and sends memory state, 𝑛 1 , to Bob Bob instantiates 𝐵 with 𝑛 1 , runs 𝐵 on 𝑡 2 , sends memory state, 𝑛 2 , to Carol • Carol instantiates 𝐵 with 𝑛 2 , runs 𝐵 on 𝑡 3 to get 𝓠 (𝑡 1 ∘ 𝑡 2 ∘ 𝑡 3 ) and • computes 𝐷(𝑦, 𝑧, 𝑨) 8 Based on Andrew McGregor’s slides: https://people.cs.umass.edu/~mcgregor/711S18/lowerbounds-1.pdf

  9. Converting Streaming Algorithm to CC Protocol Let 𝓠 be a streaming problem. • Suppose there is a transformation 𝑦 → 𝑡 1 , 𝑧 → 𝑡 2 , 𝑨 → 𝑡 3 such that 𝓠 (𝑡 1 ∘ 𝑡 2 ∘ 𝑡 3 ) suffices to compute 𝐷(𝑦, 𝑧, 𝑨) 𝑛 1 𝑛 2 Compute 𝐷 𝑦, 𝑧, 𝑨 Input: 𝑨 𝐽𝑜𝑞𝑣𝑢: 𝑦 Input: 𝑧 𝑡 1 𝑡 2 𝑡 3 An 𝑡 -bit algorithm 𝐵 for 𝓠 gives a 2𝑡 -bit protocol for 𝐷 • If there are 𝑞 players than the protocol uses 𝑞 − 1 𝑡 bits 𝑀 • A lower bound 𝑀 for computing 𝐷 implies 𝑐 = Ω 𝑞 9 Based on Andrew McGregor’s slides: https://people.cs.umass.edu/~mcgregor/711S18/lowerbounds-1.pdf

  10. A lower bound using CC method Approximating 𝐺 ∞

  11. Application: Approximating 𝑮 ∞ Theorem Every algorithm that computes 4/3 -approximation of 𝐺 ∞ (w.p. ≥ 2/3) needs Ω(𝑜) space. Proof: Reduction from Set Disjointness On input 𝑦, 𝑧 ∈ 0,1 𝑜 , players generate 𝑡 1 = {𝑘: 𝑦 𝑘 = 1} and 𝑡 2 = {𝑘: 𝑧 𝑘 = 1} 0 0 1 1 0 0 Example: → 〈3,4; 1,3,5〉 (1 0 1 0 1 0) • Then 𝐺 ∞ = 1 if 𝑦, 𝑧 represent disjoint sets, and 𝐺 ∞ = 2 , otherwise. Output ≥ 3/2 Output ≤ 4/3 An 𝑡 -space algorithm implies an 𝑡 -bit protocol: • 𝑡 = Ω 𝑜 by communication complexity of 𝑇𝑓𝑢 𝐸𝑗𝑡𝑘𝑝𝑗𝑜𝑢𝑜𝑓𝑡𝑡 11 Based on Andrew McGregor’s slides: https://people.cs.umass.edu/~mcgregor/711S18/lowerbounds-1.pdf

  12. A lower bound using CC method Computing the median of a stream

  13. Index • Alice gets an 𝑜 -bit string 𝑦 , and Bob gets an index 𝑘 ∈ [𝑜] . Define 𝐽𝑜𝑒𝑓𝑦(𝑦, 𝑘) = 𝑦 𝑘 . • One-way communication complexity of 𝐽𝑜𝑒𝑓𝑦(𝑦, 𝑘) is Ω 𝑜 • 13

  14. Application: Finding the Median of a Stream Theorem Every algorithm that computes the median of an (2𝑜 − 1) - element stream exactly (w.p. ≥ 2/3) needs Ω(𝑜) space. Proof: Reduction from Index. On input 𝑦 ∈ 0,1 𝑜 , Alice generates 𝑡 1 = {2𝑗 + 𝑦 𝑗 : 𝑗 ∈ [𝑜]} • 0 0 1 1 0 1 1 → 〈2,4,7,9,10,13,15〉 Example: • On input 𝑘 ∈ [𝑜] , Bob generates 𝑡 2 = 𝑜 − 𝑘 copies of 0 and 𝑘 − 1 copies of 2𝑜 + 2 𝑘 = 2 → 〈0,0,0,0,0,16〉 Example: Then 𝑛𝑓𝑒𝑗𝑏𝑜 𝑡 1 ∘ 𝑡 2 = 2𝑘 + 𝑦 𝑘 and Index 𝑦, 𝑘 = 2𝑘 + 𝑦 𝑘 𝑛𝑝𝑒 2 • • An 𝑡 -space algorithm implies an 𝑡 -bit protocol: by 1-way communication 𝑡 = Ω 𝑜 complexity of 𝐽𝑜𝑒𝑓𝑦 14 Based on Andrew McGregor’s slides: https://people.cs.umass.edu/~mcgregor/711S18/lowerbounds-1.pdf

  15. A lower bound using CC method Approximating Frequency Moments [Bar-Yossef, Jayram, Kumar, Sivakumar 04]

  16. Multi-party Set Disjointness • Consider a 𝑞 × 𝑜 binary matrix 𝑁 where each column has weight 0, 1 or 𝑞 Example: 0 0 1 1 0 0 4 5 1 0 1 0 1 0 1 3 0 0 1 0 0 0 6 0 0 1 0 0 1 • The input of player 𝑗 is row 𝑗 of 𝑁 𝐸𝐽𝑇𝐾 𝑞 𝑁 = ቊ 0 if there is a column of 1s 1 otherwise Communication complexity of 𝐸𝐽𝑇𝐾 𝑞 𝑁 is Ω 𝑜 • 𝑞 16 Based on Andrew McGregor’s slides: https://people.cs.umass.edu/~mcgregor/711S18/lowerbounds-1.pdf

  17. Application: Frequency Moments for 𝒍 > 𝟑 Every algorithm that 2-approximaes 𝐺 𝑙 (w.p. ≥ 2/3) needs Ω 𝑜 1− 2 Thm. space 𝑙 Proof: Reduction from multi-party Set Disjointness On input 𝑁 ∈ 0,1 𝑞×𝑜 , player 𝑗 generates 𝑡 𝑗 = {𝑘: 𝑁 𝑗𝑘 = 1} • Example: 0 0 1 1 0 0 4 5 1 1 0 1 0 1 0 3 → 〈3,4; 1,3,5; 3; 3,6〉 0 0 1 0 0 0 6 0 0 1 0 0 1 𝑙 ≤ 𝑜 𝑜 If all columns have weight 0 or 1 then 𝐺 𝑙 = σ 𝑗=1 • 𝑔 𝑗 If there is a column of weight 𝑞 then 𝐺 𝑙 ≥ 𝑞 𝑙 • 1 A 2-approximation of 𝐺 𝑙 distinguishes the cases if 𝑞 𝑙 > 4𝑜 ⇔ 𝑞 > 4𝑜 • 𝑙 • An 𝑡 -space algorithm implies 𝑡(𝑞 − 1) -bit protocol: 𝑜 𝑜 = Ω 𝑜 1−2 𝑡 = Ω = Ω 𝑙 𝑞 2 2 4𝑜 𝑙 by communication complexity of 𝐸𝐽𝑇𝐾 (𝑞) f or constant 𝑙 17 Based on Andrew McGregor’s slides: https://people.cs.umass.edu/~mcgregor/711S18/lowerbounds-1.pdf

  18. A lower bound using CC method Distinct Elements

  19. Gap Hamming • Alice and Bob get 𝑜 -bit strings 𝑦 and 𝑧 , respectively. Hamming distance 𝐼𝑏𝑛(𝑦, 𝑧) is the number of positions on which 𝑦 and 𝑧 • differ. Output: 𝐼𝑏𝑛(𝑦, 𝑧) with additive error 𝑜 w.p. ≥ 2/3 • • Communication complexity of 𝐼𝑏𝑛(𝑦, 𝑧) is Ω 𝑜 even when |𝑦| and |𝑧| are known to both players 19 Based on Andrew McGregor’s slides: https://people.cs.umass.edu/~mcgregor/711S18/lowerbounds-1.pdf

  20. Application: Distinct Elements Every algorithm (1 + 𝜁) -approximing 𝐺 0 (w.p. ≥ 2/3) needs Ω 1/𝜁 2 space Thm. Proof: Reduction from Gap Hamming On input 𝑦, 𝑧 ∈ 0,1 𝑜 , players generate 𝑡 1 = {𝑘: 𝑦 𝑘 = 1} and 𝑡 2 = {𝑘: 𝑧 𝑘 = 1} 0 0 1 1 0 0 Example: → 〈3,4; 1,3,5〉 (1 0 1 0 1 0) Then 2𝐺 0 = 𝑦 + 𝑧 + 𝐼𝑏𝑛(𝑦, 𝑧) • • When |𝑦| is known to Bob, (1 + 𝜁) -approximation of 𝐺 0 gives an additive approximation to Ham 𝑦, 𝑧 𝜁 ⋅ 𝑦 + 𝑧 + 𝐼𝑏𝑛 𝑦, 𝑧 ≤ 𝜁𝑜 ≤ 𝑜 2 f or 𝜁 ≤ 1/ 𝑜 • An 𝑡 -space algorithm implies an 𝑡 -bit protocol: 1 𝑡 = Ω 𝑜 = Ω 𝜁 2 by communication complexity of 𝐻𝑏𝑞 𝐼𝑏𝑛𝑛𝑗𝑜𝑕 20 Based on Andrew McGregor’s slides: https://people.cs.umass.edu/~mcgregor/711S18/lowerbounds-1.pdf

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend