under Modular Updates Shachar Lovett (UCSD) Kaave Hosseini (UCSD - - PowerPoint PPT Presentation

β–Ά
under modular updates
SMART_READER_LITE
LIVE PREVIEW

under Modular Updates Shachar Lovett (UCSD) Kaave Hosseini (UCSD - - PowerPoint PPT Presentation

Optimality of Linear Sketching under Modular Updates Shachar Lovett (UCSD) Kaave Hosseini (UCSD CMU), Grigory Yaroslavtsev (Indiana) Streaming and sketching Streaming with binary updates Counters 1 , , 2


slide-1
SLIDE 1

Optimality of Linear Sketching under Modular Updates

Shachar Lovett (UCSD)

Kaave Hosseini (UCSD β†’ CMU), Grigory Yaroslavtsev (Indiana)

slide-2
SLIDE 2

Streaming and sketching

slide-3
SLIDE 3

Streaming with binary updates

  • Counters 𝑦1, … , π‘¦π‘œ ∈ 𝔾2
  • Stream of updates: 𝑦𝑗 ← 𝑦𝑗 βŠ• 1
  • At the end, want to compute function 𝑔(𝑦1, … , π‘¦π‘œ)
  • For which functions can we do it using β‰ͺ π‘œ memory?
slide-4
SLIDE 4

Example

  • Initially

000000

  • Flip 𝑦1

100000

  • Flip 𝑦5

100010

  • Flip 𝑦2

110010

  • Flip 𝑦5

100000

  • …
  • Compute

𝑔 𝑦1, … , π‘¦π‘œ

slide-5
SLIDE 5

Linear sketching

  • Linear sketching is a useful primitive for streaming
  • Let 𝑔: 𝔾2

π‘œ β†’ {0,1}

  • 𝑔 has a linear sketch of size k if it factors as 𝑔 𝑦 = π‘ž(𝑀 𝑦 ) where:

(i) 𝑀: 𝔾2

π‘œ β†’ 𝔾2 𝑙 linear function

(ii) π‘ž: 𝔾2

𝑙 β†’ {0,1} post-processing function

  • Equivalently, the β€œFourier dimension” of 𝑔 is 𝑙
slide-6
SLIDE 6

Linear sketching implies streaming

  • Assume 𝑔: 𝔾2

π‘œ β†’ {0,1} factors as 𝑔 𝑦 = π‘ž(𝑀 𝑦 ) where

(i) 𝑀: 𝔾2

π‘œ β†’ 𝔾2 𝑙 linear function

(ii) π‘ž: 𝔾2

𝑙 β†’ {0,1} post-processing function

  • To compute 𝑔 in the streaming model, maintain 𝑀 𝑦 ∈ 𝔾2

𝑙

  • Easy to maintain under updates 𝑦𝑗 ← 𝑦𝑗 βŠ• 1
  • Requires only k bits of memory
slide-7
SLIDE 7

Randomized linear sketching

  • Randomization makes linear sketching more powerful
  • 𝑔: 𝔾2

π‘œ β†’ {0,1} has a randomized linear sketch of size k if it can be

approximated by a distribution over linear sketches of size k

  • That is, if exists a distribution over 𝑀, π‘ž , where:

(i) 𝑀: 𝔾2

π‘œ β†’ 𝔾2 𝑙 linear function

(ii) π‘ž: 𝔾2

𝑙 β†’ {0,1} post-processing function

Such that Pr

L,p 𝑔 𝑦 = π‘ž(𝑀 𝑦 ) β‰₯ 1 βˆ’ πœ—

slide-8
SLIDE 8

Randomized sketching gives additional power

  • Consider the OR function: 𝑃𝑆 𝑦1, … , π‘¦π‘œ = 𝑦1 ∨ β‹― ∨ π‘¦π‘œ
  • Deterministic sketching requires size n
  • Randomized sketching can be done in size 𝑃 log 1/πœ—

(random parities)

slide-9
SLIDE 9

Is linear sketching universal?

  • Linear sketching seems like a very useful primitive for streaming
  • Is it universal?
  • That is: given a streaming algorithm that computes 𝑔 using 𝑙 bits of

memory, can we extract from it a linear sketch for 𝑔 of size β‰ˆ 𝑙?

slide-10
SLIDE 10

Universality of linear sketching

slide-11
SLIDE 11

Universality of linear sketching

  • Let 𝑔: 𝔾2

π‘œ β†’ {0,1}

  • Assume: randomized streaming algorithm supporting 𝑂 updates

and using 𝑙 bits of memory

  • Goal: extract a randomized linear sketch of size β‰ˆ 𝑙
  • True if 𝑂 β‰₯ 222π‘œ

[Li-Nguyen-Woordruff β€˜14, Ai-Hu-Li-Woodruff β€˜16]

  • True if 𝑂 = Ξ©(π‘œ) for random inputs [Kannan-Mossell-Sanyal-Yaroslavtsev β€˜18]
  • True if 𝑂 = Ξ©(π‘œ2) [This work]
slide-12
SLIDE 12

Main theorem: streaming

  • Let 𝑔: 𝔾2

π‘œ β†’ {0,1}

  • Assume there exists a randomized streaming algorithm for 𝑔

supporting N = Ξ© π‘œ2 updates which uses 𝑙 bits of memory

  • Then there exists a randomized linear sketch for 𝑔 of size 𝑃(𝑙)
slide-13
SLIDE 13

Extensions (that I will not talk about)

  • Extends to approximate real-valued functions 𝑔: 𝔾2

π‘œ β†’ [0,1]

  • Extends to functions over other fields
  • Assuming only N = Ξ©(π‘œ) updates are supported, we can still extract a

randomized linear sketch, but its size will be π‘žπ‘π‘šπ‘§(𝑙) instead of 𝑃(𝑙)

slide-14
SLIDE 14

One-way communication complexity

slide-15
SLIDE 15

One way communication complexity

  • Model a streaming algorithm as a one-way communication protocol
  • Break 𝑂 updates into 𝑁 = 𝑂/π‘œ chunks of size n each
  • Setup: M players, holding inputs 𝑦1, … , 𝑦𝑁 ∈ 𝔾2

π‘œ

(𝑦𝑗 is the aggregate of the n updates in the i-th chunk)

  • Goal: compute 𝑔 𝑦1 + β‹― + 𝑦𝑁
  • Communication model: one-way
slide-16
SLIDE 16

One way communication complexity

  • M players, holding inputs 𝑦1, … , 𝑦𝑁 ∈ 𝔾2

π‘œ

  • Model: one-way communication with shared randomness
  • Goal: output = 𝑔 𝑦1 + β‹― + 𝑦𝑁

w.h.p over shared randomness

Player 1 Player 2 Player M

𝑦1 ∈ 𝔾2

π‘œ

𝑦2 ∈ 𝔾2

π‘œ

𝑦𝑁 ∈ 𝔾2

π‘œ

Message 𝑛1 ∈ 0,1 𝑙 Message 𝑛2 ∈ 0,1 𝑙 Message π‘›π‘βˆ’1 ∈ 0,1 𝑙 Output 𝑝𝑣𝑒 ∈ {0,1}

…

slide-17
SLIDE 17

Main theorem: one way communication

  • Let 𝑔: 𝔾2

π‘œ β†’ {0,1}

  • Assume there exists a one-way communication protocol for computing

𝑔 𝑦1 + β‹― + 𝑦𝑁 for 𝑁 = Ξ©(π‘œ) players with k-bit messages (recall: this corresponds to 𝑂 = π‘π‘œ = Ξ© π‘œ2 binary updates)

  • Then there exists a randomized linear sketch for f of size 𝑃 𝑙
  • For 𝑁 = Ξ© 1 players, get linear sketch of size π‘žπ‘π‘šπ‘§(𝑙)
slide-18
SLIDE 18

Proof

slide-19
SLIDE 19

Proof

  • The proof uses
  • 1. Standard techniques in communication complexity
  • 2. Additive combinatorics
slide-20
SLIDE 20

Proof step 1: Yao’s minimax principle

  • Let 𝑔: 𝔾2

π‘œ β†’ {0,1}

  • Fix a β€œhard distribution” 𝜈 over inputs
  • Goal: linear sketch for 𝑔(𝑦) where 𝑦 ∼ 𝜈
  • Embed hard distribution to the M players:
  • First M-1 players inputs 𝑦1, … , π‘¦π‘βˆ’1 are uniform in 𝔾2

π‘œ

  • Last player input 𝑦𝑁 is set so that 𝑦1 + β‹― + 𝑦𝑁 = 𝑦
  • Intuition: protocol has no information on x until the last player
slide-21
SLIDE 21

Proof step 2: protocol structure

  • Target: 𝑦 ∼ 𝜈
  • Players inputs: 𝑦1, … , π‘¦π‘βˆ’1 ∈ 𝔾2

π‘œ uniformly, 𝑦𝑁 = 𝑦1 + β‹― + π‘¦π‘βˆ’1 + 𝑦

  • We may assume the protocol is deterministic
  • Messages: 𝑛1 𝑦1 , 𝑛2 𝑛1, 𝑦2 , 𝑛3 𝑛1, 𝑛2, 𝑦3 , …
  • Output: 𝑝𝑣𝑒 𝑛1, … , π‘›π‘βˆ’1, 𝑦𝑁
  • With good probability out = 𝑔 𝑦1 + β‹― + 𝑦𝑁 = 𝑔(𝑦)
  • Can fix the messages (of the first M-1 players) to β€œtypical messages”, without hurting

the success probability too much

slide-22
SLIDE 22

Proof step 3: fixing to typical messages

  • Fix typical messages 𝑛1

βˆ—, 𝑛2 βˆ—, … , π‘›π‘βˆ’1 βˆ—

  • Corresponds to the first M-1 players inputs:
  • 𝐡1 = 𝑦1 ∈ 𝔾2

π‘œ: 𝑛1 𝑦1 = 𝑛1 βˆ—

  • 𝐡2 = 𝑦2 ∈ 𝔾2

π‘œ: 𝑛2 𝑛1 βˆ—, 𝑦2 = 𝑛2 βˆ—

  • …
  • Sets are big: if the protocol uses k bits, then 𝐡𝑗 β‰₯ 2π‘œβˆ’π‘™
  • After conditioning on 𝑦1 ∈ 𝐡1, … , π‘¦π‘βˆ’1 ∈ π΅π‘βˆ’1, protocol output is a

function of only 𝑦𝑁 = 𝑦1 + β‹― + π‘¦π‘βˆ’1 + 𝑦

slide-23
SLIDE 23

Proof step 4: mixing

  • Large sets 𝐡1, … , π΅π‘βˆ’1 βŠ‚ 𝔾2

π‘œ of density 2βˆ’π‘™

  • If we sample 𝑦1 ∈ 𝐡1, … , π‘¦π‘βˆ’1 ∈ π΅π‘βˆ’1 and 𝑦 ∼ 𝜈, then with high

probability 𝑝𝑣𝑒 𝑦1 + β‹― + π‘¦π‘βˆ’1 + 𝑦 = 𝑔 𝑦

  • Technical lemma: for 𝑁 = Ξ© 𝑂 , the sum 𝑦1 + β‹― + π‘¦π‘βˆ’1 mixes in 𝔾2

π‘œ

  • More precisely, there exists a subspace π‘Š βŠ‚ 𝔾2

π‘œ of co-dimension 𝑃(𝑙),

such that the sum is near invariant to a random shift from π‘Š

slide-24
SLIDE 24

Proof step 5: extracting linear sketch

  • We found a large subspace V of co-dimension O(k)
  • If we sample 𝑦1 ∈ 𝐡1, … , π‘¦π‘βˆ’1 ∈ π΅π‘βˆ’1, 𝑦 ∼ 𝜈 and 𝑀 ∈ π‘Š, then with

high probability 𝑝𝑣𝑒 𝑦1 + β‹― + π‘¦π‘βˆ’1 + 𝑦 + 𝑀 = 𝑔 𝑦

  • This allows to β€œfactor out” V from the output function, and extract a

linear sketch for 𝑔 𝑦

slide-25
SLIDE 25

Open problems

slide-26
SLIDE 26

Linear sketching for modular updates

  • For binary updates (or more general, modular updates), we prove that

linear sketching is universal

  • Any streaming algorithm which supports 𝑂 = Ξ© π‘œ2 updates implies a

randomized linear sketch with similar guarantees

  • Open problem 1: can this be improved to 𝑂 = Ξ© π‘œ ?
  • [Kannan-Mossell-Sanyal-Yaroslavtsev β€˜18] proved a partial result in this

regime, giving a linear sketch for f on random inputs

  • Our results in this regime incur a polynomial loss in the sketch size
slide-27
SLIDE 27

Integer updates

  • Streaming if often considered in the integer case
  • Integer counters 𝑦1, … , π‘¦π‘œ
  • Updates 𝑦𝑗 += 1 or 𝑦𝑗 βˆ’= 1
  • Sketching corresponds to linear functions over the integers
  • The results of [Li-Nguyen-Woordruff β€˜14, Ai-Hu-Li-Woodruff β€˜16] work in

this regime as well, but require assuming 𝑂 β‰₯ 222π‘œ

  • Open problem 2: can our techniques be imported to this regime?
  • Challenge: not clear what β€œmixing” should mean here
slide-28
SLIDE 28