CountMin and Count Sketches Lecture 10 February 14, 2019 Chandra - - PowerPoint PPT Presentation

countmin and count sketches
SMART_READER_LITE
LIVE PREVIEW

CountMin and Count Sketches Lecture 10 February 14, 2019 Chandra - - PowerPoint PPT Presentation

CS 498ABD: Algorithms for Big Data, Spring 2019 CountMin and Count Sketches Lecture 10 February 14, 2019 Chandra (UIUC) CS498ABD 1 Spring 2019 1 / 18 Heavy Hitters Problem Heavy Hitters Problem: Find all items i such that f i > m / k


slide-1
SLIDE 1

CS 498ABD: Algorithms for Big Data, Spring 2019

CountMin and Count Sketches

Lecture 10

February 14, 2019

Chandra (UIUC) CS498ABD 1 Spring 2019 1 / 18

slide-2
SLIDE 2

Heavy Hitters Problem

Heavy Hitters Problem: Find all items i such that fi > m/k for some fixed k. Heavy hitters are very frequent items. We saw Misra-Gries deterministic algorithm that in O(k) space finds the heavy hitters assuming they exist. Two pass algorithm correctly identifies heavy hitters.

Chandra (UIUC) CS498ABD 2 Spring 2019 2 / 18

slide-3
SLIDE 3

(Strict) Turnstile Model

Turnstile model: each update is (ij, ∆j) where ∆j can be positive or negative Strict turnstile: need xi ≥ 0 at all time for all i In terms of frequent items we want additive error to xi

Chandra (UIUC) CS498ABD 3 Spring 2019 3 / 18

slide-4
SLIDE 4

Basic Hashing/Sampling Idea

Heavy Hitters Problem: Find all items i such that fi > m/k. Let b1, b2, . . . , bk be the k heavy hitters Suppose we pick h : [n] → [ck] for some c > 1 h spreads b1, . . . , bk among the buckets (k balls into ck bins) In ideal situation each bucket can be used to count a separate heavy hitter

Chandra (UIUC) CS498ABD 4 Spring 2019 4 / 18

slide-5
SLIDE 5

Part I CountMin Sketch

Chandra (UIUC) CS498ABD 5 Spring 2019 5 / 18

slide-6
SLIDE 6

CountMin Sketch

[Cormode-Muthukrishnan]

CountMin-Sketch(w, d): h1, h2, . . . , hd are pair-wise independent hash functions from [n] → [w]. While (stream is not empty) do et = (it, ∆t) is current item for ℓ = 1 to d do C[ℓ, hℓ(ij)] ← C[ℓ, hℓ(ij)] + ∆t endWhile For i ∈ [n] set ˜ xi = mind

ℓ=1 C[ℓ, hℓ(i)].

Counter C[ℓ, j] simply counts the sum of all xi such that hℓ(i) = j. That is, C[ℓ, j] =

  • i:hℓ(i)=j

xi.

Chandra (UIUC) CS498ABD 6 Spring 2019 6 / 18

slide-7
SLIDE 7

Intuition

Suppose there are k heavy hitters b1, b2, . . . , bk Consider bi: Hash function hℓ sends bi to hℓ(bi). C[ℓ, h(bi)] counts xbi and also other items that hash to same bucket h(bi) so we always overcount (since strict turnstile model) Repeating with many hash functions and taking minimum is right thing to do: for bi the goal is to avoid other heavy hitters colliding with it

Chandra (UIUC) CS498ABD 7 Spring 2019 7 / 18

slide-8
SLIDE 8

Property of CountMin Sketch

Lemma

Let d = Ω(log 1

δ) and w > 2 ǫ. Then for any fixed i ∈ [n], xi ≤ ˜

xi and Pr[˜ xi ≥ xi + ǫx1] ≤ δ.

Chandra (UIUC) CS498ABD 8 Spring 2019 8 / 18

slide-9
SLIDE 9

Property of CountMin Sketch

Lemma

Let d = Ω(log 1

δ) and w > 2 ǫ. Then for any fixed i ∈ [n], xi ≤ ˜

xi and Pr[˜ xi ≥ xi + ǫx1] ≤ δ. Unlike Misra-Greis we have over estimates Actual items are not stored (requires work to recover heavy hitters) Works in strict turnstile model and hence can handle deletions Space usage is O( log(1/δ)

ǫ

) counters and hence O( log(1/δ)

ǫ

log m) bits

Chandra (UIUC) CS498ABD 8 Spring 2019 8 / 18

slide-10
SLIDE 10

Analysis

Fix ℓ: hℓ(i) is the bucket that hℓ hashes i to.

Chandra (UIUC) CS498ABD 9 Spring 2019 9 / 18

slide-11
SLIDE 11

Analysis

Fix ℓ: hℓ(i) is the bucket that hℓ hashes i to. Zℓ = C[ℓ, hℓ(i)] is the counter value that i is hashed to.

Chandra (UIUC) CS498ABD 9 Spring 2019 9 / 18

slide-12
SLIDE 12

Analysis

Fix ℓ: hℓ(i) is the bucket that hℓ hashes i to. Zℓ = C[ℓ, hℓ(i)] is the counter value that i is hashed to. E[Zℓ] = xi +

i ′=i Pr[hℓ(i ′) = hℓ(i)]xi ′

Chandra (UIUC) CS498ABD 9 Spring 2019 9 / 18

slide-13
SLIDE 13

Analysis

Fix ℓ: hℓ(i) is the bucket that hℓ hashes i to. Zℓ = C[ℓ, hℓ(i)] is the counter value that i is hashed to. E[Zℓ] = xi +

i ′=i Pr[hℓ(i ′) = hℓ(i)]xi ′

By pairwise-independence E[Zℓ] = xi +

i ′=i xi ′/w ≤ xi + ǫx1/2

Chandra (UIUC) CS498ABD 9 Spring 2019 9 / 18

slide-14
SLIDE 14

Analysis

Fix ℓ: hℓ(i) is the bucket that hℓ hashes i to. Zℓ = C[ℓ, hℓ(i)] is the counter value that i is hashed to. E[Zℓ] = xi +

i ′=i Pr[hℓ(i ′) = hℓ(i)]xi ′

By pairwise-independence E[Zℓ] = xi +

i ′=i xi ′/w ≤ xi + ǫx1/2

Via Markov applied to Zℓ − xi (we use strict turnstile here) Pr[Zℓ] ≥ xi + ǫx1 ≤ 1/2

Chandra (UIUC) CS498ABD 9 Spring 2019 9 / 18

slide-15
SLIDE 15

Analysis

Fix ℓ: hℓ(i) is the bucket that hℓ hashes i to. Zℓ = C[ℓ, hℓ(i)] is the counter value that i is hashed to. E[Zℓ] = xi +

i ′=i Pr[hℓ(i ′) = hℓ(i)]xi ′

By pairwise-independence E[Zℓ] = xi +

i ′=i xi ′/w ≤ xi + ǫx1/2

Via Markov applied to Zℓ − xi (we use strict turnstile here) Pr[Zℓ] ≥ xi + ǫx1 ≤ 1/2 Since the d hash functions are independent Pr[minℓ Zℓ ≥ xi + ǫx1] ≤ 1/2d ≤ δ

Chandra (UIUC) CS498ABD 9 Spring 2019 9 / 18

slide-16
SLIDE 16

Summarizing

Lemma

Let d = Ω(log 1

δ) and w > 2 ǫ. Then for any fixed i ∈ [n], xi ≤ ˜

xi and Pr[˜ xi ≥ xi + ǫx1] ≤ δ. Choose d = 2 ln n and w = 2/ǫ: we have Pr[˜ xi ≥ xi + ǫx1] ≤ 1/n2. By union bound, with probability (1 − 1/n), for all i ∈ [n], ˜ xi ≤ xi + ǫx1

Chandra (UIUC) CS498ABD 10 Spring 2019 10 / 18

slide-17
SLIDE 17

Summarizing

Lemma

Let d = Ω(log 1

δ) and w > 2 ǫ. Then for any fixed i ∈ [n], xi ≤ ˜

xi and Pr[˜ xi ≥ xi + ǫx1] ≤ δ. Choose d = 2 ln n and w = 2/ǫ: we have Pr[˜ xi ≥ xi + ǫx1] ≤ 1/n2. By union bound, with probability (1 − 1/n), for all i ∈ [n], ˜ xi ≤ xi + ǫx1 Total space O( 1

ǫ log n) counters and hence O( 1 ǫ log n log m) bits.

Chandra (UIUC) CS498ABD 10 Spring 2019 10 / 18

slide-18
SLIDE 18

CountMin as a Linear Sketch

Question: Why is CountMin a linear sketch?

Chandra (UIUC) CS498ABD 11 Spring 2019 11 / 18

slide-19
SLIDE 19

CountMin as a Linear Sketch

Question: Why is CountMin a linear sketch? Recall that for 1 ≤ ℓ ≤ d and 1 ≤ s ≤ w: C[ℓ, s] =

  • i:hℓ(i)=s

xi Thus, once hash function hℓ is fixed: C[ℓ, s] = u, x where u is a row vector in {0, 1}n such that ui = 1 if hℓ(i) = s and ui = 0 otherwise Thus, once hash functions are fixed, the counter values can be written as Mx where M ∈ {0, 1}wd×n is the sketch matrix

Chandra (UIUC) CS498ABD 11 Spring 2019 11 / 18

slide-20
SLIDE 20

Part II Count Sketch

Chandra (UIUC) CS498ABD 12 Spring 2019 12 / 18

slide-21
SLIDE 21

Count Sketch

[Charikar-Chen-FarachColton]

Count-Sketch(w, d): h1, h2, . . . , hd are pair-wise independent hash functions from [n] → [w]. g1, g2, . . . , gd are pair-wise independent hash functions from [n] → {−1, 1}. While (stream is not empty) do et = (it, ∆t) is current item for ℓ = 1 to d do C[ℓ, hℓ(ij)] ← C[ℓ, hℓ(ij)] + g(it)∆t endWhile For i ∈ [n] set ˜ xi = median{g1(i)C[1, h1(i)], . . . , gℓ(i)C[ℓ, hℓ(i)]}.

Like CountMin, Count sketch has wd counters. Now counter values can become negative even if x is positive.

Chandra (UIUC) CS498ABD 13 Spring 2019 13 / 18

slide-22
SLIDE 22

Intuition

Each hash function hℓ spreads the elements across w buckets The has function gℓ induces cancellations (inspired by F2 estimation algorithm) Since answer may be negative even if x ≥ 0, we take the median Exercise: Show that Count sketch is also a linear sketch.

Chandra (UIUC) CS498ABD 14 Spring 2019 14 / 18

slide-23
SLIDE 23

Count Sketch Analysis

Lemma

Let d ≥ 4 log 1

δ and w > 3 ǫ2. Then for any fixed i ∈ [n],

E[˜ xi] = xi and Pr[|˜ xi − xi| ≥ ǫx2] ≤ δ.

Chandra (UIUC) CS498ABD 15 Spring 2019 15 / 18

slide-24
SLIDE 24

Count Sketch Analysis

Lemma

Let d ≥ 4 log 1

δ and w > 3 ǫ2. Then for any fixed i ∈ [n],

E[˜ xi] = xi and Pr[|˜ xi − xi| ≥ ǫx2] ≤ δ. Comparison to CountMin Error guarantee is with respect to x2 instead of x1. For x ≥ 0, x2 ≤ x1 and in some cases x2 ≪ x1. Space increases to O( 1

ǫ2 log n) counters from O( 1 ǫ log n)

counters

Chandra (UIUC) CS498ABD 15 Spring 2019 15 / 18

slide-25
SLIDE 25

Analysis

Fix an i ∈ [n]. Let Zℓ = gℓ(i)C[ℓ, hℓ(i)].

Chandra (UIUC) CS498ABD 16 Spring 2019 16 / 18

slide-26
SLIDE 26

Analysis

Fix an i ∈ [n]. Let Zℓ = gℓ(i)C[ℓ, hℓ(i)]. For i ′ ∈ [n] let Yi ′ be the indicator random variable that is 1 if hℓ(i) = hℓ(i ′); that is i and i ′ collide in hℓ. E[Yi ′] = E[Y 2

i ′] = 1/w from pairwise independence of hℓ.

Chandra (UIUC) CS498ABD 16 Spring 2019 16 / 18

slide-27
SLIDE 27

Analysis

Fix an i ∈ [n]. Let Zℓ = gℓ(i)C[ℓ, hℓ(i)]. For i ′ ∈ [n] let Yi ′ be the indicator random variable that is 1 if hℓ(i) = hℓ(i ′); that is i and i ′ collide in hℓ. E[Yi ′] = E[Y 2

i ′] = 1/w from pairwise independence of hℓ.

Zℓ = gℓ(i)C[ℓ, hℓ(i)] = gℓ(i)

  • i ′

gℓ(i ′)xi ′Yi ′

Chandra (UIUC) CS498ABD 16 Spring 2019 16 / 18

slide-28
SLIDE 28

Analysis

Fix an i ∈ [n]. Let Zℓ = gℓ(i)C[ℓ, hℓ(i)]. For i ′ ∈ [n] let Yi ′ be the indicator random variable that is 1 if hℓ(i) = hℓ(i ′); that is i and i ′ collide in hℓ. E[Yi ′] = E[Y 2

i ′] = 1/w from pairwise independence of hℓ.

Zℓ = gℓ(i)C[ℓ, hℓ(i)] = gℓ(i)

  • i ′

gℓ(i ′)xi ′Yi ′ Therefore, E[Zℓ] = xi +

  • i ′=i

E[gℓ(i)gℓ(i ′)Yi ′]xi ′ = xi, because E[gℓ(i)gℓ(i ′)] = 0 for i = i ′ from pairwise independence

  • f gℓ and Yi ′ is independent of gℓ(i) and gℓ(i ′).

Chandra (UIUC) CS498ABD 16 Spring 2019 16 / 18

slide-29
SLIDE 29

Analysis

Zℓ = gℓ(i)C[ℓ, hℓ(i)]. And E[Zℓ] = xi.

Chandra (UIUC) CS498ABD 17 Spring 2019 17 / 18

slide-30
SLIDE 30

Analysis

Zℓ = gℓ(i)C[ℓ, hℓ(i)]. And E[Zℓ] = xi. Var(Zℓ) = E

  • (Zℓ − xi)2

= E  (

  • i ′=i

gℓ(i)gℓ(i ′)Yi ′xi ′)2   = E  

i ′=i

x2

i ′Y 2 i ′ +

  • i ′=i ′′

xi ′xi ′′gℓ(i ′)gℓ(i ′′)Yi ′Yi ′′   =

  • i ′=i

x2

i ′ E

  • Y 2

i ′

x2

2/w.

Chandra (UIUC) CS498ABD 17 Spring 2019 17 / 18

slide-31
SLIDE 31

Analysis

Zℓ = gℓ(i)C[ℓ, hℓ(i)]. We have seen: E[Zℓ] = xi and Var(Zℓ) ≤ x2

2/w.

Chandra (UIUC) CS498ABD 18 Spring 2019 18 / 18

slide-32
SLIDE 32

Analysis

Zℓ = gℓ(i)C[ℓ, hℓ(i)]. We have seen: E[Zℓ] = xi and Var(Zℓ) ≤ x2

2/w.

Using Chebyshev: Pr[|Zℓ − xi| ≥ ǫx2] ≤ Var(Zℓ) ǫ2x2

2

≤ 1 ǫ2w ≤ 1/3.

Chandra (UIUC) CS498ABD 18 Spring 2019 18 / 18

slide-33
SLIDE 33

Analysis

Zℓ = gℓ(i)C[ℓ, hℓ(i)]. We have seen: E[Zℓ] = xi and Var(Zℓ) ≤ x2

2/w.

Using Chebyshev: Pr[|Zℓ − xi| ≥ ǫx2] ≤ Var(Zℓ) ǫ2x2

2

≤ 1 ǫ2w ≤ 1/3. Via the Chernoff bound, Pr[|median{Z1, . . . , Zd} − xi| ≥ ǫx2] ≤ e−cd ≤ δ.

Chandra (UIUC) CS498ABD 18 Spring 2019 18 / 18

slide-34
SLIDE 34

Summarizing

Lemma

Let d ≥ 4 log 1

δ and w > 3 ǫ2. Then for any fixed i ∈ [n],

E[˜ xi] = xi and Pr[|˜ xi − xi| ≥ ǫx2] ≤ δ. Choose d = θ(ln n) and w = 3/ǫ2: we have Pr[|˜ xi − xi| ≥ ǫx2] ≤ 1/n2. By union bound, with probability (1 − 1/n), for all i ∈ [n], |˜ xi − xi| ≤ ǫx2

Chandra (UIUC) CS498ABD 19 Spring 2019 19 / 18

slide-35
SLIDE 35

Summarizing

Lemma

Let d ≥ 4 log 1

δ and w > 3 ǫ2. Then for any fixed i ∈ [n],

E[˜ xi] = xi and Pr[|˜ xi − xi| ≥ ǫx2] ≤ δ. Choose d = θ(ln n) and w = 3/ǫ2: we have Pr[|˜ xi − xi| ≥ ǫx2] ≤ 1/n2. By union bound, with probability (1 − 1/n), for all i ∈ [n], |˜ xi − xi| ≤ ǫx2 Total space O( 1

ǫ2 log n) counters and hence O( 1 ǫ2 log n log m) bits.

Chandra (UIUC) CS498ABD 19 Spring 2019 19 / 18