Lecture 4 Barna Saha AT&T-Labs Research September 19, 2013 - - PowerPoint PPT Presentation

lecture 4
SMART_READER_LITE
LIVE PREVIEW

Lecture 4 Barna Saha AT&T-Labs Research September 19, 2013 - - PowerPoint PPT Presentation

Lecture 4 Barna Saha AT&T-Labs Research September 19, 2013 Outline Heavy Hitter Continued Frequency Moment Estimation Dimensionality Reduction Heavy Hitter Heavy Hitter Problem: For 0 < < < 1 find a set of elements S


slide-1
SLIDE 1

Lecture 4

Barna Saha

AT&T-Labs Research

September 19, 2013

slide-2
SLIDE 2

Outline

Heavy Hitter Continued Frequency Moment Estimation Dimensionality Reduction

slide-3
SLIDE 3

Heavy Hitter

◮ Heavy Hitter Problem: For 0 < ǫ < φ < 1 find a set of

elements S including all i such that fi > φm and there is no element in S with frequency ≤ (φ − ǫ)m.

◮ Count-Min sketch guarantees: fi ≤ ˆ

fi ≤ fi + ǫm with probability ≥ 1 − δ in space e

ǫ log 1 (φ−ǫ)δ. ◮ Insert only: Maintain a min-heap of size k = 1 φ−ǫ, when an

item arrives estimate frequency and if above φm include it in the heap. If heap size more than k, discard the minimum frequency element in the heap.

slide-4
SLIDE 4

Heavy Hitter

◮ Turnstile model:

◮ Maintain dyadic intervals over binary search tree and maintain

log n count-min sketch with using space e

ǫ log 2 log n δ(φ−ǫ) one for

each level.

◮ At every level at most 1

φ heavy hitters.

◮ Estimate frequency of children of the heavy hitter nodes until

leaf-level is reached.

◮ Return all the leaves with estimated frequency above φm. ◮ Analysis ◮ At most

2 φ−ǫ nodes at every level is examined.

◮ Each true frequency > (φ − ǫ)m with probability at least

1 − δ(φ−ǫ)

2 log n .

◮ By union bound all true frequencies are above (φ − ǫ)m with

probability at least 1 − δ.

slide-5
SLIDE 5

l2 frequency estimation

◮ |fi − ˆ

fi| ≤ ±ǫ

  • f 2

1 + f 2 2 + ....f 2 n [Count-sketch] ◮ F2 = f 2 1 + f 2 2 + ....f 2 n ◮ How do we estimate F2 in small space ?

slide-6
SLIDE 6

AMS-F2 Estimation

◮ H = {h : [n] → {+1, −1}} four-wise independent hash

functions

◮ Maintain Zj = Zj + ahj(i) on arrival of (i, a) for

j = 1, ..., t = c

ǫ2 ◮ Return Y = 1 t

t

j=1 Z 2 j

slide-7
SLIDE 7

Analysis

◮ Zj = n i=1 fihj(i) ◮ E

  • Zj
  • = 0, E
  • Z 2

j

  • = F2.

◮ Var

  • Z 2

j

  • = E
  • Z 4

j

  • − (E
  • Zj
  • )2 ≤ 4F 2

2 . ◮ E

  • Y
  • = F2. Var
  • Y
  • = 1

t2

t

j=1 Var(Z 2 j ) = 4ǫ2 c F 2 2 ◮ By Chebyshev Inequality Pr

  • |Y − E
  • Y
  • | > ǫF2
  • ≤ 4

c

slide-8
SLIDE 8

Boosting by Median

◮ Keep Y1, Y2, ...Ys, s = O(log 1δ) ◮ Return A = median(Y1, Y2, .., Ys) ◮ By Chernoff bound Pr

  • |A − F2| > ǫF2
  • < δ
slide-9
SLIDE 9

Linear Sketch

◮ Algorithm maintains a linear sketch [Z1, Z2, ...., Zt]x = Rx

where R is a t × n random matrix with entries {+1, −1}.

◮ Use Y = ||Rx||2 2 to estimate t||x|2

  • 2. t = O( 1

ǫ2 ). ◮ Streaming algorithm operating in the sketch model can be

viewed as dimensionality reduction technique.

slide-10
SLIDE 10

Dimensionality Reduction

◮ Streaming algorithm operating in the sketch model can be

viewed as dimensionality reduction technique.

◮ stream S: point in n dimensional space, want to compute l2(S) ◮ sketch operator can be viewed as an approximate embedding

  • f ln

2 to sketch space C such that

  • 1. Each point in C can be described using only small number

(say m) of numbers so C ⊂ Rm and

  • 2. value of l2(S) is approximately equal to F(C(S)).

◮ F(Y1, Y2, ..Yt) = median(Y1, Y2, .., Yt)

slide-11
SLIDE 11

Dimensionality Reduction

◮ F(Y1, Y2, ..Yt) = median(Y1, Y2, .., Yt) ◮ Disadvantage: F is not a norm–performing any nontrivial

  • perations in the sketch space (e.g. clustering, similarity

search, regression etc.) becomes difficult.

◮ Can we embed from ln 2 to lm 2 , m << n approximately

preserving the distance ? Johnson-Lindenstrauss Lemma

slide-12
SLIDE 12

Interlude to Normal Distribution

Normal distribution N(0, 1):

◮ Range (−∞, ∞) ◮ Density f (x) = e−x2/

√ 2π

◮ Mean=0, Variance=1

Basic facts

◮ If X and Y are independent random variables with normal

distribution then so is X + Y

◮ If X and Y are independent with mean 0 then

E

  • [X + Y ]2

= E

  • X 2

+ E

  • Y 2

◮ E

  • cX
  • = cE
  • X
  • , Var
  • cX
  • = c2Var
  • X
slide-13
SLIDE 13

A Different Linear Sketch

Instead of ±1 let ri be a i.i.d. random variable from N(0, 1).

◮ Consider Z = i rixi ◮ E

  • Z 2

= E

  • (

i rixi)2

=

i E

  • r2

i

  • x2

i = i Var

  • ri
  • x2

i =

  • i x2

i = ||x||2 2. ◮ As before we maintain Z = [Z1, Z2, ..., Zt] and define

Y = ||Z||2

2 ◮ E

  • Y
  • = t||x||2

2 ◮ We show that there exists constant C > 0 s.t. for small

enough ǫ > 0 Pr

  • |Y − t||x||2

2| > ǫt||x||2 2

  • ≤ e−Cǫ2t (JL lemma)

◮ set t = O( 1 ǫ2 log 1 δ)

slide-14
SLIDE 14

Johnson Lindenstrauss Lemma

Lemma

For any 0 < epsilon < 1 and any integer m, let t be a positive integer such that t > 4 ln m ǫ2/2 + ǫ3/3 Then for any set V of m points in Rn, there is a map f : Rn → Rt such that for all u and v ∈ V , (1 − ǫ)||u − v||2

2 ≤ ||f (u) − f (v)||2 2 ≤ (1 + ǫ)||u − v||2 2.

Furthermore this map can be found in randomized polynomial time.