Bloom Filters, Count Sketches and Adaptive Sketches
Rice University
Anshumali Shrivastava
anshumali@rice.edu
29th August 2016
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 1 / 22
Bloom Filters, Count Sketches and Adaptive Sketches Rice University - - PowerPoint PPT Presentation
Bloom Filters, Count Sketches and Adaptive Sketches Rice University Anshumali Shrivastava anshumali@rice.edu 29th August 2016 Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 1 / 22 Basics: Universal Hashing Basic tool for
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 1 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 2 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 3 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 4 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 5 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 6 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 7 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 8 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 8 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 8 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 9 / 22
Use Random Hash Function
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 10 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 11 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 11 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 11 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 12 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 13 / 22
t = T (𝑩𝑼) t = T-1 (𝑩𝑼−𝟐 ) t = T-2 (𝑩𝑼−𝟑 )
t = T-3 (𝑩𝑼−𝟒 ) t = T-4 (𝑩𝑼−𝟓 ) t = T-5 (𝑩𝑼−𝟔 ) t = T-6 (𝑩𝑼−𝟕 )
1Matusevych, Smola and Ahmad 2012 Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 14 / 22
t = T (𝑩𝑼) t = T-1 (𝑩𝑼−𝟐 ) t = T-2 (𝑩𝑼−𝟑 )
t = T-3 (𝑩𝑼−𝟒 ) t = T-4 (𝑩𝑼−𝟓 ) t = T-5 (𝑩𝑼−𝟔 ) t = T-6 (𝑩𝑼−𝟕 )
1Matusevych, Smola and Ahmad 2012 Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 14 / 22
t = T (𝑩𝑼) t = T-1 (𝑩𝑼−𝟐 ) t = T-2 (𝑩𝑼−𝟑 )
t = T-3 (𝑩𝑼−𝟒 ) t = T-4 (𝑩𝑼−𝟓 ) t = T-5 (𝑩𝑼−𝟔 ) t = T-6 (𝑩𝑼−𝟕 )
1Matusevych, Smola and Ahmad 2012 Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 14 / 22
t = T (𝑩𝑼) t = T-1 (𝑩𝑼−𝟐 ) t = T-2 (𝑩𝑼−𝟑 )
t = T-3 (𝑩𝑼−𝟒 ) t = T-4 (𝑩𝑼−𝟓 ) t = T-5 (𝑩𝑼−𝟔 ) t = T-6 (𝑩𝑼−𝟕 )
1Matusevych, Smola and Ahmad 2012 Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 14 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 15 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 16 / 22
𝒈(𝒖) 𝒈(𝒖)
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 17 / 22
Data Stream Multiply by 𝒈(𝒖) SKETCH Insert Query RESULT Pre-emphasis De-emphasis Divide by apt 𝒈(𝒖)
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 17 / 22
Error
+
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 18 / 22
Pre-Emphasis
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 18 / 22
Error
Pre-Emphasis De-Emphasis + +
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 18 / 22
t′=0(f (t′))2
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 19 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 20 / 22
t′=0(f (t′))2
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 20 / 22
500 1000 1500 2000 500 1000 1500 2000 Time Absolute Error w = 218 AOL
CMS Ada−CMS (exp) Ada−CMS (lin) Hokusai
400 600 800 1000 200 400 600 800 1000 Time Absolute Error w = 218 Criteo
CMS Ada−CMS (exp) Ada−CMS (lin) Hokusai
Time
500 1000 1500 2000
Standard Deviation of Errors
50 100 150 200 w = 218 AOL
CMS Ada-CMS (exp) Ada-CMS (lin) Hokusai
Time
400 600 800 1000
Standard Deviation of Errors
50 100 150 200 w = 218 Criteo
CMS Ada-CMS (exp) Ada-CMS (lin) Hokusai
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 21 / 22
Anshumali Shrivastava (COMP 640) Sketching 29th August 2016 22 / 22