Recap
Hashing-based sketch techniques summarize large data sets Summarize vectors:
– Test equality (fingerprints) – Recover approximate entries (count-min, count sketch) – Approximate Euclidean norm (F2) and dot product – Approximate number of non-zero entries (F0) – Approximate set membership (Bloom filter)
Streams, Sketching and Big Data
2