Limited independence and Hashing 05 -06 Lecture 06/07 September 8 - PowerPoint PPT Presentation

CS 498ABD: Algorithms for Big Data Limited independence and Hashing 05 -06 Lecture 06/07 September 8 and 10, 2020 Chandra (UIUC) CS498ABD 1 Fall 2020 1 / 42

Pseudorandomness Randomized algorithms rely on independent random bits Psuedorandomness: when can we avoid or limit number of random bits? Motivated by fundamental theoretical questions and applications Applications: hashing, cryptography, streaming, simulations, derandomization, . . . A large topic in TCS with many connections to mathematics. This course: need t -wise independent variables and hashing Chandra (UIUC) CS498ABD 2 Fall 2020 2 / 42

Part I Pairwise and t -wise independent random variables Chandra (UIUC) CS498ABD 3 Fall 2020 3 / 42

Pairwise independent random variables Definition Discrete random variables X 1 , X 2 , . . . , X n from a range B are independent if for all b 1 , b 2 , . . . , b n 2 B n Y Pr[ X 1 = b 1 , X 2 = b 2 , . . . , X n = b n ] = Pr[ X i = b i ] . i =1 Uniformly distributed if Pr[ X i = b ] = 1 / | B | for all i , b 2 B . Chandra (UIUC) CS498ABD 4 Fall 2020 4 / 42

Pairwise independent random variables Definition Discrete random variables X 1 , X 2 , . . . , X n from a range B are independent if for all b 1 , b 2 , . . . , b n 2 B n Y Pr[ X 1 = b 1 , X 2 = b 2 , . . . , X n = b n ] = Pr[ X i = b i ] . i =1 Uniformly distributed if Pr[ X i = b ] = 1 / | B | for all i , b 2 B . Definition Random variables X 1 , X 2 , . . . , X n from a range B are pairwise independent if for all 1  i < j  n and for all b , b 0 2 B , Pr[ X i = b , X j = b 0 ] = Pr[ X i = b ] · Pr[ X j = b 0 ] . Chandra (UIUC) CS498ABD 4 Fall 2020 4 / 42

Pairwise independent random variables Definition Random variables X 1 , X 2 , . . . , X n from a range B are pairwise independent if for all 1  i < j  n and for all b , b 0 2 B , Pr[ X i = b , X j = b 0 ] = Pr[ X i = b ] · Pr[ X j = b 0 ] . If X 1 , X 2 , . . . , X n are independent than they are pairwise independent but converse is not necessarily true Chandra (UIUC) CS498ABD 5 Fall 2020 5 / 42

Pairwise independent random variables Definition Random variables X 1 , X 2 , . . . , X n from a range B are pairwise independent if for all 1  i < j  n and for all b , b 0 2 B , Pr[ X i = b , X j = b 0 ] = Pr[ X i = b ] · Pr[ X j = b 0 ] . If X 1 , X 2 , . . . , X n are independent than they are pairwise independent but converse is not necessarily true Example: X 1 , X 2 are independent bits (variables from { 0 , 1 } ) and X 3 = X 1 � X 2 . X 1 , X 2 , X 3 are pairwise independent but not independent. Chandra (UIUC) CS498ABD 5 Fall 2020 5 / 42

t -wise independence Generalizing pairwise independence: Definition Random variables X 1 , X 2 , . . . , X n from a range B are t -wise independent for integer t > 1 X i 1 , X i 2 , . . . , X i t are independent for any i 1 6 = i 2 6 = . . . 6 = i t 2 { 1 , 2 , . . . , n } . As t increases the variables become more and more independent. If t = n the variables are independent. Chandra (UIUC) CS498ABD 6 Fall 2020 6 / 42

Motivation for pairwise/ t -wise independence from streaming Want n uniformly distr random variables X 1 , X 2 , . . . , X n , say bits But cannot store n bits because n is too large. Achievable: storage of O (log n ) random bits given i where 1  i  n can generate X i in O (log n ) time X 1 , X 2 , . . . , X n are pairwise independent and uniform Hence, with small storage, can generate n random variables “on the fly”. In several applications, pairwise independence (or generalizations) su ffi ce Chandra (UIUC) CS498ABD 7 Fall 2020 7 / 42

Generating pairwise independent bits Assume for simplicity n = 2 k � 1 (otherwise consider nearest power of 2 ). Hence k = O (log n ) Let Y 1 , Y 2 , . . . , Y k be independent bits For any S ⇢ { 1 , 2 , . . . , k } , S 6 = ; , define X S = � i 2 S Y i 2 k � 1 random variables X S bib 4 k - Xi , Xz , . , Xu - n : 2K . . = 15 - I Xi 2 34 1 i E- 9 Y , Q Yu Xe : = Chandra (UIUC) CS498ABD 8 Fall 2020 8 / 42

Generating pairwise independent bits Assume for simplicity n = 2 k � 1 (otherwise consider nearest power of 2 ). Hence k = O (log n ) Let Y 1 , Y 2 , . . . , Y k be independent bits For any S ⇢ { 1 , 2 , . . . , k } , S 6 = ; , define X S = � i 2 S Y i 2 k � 1 random variables X S - , Xu Xi , Xz , Claim: If S 6 = T then X S and X T are independent Yi 1001 e- =9 ¥-405 , T - Sto T E- { 3. 4,103 S Xs Yu -040 Xs=Yz Chandra (UIUC) CS498ABD 8 Fall 2020 8 / 42

Generating pairwise independent bits Assume for simplicity n = 2 k � 1 (otherwise consider nearest power of 2 ). Hence k = O (log n ) Let Y 1 , Y 2 , . . . , Y k be independent bits For any S ⇢ { 1 , 2 , . . . , k } , S 6 = ; , define X S = � i 2 S Y i 2 k � 1 random variables X S Claim: If S 6 = T then X S and X T are independent Proof. X S and X T are both uniformaly distributed over { 0 , 1 } . Suppose S � T 6 = ; . Even knowing all outcomes of variables in T the variables in S � T are independent and hence Pr[ X S = 0 | T ] = 1 / 2 and hence X S is independent of X T . If S ⇢ T then apply same argument to T � S . Chandra (UIUC) CS498ABD 8 Fall 2020 8 / 42

Pairwise independent variables with larger range Suppose we want n pairwise independent random variables in range { 0 , 1 , 2 , . . . , m � 1 } where m = 2 k � 1 for some k • pairwise random variables " - r , I - i } C- do , 1,2 , Xi - r , MB C- { 0 , I , 2 , . , Chandra (UIUC) CS498ABD 9 Fall 2020 9 / 42

↳ Pairwise independent variables with larger range Suppose we want n pairwise independent random variables in range • l { 0 , 1 , 2 , . . . , m � 1 } where m = 2 k � 1 for some k U l - ME 1024 w/o bits Now each X i needs to be a log m bit string Use preceding construction for each bit independently Requires O (log m log n ) bits total Can in fact do O (log n + log m ) bits = random bite ] completely random bits [ ↳ n to pain n Chandra (UIUC) CS498ABD 9 Fall 2020 9 / 42

Using prime numbers and fields = 0 Assume n = m = p where p is a prime number = Want p pairwise random variables distributed uniformly in Z p = { 0 , 1 , 2 , . . . , p � 1 } to be pairwise , Xu Xi , Xu , Want . indef . . . . , m -13 Loll , h each C- Xi . uniformly distributed - - m =p n - = Chandra (UIUC) CS498ABD 10 Fall 2020 10 / 42

Using prime numbers and fields Assume n = m = p where p is a prime number Want p pairwise random variables distributed uniformly in Z p = { 0 , 1 , 2 , . . . , p � 1 } Choose a , b 2 { 0 , 1 , 2 , . . . , p � 1 } uniformly and independently at random. Requires 2 d log p e random bits For 0  i  p � 1 set X i = ai + b mod p Note that one needs to store only a , b , p and can generate X i e ffi ciently on the fly from i = ④ under tails 2 hosp win Chandra (UIUC) CS498ABD 10 Fall 2020 10 / 42

Using prime numbers and fields Assume n = m = p where p is a prime number Want p pairwise random variables distributed uniformly in Z p = { 0 , 1 , 2 , . . . , p � 1 } Choose a , b 2 { 0 , 1 , 2 , . . . , p � 1 } uniformly and independently at random. Requires 2 d log p e random bits For 0  i  p � 1 set X i = ai + b mod p Note that one needs to store only a , b , p and can generate X i e ffi ciently on the fly from i Exercise: Prove that each X i is uniformly distributed in Z p . Claim: For i 6 = j , X i and X j are independent. Chandra (UIUC) CS498ABD 10 Fall 2020 10 / 42

Using prime numbers and fields Claim: For i 6 = j , X i and X j are independent. Some math required: Z p is a field for any prime p . That is { 0 , 1 , 2 , . . . , p � 1 } forms a commutative group under addition mod p (easy). And more importantly { 1 , 2 , . . . , p � 1 } forms a commutative group under multiplication. Chandra (UIUC) CS498ABD 11 Fall 2020 11 / 42

Some math required... Lemma (LemmaUnique) Let p be a prime number, x : an integer number in { 1 , . . . , p � 1 } . = ) There exists a unique y s.t. xy = 1 mod p . In other words: For every element there is a unique inverse. = ) Z p = { 0 , 1 , . . . , p � 1 } when working modulo p is a field . Chandra (UIUC) CS498ABD 12 Fall 2020 12 / 42

Proof of LemmaUnique Claim Let p be a prime number. For any x , y , z 2 { 1 , . . . , p � 1 } s.t. y 6 = z , we have that xy mod p 6 = xz mod p . Proof. Assume for the sake of contradiction xy mod p = xz mod p . x ( y � z ) = 0 mod p = ) p divides x ( y � z ) = ) p divides y � z = ) y � z = 0 = ) y = z . And that is a contradiction. Chandra (UIUC) CS498ABD 13 Fall 2020 13 / 42

Proof of LemmaUnique Lemma (LemmaUnique) Let p be a prime number, x : an integer number in { 1 , . . . , p � 1 } . = ) There exists a unique y s.t. xy = 1 mod p . Proof. By the above claim if xy = 1 mod p and xz = 1 mod p then y = z . Hence uniqueness follows. Chandra (UIUC) CS498ABD 14 Fall 2020 14 / 42

Proof of LemmaUnique Lemma (LemmaUnique) Let p be a prime number, x : an integer number in { 1 , . . . , p � 1 } . = ) There exists a unique y s.t. xy = 1 mod p . Proof. By the above claim if xy = 1 mod p and xz = 1 mod p then y = z . Hence uniqueness follows. Existence. For any x 2 { 1 , . . . , p � 1 } we have that { x ⇤ 1 mod p , x ⇤ 2 mod p , . . . , x ⇤ ( p � 1) mod p } = { 1 , 2 , . . . , p � 1 } . ) There exists a number y 2 { 1 , . . . , p � 1 } such that = xy = 1 mod p . Chandra (UIUC) CS498ABD 14 Fall 2020 14 / 42

Limited independence and Hashing 05 -06 Lecture 06/07 September 8 - PowerPoint PPT Presentation

CS 498ABD: Algorithms for Big Data Limited independence and Hashing 05 -06 Lecture 06/07 September 8 and 10, 2020 Chandra (UIUC) CS498ABD 1 Fall 2020 1 / 42 Pseudorandomness Randomized algorithms rely on independent random bits

Today. Cuckoo hashing. Today. Cuckoo hashing. Johnson-Lindenstrass. Cuckoo hashing. Hashing

14. Hashing Hash Tables, Pre-Hashing, Hashing, Resolving Collisions using Chaining, Simple

Overview Intro to Hashing Intro to Hashing Hashing with Chaining Whats hashing?

14. Hashing Hash Tables, Pre-Hashing, Hashing, Resolving Collisions using Chaining, Simple

Database Systems Index: Hashing Based on slides by Feifei Li, University of Utah Hashing n

Hashing (Application of Probability) Ashwinee Panda Final CS 70 Lecture! 9 Aug 2018 Overview

Union-Find [10] In the last class Hashing Collision Handling for Hashing Closed

Hashing Connections 2-Universal Hash Function Perfect Hashing Anil Maheshwari Proofs

Hashing Chapter 5 1 Objectives Understand the idea of hashing Compare hashing to sorting

Linear probing with constant independence Anna Pagh, Rasmus Pagh, and Milan Ru i IT

Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files

Lecture 8: Hashing I Lecture Overview Dictionaries and Python Motivation Prehashing

Advanced Algorithms COMS31900 Hashing part two Static Perfect Hashing Rapha el Clifford

Hashing Hashing What is it? A form of narcotic intake? A side order for your eggs? A

CS70: Jean Walrand: Lecture 23. Bayes Rule, Independence, Mutual Independence 1. Conditional

Limited independence and Hashing Lecture 05/06 September 8 and 10, 2020 Chandra (UIUC)

Lecture 15: Conditional Probability and Independence Math 115 October 29, 2019 1/8 Conditional

A Proof-theoretic Characterization of Independence in Type Theory Yuting Wang 1 Kaustuv Chaudhuri

Exploiting Order Independence for Scalable and Expressive Packet Classification Author: Kirill

[ | ] independence Two events E and F are independent if P(EF) = P(E) P(F) equivalently:

Independent Pr[A] = Pr[A | B] Definition 2: Events Events A and B are independent iff Pr[A]

Independence of points on elliptic curves coming from modular curves Gregorio Baldi XXI

Independence Will Perkins January 17, 2013 Independent Events Definition Two events A and B are

Independence Alice Gao Lecture 13 Based on work by K. Leyton-Brown, K. Larson, and P. van Beek

Sambuz

Useful Links

Newsletter

Mail Us