algorithms for big data iii

Algorithms for Big Data (III) Chihao Zhang Shanghai Jiao Tong - PowerPoint PPT Presentation

Algorithms for Big Data (III) Chihao Zhang Shanghai Jiao Tong University Sept. 29, 2019 Algorithms for Big Data (III) 1/16 We introduced the notion of universal families of Hash functions. Review of the Last Lecture Last time, we proved a


  1. Algorithms for Big Data (III) Chihao Zhang Shanghai Jiao Tong University Sept. 29, 2019 Algorithms for Big Data (III) 1/16

  2. We introduced the notion of universal families of Hash functions. Review of the Last Lecture Last time, we proved a few useful concentration inequalities. We constructed a -universal universal family of Hash functions. Algorithms for Big Data (III) 2/16

  3. We introduced the notion of universal families of Hash functions. Review of the Last Lecture Last time, we proved a few useful concentration inequalities. We constructed a -universal universal family of Hash functions. Algorithms for Big Data (III) 2/16

  4. Review of the Last Lecture Last time, we proved a few useful concentration inequalities. We constructed a -universal universal family of Hash functions. Algorithms for Big Data (III) 2/16 We introduced the notion of universal families of Hash functions.

  5. Review of the Last Lecture Last time, we proved a few useful concentration inequalities. Algorithms for Big Data (III) 2/16 We introduced the notion of universal families of Hash functions. We constructed a 2 -universal universal family of Hash functions.

  6. h a b x Review: the construction mod p Algorithms for Big Data (III) is a -universal Hash function family. So n h y h x Pr h y , We proved that for every x mod n b ax n is defined as m where each h a b p b p a h a b The family 3/16 Let m , n be two integer and p ≥ m be a prime.

  7. Review: the construction Pr h Algorithms for Big Data (III) is a -universal Hash function family. So n h y h x y , We proved that for every x The family 3/16 Let m , n be two integer and p ≥ m be a prime. H = { h a , b : 1 ≤ a ≤ p − 1 , 0 ≤ b ≤ p − 1 } , where each h a , b : [ m ] → [ n ] is defined as h a , b ( x ) = ( ax + b mod p ) mod n .

  8. is a -universal Hash function family. Review: the construction The family So Algorithms for Big Data (III) 3/16 Let m , n be two integer and p ≥ m be a prime. H = { h a , b : 1 ≤ a ≤ p − 1 , 0 ≤ b ≤ p − 1 } , where each h a , b : [ m ] → [ n ] is defined as h a , b ( x ) = ( ax + b mod p ) mod n . We proved that for every x � y , Pr h ∈H [ h ( x ) = h ( y )] ≤ 1 n .

  9. Review: the construction The family Algorithms for Big Data (III) 3/16 Let m , n be two integer and p ≥ m be a prime. H = { h a , b : 1 ≤ a ≤ p − 1 , 0 ≤ b ≤ p − 1 } , where each h a , b : [ m ] → [ n ] is defined as h a , b ( x ) = ( ax + b mod p ) mod n . We proved that for every x � y , Pr h ∈H [ h ( x ) = h ( y )] ≤ 1 n . So H is a 2 -universal Hash function family.

  10. is called strongly -universal family of Hash functions. h a b x 4/16 to get a Algorithms for Big Data (III) p a b mod p b ax In this case, we have strong -universal family. p are primes, the we can modify the previously constructed Recall that if we further require that for any u v , n When m then n v h y u h x Pr h Strongly 2 -Universal Hash Family

  11. h a b x 4/16 In this case, we have Algorithms for Big Data (III) p a b mod p b ax strong -universal family. to get a p are primes, the we can modify the previously constructed n When m Strongly 2 -Universal Hash Family Recall that if we further require that for any u , v , Pr h ∈H [ h ( x ) = u ∧ h ( y ) = v ] = 1 n 2 , then H is called strongly 2 -universal family of Hash functions.

  12. h a b x In this case, we have ax b mod p a b p Algorithms for Big Data (III) 4/16 Strongly 2 -Universal Hash Family Recall that if we further require that for any u , v , Pr h ∈H [ h ( x ) = u ∧ h ( y ) = v ] = 1 n 2 , then H is called strongly 2 -universal family of Hash functions. When m = n = p are primes, the we can modify the previously constructed H to get a strong 2 -universal family.

  13. Algorithms for Big Data (III) In this case, we have 4/16 Strongly 2 -Universal Hash Family Recall that if we further require that for any u , v , Pr h ∈H [ h ( x ) = u ∧ h ( y ) = v ] = 1 n 2 , then H is called strongly 2 -universal family of Hash functions. When m = n = p are primes, the we can modify the previously constructed H to get a strong 2 -universal family. mod p : 0 ≤ a , b ≤ p − 1 } . H = { h a , b ( x ) = ax + b

  14. The equations h a b x y and h a b x mod p mod p Pr h a b H h a b x h a b x Proof x mod p b y Therefore, ax mod p y y y p Algorithms for Big Data (III) x They have a unique solution y a Lemma y b ax y b ax y are equivalent to and p is a prime. p ) if a mod p has unique solution (in b The equation ax 5/16

  15. The equations h a b x y and h a b x mod p mod p Pr h a b H h a b x h a b x b y ax mod p Proof Therefore, x y y p Algorithms for Big Data (III) mod p y x y Lemma a They have a unique solution y b ax y b ax y are equivalent to 5/16 The equation ax + b = 0 mod p has unique solution (in F p ) if a � 0 and p is a prime.

  16. Pr h a b H h a b x h a b x Proof mod p Algorithms for Big Data (III) p y y Therefore, mod p ax y b x Lemma x y y a They have a unique solution 5/16 The equation ax + b = 0 mod p has unique solution (in F p ) if a � 0 and p is a prime. The equations h a , b ( x 1 ) = y 1 and h a , b ( x 2 ) = y 2 are equivalent to ax 1 + b = y 1 ax 2 + b = y 2 mod p , mod p .

  17. Pr h a b H h a b x h a b x Proof Lemma Algorithms for Big Data (III) p y y Therefore, 5/16 They have a unique solution The equation ax + b = 0 mod p has unique solution (in F p ) if a � 0 and p is a prime. The equations h a , b ( x 1 ) = y 1 and h a , b ( x 2 ) = y 2 are equivalent to ax 1 + b = y 1 ax 2 + b = y 2 mod p , mod p . a = y 2 − y 1 b = y 1 − ax 1 mod p , mod p . x 2 − x 1

  18. Proof They have a unique solution Algorithms for Big Data (III) Therefore, Lemma 5/16 The equation ax + b = 0 mod p has unique solution (in F p ) if a � 0 and p is a prime. The equations h a , b ( x 1 ) = y 1 and h a , b ( x 2 ) = y 2 are equivalent to ax 1 + b = y 1 ax 2 + b = y 2 mod p , mod p . a = y 2 − y 1 b = y 1 − ax 1 mod p , mod p . x 2 − x 1 Pr h a , b ∈ H [ h a , b ( x 1 ) = y 1 ∧ h a , b ( x 2 ) = y 2 ] = 1 p 2 .

  19. We can naturally generalize m p to m h a b x The General Case b p , define k i a i x i Then mod p and h a b a i p b p Algorithms for Big Data (III) b a i p x n p k . Write every number x in base p : x x x p p The Hash family we just constructed has the restriction that m x k p k For every a a a a k , with 6/16

  20. We can naturally generalize m p to m h a b x The General Case b p , define k i a i x i Then mod p and h a b a i p b p Algorithms for Big Data (III) b a i p x p k . Write every number x in base p : x x x p p x k p k For every a a a a k , with 6/16 The Hash family we just constructed has the restriction that m = n

  21. h a b x The General Case b p , define k i a i x i mod p and Then h a b a i p b p Algorithms for Big Data (III) b p 6/16 a i Write every number x in base p : x x x p x p x k p k For every a a a a k , with The Hash family we just constructed has the restriction that m = n We can naturally generalize m = p to m = p k .

  22. h a b x The General Case Algorithms for Big Data (III) p b p a i h a b Then mod p b a i x i i k , define p b and p a i , with a k a a For every a Write every number x in base p : 6/16 The Hash family we just constructed has the restriction that m = n We can naturally generalize m = p to m = p k . x = x 0 + x 1 · p + x 2 · p 2 + . . . x k − 1 · p k − 1 .

  23. The General Case Then Algorithms for Big Data (III) p b p a i h a b 6/16 Write every number x in base p : The Hash family we just constructed has the restriction that m = n We can naturally generalize m = p to m = p k . x = x 0 + x 1 · p + x 2 · p 2 + . . . x k − 1 · p k − 1 . For every ¯ a = ( a 0 , a 1 , . . . , a k − 1 ) , with 0 ≤ a i ≤ p − 1 and 0 ≤ b ≤ p − 1 , define k − 1 a , b ( x ) = � a i x i + b � ∑ mod p . h ¯ � � i =0

  24. The General Case Then Algorithms for Big Data (III) 6/16 Write every number x in base p : The Hash family we just constructed has the restriction that m = n We can naturally generalize m = p to m = p k . x = x 0 + x 1 · p + x 2 · p 2 + . . . x k − 1 · p k − 1 . For every ¯ a = ( a 0 , a 1 , . . . , a k − 1 ) , with 0 ≤ a i ≤ p − 1 and 0 ≤ b ≤ p − 1 , define k − 1 a , b ( x ) = � a i x i + b � ∑ mod p . h ¯ � � i =0 a , b : 0 ≤ a i ≤ p − 1 , 0 ≤ b ≤ p − 1 } . H = { h ¯

  25. j i a j x j j i a j y j For fixed x y u v and a j j i , a unique pair a i b (out of p pairs) is determined. h a b x h a b y Proof Algorithms for Big Data (III) p v u Pr h a b Therefore, mod p b v Assuming x a i y i mod p u b a i x i , we have equations p For every u v y i ). y and they difger on the position i ( x i 7/16

  26. j i a j x j j i a j y j For fixed x y u v and a j j i , a unique pair a i b (out of p pairs) is determined. h a b x h a b y Proof Algorithms for Big Data (III) p v u Pr h a b Therefore, mod p v b a i y i mod p u b a i x i , we have equations p For every u v 7/16 Assuming x � y and they difger on the position i ( x i � y i ).

Recommend


More recommend