cs 473 algorithms
play

CS 473: Algorithms Chandra Chekuri Ruta Mehta University of - PowerPoint PPT Presentation

CS 473: Algorithms Chandra Chekuri Ruta Mehta University of Illinois, Urbana-Champaign Fall 2016 Chandra & Ruta (UIUC) CS473 1 Fall 2016 1 / 22 CS 473: Algorithms, Fall 2016 Fingerprinting Lecture 11 September 28, 2016 Chandra


  1. CS 473: Algorithms Chandra Chekuri Ruta Mehta University of Illinois, Urbana-Champaign Fall 2016 Chandra & Ruta (UIUC) CS473 1 Fall 2016 1 / 22

  2. CS 473: Algorithms, Fall 2016 Fingerprinting Lecture 11 September 28, 2016 Chandra & Ruta (UIUC) CS473 2 Fall 2016 2 / 22

  3. Fingerprinting Source: Wikipedia Process of mapping a large data item to a much shorter bit string, called its fingerprint. Fingerprints uniquely identifies data for all practical purposes . Chandra & Ruta (UIUC) CS473 3 Fall 2016 3 / 22

  4. Fingerprinting Source: Wikipedia Process of mapping a large data item to a much shorter bit string, called its fingerprint. Fingerprints uniquely identifies data for all practical purposes . Typically used to avoid comparison and transmission of bulky data. Eg: Web browser can store/fetch file fingerprints to check if it is changed. Chandra & Ruta (UIUC) CS473 3 Fall 2016 3 / 22

  5. Fingerprinting Source: Wikipedia Process of mapping a large data item to a much shorter bit string, called its fingerprint. Fingerprints uniquely identifies data for all practical purposes . Typically used to avoid comparison and transmission of bulky data. Eg: Web browser can store/fetch file fingerprints to check if it is changed. As you may have guessed, fingerprint functions are hash functions. Chandra & Ruta (UIUC) CS473 3 Fall 2016 3 / 22

  6. Bloom Filters Hashing: To insert x in dictionary store x in table in location h(x) 1 To lookup y in dictionary check contents of location h(y) 2 Chandra & Ruta (UIUC) CS473 4 Fall 2016 4 / 22

  7. Bloom Filters Hashing: To insert x in dictionary store x in table in location h(x) 1 To lookup y in dictionary check contents of location h(y) 2 Bloom Filter: tradeoff space for false positives Storing items in dictionary expensive in terms of memory, 1 especially if items are unwieldy objects such a long strings, images, etc with non-uniform sizes. To insert x in dictionary set bit to 1 in location h(x) (initially all 2 bits are set to 0 ) To lookup y if bit in location h(y) is 1 say yes, else no. 3 Chandra & Ruta (UIUC) CS473 4 Fall 2016 4 / 22

  8. Bloom Filters Chandra & Ruta (UIUC) CS473 5 Fall 2016 5 / 22

  9. Bloom Filters Bloom Filter: tradeoff space for false positives To insert x in dictionary set bit to 1 in location h(x) (initially all 1 bits are set to 0 ) To lookup y if bit in location h(y) is 1 say yes, else no 2 No false negatives but false positives possible due to collisions 3 Chandra & Ruta (UIUC) CS473 5 Fall 2016 5 / 22

  10. Bloom Filters Bloom Filter: tradeoff space for false positives To insert x in dictionary set bit to 1 in location h(x) (initially all 1 bits are set to 0 ) To lookup y if bit in location h(y) is 1 say yes, else no 2 No false negatives but false positives possible due to collisions 3 Reducing false positives: Pick k hash functions h 1 , h 2 , . . . , h k independently 1 To insert x for 1 ≤ i ≤ k set bit in location h i (x) in table i to 1 2 To lookup y compute h i (y) for 1 ≤ i ≤ k and say yes only if 3 each bit in the corresponding location is 1 , otherwise say no. If probability of false positive for one hash function is α < 1 then with k independent hash function it is Chandra & Ruta (UIUC) CS473 5 Fall 2016 5 / 22

  11. Bloom Filters Bloom Filter: tradeoff space for false positives To insert x in dictionary set bit to 1 in location h(x) (initially all 1 bits are set to 0 ) To lookup y if bit in location h(y) is 1 say yes, else no 2 No false negatives but false positives possible due to collisions 3 Reducing false positives: Pick k hash functions h 1 , h 2 , . . . , h k independently 1 To insert x for 1 ≤ i ≤ k set bit in location h i (x) in table i to 1 2 To lookup y compute h i (y) for 1 ≤ i ≤ k and say yes only if 3 each bit in the corresponding location is 1 , otherwise say no. If probability of false positive for one hash function is α < 1 then with k independent hash function it is α k . Chandra & Ruta (UIUC) CS473 5 Fall 2016 5 / 22

  12. Outline Use of hash functions for designing fast algorithms Problem Given a text T of length m and pattern P of length n , m ≫ n , find all occurrences of P in T . Chandra & Ruta (UIUC) CS473 6 Fall 2016 6 / 22

  13. Outline Use of hash functions for designing fast algorithms Problem Given a text T of length m and pattern P of length n , m ≫ n , find all occurrences of P in T . Karp-Rabin Randomized Algorithm Chandra & Ruta (UIUC) CS473 6 Fall 2016 6 / 22

  14. Outline Use of hash functions for designing fast algorithms Problem Given a text T of length m and pattern P of length n , m ≫ n , find all occurrences of P in T . Karp-Rabin Randomized Algorithm Sampling a prime String equality via mod p arithmetic Rabin’s fingerprinting scheme – rolling hash Karp-Rabin pattern matching algorithm: O(m + n) time. Chandra & Ruta (UIUC) CS473 6 Fall 2016 6 / 22

  15. Sampling a prime Problem Given an integer x > 0 , sample a prime uniformly at random from all the primes between 1 and x . Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 22

  16. Sampling a prime Problem Given an integer x > 0 , sample a prime uniformly at random from all the primes between 1 and x . Procedure Sample a number p uniformly at random from { 1 , . . . , x } . 1 If p is a prime, then output p . Else go to Step (1). 2 Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 22

  17. Sampling a prime Problem Given an integer x > 0 , sample a prime uniformly at random from all the primes between 1 and x . Procedure Sample a number p uniformly at random from { 1 , . . . , x } . 1 If p is a prime, then output p . Else go to Step (1). 2 Checking if p is prime Agrawal-Kayal-Saxena primality test: deterministic but slow Miller-Rabin randomized primality test: fast but randomized outputs ‘prime’ when it is not with very low probability . Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 22

  18. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  19. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? π (x) : number of primes in { 1 , . . . , x } , Lemma For a fixed prime p ∗ ≤ x , Pr[ algorithm outputs p ∗ ] = 1 /π (x) . Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  20. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? π (x) : number of primes in { 1 , . . . , x } , Lemma For a fixed prime p ∗ ≤ x , Pr[ algorithm outputs p ∗ ] = 1 /π (x) . Proof. A : Event that a prime is picked in a round. Pr[A] = π (x) / x . Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  21. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? π (x) : number of primes in { 1 , . . . , x } , Lemma For a fixed prime p ∗ ≤ x , Pr[ algorithm outputs p ∗ ] = 1 /π (x) . Proof. A : Event that a prime is picked in a round. Pr[A] = π (x) / x . B : Number (prime) p ∗ is picked. Pr[B] = 1 / x . B ⊂ A . Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  22. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? π (x) : number of primes in { 1 , . . . , x } , Lemma For a fixed prime p ∗ ≤ x , Pr[ algorithm outputs p ∗ ] = 1 /π (x) . Proof. A : Event that a prime is picked in a round. Pr[A] = π (x) / x . B : Number (prime) p ∗ is picked. Pr[B] = 1 / x . B ⊂ A . Pr[B | A] = Pr [A ∩ B] = Pr [B] 1 / x 1 [A] = π (x) / x = Pr [A] Pr π (x) Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  23. Sampling a Prime: Analysis Is the returned prime sampled uniformly at random ? π (x) : number of primes in { 1 , . . . , x } , Lemma For a fixed prime p ∗ ≤ x , Pr[ algorithm outputs p ∗ ] = 1 /π (x) . Proof. A : Event that a prime is picked in a round. Pr[A] = π (x) / x . B : Number (prime) p ∗ is picked. Pr[B] = 1 / x . B ⊂ A . Pr[B | A] = Pr [A ∩ B] = Pr [B] 1 / x 1 [A] = π (x) / x = Pr [A] Pr π (x) Running time in expectation Q: How many samples in expectation before termination? A: x /π (x) . Exercise. Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 22

  24. How many primes between 0 and x π (x) : Number of primes between 0 and x . Prime Number Theorem π (x) lim x →∞ x / ln x = 1 By Jacques Hadamard and Charles Jean de la Vall´ ee-Poussin in 1896 Chandra & Ruta (UIUC) CS473 9 Fall 2016 9 / 22

  25. How many primes between 0 and x π (x) : Number of primes between 0 and x . Prime Number Theorem π (x) lim x →∞ x / ln x = 1 By Jacques Hadamard and Charles Jean de la Vall´ ee-Poussin in 1896 Chebyshev (from 1848) π (x) ≥ 7 ln x = (1 . 262 .. ) x x x lg x > 8 lg x Chandra & Ruta (UIUC) CS473 9 Fall 2016 9 / 22

  26. How many primes between 0 and x π (x) : Number of primes between 0 and x . Prime Number Theorem π (x) lim x →∞ x / ln x = 1 By Jacques Hadamard and Charles Jean de la Vall´ ee-Poussin in 1896 Chebyshev (from 1848) π (x) ≥ 7 ln x = (1 . 262 .. ) x x x lg x > 8 lg x π (x) 1 y ∼ { 1 , . . . , x } u.a.r., then y is a prime w.p. > lg x . x Chandra & Ruta (UIUC) CS473 9 Fall 2016 9 / 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend