advanced algorithms
play

ADVANCED ALGORITHMS Lecture 16: hashing (fin), sampling 1 - PowerPoint PPT Presentation

ADVANCED ALGORITHMS Lecture 16: hashing (fin), sampling 1 ANNOUNCEMENTS HW 3 is due tomorrow! Send project topics n n Send email to utah-algo-ta@googlegroups.com, with subject Project topic; one email per group; names and


  1. ADVANCED ALGORITHMS Lecture 16: hashing (fin), sampling � 1

  2. ANNOUNCEMENTS ➤ HW 3 is due tomorrow! ➤ Send project topics n n ➤ Send email to utah-algo-ta@googlegroups.com, with subject “Project topic”; one email per group; names and UIDs easy calculus n l E n e e I I I I � 2

  3. LAST CLASS ➤ Hashing ➤ place n balls into n bins, independently and uniformly at random ➤ expected size of a bin = 1 ➤ number of bins with k balls ~= n/k! T ➤ max size of bin = O(log n/log log n) n � 3

  4. 
 
 MAIN IDEAS ➤ Random variables as sums of “simple” random variables ➤ Linearity of expectation 
 Theorem. Let X be a non-negative random variable. For any t > 0, Pr[ X > t ⋅ 피 [ X ]] ≤ 1 Randomvaria t denotdeviate ➤ Markov’s inequality is usually not tight too much from ➤ Union bound their expectations � 4

  5. 
 
 MAIN IDEAS ➤ Random variables as sums of “simple” random variables ➤ Linearity of expectation 
 Theorem. Let X be a non-negative random variable. For any t > 0, Pr[ X > t ⋅ 피 [ X ]] ≤ 1 t ➤ Markov’s inequality is usually not tight ➤ Union bound Theorem. suppose E 1 , E 2 , …, E n are n events in a probability space. Then Pr[ E 1 ∪ E 2 ∪ … ∪ E n ] ≤ Pr[ E 1 ] + Pr[ E 2 ] + … + Pr[ E n ] � 5

  6. THOUGHTS ➤ When hashing n balls to n bins, outcomes not “as uniform” as one likes load of abin a login Max log login ➤ Many empty bins (HW) I ➤ What happens if there are more balls? hash m balls, where Ign m ≫ n t much better ➤ “Power of two choices” (Broder et al. 91) balancing load toad E0 log log n happens Max � 6

  7. ESTIMATION winner in the full populam winner in sample Want Question: suppose each person votes R or B. Can we predict the winner without counting all votes? m of the people ask for who they Sample Answer will vote for and output the winnerin the Samp le � 7

  8. matter i ought to Things that sampling to be truly uniform everyone answering truthfully n of samples the number how close the matters the margin votes true are in our prediction confidence

  9. ANALYZING SAMPLING Each person che has a choice of Natural formalism: or I 0 ➤ Choose n people uniformly at random. entire population N ➤ Let X i (0/1) be outcome of i’th person that vote 0 Tanipmle No u that vote 1 in sample voting 0 µ mo 1 n n i if hi no Predicted winner 1 of wise Paki of Prf Xi D � 8

  10. t Xn t Xzt X N l X X XitX Mo n what is 1 IE Xi n.IT Efm Elmo n rgue that tend a Eln ist n'ad n fraction of votes I received Estimation error in thesample fraction of votes I i the population in 1 We j st argued the fro r expressions

  11. We justar from the corr expressions gued Eln EE Elmo and no n If true winner then winner sample what if 0.4N 0.6N N No 1 Our prediction is right iff moan Elmo 0.4 n claim prediction is Asks Is our no right our prediction is right iff o o l Elmo L no n 051N what if N No 0.49N and 2 our prediction is right iffno IEf.no LOo01n what is we take G oal n samples if theprob.thatno aster JLO.am

  12. ANALYZING SAMPLING Natural formalism: ➤ Choose n people uniformly at random. ➤ Let X i (0/1) be outcome of i’th person ➤ Error in estimation: ||empirical mean - true expectation||? ➤ “Confidence” Ideal guarantee: || empirical mean - true expectation || < 0.001 w.p. 0.999 � 9

  13. ii I t MARKOV? ItIno3Lo 1rij want nyo E I t Emf Prfn.s no s EIndftooI f sa I failure prob t _In my 106 I n N � 10

  14. CAN WE USE THE “NUMBER OF SAMPLES”? Variance of random variable X r variable X Hitman T E EX µ Barity fx Y tXn It Iat X t var xn if theyL varlx.lt var Xi var x independent � 11

  15. CHEBYCHEV’S INEQUALITY a random variable has low variance If MY then Markov can be improved whose be ar variable variance Then r Let theorem Pr III t E Ext r a Prl t yxY � 12

  16. we wanted Backtosany sling so.in Efno no d or Pr f Ino Efno E 0 O l d r n compute this a idea u

  17. VARIANCE OF AVERAGE � 13

  18. BOUND VIA CHEBYCHEV � 14

  19. WHAT IF WE TAKE HIGHER POWERS? 피 [( X − 피 X ) 4 ] ≤ … “Moment methods” ➤ Usually get improved bounds � 15

  20. CHERNOFF BOUND � 16

  21. INTERPRETING THE CHERNOFF BOUND � 17

  22. INTERPRETING THE CHERNOFF BOUND Useful heuristic: ➤ Sums of independent random variables don’t deviate much more than the variance � 18

  23. MCDIARMID’S INEQUALITY � 19

  24. ESTIMATING THE SUM OF NUMBERS � 20

  25. ESTIMATING THE SUM OF NUMBERS � 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend