ADVANCED ALGORITHMS Lecture 16: hashing (fin), sampling 1 - PowerPoint PPT Presentation

ADVANCED ALGORITHMS Lecture 16: hashing (fin), sampling � 1

ANNOUNCEMENTS ➤ HW 3 is due tomorrow! ➤ Send project topics n n ➤ Send email to utah-algo-ta@googlegroups.com, with subject “Project topic”; one email per group; names and UIDs easy calculus n l E n e e I I I I � 2

LAST CLASS ➤ Hashing ➤ place n balls into n bins, independently and uniformly at random ➤ expected size of a bin = 1 ➤ number of bins with k balls ~= n/k! T ➤ max size of bin = O(log n/log log n) n � 3

    MAIN IDEAS ➤ Random variables as sums of “simple” random variables ➤ Linearity of expectation   Theorem. Let X be a non-negative random variable. For any t > 0, Pr[ X > t ⋅ 피 [ X ]] ≤ 1 Randomvaria t denotdeviate ➤ Markov’s inequality is usually not tight too much from ➤ Union bound their expectations � 4

    MAIN IDEAS ➤ Random variables as sums of “simple” random variables ➤ Linearity of expectation   Theorem. Let X be a non-negative random variable. For any t > 0, Pr[ X > t ⋅ 피 [ X ]] ≤ 1 t ➤ Markov’s inequality is usually not tight ➤ Union bound Theorem. suppose E 1 , E 2 , …, E n are n events in a probability space. Then Pr[ E 1 ∪ E 2 ∪ … ∪ E n ] ≤ Pr[ E 1 ] + Pr[ E 2 ] + … + Pr[ E n ] � 5

THOUGHTS ➤ When hashing n balls to n bins, outcomes not “as uniform” as one likes load of abin a login Max log login ➤ Many empty bins (HW) I ➤ What happens if there are more balls? hash m balls, where Ign m ≫ n t much better ➤ “Power of two choices” (Broder et al. 91) balancing load toad E0 log log n happens Max � 6

ESTIMATION winner in the full populam winner in sample Want Question: suppose each person votes R or B. Can we predict the winner without counting all votes? m of the people ask for who they Sample Answer will vote for and output the winnerin the Samp le � 7

matter i ought to Things that sampling to be truly uniform everyone answering truthfully n of samples the number how close the matters the margin votes true are in our prediction confidence

ANALYZING SAMPLING Each person che has a choice of Natural formalism: or I 0 ➤ Choose n people uniformly at random. entire population N ➤ Let X i (0/1) be outcome of i’th person that vote 0 Tanipmle No u that vote 1 in sample voting 0 µ mo 1 n n i if hi no Predicted winner 1 of wise Paki of Prf Xi D � 8

t Xn t Xzt X N l X X XitX Mo n what is 1 IE Xi n.IT Efm Elmo n rgue that tend a Eln ist n'ad n fraction of votes I received Estimation error in thesample fraction of votes I i the population in 1 We j st argued the fro r expressions

We justar from the corr expressions gued Eln EE Elmo and no n If true winner then winner sample what if 0.4N 0.6N N No 1 Our prediction is right iff moan Elmo 0.4 n claim prediction is Asks Is our no right our prediction is right iff o o l Elmo L no n 051N what if N No 0.49N and 2 our prediction is right iffno IEf.no LOo01n what is we take G oal n samples if theprob.thatno aster JLO.am

ANALYZING SAMPLING Natural formalism: ➤ Choose n people uniformly at random. ➤ Let X i (0/1) be outcome of i’th person ➤ Error in estimation: ||empirical mean - true expectation||? ➤ “Confidence” Ideal guarantee: || empirical mean - true expectation || < 0.001 w.p. 0.999 � 9

ii I t MARKOV? ItIno3Lo 1rij want nyo E I t Emf Prfn.s no s EIndftooI f sa I failure prob t _In my 106 I n N � 10

CAN WE USE THE “NUMBER OF SAMPLES”? Variance of random variable X r variable X Hitman T E EX µ Barity fx Y tXn It Iat X t var xn if theyL varlx.lt var Xi var x independent � 11

CHEBYCHEV’S INEQUALITY a random variable has low variance If MY then Markov can be improved whose be ar variable variance Then r Let theorem Pr III t E Ext r a Prl t yxY � 12

we wanted Backtosany sling so.in Efno no d or Pr f Ino Efno E 0 O l d r n compute this a idea u

VARIANCE OF AVERAGE � 13

BOUND VIA CHEBYCHEV � 14

WHAT IF WE TAKE HIGHER POWERS? 피 [( X − 피 X ) 4 ] ≤ … “Moment methods” ➤ Usually get improved bounds � 15

CHERNOFF BOUND � 16

INTERPRETING THE CHERNOFF BOUND � 17

INTERPRETING THE CHERNOFF BOUND Useful heuristic: ➤ Sums of independent random variables don’t deviate much more than the variance � 18

MCDIARMID’S INEQUALITY � 19

ESTIMATING THE SUM OF NUMBERS � 20

ESTIMATING THE SUM OF NUMBERS � 21

ADVANCED ALGORITHMS Lecture 16: hashing (fin), sampling 1 - PowerPoint PPT Presentation

ADVANCED ALGORITHMS Lecture 16: hashing (fin), sampling 1 ANNOUNCEMENTS HW 3 is due tomorrow! Send project topics n n Send email to utah-algo-ta@googlegroups.com, with subject Project topic; one email per group; names and

Advanced Algorithms (I) Chihao Zhang Shanghai Jiao Tong University Feb. 25, 2019 Advanced

Graph Algorithms Chapter 22 1 CPTR 430 Algorithms Graph Algorithms Why Study Graph Algorithms?

Greedy Algorithms Chapter 16 1 CPTR 430 Algorithms Greedy Algorithms Greedy Algorithms For

Algorithms Chapter 3 Chapter Summary Algorithms n Example Algorithms n Algorithmic Paradigms

Advanced Algorithms (IV) Chihao Zhang Shanghai Jiao Tong University Mar. 18, 2019 Advanced

General remarks Algorithms Algorithms Oliver Oliver Week 8 Kullmann Kullmann Greedy Greedy

Week 7 Video 3 Advanced Clustering Algorithms Today Multiple advanced algorithms for

Divide and Conquer Algorithms: Advanced Sorting Prichard Ch. 10.2: Advanced Sorting Algorithms

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online

- - packing p a - packing algo- packing cking rithms algo- a l g o - theorems rithms

Evolutionary Algorithms CS 478 - Evolutionary Algorithms 1 Evolutionary Computation/Algorithms

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Algorithms Theory Algorithms Theory 10 10 Greedy Algorithms G d Al ith Dr. Alexander

Randomized Algorithms Randomized Algorithms Two Types of Randomized Algorithms Two Types of

The Probabilistic Method Week 6: Expectation, Variance, and Beyond Joshua Brody CS49/Math59

9.520 Math Camp 2011 Probability Theory Say we have some training data S ( n ) , comprising n

CS 473: Algorithms Chandra Chekuri Ruta Mehta University of Illinois, Urbana-Champaign Fall

Randomness in Computing L ECTURE 6 Last time Conditional expectation Branching process

Count-Min Sketch Complexity Analysis Markovs Inequality Anil Maheshwari Proof of the claim

Ground state expansion and the spectral gap of local Hamiltonians Elizabeth Crosson California

Randomized Algorithms Lecture 3: Occupancy, Moments and deviations, Randomized selection

Poverty and Inequality Dynamics. Ira N. Gang, Rutgers University Ksenia Gatskova, IOS-Regensburg