probability and statistics
play

Probability and Statistics for Computer Science The weak law of - PowerPoint PPT Presentation

Probability and Statistics for Computer Science The weak law of large numbers gives us a very valuable way of thinking about expecta:ons. ---Prof. Forsythe Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC,


  1. Probability and Statistics ì for Computer Science “The weak law of large numbers gives us a very valuable way of thinking about expecta:ons.” ---Prof. Forsythe Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC, 09.22.2020

  2. Last time ✺ Random Variable ✺ Expected value ✺ Variance & covariance

  3. Last time

  4. Content

  5. Content ✺ Random Variable ✺ Review with ques,ons ✺ The weak law of large numbers ✺ Simula=on & example of airline overbooking

  6. Expected value ✺ The expected value (or expecta,on ) of a random variable X is � E [ X ] = xP ( x ) x The expected value is a weighted sum of all the values X can take

  7. Linearity of Expectation

  8. Expected value of a function of X

  9. Q: What is E[E[X]]? A. E[X] B. 0 C. Can’t be sure

  10. Probability distribution ✺ Given the random variable X , what is E[2| X | +1]? A. 0 p ( x ) P ( X = x ) B. 1 C. 2 D. 3 1/2 E. 5 0 -1 1 X

  11. Probability distribution ✺ Given the random variable S in the 4- sided die, whose range is {2,3,4,5,6,7,8}, probability distribu:on of S. What is E[S] ? p ( s ) A. 4 B. 5 C. 6 1/16 S 5 6 8 2 3 4 7

  12. A neater expression for variance ✺ Variance of Random Variable X is defined as: var [ X ] = E [( X − E [ X ]) 2 ] ✺ It’s the same as: var [ X ] = E [ X 2 ] − E [ X ] 2

  13. Probability distribution and cumulative distribution ✺ Given the random variable X , what is var[2| X | +1]? A. 0 p ( x ) P ( X = x ) B. 1 C. 2 D. 3 1/2 E. -1 0 -1 1 X

  14. Probability distribution ✺ Given the random variable X , what is var[2| X | +1]? Let Y = 2| X |+1 p ( y ) P ( Y = y ) 1 0 3 X

  15. Probability distribution ✺ Give the random variable S in the 4- sided die, whose range is {2,3,4,5,6,7,8}, probability distribu:on of S. p ( s ) What is var[S] ? 1/16 S 5 6 8 2 3 4 7

  16. Content ✺ Random Variable ✺ Review with ques=ons ✺ The weak law of large numbers

  17. Towards the weak law of large numbers ✺ The weak law says that if we repeat a random experiment many :mes, the average of the observa:ons will “converge” to the expected value ✺ For example, if you repeat the profit example, the average earning will “converge” to E[ X ]=20p-10 ✺ The weak law jus:fies using simula:ons (instead of calcula:on) to es:mate the expected values of random variables

  18. Markov’s inequality ✺ For any random variable X that only take s x ≥ 0 and constant a > 0 P ( X ≥ a ) ≤ E [ X ] a ✺ For example, if a = 10 E[X] E [ X ] P ( X ≥ 10 E [ X ]) ≤ 10 E [ X ] = 0 . 1

  19. Proof of Markov’s inequality

  20. Chebyshev’s inequality ✺ For any random variable X and constant a >0 P ( | X − E [ X ] | ≥ a ) ≤ var [ X ] a 2 ✺ If we let a = kσ where σ = std[ X ] P ( | X − E [ X ] | ≥ k σ ) ≤ 1 k 2 ✺ In words, the probability that X is greater than k standard devia:on away from the mean is small

  21. Proof of Chebyshev’s inequality ✺ Given Markov inequality, a>0, x ≥ 0 P ( X ≥ a ) ≤ E [ X ] a ✺ We can rewrite it as P ( | U | ≥ w ) ≤ E [ | U | ] ω > 0 w

  22. Proof of Chebyshev’s inequality ✺ If U = ( X − E [ X ]) 2 P ( | U | ≥ w ) ≤ E [ | U | ] = E [ U ] w w

  23. Proof of Chebyshev’s inequality ✺ Apply Markov inequality to U = ( X − E [ X ]) 2 P ( | U | ≥ w ) ≤ E [ | U | ] = E [ U ] = var [ X ] w w w ✺ Subs:tute and w = a 2 U = ( X − E [ X ]) 2 P (( X − E [ X ]) 2 ≥ a 2 ) ≤ var [ X ] Assume a > 0 a 2 ⇒ P ( | X − E [ X ] | ≥ a ) ≤ var [ X ] a 2

  24. Now we are closer to the law of large numbers

  25. Sample mean and IID samples ✺ We define the sample mean to be the X average of N random variables X 1 , …, X N . ✺ If X 1 , …, X N are independent and have iden,cal probability func:on P ( x ) then the numbers randomly generated from them are called IID samples ✺ The sample mean is a random variable

  26. Sample mean and IID samples ✺ Assume we have a set of IID samples from N random variables X 1 , …, X N that have probability func:on P ( x ) ✺ We use to denote the sample mean of X these IID samples � N i =1 X i X = N

  27. Expected value of sample mean of IID random variables ✺ By linearity of expected value N � N ] = 1 i =1 X i � E [ X ] = E [ E [ X i ] N N i =1

  28. Expected value of sample mean of IID random variables ✺ By linearity of expected value N � N ] = 1 i =1 X i � E [ X ] = E [ E [ X i ] N N i =1 ✺ Given each X i has iden:cal P ( x ) N E [ X ] = 1 � E [ X ] = E [ X ] N i =1

  29. Variance of sample mean of IID random variables ✺ By the scaling property of variance N N var [ X ] = var [ 1 X i ] = 1 � � N 2 var [ X i ] N i =1 i =1

  30. Variance of sample mean of IID random variables ✺ By the scaling property of variance N N var [ X ] = var [ 1 X i ] = 1 � � N 2 var [ X i ] N i =1 i =1 ✺ And by independence of these IID random variables N var [ X ] = 1 � var [ X i ] N 2 i =1

  31. Variance of sample mean of IID random variables ✺ By the scaling property of variance N N var [ X ] = var [ 1 X i ] = 1 � � N 2 var [ X i ] N i =1 i =1 ✺ And by independence of these IID random variables N var [ X ] = 1 � var [ X i ] N 2 i =1 ✺ Given each X i has iden:cal , P ( x ) var [ X i ] = var [ X ] N var [ X ] = 1 var [ X ] = var [ X ] � N 2 N i =1

  32. Expected value and variance of sample mean of IID random variables ✺ The expected value of sample mean is the same as the expected value of the distribu:on E [ X ] = E [ X ] ✺ The variance of sample mean is the distribu:on’s variance divided by the sample size N var [ X ] = var [ X ] N

  33. Weak law of large numbers ✺ Given a random variable X with finite variance, probability distribu:on func:on and the P ( x ) sample mean of size N . X ✺ For any posi:ve number � > 0 N →∞ P ( | X − E [ X ] | ≥ � ) = 0 lim ✺ That is: the value of the mean of IID samples is very close with high probability to the expected value of the popula:on when sample size is very large

  34. Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2

  35. Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N

  36. Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] N � 2

  37. Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] 0 N � 2 N → ∞

  38. Proof of Weak law of large numbers ✺ Apply Chebyshev’s inequality P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] � 2 var [ X ] = var [ X ] ✺ Subs:tute and E [ X ] = E [ X ] N P ( | X − E [ X ] | ≥ � ) ≤ var [ X ] 0 N � 2 N → ∞ N →∞ P ( | X − E [ X ] | ≥ � ) = 0 lim

  39. Applications of the Weak law of large numbers

  40. Applications of the Weak law of large numbers ✺ The law of large numbers jus,fies using simula,ons (instead of calcula:on) to es:mate the expected values of random variables N →∞ P ( | X − E [ X ] | ≥ � ) = 0 lim ✺ The law of large numbers also jus,fies using histogram of large random samples to approximate the probability distribu:on func:on , see proof on P ( x ) Pg. 353 of the textbook by DeGroot, et al.

  41. Histogram of large random IID samples approximates the probability distribution ✺ The law of large numbers jus:fies using histograms to approximate the probability distribu:on. Given N IID random variables X 1 , …, X N ✺ According to the law of large numbers � N i =1 Y i N → ∞ E [ Y i ] Y = N ✺ As we know for indicator func:on E [ Y i ] = P ( c 1 ≤ X i < c 2 ) = P ( c 1 ≤ X < c 2 )

  42. Simulation of the sum of two-dice ✺ hpp://www.randomservices.org/ random/apps/DiceExperiment.html

  43. Probability using the property of Independence: Airline overbooking ✺ An airline has a flight with s seats. They always sell t ( t > s ) :ckets for this flight. If :cket holders show up independently with probability p , what is the probability that the flight is overbooked ? t � P( overbooked) C ( t, u ) p u (1 − p ) t − u = u = s +1

  44. Simulation of airline overbooking ✺ An airline has a flight with 7 seats. They always sell 12 :ckets for this flight. If :cket holders show up independently with probability p , es:mate the following values ✺ Expected value of the number of :cket holders who show up ✺ Probability that the flight being overbooked ✺ Expected value of the number of :cket holders who can’t fly due to the flight is overbooked.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend