cs70 jean walrand lecture 27
play

CS70: Jean Walrand: Lecture 27. Expectation; Conditional - PowerPoint PPT Presentation

CS70: Jean Walrand: Lecture 27. Expectation; Conditional Expectation; B(n, p); G(p) 1. Review of Expectation 2. Linearity of Expectation 3. Conditional Expectation 4. Independence of RVs 5. Applications 6. Important Distributions and


  1. CS70: Jean Walrand: Lecture 27. Expectation; Conditional Expectation; B(n, p); G(p) 1. Review of Expectation 2. Linearity of Expectation 3. Conditional Expectation 4. Independence of RVs 5. Applications 6. Important Distributions and Expectations.

  2. Expectation Recall: X : Ω → ℜ ; Pr [ X = a ];= Pr [ X − 1 ( a )] ; Definition: The expectation of a random variable X is E [ X ] = ∑ a × Pr [ X = a ] . a Indicator: Let A be an event. The random variable X defined by � 1 , if ω ∈ A X ( ω ) = 0 , if ω / ∈ A is called the indicator of the event A . Note that Pr [ X = 1 ] = Pr [ A ] and Pr [ X = 0 ] = 1 − Pr [ A ] . Hence, E [ X ] = 1 × Pr [ X = 1 ]+ 0 × Pr [ X = 0 ] = Pr [ A ] . The random variable X is sometimes written as 1 { ω ∈ A } or 1 A ( ω ) .

  3. Linearity of Expectation Theorem: E [ X ] = ∑ X ( ω ) × Pr [ ω ] . ω Theorem: Expectation is linear E [ a 1 X 1 + ··· + a n X n ] = a 1 E [ X 1 ]+ ··· + a n E [ X n ] . Proof: E [ a 1 X 1 + ··· + a n X n ] = ∑ ( a 1 X 1 + ··· + a n X n )( ω ) Pr [ ω ] ω = ∑ ( a 1 X 1 ( ω )+ ··· + a n X n ( ω )) Pr [ ω ] ω = a 1 ∑ X 1 ( ω ) Pr [ ω ]+ ··· + a n ∑ X n ( ω ) Pr [ ω ] ω ω = a 1 E [ X 1 ]+ ··· + a n E [ X n ] .

  4. Using Linearity - 1: Pips on dice Roll a die n times. X m = number of pips on roll m . X = X 1 + ··· + X n = total number of pips in n rolls. E [ X ] = E [ X 1 + ··· + X n ] = E [ X 1 ]+ ··· + E [ X n ] , by linearity = nE [ X 1 ] , because the X m have the same distribution Now, E [ X 1 ] = 1 × 1 6 + ··· + 6 × 1 6 = 6 × 7 × 1 6 = 7 2 . 2 Hence, E [ X ] = 7 n 2 .

  5. Using Linearity - 2: Fixed point. Hand out assignments at random to n students. X = number of students that get their own assignment back. X = X 1 + ··· + X n where X m = 1 { student m gets his/her own assignment back } . One has E [ X ] = E [ X 1 + ··· + X n ] = E [ X 1 ]+ ··· + E [ X n ] , by linearity = nE [ X 1 ] , because all the X m have the same distribution = nPr [ X 1 = 1 ] , because X 1 is an indicator = n ( 1 / n ) , because student 1 is equally likely to get any one of the n assignments = 1 . Note that linearity holds even though the X m are not independent (whatever that means).

  6. Using Linearity - 3: Binomial Distribution. Flip n coins with heads probability p . X - number of heads Binomial Distibution: Pr [ X = i ] , for each i . � n � p i ( 1 − p ) n − i . Pr [ X = i ] = i � n � E [ X ] = ∑ i × Pr [ X = i ] = ∑ p i ( 1 − p ) n − i . i × i i i Uh oh. ... Or... a better approach: Let � 1 if i th flip is heads X i = 0 otherwise E [ X i ] = 1 × Pr [“ heads ′′ ]+ 0 × Pr [“ tails ′′ ] = p . Moreover X = X 1 + ··· X n and E [ X ] = E [ X 1 ]+ E [ X 2 ]+ ··· E [ X n ] = n × E [ X i ]= np .

  7. Conditional Expectation How do observations affect expectation? Example 1: Roll one die. You are told that the outcome X is at least 3. What is the expected value of X given that information? Given that X ≥ 3, we know that X is uniform in { 3 , 4 , 5 , 6 } . Hence, the mean value is 4 . 5. We write E [ X | X ≥ 3 ] = 4 . 5 . Similarly, we have E [ X | X < 3 ] = 1 . 5 because, given that X < 3, X is uniform in { 1 , 2 } . Note that E [ X | X ≥ 3 ] × Pr [ X ≥ 3 ]+ E [ X | X < 2 ] × Pr [ X < 2 ] = 4 . 5 × 4 6 + 1 . 5 × 2 6 = 3 + 0 . 5 = 3 . 5 = E [ X ] . Is this a coincidence?

  8. Conditional Expectation How do observations affect expectation? Example 2: Roll two dice. You are told that the total number X of pips is at least 8. What is the expected value of X given that information? Recall the distribution of X : Pr [ X = 2 ] = Pr [ X = 12 ] = 1 / 36 , Pr [ X = 3 ] = Pr [ X = 11 ] = 2 / 36 ,... . Given that X ≥ 8, the distribution of X becomes { ( 8 , 5 / 15 ) , ( 9 , 4 / 15 ) , ( 10 , 3 / 15 ) , ( 11 , 2 / 15 ) , ( 12 , 1 / 15 ) } . For instance, Pr [ X = 8 | X ≥ 8 ] = Pr [ X = 8 ] Pr [ X ≥ 8 ] = 5 / 36 15 / 36 = 5 15 . Hence, E [ X | X ≥ 8 ] = 8 5 15 + 9 4 15 + 10 3 15 + 11 2 15 + 12 1 15 = 140 15 ≈ 9 . 33 .

  9. Conditional Expectation How do observations affect expectation? Example 2: continued Roll two dice. You are told that the total number X of pips is less than 8. What is the expected value of X given that information? We find that E [ X | X < 8 ] = 2 1 21 + 3 3 21 + ··· + 7 6 21 = 112 21 ≈ 5 . 33 . Observe that E [ X | X ≥ 8 ] Pr [ X ≥ 8 ]+ E [ X | X < 8 ] Pr [ X < 8 ] = 9 . 33 × 15 36 + 5 . 3321 36 = 7 = E [ X ] . Coincidence? Probably not.

  10. Conditional Probability Definition Let X be a RV and A an event. Then E [ X | A ] := ∑ a × Pr [ X = a | A ] . a It is easy (really) to see that 1 E [ X | A ] = ∑ Pr [ A ] ∑ X ( ω ) Pr [ ω | A ] = X ( ω ) Pr [ ω ] . ω ∈ A ω Theorem Conditional Expectation is linear E [ a 1 X 1 + ··· + a n X n | A ] = a 1 E [ X 1 | A ]+ ··· + a n E [ X n | A ] . Proof: E [ a 1 X 1 + ··· + a n X n | A ] = ∑ [ a 1 X 1 ( ω )+ ··· + a n X n ( ω )] Pr [ ω | A ] ω = a 1 ∑ X 1 ( ω ) Pr [ ω | A ]+ ··· + a n ∑ X n ( ω ) Pr [ ω | A ] ω ω = a 1 E [ X 1 | A ]+ ··· a n E [ X n | A ] .

  11. Conditional Probability Theorem E [ X ] = E [ X | A ] Pr [ A ]+ E [ X | ¯ A ] Pr [¯ A ] . Proof The law of total probability says that Pr [ ω ] = Pr [ ω | A ] Pr [ A ]+ Pr [ ω | ¯ A ] Pr [¯ A ] . Hence, = ∑ E [ X ] X ( ω ) Pr [ ω ] ω = ∑ X ( ω ) Pr [ ω | ¯ A ] Pr [¯ X ( ω ) Pr [ ω | A ] Pr [ A ]+ ∑ A ] ω ω E [ X | A ] Pr [ A ]+ E [ X | ¯ A ] Pr [¯ = A ] .

  12. Geometric Distribution Let’s flip a coin with Pr [ H ] = p until we get H . For instance: ω 1 = H , or ω 2 = T H , or ω 3 = T T H , or ω n = T T T T ··· T H . Note that Ω = { ω n , n = 1 , 2 ,... } . Let X be the number of flips until the first H . Then, X ( ω n ) = n . Also, Pr [ X = n ] = ( 1 − p ) n − 1 p , n ≥ 1 .

  13. Geometric Distribution Pr [ X = n ] = ( 1 − p ) n − 1 p , n ≥ 1 .

  14. Geometric Distribution Pr [ X = n ] = ( 1 − p ) n − 1 p , n ≥ 1 . Note that ∞ ∞ ∞ ∞ ( 1 − p ) n − 1 = p ( 1 − p ) n − 1 p = p ( 1 − p ) n . ∑ ∑ ∑ ∑ Pr [ X n ] = n = 1 n = 1 n = 1 n = 0 n = 0 a n = Now, if | a | < 1, then S := ∑ ∞ 1 1 − a . Indeed, 1 + a + a 2 + a 3 + ··· S = a + a 2 + a 3 + a 4 + ··· aS = 1 + a − a + a 2 − a 2 + ··· = 1 . ( 1 − a ) S = Hence, ∞ 1 ∑ Pr [ X n ] = p 1 − ( 1 − p ) = 1 . n = 1

  15. Geometric Distribution: Expectation X = D G ( p ) , i.e., Pr [ X = n ] = ( 1 − p ) n − 1 p , n ≥ 1 . One has ∞ ∞ n ( 1 − p ) n − 1 p . ∑ ∑ E [ X ] = nPr [ X = n ] = n = 1 n = 1 Thus, p + 2 ( 1 − p ) p + 3 ( 1 − p ) 2 p + 4 ( 1 − p ) 3 p + ··· E [ X ] = ( 1 − p ) p + 2 ( 1 − p ) 2 p + 3 ( 1 − p ) 3 p + ··· ( 1 − p ) E [ X ] = p + ( 1 − p ) p + ( 1 − p ) 2 p + ( 1 − p ) 3 p + ··· pE [ X ] = by subtracting the previous two identities ∞ ∑ = Pr [ X = n ] = 1 . n = 1 Hence, E [ X ] = 1 p .

  16. Geometric Distribution: Renewal Trick A different look at the algebra. We flip the coin once, and, if we get T , let ω be the following flips. Note that X ( H ω ) = 1 and X ( T ω ) = 1 + Y ( ω ) . Hence, = ∑ 1 × Pr [ H ω ]+ ∑ E [ X ] ( 1 + Y ( ω )) Pr [ T ω ] ω ω = ∑ pPr [ ω ]+ ∑ ( 1 + Y ( ω ))( 1 − p ) Pr [ ω ] ω ω = p +( 1 − p )( 1 + E [ Y ]) = 1 +( 1 − p ) E [ Y ] . But, E [ X ] = E [ Y ] . Thus, E [ X ] = 1 +( 1 − p ) E [ X ] , so that E [ X ] = 1 / p .

  17. Geometric Distribution: Memoryless Let X be G ( p ) . Then, for n ≥ 0, Pr [ X > n ] = Pr [ first n flips are T ] = ( 1 − p ) n . Theorem Pr [ X > n + m | X > n ] = Pr [ X > m ] , m , n ≥ 0 . Proof: Pr [ X > n + m and X > n ] Pr [ X > n + m | X > n ] = Pr [ X > n ] Pr [ X > n + m ] = Pr [ X > n ] ( 1 − p ) n + m = ( 1 − p ) m = ( 1 − p ) n = Pr [ X > m ] .

  18. Geometric Distribution: Memoryless - Interpretation Pr [ X > n + m | X > n ] = Pr [ X > m ] , m , n ≥ 0 . Pr [ X > n + m | X > n ] = Pr [ A | B ] = Pr [ A ] = Pr [ X > m ] . The coin is memoryless, therefore, so is X .

  19. Geometric Distribution: Yet another look Theorem: For a r.v. X that takes the values { 0 , 1 , 2 ,... } , one has ∞ ∑ E [ X ] = Pr [ X ≥ i ] . i = 1 [See later for a proof.] If X = G ( p ) , then Pr [ X ≥ i ] = Pr [ X > i − 1 ] = ( 1 − p ) i − 1 . Hence, ∞ ∞ 1 − ( 1 − p ) = 1 1 ( 1 − p ) i − 1 = ( 1 − p ) i = ∑ ∑ E [ X ] = p . i = 1 i = 0

  20. Expected Value of Integer RV Theorem: For a r.v. X that takes values in { 0 , 1 , 2 ,... } , one has ∞ ∑ E [ X ] = Pr [ X ≥ i ] . i = 1 Proof: One has ∞ ∑ E [ X ] = i × Pr [ X = i ] i = 1 ∞ ∑ = i { Pr [ X ≥ i ] − Pr [ X ≥ i + 1 ] } i = 1 ∞ ∑ = { i × Pr [ X ≥ i ] − i × Pr [ X ≥ i + 1 ] } i = 1 ∞ ∑ = { i × Pr [ X ≥ i ] − ( i − 1 ) × Pr [ X ≥ i ] } i = 1 ∞ ∑ = Pr [ X ≥ i ] . i = 1

  21. Riding the bus. n buses arrive uniformly at random throughout a 24 hour day. What is the time between buses? What is the time to wait for a bus? Here are typical arrival times, independent and uniform in [ 0 , 24 ] . Here is an alternative picture (left)

  22. Riding the bus. Add the black dot uniformly at random and pretend that it represents 0/24. This is legitimate, because given the black dot, the other dots are uniform at random. Then, 24 = E [ X 1 + ··· + X 5 ] = 5 E [ X 1 ] , by linearity and symmetry = 5 E ( X 1 ] . Hence, E [ X 1 ] = E [ X m ] = 24 24 5 = n + 1 for n busses.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend