review of conditional probability and independence
play

Review of Conditional Probability and Independence Definition L7.3 - PowerPoint PPT Presentation

Review of Conditional Probability and Independence Definition L7.3 (Def 1.3.2 on p.20): If A, B S and P ( B ) > 0 , then P ( A | B ) = P ( A B ) . P ( B ) Bayes Rule Theorem L7.2 (Thm 1.3.5 on p.23): Let A 1 , A 2 , . . . be a


  1. Review of Conditional Probability and Independence Definition L7.3 (Def 1.3.2 on p.20): If A, B ∈ S and P ( B ) > 0 , then P ( A | B ) = P ( A ∩ B ) . P ( B ) Bayes’ Rule Theorem L7.2 (Thm 1.3.5 on p.23): Let A 1 , A 2 , . . . be a partition of the sample space S and B ⊂ S . If P ( B ) > 0 and P ( A i ) > 0 , then P ( B | A i ) P ( A i ) P ( A i | B ) = . � P ( B | A j ) P ( A j ) j : P ( A j ) > 0 19 / 25 Lecture 7: Methods of Estimation

  2. Review of Conditional Probability and Independence Definition L7.4 (Def 4.2.1 on p.148): Let ( X, Y ) be a discrete bivariate random vector with joint pmf f ( x, y ) and marginal pmfs f X ( x ) and f Y ( y ) . For any x such that P ( X = x ) = f X ( x ) > 0 , the conditional pmf of Y given that X = x is the function of y defined by f ( y | x ) = P ( Y = y | X = x ) = f ( x, y ) f X ( x ) . For any y such that P ( Y = y ) = f Y ( y ) > 0 , the conditional pmf of X given that Y = y is the function of x defined by f ( x | y ) = P ( X = x | Y = y ) = f ( x, y ) f Y ( y ) . If g ( Y ) is a function of a discrete random variable Y , then the conditional expected value of g ( Y ) given that X = x is � E ( g ( Y ) | x ) = g ( y ) f ( y | x ) . y 20 / 25 Lecture 7: Methods of Estimation

  3. Review of Conditional Probability and Independence Definition L7.5 (Def 4.2.3 on p.150): Let ( X, Y ) be a continuous bivariate random vector with joint pdf f ( x, y ) and marginal pdfs f X ( x ) and f Y ( y ) . For any x such that f X ( x ) > 0 , the conditional pdf of Y given that X = x is the function of y defined by f ( y | x ) = f ( x, y ) f X ( x ) . For any y such that f Y ( y ) > 0 , the conditional pdf of X given that Y = y is the function of x defined by f ( x | y ) = f ( x, y ) f Y ( y ) . If g ( Y ) is a function of a continuous random variable Y , then the conditional expected value of g ( Y ) given that X = x is � ∞ E ( g ( Y ) | x ) = g ( y ) f ( y | x ) dy. −∞ 21 / 25 Lecture 7: Methods of Estimation

  4. Bayesian Estimation The Bayesian approach differs greatly from the classical approach that we have been discussing. In the Bayesian approach, the parameter θ is assumed to be a random variable/vector with prior distribution π ( θ ) . Then we can find update the pdf/pmf of the distribution of θ given data X = x using Bayes’ Rule π ( θ | x ) = f ( x , θ ) m ( x ) = f ( x | θ ) π ( θ ) m ( x ) where m ( x ) is the pdf/pmf of the marginal distribution of X . The updated prior is referred to as the posterior distribution . The Bayes estimator of θ is obtained by finding the mean of the posterior distribution; that is, ˆ θ B = E [ θ | X ] . 22 / 25 Lecture 7: Methods of Estimation

  5. Bayesian Estimation Example L7.7 : Let X 1 , . . . , X n be a random sample from a Bernoulli( p ) distribution. Find the Bayes estimator of p , assuming that the prior distribution on p is beta( α , β ). Answer to Example L7.7 : Since X 1 , . . . , X n are iid Bernoulli( p ) random variables, � n i =1 X i is binomial( n , p ). The posterior distribution of p | � n i =1 X i = x is f ( x | p ) π ( p ) π ( p | x ) = m ( x ) p x (1 − p ) n − x Γ( α + β ) � n � Γ( α )Γ( β ) p α − 1 (1 − p ) β − 1 x = � 1 p x (1 − p ) n − x Γ( α + β ) � n Γ( α )Γ( β ) p α − 1 (1 − p ) β − 1 dp � 0 x p x + α − 1 (1 − p ) n − x + β − 1 = � 1 0 p x + α − 1 (1 − p ) n − x + β − 1 dp Γ( n + α + β ) Γ( x + α )Γ( n − x + β ) p x + α − 1 (1 − p ) n − x + β − 1 I (0 , 1) ( p ) . = 23 / 25 Lecture 7: Methods of Estimation

  6. Bayesian Estimation Answer to Example L7.7 continued : Thus, p | � n i =1 X i = x follows a beta( � n i =1 x i + α , n − � n i =1 x i + β ) distribution. The Bayes estimator (posterior mean) is � n i =1 X i + α ˆ = p B α + β + n � � n � � α + β � n i =1 X i α = + α + β . α + β + n α + β + n n The Bayes estimator is a weighted average of ¯ X (the sample α mean based on the data) and E [ p ] = α + β (the mean of the prior distribution). 24 / 25 Lecture 7: Methods of Estimation

  7. Bayesian Estimation Definition L7.6 (Def 7.2.15 on p.325): Let F denote the class of pdfs or pmfs f ( x | θ ) (indexed by θ ). A class Π of prior distributions is a conjugate family for F if the posterior distribution is in the class Π for all f ∈ F , all priors in Π , and all x ∈ X . As seen in Example L7.7, the beta family is conjugate for the binomial family. 25 / 25 Lecture 7: Methods of Estimation

  8. Bayesian Tests Hypothesis testing is much different from a Bayesian perspective where the parameter is considered random. From the Bayesian perspective, the natural approach is to compute � P ( H 0 is true | x ) = P ( θ ∈ Θ 0 | x ) = π ( θ | x ) dθ Θ 0 and � P ( H 1 is true | x ) = P ( θ ∈ Θ c 0 | x ) = π ( θ | x ) dθ Θ c 0 based on the posterior distribution π ( θ | x ) . 7 / 14 Lecture 14: More Hypothesis Testing Examples

  9. Bayesian Tests Example L14.2 : Suppose we toss a coin 5 times and count the total number of heads which occur. We assume each toss is independent and the probability of heads (denoted by p ) is the same on each toss. Consider a Bayesian model which assumes that p follows a Uniform (0 , 1) prior. What is the probability of the the null 5 � hypothesis H 0 : p ≤ . 5 if X i = 5 ? i =1 Answer to Example L14.2 : Since p | X = x ∼ beta ( � 5 i =1 x i + 1 , 5 − � 5 i =1 x i + 1) from slide 7.24, the probability is � . 5 � 5 � � 6 p 5 dp = 1 / 64 = . 015625 . � P p ≤ . 5 x i = 5 = � � 0 i =1 8 / 14 Lecture 14: More Hypothesis Testing Examples

  10. Finding a Bayesian Credible Interval Interval estimators are much different from a Bayesian perspective where the parameter is considered random. Definition L16.5 (p.436): If π ( θ | x ) is the posterior distribution of θ given X = x , then for any set A ⊂ Θ , the credible probability of A is � P ( θ ∈ A | x ) = π ( θ | x ) dθ (assuming θ | x is continuous) A and A is a credible set for θ . 23 / 24 Lecture 16: Confidence Intervals

  11. Finding a Bayesian Credible Interval Example L16.5 : Suppose X 1 , . . . , X n are iid Bernoulli( p ) random variables and suppose we consider a Bayesian model which assumes that p follows a Uniform (0 , 1) prior. Find a 90% credible set for p for the data set with 4 successes and 14 failures. Answer to Example L16.5 : From slide 7.23, p | � n i =1 X i = y ∼ beta ( y + α, n − y + β ) , we have p | X = x ∼ beta (4 + 1 = 5 , 14 + 1 = 15) . � p L So we can find p L such that π ( p | x ) dθ = . 05 and p U such 0 � 1 p U π ( p | x ) dθ = . 05 where π ( p | x ) = 58140 p 4 (1 − p ) 14 . that Using the R commands qbeta(.05,5,15) and qbeta(.95,5,15) , we obtain the 90% credible set ( . 1099 , . 4191) . The shortest 90% credible set ( . 0953 , . 3991) can be obtained with the R commands alpha=.02931685;qbeta(c(alpha,.9+alpha),5,15) since > dbeta(qbeta(c(alpha,.9+alpha),5,15),5,15) [1] 1.180588 1.180588 24 / 24 Lecture 16: Confidence Intervals

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend