bayesian hypothesis testing
play

Bayesian hypothesis testing Dr. Jarad Niemi STAT 544 - Iowa State - PowerPoint PPT Presentation

Bayesian hypothesis testing Dr. Jarad Niemi STAT 544 - Iowa State University March 7, 2019 Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 1 / 25 Outline Scientific method Statistical hypothesis testing Simple vs


  1. Bayesian hypothesis testing Dr. Jarad Niemi STAT 544 - Iowa State University March 7, 2019 Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 1 / 25

  2. Outline Scientific method Statistical hypothesis testing Simple vs composite hypotheses Simple Bayesian hypothesis testing All simple hypotheses All composite hypotheses Propriety Posterior Prior predictive distribution Bayesian hypothesis testing with mixed hypotheses (models) Prior model probability Prior for parameters in composite hypotheses WARNING: do not use non-informative priors Posterior model probability Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 2 / 25

  3. Scientific method http://www.wired.com/wiredscience/2013/04/whats-wrong-with-the-scientific-method/ Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 3 / 25

  4. Statistical hypothesis testing Statistical hypothesis testing Definition A simple hypothesis specifies the value for all parameters of interest while a composite hypothesis does not. ind Let Y i ∼ Ber ( θ ) and H 0 : θ = 0 . 5 (simple) H 1 : θ � = 0 . 5 (composite) Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 4 / 25

  5. Statistical hypothesis testing Prior probabilities on simple hypotheses What is your prior probability for the following hypotheses: a coin flip has exactly 0.5 probability of landing heads a fertilizer treatment has zero effect on plant growth inactivation of a mouse growth gene has zero effect on mouse hair color a butterfly flapping its wings in Australia has no effect on temperature in Ames guessing the color of a card drawn from a deck has probability 0.5 Many null hypotheses have zero probability a priori , so why bother performing the hypothesis test? Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 5 / 25

  6. Statistical hypothesis testing All simple hypotheses Bayesian hypothesis testing with all simple hypotheses Let Y ∼ p ( y | θ ) and H j : θ = d j for j = 1 , . . . , J . Treat this as a discrete prior on the d j , i.e. P ( θ = d j ) = p j . The posterior is then p j p ( y | d j ) P ( θ = d j | y ) = ∝ p j p ( y | d j ) . � J k =1 p k p ( y | d k ) ind For example, suppose Y i ∼ Ber ( θ ) and P ( θ = d j ) = 1 / 11 where d j = j/ 10 for j = 0 , . . . , 10 . The posterior is n P ( θ = d j | y ) ∝ 1 ( d j ) y i (1 − d j ) 1 − y i = ( d j ) ny (1 − d j ) n (1 − y ) � 11 i =1 If j = 0 ( j = 10 ), any y i = 1 ( y i = 0 ) will make the posterior probability of H 0 ( H 1 ) zero. Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 6 / 25

  7. Statistical hypothesis testing All simple hypotheses Discrete prior example n = 13; y = rbinom(n,1,.45); sum(y) [1] 7 0.3 0.2 variable value prior posterior 0.1 0.0 0.00 0.25 0.50 0.75 1.00 theta Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 7 / 25

  8. Statistical hypothesis testing All composite hypotheses Bayesian hypothesis testing with all composite hypotheses Let Y ∼ p ( y | θ ) and H j : θ ∈ ( E j − 1 , E j ] for j = 1 , . . . , J . Just calculate the area under the curve, i.e. prior probabilities are � E j P ( H j ) = P ( E j − 1 < θ < E j ) = p ( θ ) dθ. E j − 1 and posterior probabilities are � E j P ( H j | y ) = P ( E j − 1 < θ < E j | y ) = p ( θ | y ) dθ E j − 1 ind For example, suppose Y i ∼ Ber ( θ ) and E j = j/ 10 for j = 0 , . . . , 10 . Now, assume θ ∼ Be (1 , 1) and thus θ | y ∼ Be (1 + ny, 1 + n [1 − y ]) . Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 8 / 25

  9. Statistical hypothesis testing All composite hypotheses Beta example The posterior probabilities are 3 2 y 1 0 0 0.03 0.12 0.25 0.3 0.21 0.08 0.01 0 0 0.00 0.25 0.50 0.75 1.00 x Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 9 / 25

  10. Posterior propriety Tonelli’s Theorem Tonelli’s Theorem (successor to Fubini’s Theorem) Theorem Tonelli’s Theorem states that if X and Y are σ -finite measure spaces and f is non-negative and measureable, then � � � � f ( x, y ) dydx = f ( x, y ) dxdy X Y Y X i.e. you can interchange the integrals (or sums). On the following slides, the use of this theorem will be indicated by TT. Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 10 / 25

  11. Posterior propriety Proper priors Proper priors with discrete data Theorem If the prior is proper and the data are discrete, then the posterior is always proper. Proof. Let p ( θ ) be the prior and p ( y | θ ) be the statistical model. Thus, we need to show that � p ( y ) = p ( y | θ ) p ( θ ) dθ < ∞ ∀ y. Θ For discrete y , we have T T � � p ( y ) ≤ � z ∈Y p ( z ) = � Θ p ( z | θ ) p ( θ ) dθ = � z ∈Y p ( z | θ ) p ( θ ) dθ z ∈Y Θ � = Θ p ( θ ) dθ = 1 . Thus the posterior is always proper if y is discrete and the prior is proper. Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 11 / 25

  12. Posterior propriety Proper priors Proper priors with continuous data Theorem If the prior is proper and the data are continuous, then the posterior is almost always proper. Proof. Let p ( θ ) be the prior and p ( y | θ ) be the statistical model. Thus, we need to show that � p ( y ) = p ( y | θ ) p ( θ ) dθ < ∞ for almost all y. Θ For continuous y , we have T T � � � � � � Y p ( z ) dz = Θ p ( z | θ ) p ( θ ) dθdz = Y p ( z | θ ) dz p ( θ ) dθ = Θ p ( θ ) dθ = 1 Y Θ thus p ( y ) is finite except on a set of measure zero, i.e. p ( θ | y ) is almost always proper. Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 12 / 25

  13. Posterior propriety Propriety of prior predictive distributions Proper prior predictive distributions In the previous derivations when the prior is proper, we showed that � � p ( z ) = 1 and p ( z ) dz = 1 Y z ∈Y for discrete and continuous data, respectively. Corollary When the prior is proper, the prior predictive distribution is also proper. Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 13 / 25

  14. Posterior propriety Propriety of prior predictive distributions Improper prior predictive distributions Theorem If p ( θ ) is improper, then p ( y ) = � p ( y | θ ) p ( θ ) dθ is improper. Proof. p ( y | θ ) p ( θ ) dθdy TT � � � � � p ( y ) dy = = p ( θ ) p ( y | θ ) dydθ = � p ( θ ) dθ since p ( θ ) is improper, so is p ( y ) . A similar result holds for discrete y replacing the integral with a sum. Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 14 / 25

  15. Bayesian hypothesis testing Bayesian hypothesis testing To evaluate the relative plausibility of a hypothesis (model), we use the posterior model probability: p ( H j | y ) = p ( y | H j ) p ( H j ) p ( y | H j ) p ( H j ) = ∝ p ( y | H j ) p ( H j ) . � J p ( y ) k =1 p ( y | H k ) p ( H k ) where p ( H j ) is the prior model probability and � p ( y | H j ) = p ( y | θ ) p ( θ | H j ) dθ is the marginal likelihood under model H j and p ( θ | H j ) is the prior for parameters θ when model H j is true. Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 15 / 25

  16. Bayesian hypothesis testing Marginal likelihood The marginal likelihood calculation differs for simple vs composite hypotheses: Simple hypotheses can be considered to have a Dirac delta function for a prior, e.g. if H 0 : θ = θ 0 then θ | H 0 ∼ δ θ 0 . Then the marginal likelihood is � p ( y | H 0 ) = p ( y | θ ) p ( θ | H 0 ) dθ = p ( y | θ 0 ) . Composite hypotheses have a continuous prior and thus � p ( y | H j ) = p ( y | θ ) p ( θ | H j ) dθ. Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 16 / 25

  17. Bayesian hypothesis testing Two models If we only have two models: H 0 and H 1 , then p ( y | H 0 ) p ( H 0 ) 1 p ( H 0 | y ) = p ( y | H 0 ) p ( H 0 ) + p ( y | H 1 ) p ( H 1 ) = 1 + p ( y | H 1 ) p ( H 1 ) p ( y | H 0 ) p ( H 0 ) where p ( H 1 ) p ( H 1 ) p ( H 0 ) = 1 − p ( H 1 ) is the prior odds in favor of H 1 and BF ( H 1 : H 0 ) = p ( y | H 1 ) 1 p ( y | H 0 ) = BF ( H 0 : H 1 ) is the Bayes Factor for model H 1 relative to H 0 . Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 17 / 25

  18. Bayesian hypothesis testing Binomial model Binomial model ind Consider a coin flipping experiment so that Y i ∼ Ber ( θ ) and the null hypothesis H 0 : θ = 0 . 5 versus the alternative H 1 : θ � = 0 . 5 and θ | H 1 ∼ Be ( a, b ) . 0 . 5 n BF ( H 0 : H 1 ) = � 1 0 θ ny (1 − θ ) n (1 − y ) θa − 1(1 − θ ) b − 1 dθ Beta ( a,b ) 0 . 5 n = � 1 1 0 θ a + ny − 1 (1 − θ ) b + n − ny − 1 θ Beta ( a,b ) 0 . 5 n = Beta ( a + ny,b + n − ny ) Beta ( a,b ) 0 . 5 n Beta ( a,b ) = Beta ( a + ny,b + n − ny ) and with p ( H 0 ) = p ( H 1 ) the posterior model probability is 1 P ( H 0 | y ) = . 1 1 + BF ( H 0 : H 1 ) Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 18 / 25

  19. Bayesian hypothesis testing Binomial model Sample size and sample average P ( H 0 ) = P ( H 1 ) = 0 . 5 and θ | H 1 ∼ Be (1 , 1) : 1.00 0.75 n p(H 0 |y) 10 0.50 20 30 0.25 0.00 0.00 0.25 0.50 0.75 1.00 ybar Jarad Niemi (STAT544@ISU) Bayesian hypothesis testing March 7, 2019 19 / 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend