probability review
play

Probability Review Applied Bayesian Statistics Dr. Earvin Balderama - PowerPoint PPT Presentation

Probability Review Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago August 31, 2017 Applied Bayesian Statistics 1 Last edited September 8, 2017 by Earvin Balderama


  1. Probability Review Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago August 31, 2017 Applied Bayesian Statistics 1 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  2. Random Variables Mathematically, a random variable is a function that maps a sample space into the real numbers: X : S → R . Countable (discrete). 1 Uncountable (continuous). 2 Example: 3 coin tosses S = { HHH , HHT , HTH , THH , THT , TTH , TTT } We may want to create a random variable, X , defined as the number of tails . X = { 0 , 1 , 2 , 3 } Applied Bayesian Statistics 2 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  3. Probability Mathematically, a probability function assigns numbers (between 0 and 1) to subsets of a sample space: P : B → [ 0 , 1 ] , ∀B ⊆ S . Two interpretations: ( Frequentist ) Based on long-run relative frequencies of possible 1 outcomes. ( Bayesian ) Based on belief about how likely each possible outcome is. 2 Applied Bayesian Statistics 3 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  4. Probability Mathematically, a probability function assigns numbers (between 0 and 1) to subsets of a sample space: P : B → [ 0 , 1 ] , ∀B ⊆ S . Two interpretations: ( Frequentist ) Based on long-run relative frequencies of possible 1 outcomes. ( Bayesian ) Based on belief about how likely each possible outcome is. 2 Regardless of interpretation, same basic probability laws apply, e.g., P ( A ) ≥ 0, P ( S ) = 1, P ( A ∪ B ) = P ( A ) + P ( B ) , for mutually exclusive A and B . Applied Bayesian Statistics 3 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  5. Probability distributions A probability distribution is a list of all possible values of a random variable and their corresponding probabilities. Discrete random variable: probability mass function (PMF) 1 PMF: f ( x ) = Prob ( X = x ) ≥ 0 � Mean: E ( X ) = xf ( x ) x � [ x − E ( X )] 2 f ( x ) Variance: V ( X ) = x Continuous random variable: probability density function (PDF) 2 Prob ( X = x ) = 0 for all x � PDF: f ( x ) ≥ 0 , Prob ( X ∈ B ) = f ( x ) dx B � Mean: E ( X ) = xf ( x ) dx � � � 2 Variance: V ( X ) = x − E ( X ) f ( x ) dx Applied Bayesian Statistics 4 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  6. Parametric families of distributions A statistical analysis typically proceeds by selecting a PMF (or PDF) that seems to match the distribution of a sample. We rarely know the PMF (or PDF) exactly, but we may assume it is from a parametric family of distributions, and estimate the parameters. Discrete random variables 1 Binomial (Bernoulli is a special case) Poisson NegativeBinomial Continuous random variables 2 Normal Gamma (Exponential and χ 2 are special cases) InverseGamma Beta (Uniform is a special case) Applied Bayesian Statistics 5 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  7. X ∼ Bernoulli ( θ ) Only two outcomes, (success/failure, 0/1, zero/nonzero, etc.), where θ is the probability of success. X ∈ { 0 , 1 } � 1 − θ, if x = 0 , PMF: f ( x ) = Prob ( X = x ) = θ, if x = 1 . � Mean: E ( X ) = xf ( x ) = 0 ( 1 − θ ) + 1 θ = θ x � [ x − θ ] 2 f ( x ) = ( 0 − θ ) 2 ( 1 − θ ) + ( 1 − θ ) 2 θ = θ ( 1 − θ ) Variance: V ( X ) = x Applied Bayesian Statistics 6 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  8. X ∼ Binomial ( n , θ ) X = number of “successes” in n independent “Bernoulli trials,” where θ is the probability of success on each trial. X ∈ { 0 , 1 , . . . , n } � n � θ x ( 1 − θ ) n − x . PMF: f ( x ) = Prob ( X = x ) = x Mean: E ( X ) = n θ Variance: V ( X ) = n θ ( 1 − θ ) Applied Bayesian Statistics 7 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  9. X ∼ Poisson ( λ ) X = number of events that occur in a unit of time. X ∈ { 0 , 1 , . . . } PMF: f ( x ) = Prob ( X = x ) = λ x e − λ . x ! Mean: E ( X ) = λ Variance: V ( X ) = λ Note: Can be parameterized with λ = n θ , where θ is the expected number of events per unit time. E ( X ) = V ( X ) = n θ . Applied Bayesian Statistics 8 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  10. X ∼ NegativeBinomial ( r , θ ) X = number of “failures” until r “successes” in a sequence of independent “Bernoulli trials,” where θ is the probability of success on each trial. X ∈ { 0 , 1 , . . . , n } � x + r − 1 � θ r ( 1 − θ ) x . PMF: f ( x ) = Prob ( X = x ) = x Mean: E ( X ) = r ( 1 − θ ) θ Variance: V ( X ) = r ( 1 − θ ) θ 2 Note: The geometric distribution is a special case: Geom ( θ ) = NB ( 1 , θ ) . Note: MANY different ways to specify the NB distribution. The important thing to note is that NB is a discrete count distribution that is a more flexible model than Poisson. Applied Bayesian Statistics 9 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  11. X ∼ Normal ( µ, σ 2 ) X ∈ ( −∞ , ∞ ) � � 2 � � x − µ 1 − 1 PDF: f ( x ) = √ exp . 2 σ 2 πσ Mean: E ( X ) = µ Variance: V ( X ) = σ 2 Applied Bayesian Statistics 10 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  12. X ∼ Gamma ( a , b ) X ∈ ( 0 , ∞ ) b a Γ( a ) x a − 1 e − bx . PDF: f ( x ) = Mean: E ( X ) = a b Variance: V ( X ) = a b 2 Parameters: shape a > 0, rate b > 0. Applied Bayesian Statistics 11 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  13. X ∼ InverseGamma ( a , b ) If Y ∼ Gamma ( a , b ) , then X = 1 Y ∼ InverseGamma ( a , b ) . X ∈ ( 0 , ∞ ) b a Γ( a ) x − a − 1 e − b / x . PDF: f ( x ) = b Mean: E ( X ) = a − 1 , for a > 1. b 2 Variance: V ( X ) = ( a − 1 ) 2 ( a − 2 ) , for a > 2. Parameters: shape a > 0, rate b > 0. Applied Bayesian Statistics 12 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  14. X ∼ Beta ( a , b ) X ∈ [ 0 , 1 ] Γ( a + b ) Γ( a )Γ( b ) x a − 1 ( 1 − x ) b − 1 . PDF: f ( x ) = a Mean: E ( X ) = a + b ab Variance: V ( X ) = ( a + b ) 2 ( a + b + 1 ) Parameters: a > 0, b > 0. Applied Bayesian Statistics 13 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  15. Joint distributions A random vector of p random variables: X = ( X 1 , X 2 , . . . , X p ) . For now, suppose we have just p = 2 random variables, X and Y . ( X , Y ) can be discrete or continuous. Applied Bayesian Statistics 14 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  16. Joint distributions Discrete ( X , Y ) 1 joint PMF : f ( x , y ) = Prob ( X = x , Y = y ) � marginal PMF for X : f X ( x ) = Prob ( X = x ) = f ( x , y ) y � marginal PMF for Y : f Y ( y ) = Prob ( Y = y ) = f ( x , y ) x Continuous ( X , Y ) 2 joint PDF : f ( x , y ) � Prob [( X , Y ) ∈ B ] = f ( x , y ) dxdy B � marginal PDF for X : f X ( x ) = f ( x , y ) dy � marginal PDF for Y : f Y ( y ) = f ( x , y ) dx Applied Bayesian Statistics 15 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  17. Discrete random variables Example Patients are randomly assigned a dose and followed to determine whether they develop a tumor. X ∈ { 5 , 10 , 20 } is the dose; Y ∈ { 0 , 1 } is 1 if a tumor develops and 0 otherwise. X Y 5 10 20 The joint PMF is given by 0 0.469 0.124 0.049 1 0.231 0.076 0.051 Applied Bayesian Statistics 16 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  18. Discrete random variables Example Find the marginal PMFs of X and Y . Applied Bayesian Statistics 17 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  19. Discrete random variables Example Find the marginal PMFs of X and Y . � f Y ( 0 ) = f ( x , 0 ) = 0 . 469 + 0 . 124 + 0 . 049 = 0 . 642 x � f Y ( 1 ) = f ( x , 1 ) = 0 . 231 + 0 . 076 + 0 . 051 = 0 . 358 x Applied Bayesian Statistics 17 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  20. Discrete random variables Example Find the marginal PMFs of X and Y . � f Y ( 0 ) = f ( x , 0 ) = 0 . 469 + 0 . 124 + 0 . 049 = 0 . 642 x � f Y ( 1 ) = f ( x , 1 ) = 0 . 231 + 0 . 076 + 0 . 051 = 0 . 358 x f X ( 5 ) = 0 . 7 , f X ( 10 ) = 0 . 2 , f X ( 20 ) = 0 . 1 X Y 5 10 20 0 0.469 0.124 0.049 0.642 1 0.231 0.076 0.051 0.358 0.7 0.2 0.1 1 Applied Bayesian Statistics 17 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

  21. Discrete random variables conditional PMF of Y given X : f ( y | x ) = Prob ( Y = y | X = x ) = Prob ( X = x , Y = y ) = f ( x , y ) Prob ( X = x ) f X ( x ) joint conditional = marginal Applied Bayesian Statistics 18 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend