review
play

Review Probability Basic definitions: Randomization experiment - PowerPoint PPT Presentation

Review Probability Basic definitions: Randomization experiment Sample spaces Elementary outcomes Event Basic operationsconditional probability Bayes Theorem Objectives Random Variable Discrete random


  1. Review • Probability • Basic definitions:  Randomization experiment  Sample spaces  Elementary outcomes  Event • Basic operations—conditional probability • Bayes Theorem

  2. Objectives • Random Variable  Discrete random variable  Continuous random variable • Two probability distributions  Binomial distribution  Normal distribution

  3. Random variables 4 • A random variable is a function that assigns numeric values to different events in a sample space. Usually we denote a random variable using a capital letter X, Y or Z… NOTE: (1) Randomness; (2) Numeric values • Example 1: Randomly select a student from a class. • X=student’s number of siblings. X could be 0, 1, 2 … Example 2: Randomly select a student from a class. • X=student’s height. X could be any value bigger than 0

  4. Two types of random variables 5 Discrete random variable: their outcomes are set of 1. discrete (isolated) values. Eg. X=number of siblings Continuous random variable: its possible values 2. cannot be enumerated; infinite number of values, all outcomes have probability zero. p(x)=0 for every x. Eg. X=the student’ height

  5. EG1. Tossing two coins 6 let X=number of heads Outcome TT HT TH HH x 0 1 2 Notation: X: variable x: observed values

  6. Probability distribution function 7 • A probability distribution function (pdf) is a mathematical relationship, or rule, that assigns to any possible value x of a discrete random variable X the probability Pr(X=x).

  7. Probability Distribution of the Random Variable 8 X=number of heads. Outcome TT WT TW WW x 0 1 2 P(X=x) 1/4 1/2 1/4 Probability histogram

  8. EG2. Tossing two dice 9 Y: the sum of the dots on the two Dice. What’s the possible values of Y?

  9. Probability Distribution of the Random Variable 10 Y: the sum of the dots on the two Dice.

  10. Relative frequency In practice, the probability can be estimated by the relative frequency of an event “in a long run”. frequency of occurrences Probability = frequency of all possible occurrences 0 ≤ Probability ≤ 1 Relative frequency histogram should look very much like the probability histogram, if the experiment is repeated many times.

  11. Data set vs. Probability distributions  Sample properties—based on data set 12 ∑ n = x x / n Sample mean: = i i 1 1 ∑ Sample variance: n = − 2 2 s ( x x ) − = i i 1 n 1  Model or population properties—based on probability distribution. R ∑ µ = = Pr( ) x X x Population mean: i i = i 1 Population variance: R ∑ σ = − µ = 2 2 ( x ) Pr( X x ) i i = i 1

  12. Mean of Random Variable  Mean or expected value of X, denoted as E(X) 13 or µ, is defined as R ∑ = µ = = ( ) Pr( ) E X x X x i i = i 1  It is the sum of the possible values, each weighted by its probability  Expectation represents “average” value of the random variable

  13. Mean of X 14 X=number of heads. Outcome TT WT TW WW x 0 1 2 P(X=x) 1/4 1/2 1/4 xP(x) 0 1/2 1/2 3 ∑ = µ = = = E X ( ) x Pr( X x ) 1 i i = i 1

  14. Variance of Random Variable 15  The variance of X is the expected squared distance from the population mean. R ∑ = σ = − µ = 2 2 Var ( X ) ( x ) Pr( X x ) i i = i 1  The standard deviation σ is the square root of variance = σ = sd ( X ) Var ( X )

  15. Variance of X 16 X=number of heads. (X-µ) 2 P(x) x P(x) (0-1) 2 *0.25=0.25 0 0.25 (1-1) 2 *0.25=0 1 0.5 (2-1) 2 *0.25=0.25 2 0.25 Total 0.50 σ = 2 Thus, 0.5 Summary, µ and σ are computed from probability distribution. They are population properties.

  16. Two types of random variables 17 Discrete random variable: their outcomes are set of 1. discrete (isolated) values. Continuous random variable: its possible values 2. cannot be enumerated; infinite number of values, all outcomes have probability zero. p(x)=0 for every x.

  17. Continuous random variables 18  A balanced spinning pointer. Can stop anywhere in the circle  X—the proportion of the total circumference it lands on.  X can be any value between 0 and 1. Infinite values. p(0.25≤x ≤0.75)=0.5   p(x=0.5)=0, for x can take on an infinite number of values.

  18. Probability density function(pdf) of X = y f x ( ) 19 • The curve is the probability density function (pdf) of the random = variable X y f x ( ) • Pr( a≤X ≤b)= is the area under the curve between the x value a and b. = ∫ b ≤ ≤ P a ( X b ) f x dx ( ) a • The total area under the density function curve over the entire range of possible values for the random variable is 1 ∞ ∫ −∞ ≤ ≤ ∞ = = P ( X ) f x dx ( ) 1 −∞

  19. Probability density function(pdf) of X 20 • The pdf has large values in regions of high probability and = y f x ( ) small values in regions of low probability • Pr(X=x)=0 for any specific value x • Generally, a distinction is not made between probabilities such as Pr(X<x) and Pr( X≤x ), Pr( a≤X≤b ) and Pr(a<X<b) when X is a continuous

  20. Expectation and variance of a continuous random variable 21 = ∫ ∞ µ = µ • Mean : E (X) xf x dx ( ) −∞ Center of the probability density ∞ ∫ = σ = − µ σ 2 2 2 Var (X) ( x ) f x dx ( ) • Variance : −∞ Spread of the probability density • The standard deviation , or σ , is the square root of the variance, that is, σ = Var ( X )

  21. Two distributions 22  Binomial --discrete  Normal -- continuous

  22. Bernoulli trial 23 Examples:  A heads-or-tails Coin toss  A win-or-lose football game  A pass-or-fail automotive smog inspection Properties:  Two outcomes: success or failure  Success probability(p) is the same in each trial  Trials are independent.

  23. Binomial random variable 24 ---X is the number of success in n repeated Bernoulli trial with probability p of success.  Success probability(p) is the same in each trial  Trials are independent.

  24. Binomial random variable 25 Probability Distribution: the probability of obtaining k successes in n trial, with success probability p:   n − = = − k n k   P X ( k ) p (1 p )   k : counts all possible ways of getting k   = n n !   −   success and n-k failures k k n !( k )! = × − × × where n ! n ( n 1) ... 1 : probability for getting k success and − − k n k p (1 p ) n-k failures

  25. Mean and Variance of the Binomial Distribution 26 µ = np σ = − 2 np (1 p )

  26. Exercise 27 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 1. what is the exact binomial probability of 5 HIV positive test results?

  27. Exercise 28 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 1. what is the exact binomial probability of 5 HIV positive test results?   500 = = − 5 495   P X ( 5) 0.01 (1 0.01) Answer:   5 = 0.176 EXCEL: BINOMDIST(5,500,0.01,FALSE)

  28. Exercise 29 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 2. What is the exact binomial probability of at least 5 HIV positive test results?

  29. Exercise 30 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 2. What is the exact binomial probability of at least 5 HIV positive test results? ≥ = − ≤ P X ( 5) 1 P X ( 4) Answer: = − 1 ( 4) F = − 1 0.44 = 0 .5 6 EXCEL: F(4)= BINOMDIST(4,500,0.01,TRUE)

  30. Normal distribution 31 • Normal distribution is also called Gaussian distribution, after the well-known mathematician Karl Gauss (1777-1855, “the Prince of Mathematicians“)

  31. Normal distribution 32 • Normal distribution is very useful • Many things closely follow a normal distribution • Heights of people • Errors in measurement • Blood pressure • Scores on a test • Many other distributions can be made approximately normal by transformation—Binomial et al. • Most statistical methods considered in this text are based on normal distribution

  32. The pdf of normal distribution 33 • The normal distribution is defined by its pdf, which is given as for some parameters µ and σ   − µ 2 ( x ) −  1 σ  2  =   2 f x ( ) e πσ 2

  33. Other properties of Normal pdf 34 • Mean=median=mode • Symmetry about the center • 50% of values less than the mean

  34. Location is measured by µ • In the graph, µ 2 > µ 1 35

  35. Spread is measured by σ 2 • In the graph, σ 2 > σ 1 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend