SLIDE 1
ACMS 20340 Statistics for Life Sciences Chapter 12: Discrete - - PowerPoint PPT Presentation
ACMS 20340 Statistics for Life Sciences Chapter 12: Discrete - - PowerPoint PPT Presentation
ACMS 20340 Statistics for Life Sciences Chapter 12: Discrete Probability Distributions What about categorical variables? Weve studied various distributions of quantitative variables, most notably, the Normal distributions. But what is the
SLIDE 2
SLIDE 3
Some Motivating Examples
◮ You toss a fair coin ten times.
◮ How many times does it come up heads? ◮ What is the probability of it coming up heads exactly three
times?
◮ An obstetrician oversees 12 single-birth deliveries on a certain
day.
◮ How many of the deliveries are of girls? ◮ What is the probability of there being exactly 7 girls in this
“batch” of 12?
SLIDE 4
The Binomial Setting
- 1. There is a fixed number n of observations.
- 2. The n observations are independent, which means that
knowing the result of one observation doesn’t change the probabilities we assign to other observations.
- 3. Each observation falls into one of two categories, one of which
we will call “success”, and the other “failure”.
- 4. The probability p of a success is the same for each
- bservation.
SLIDE 5
The Binomial Distribution
The count X of successes in the binomial setting has the binomial distribution with parameters n and p. The parameter n is the number of observations, and p is the probability of a success on any one observation. The possible values of X are whole numbers from 0 to n. An important caveat: Not all counts have a binomial distribution, so we must ensure that we’re in the binomial setting before we conclude that a count has a binomial distribution.
SLIDE 6
Binomial Distribution Examples
◮ You toss a fair coin ten times and count the number of Hs.
◮ n = 10 ◮ p = 1/2
◮ An obstetrician oversees 12 single-birth deliveries on a certain
day and counts the number of girls born.
◮ n = 12 ◮ p = 1/2
◮ You roll a fair die 100 times and count the number of
- ccurrence of ‘1’.
◮ n = 100 ◮ p = 1/6
SLIDE 7
A Non-Example
You select five balls from a barrel containing 50 red balls and 50 blue balls, without replacement. What is the probability of selecting only red balls?
- 50
100
- 49
99
- 48
98
- 47
97
- 46
96
- = 1081
38412 = 0.028 Why aren’t these counts binomially distributed?
SLIDE 8
Binomial Probabilities 1
What we’d like is a formula for the probability that a binomial random variable takes any value. Idea: We add probabilities for the different ways of getting exactly that many successes in n observations. That is, if X is a binomial random variable, we want a formula for calculating P(X = k) for any k = 0, 1, 2, . . . , n.
SLIDE 9
Binomial Probabilities 2
Let’s first consider an example. Each child born to a particular set of parents has probability 0.25
- f having blood type O.
If these parents have 5 children, what is the probability of exactly two of them having blood type O? The count of children with blood type O is binomially distributed:
◮ n = 5 ◮ p = 0.25
Let’s use “S” to stand for success (blood type O) and “F” to stand for failure.
SLIDE 10
Binomial Probabilities 3
Step 1: What is the probability of that just the first and third child give successes? That is, P(SFSFF) =? The probability of a sequence of independent events is the product
- f the probabilities of each individual event:
P(SFSFF) = P(S) · P(F) · P(S) · P(F) · P(F) = (0.25)(0.75)(0.25)(0.75)(0.75) = (0.25)2(0.75)3
SLIDE 11
Binomial Probabilities 4
Step 2: Observe that any arrangement of 2 S’s and 3 F’s has this same probability: we always just multiply 0.25 twice and 0.75 three times whenever we have 2 S’s and 3 F’s. So the probability that X = 2 is the probability of getting 2 S’s and 3 F’s in any arrangement whatsoever: SSFFF SFSFF SFFSF SFFFS FSSFF FSFSF FSFFS FFSSF FFSFS FFFSS There are ten such arrangements, each with the same probability, and hence P(X = 2) = 10(0.25)2(0.75)3 = 0.2637.
SLIDE 12
The Binomial Coefficient
The number of ways of arranging k successes among n
- bservations is given by the binomial coefficient
n k
- =
n! k!(n − k)! for any k = 0, 1, 2, . . . , n. Recall that the factorial of n, n! is n! = n · (n − 1) · (n − 2) · . . . · 3 · 2 · 1, and 0!=1.
SLIDE 13
The Binomial Coefficient in Action
How many different ways are there to have exactly two successes in five trials? 5 2
- = 5!
2!3! = (5)(4)(3)(2)(1) (2)(1)(3)(2)(1) = (5)(4) (2)(1) = 20 2 = 10.
SLIDE 14
The Official Formula for Binomial Probabilitiies
If X has the binomial distribution with n observations and probability p of success for each observation, then the possible values of X are 0, 1, 2, . . . , n. If k is any one of these values, then P(X = k) = n k
- pk(1 − p)n−k.
SLIDE 15
Example
One in ten boxes of Cracker Jacks contains a decoder ring. What is the probability that no more than one of ten randomly chosen boxes of Cracker Jacks contains a decoder ring?
◮ n = 10 ◮ p = 0.1
P(X ≤ 1) = P(X = 0) + P(X = 1) = 10
- (0.1)0(0.9)10 +
10 1
- (0.1)(0.9)9
= 10! 0!10!(1)(0.3487) + 10! 1!9!(0.1)(0.3874) = (1)(1)(0.3487) + (10)(0.1)(0.3874) = 0.3487 + 0.3874 = 0.7361
SLIDE 16
Binomial mean and standard deviation
Q In many repetitions of the binomial setting, with n
- bservations and the probability of success p, what will be the
average count of successes? (In other words, what is the mean of the count variable X?) A If a count X has the binomial distribution with n observations and probability p of success, the mean and standard deviation
- f X are
µ = np σ =
- np(1 − p).
SLIDE 17
Coin Tossing
You toss a fair coin ten times and count the occurrence of Hs.
◮ n = 10 ◮ p = 1/2
If we repeat the ten trials repeatedly, how many heads should
- ccur on average?
µ = np = (10)(1/2) = 5 And the standard deviation? σ =
- np(1 − p) =
- 10(1/2)(1/2) =
- 5/2
SLIDE 18
The Normal Approximation to Binomial Distributions
Suppose that a count X has the binomial distribution with n
- bservations and probability of success p.
When n is large, the distribution of X is approximately Normal, N(np,
- np(1 − p)).
As a rule of thumb, we use the Normal approximation when n is so large that np ≥ 10 and n(1 − p) ≥ 10.
SLIDE 19
Remember This?
SLIDE 20
One Last Example
About 60% of American adults are either overweight or obese. What is the probability that at least 1520 individuals from a random sample of 2500 adults are overweight or obese? Given that our sample is random, we can take the 2500 members
- f our sample to be independent.
So we’re in the binomial setting:
◮ n = 2500 ◮ p = 0.6
Using software, we find that P(X ≥ 1520) = 0.2131.
SLIDE 21
Let’s Use the Normal Approximation 1
µ = np = (2500)(0.6) = 1500 σ =
- np(1 − p) =
- (2500)(0.6)(0.4) = 24.49
The distribution of this binomial random variable is approximated well by the Normal distribution N(1500, 24.49) (since np = 1500 ≥ 10 and n(1 − p) = 1000 ≥ 10).
SLIDE 22
Let’s Use the Normal Approximation 2
P(X ≥ 1520) = P
- X − 1500
24.49 ≥ 1520 − 1500 24.49
- = P(Z ≥ 0.82)