ACMS 20340 Statistics for Life Sciences Chapter 12: Discrete - - PowerPoint PPT Presentation

acms 20340 statistics for life sciences
SMART_READER_LITE
LIVE PREVIEW

ACMS 20340 Statistics for Life Sciences Chapter 12: Discrete - - PowerPoint PPT Presentation

ACMS 20340 Statistics for Life Sciences Chapter 12: Discrete Probability Distributions What about categorical variables? Weve studied various distributions of quantitative variables, most notably, the Normal distributions. But what is the


slide-1
SLIDE 1

ACMS 20340 Statistics for Life Sciences

Chapter 12: Discrete Probability Distributions

slide-2
SLIDE 2

What about categorical variables?

We’ve studied various distributions of quantitative variables, most notably, the Normal distributions. But what is the appropriate probability model for the count of successful outcomes of a categorical variable? We will focus on one distribution in particular, the binomial distribution.

slide-3
SLIDE 3

Some Motivating Examples

◮ You toss a fair coin ten times.

◮ How many times does it come up heads? ◮ What is the probability of it coming up heads exactly three

times?

◮ An obstetrician oversees 12 single-birth deliveries on a certain

day.

◮ How many of the deliveries are of girls? ◮ What is the probability of there being exactly 7 girls in this

“batch” of 12?

slide-4
SLIDE 4

The Binomial Setting

  • 1. There is a fixed number n of observations.
  • 2. The n observations are independent, which means that

knowing the result of one observation doesn’t change the probabilities we assign to other observations.

  • 3. Each observation falls into one of two categories, one of which

we will call “success”, and the other “failure”.

  • 4. The probability p of a success is the same for each
  • bservation.
slide-5
SLIDE 5

The Binomial Distribution

The count X of successes in the binomial setting has the binomial distribution with parameters n and p. The parameter n is the number of observations, and p is the probability of a success on any one observation. The possible values of X are whole numbers from 0 to n. An important caveat: Not all counts have a binomial distribution, so we must ensure that we’re in the binomial setting before we conclude that a count has a binomial distribution.

slide-6
SLIDE 6

Binomial Distribution Examples

◮ You toss a fair coin ten times and count the number of Hs.

◮ n = 10 ◮ p = 1/2

◮ An obstetrician oversees 12 single-birth deliveries on a certain

day and counts the number of girls born.

◮ n = 12 ◮ p = 1/2

◮ You roll a fair die 100 times and count the number of

  • ccurrence of ‘1’.

◮ n = 100 ◮ p = 1/6

slide-7
SLIDE 7

A Non-Example

You select five balls from a barrel containing 50 red balls and 50 blue balls, without replacement. What is the probability of selecting only red balls?

  • 50

100

  • 49

99

  • 48

98

  • 47

97

  • 46

96

  • = 1081

38412 = 0.028 Why aren’t these counts binomially distributed?

slide-8
SLIDE 8

Binomial Probabilities 1

What we’d like is a formula for the probability that a binomial random variable takes any value. Idea: We add probabilities for the different ways of getting exactly that many successes in n observations. That is, if X is a binomial random variable, we want a formula for calculating P(X = k) for any k = 0, 1, 2, . . . , n.

slide-9
SLIDE 9

Binomial Probabilities 2

Let’s first consider an example. Each child born to a particular set of parents has probability 0.25

  • f having blood type O.

If these parents have 5 children, what is the probability of exactly two of them having blood type O? The count of children with blood type O is binomially distributed:

◮ n = 5 ◮ p = 0.25

Let’s use “S” to stand for success (blood type O) and “F” to stand for failure.

slide-10
SLIDE 10

Binomial Probabilities 3

Step 1: What is the probability of that just the first and third child give successes? That is, P(SFSFF) =? The probability of a sequence of independent events is the product

  • f the probabilities of each individual event:

P(SFSFF) = P(S) · P(F) · P(S) · P(F) · P(F) = (0.25)(0.75)(0.25)(0.75)(0.75) = (0.25)2(0.75)3

slide-11
SLIDE 11

Binomial Probabilities 4

Step 2: Observe that any arrangement of 2 S’s and 3 F’s has this same probability: we always just multiply 0.25 twice and 0.75 three times whenever we have 2 S’s and 3 F’s. So the probability that X = 2 is the probability of getting 2 S’s and 3 F’s in any arrangement whatsoever: SSFFF SFSFF SFFSF SFFFS FSSFF FSFSF FSFFS FFSSF FFSFS FFFSS There are ten such arrangements, each with the same probability, and hence P(X = 2) = 10(0.25)2(0.75)3 = 0.2637.

slide-12
SLIDE 12

The Binomial Coefficient

The number of ways of arranging k successes among n

  • bservations is given by the binomial coefficient

n k

  • =

n! k!(n − k)! for any k = 0, 1, 2, . . . , n. Recall that the factorial of n, n! is n! = n · (n − 1) · (n − 2) · . . . · 3 · 2 · 1, and 0!=1.

slide-13
SLIDE 13

The Binomial Coefficient in Action

How many different ways are there to have exactly two successes in five trials? 5 2

  • = 5!

2!3! = (5)(4)(3)(2)(1) (2)(1)(3)(2)(1) = (5)(4) (2)(1) = 20 2 = 10.

slide-14
SLIDE 14

The Official Formula for Binomial Probabilitiies

If X has the binomial distribution with n observations and probability p of success for each observation, then the possible values of X are 0, 1, 2, . . . , n. If k is any one of these values, then P(X = k) = n k

  • pk(1 − p)n−k.
slide-15
SLIDE 15

Example

One in ten boxes of Cracker Jacks contains a decoder ring. What is the probability that no more than one of ten randomly chosen boxes of Cracker Jacks contains a decoder ring?

◮ n = 10 ◮ p = 0.1

P(X ≤ 1) = P(X = 0) + P(X = 1) = 10

  • (0.1)0(0.9)10 +

10 1

  • (0.1)(0.9)9

= 10! 0!10!(1)(0.3487) + 10! 1!9!(0.1)(0.3874) = (1)(1)(0.3487) + (10)(0.1)(0.3874) = 0.3487 + 0.3874 = 0.7361

slide-16
SLIDE 16

Binomial mean and standard deviation

Q In many repetitions of the binomial setting, with n

  • bservations and the probability of success p, what will be the

average count of successes? (In other words, what is the mean of the count variable X?) A If a count X has the binomial distribution with n observations and probability p of success, the mean and standard deviation

  • f X are

µ = np σ =

  • np(1 − p).
slide-17
SLIDE 17

Coin Tossing

You toss a fair coin ten times and count the occurrence of Hs.

◮ n = 10 ◮ p = 1/2

If we repeat the ten trials repeatedly, how many heads should

  • ccur on average?

µ = np = (10)(1/2) = 5 And the standard deviation? σ =

  • np(1 − p) =
  • 10(1/2)(1/2) =
  • 5/2
slide-18
SLIDE 18

The Normal Approximation to Binomial Distributions

Suppose that a count X has the binomial distribution with n

  • bservations and probability of success p.

When n is large, the distribution of X is approximately Normal, N(np,

  • np(1 − p)).

As a rule of thumb, we use the Normal approximation when n is so large that np ≥ 10 and n(1 − p) ≥ 10.

slide-19
SLIDE 19

Remember This?

slide-20
SLIDE 20

One Last Example

About 60% of American adults are either overweight or obese. What is the probability that at least 1520 individuals from a random sample of 2500 adults are overweight or obese? Given that our sample is random, we can take the 2500 members

  • f our sample to be independent.

So we’re in the binomial setting:

◮ n = 2500 ◮ p = 0.6

Using software, we find that P(X ≥ 1520) = 0.2131.

slide-21
SLIDE 21

Let’s Use the Normal Approximation 1

µ = np = (2500)(0.6) = 1500 σ =

  • np(1 − p) =
  • (2500)(0.6)(0.4) = 24.49

The distribution of this binomial random variable is approximated well by the Normal distribution N(1500, 24.49) (since np = 1500 ≥ 10 and n(1 − p) = 1000 ≥ 10).

slide-22
SLIDE 22

Let’s Use the Normal Approximation 2

P(X ≥ 1520) = P

  • X − 1500

24.49 ≥ 1520 − 1500 24.49

  • = P(Z ≥ 0.82)

= 1 − 0.7939 = 0.2061 The Normal approximation 0.2061 differs from the software result 0.2131 by only 0.007.