U nit 2: P robability and distributions L ecture 2: N ormal - - PowerPoint PPT Presentation

u nit 2 p robability and distributions l ecture 2 n ormal
SMART_READER_LITE
LIVE PREVIEW

U nit 2: P robability and distributions L ecture 2: N ormal - - PowerPoint PPT Presentation

U nit 2: P robability and distributions L ecture 2: N ormal distribution S tatistics 101 Mine C etinkaya-Rundel September 17, 2013 Normal distribution Normal distribution 1 Normal distribution model 68-95-99.7 Rule Standardizing with Z


slide-1
SLIDE 1

Unit 2: Probability and distributions Lecture 2: Normal distribution Statistics 101

Mine C ¸ etinkaya-Rundel September 17, 2013

slide-2
SLIDE 2

Normal distribution

1

Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Percentiles Recap

2

Evaluating the normal approximation

3

Application exercises Finding probabilities // Quality control Finding cutoff points // Hot bodies Conditional probability // SAT scores Finding missing parameters // Auto insurance premiums

Statistics 101 U2 - L2: Normal distribution Mine C ¸ etinkaya-Rundel

slide-3
SLIDE 3

Normal distribution

Heights of males

http://blog.okcupid.com/index.php/the-biggest-lies-in-online-dating/ Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 2 / 26

slide-4
SLIDE 4

Normal distribution

Heights of males

“The male heights on OkCupid very nearly follow the expected normal distribution – except the whole thing is shifted to the right of where it should be. Almost universally guys like to add a couple inches.” “You can also see a more subtle vanity at work: starting at roughly 5’ 8”, the top of the dotted curve tilts even further rightward. This means that guys as they get closer to six feet round up a bit more than usual, stretching for that coveted psychological benchmark.”

http://blog.okcupid.com/index.php/the-biggest-lies-in-online-dating/ Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 2 / 26

slide-5
SLIDE 5

Normal distribution

Heights of females

http://blog.okcupid.com/index.php/the-biggest-lies-in-online-dating/ Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 3 / 26

slide-6
SLIDE 6

Normal distribution

Heights of females

“When we looked into the data for women, we were surprised to see height exaggeration was just as widespread, though without the lurch towards a benchmark height.”

http://blog.okcupid.com/index.php/the-biggest-lies-in-online-dating/ Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 3 / 26

slide-7
SLIDE 7

Normal distribution Normal distribution model

Normal distribution

Denoted as N(µ, σ) → Normal with mean µ and standard deviation σ Unimodal and symmetric, bell shaped curve, that also follows very strict guidelines about how variably the data are distributed around the mean Therefore while most variables are nearly normal, but none are exactly normal

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 4 / 26

slide-8
SLIDE 8

Normal distribution 68-95-99.7 Rule

68-95-99.7 Rule

µ − 3σ µ − 2σ µ − σ µ µ + σ µ + 2σ µ + 3σ 99.7% 95% 68%

For nearly normally distributed data,

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 5 / 26

slide-9
SLIDE 9

Normal distribution 68-95-99.7 Rule

68-95-99.7 Rule

µ − 3σ µ − 2σ µ − σ µ µ + σ µ + 2σ µ + 3σ 99.7% 95% 68%

For nearly normally distributed data,

about 68% falls within 1 SD of the mean,

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 5 / 26

slide-10
SLIDE 10

Normal distribution 68-95-99.7 Rule

68-95-99.7 Rule

µ − 3σ µ − 2σ µ − σ µ µ + σ µ + 2σ µ + 3σ 99.7% 95% 68%

For nearly normally distributed data,

about 68% falls within 1 SD of the mean, about 95% falls within 2 SD of the mean,

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 5 / 26

slide-11
SLIDE 11

Normal distribution 68-95-99.7 Rule

68-95-99.7 Rule

µ − 3σ µ − 2σ µ − σ µ µ + σ µ + 2σ µ + 3σ 99.7% 95% 68%

For nearly normally distributed data,

about 68% falls within 1 SD of the mean, about 95% falls within 2 SD of the mean, about 99.7% falls within 3 SD of the mean.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 5 / 26

slide-12
SLIDE 12

Normal distribution 68-95-99.7 Rule

68-95-99.7 Rule

µ − 3σ µ − 2σ µ − σ µ µ + σ µ + 2σ µ + 3σ 99.7% 95% 68%

For nearly normally distributed data,

about 68% falls within 1 SD of the mean, about 95% falls within 2 SD of the mean, about 99.7% falls within 3 SD of the mean.

It is possible for observations to fall 4, 5, or more standard deviations away from the mean, but these occurrences are very rare if the data are nearly normal.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 5 / 26

slide-13
SLIDE 13

Normal distribution 68-95-99.7 Rule

Describing variability using the 68-95-99.7 Rule

SAT scores are distributed nearly normally with mean 1500 and standard deviation 300.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 6 / 26

slide-14
SLIDE 14

Normal distribution 68-95-99.7 Rule

Describing variability using the 68-95-99.7 Rule

SAT scores are distributed nearly normally with mean 1500 and standard deviation 300.

600 900 1200 1500 1800 2100 2400 99.7% 95% 68% Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 6 / 26

slide-15
SLIDE 15

Normal distribution 68-95-99.7 Rule

Describing variability using the 68-95-99.7 Rule

SAT scores are distributed nearly normally with mean 1500 and standard deviation 300.

600 900 1200 1500 1800 2100 2400 99.7% 95% 68%

∼68% of students score between 1200 and 1800 on the SAT.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 6 / 26

slide-16
SLIDE 16

Normal distribution 68-95-99.7 Rule

Describing variability using the 68-95-99.7 Rule

SAT scores are distributed nearly normally with mean 1500 and standard deviation 300.

600 900 1200 1500 1800 2100 2400 99.7% 95% 68%

∼68% of students score between 1200 and 1800 on the SAT. ∼95% of students score between 900 and 2100 on the SAT.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 6 / 26

slide-17
SLIDE 17

Normal distribution 68-95-99.7 Rule

Describing variability using the 68-95-99.7 Rule

SAT scores are distributed nearly normally with mean 1500 and standard deviation 300.

600 900 1200 1500 1800 2100 2400 99.7% 95% 68%

∼68% of students score between 1200 and 1800 on the SAT. ∼95% of students score between 900 and 2100 on the SAT. ∼99.7% of students score between 600 and 2400 on the SAT.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 6 / 26

slide-18
SLIDE 18

Normal distribution 68-95-99.7 Rule

Participation question A doctor collects a large set of heart rate measurements that approx- imately follow a normal distribution. He only reports 3 statistics, the mean = 110 beats per minute, the minimum = 65 beats per minute, and the maximum = 155 beats per minute. Which of the following is most likely to be the standard deviation of the distribution? (a) 5 (b) 15 (c) 35 (d) 90

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 7 / 26

slide-19
SLIDE 19

Normal distribution 68-95-99.7 Rule

Participation question A doctor collects a large set of heart rate measurements that approx- imately follow a normal distribution. He only reports 3 statistics, the mean = 110 beats per minute, the minimum = 65 beats per minute, and the maximum = 155 beats per minute. Which of the following is most likely to be the standard deviation of the distribution? (a) 5 → 110 ± (3 × 5) = (95, 125) (b) 15 → 110 ± (3 × 15) = (65, 155) (c) 35 → 110 ± (3 × 35) = (5, 215) (d) 90 → 110 ± (3 × 90) = (−160, 380)

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 7 / 26

slide-20
SLIDE 20

Normal distribution Standardizing with Z scores

SAT scores are distributed nearly normally with mean 1500 and stan- dard deviation 300. ACT scores are distributed nearly normally with mean 21 and standard deviation 5. A college admissions officer wants to determine which of the two applicants scored better on their stan- dardized test with respect to the other test takers: Pam, who earned an 1800 on her SAT, or Jim, who scored a 24 on his ACT?

600 900 1200 1500 1800 2100 2400 Pam 6 11 16 21 26 31 36 Jim

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 8 / 26

slide-21
SLIDE 21

Normal distribution Standardizing with Z scores

Standardizing with Z scores

Since we cannot just compare these two raw scores, we instead compare how many standard deviations beyond the mean each

  • bservation is.

Pam’s score is 1800−1500

300

= 1 standard deviation above the mean.

Jim’s score is 24−21

5

= 0.6 standard deviations above the mean.

−2 −1 1 2 Pam Jim

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 9 / 26

slide-22
SLIDE 22

Normal distribution Standardizing with Z scores

Standardizing with Z scores (cont.)

These are called standardized scores, or Z scores.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 10 / 26

slide-23
SLIDE 23

Normal distribution Standardizing with Z scores

Standardizing with Z scores (cont.)

These are called standardized scores, or Z scores. Z score of an observation is the number of standard deviations it falls above or below the mean. Z scores

Z = observation − mean SD

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 10 / 26

slide-24
SLIDE 24

Normal distribution Standardizing with Z scores

Standardizing with Z scores (cont.)

These are called standardized scores, or Z scores. Z score of an observation is the number of standard deviations it falls above or below the mean. Z scores

Z = observation − mean SD

Z scores are defined for distributions of any shape, but only when the distribution is normal can we use Z scores to calculate percentiles.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 10 / 26

slide-25
SLIDE 25

Normal distribution Standardizing with Z scores

Standardizing with Z scores (cont.)

These are called standardized scores, or Z scores. Z score of an observation is the number of standard deviations it falls above or below the mean. Z scores

Z = observation − mean SD

Z scores are defined for distributions of any shape, but only when the distribution is normal can we use Z scores to calculate percentiles. Observations that are more than 2 SD away from the mean (|Z| > 2) are usually considered unusual.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 10 / 26

slide-26
SLIDE 26

Normal distribution Standardizing with Z scores

Participation question Scores on a standardized test are normally distributed with a mean of 100 and a standard deviation of 20. If these scores are converted to standard normal Z scores, which of the following statements will be correct? (a) Both the mean and median score will equal 0. (b) The mean will equal 0, but the median cannot be determined. (c) The mean of the standardized z-scores will equal 100. (d) The mean of the standardized z-scores will equal 5.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 11 / 26

slide-27
SLIDE 27

Normal distribution Standardizing with Z scores

Participation question Scores on a standardized test are normally distributed with a mean of 100 and a standard deviation of 20. If these scores are converted to standard normal Z scores, which of the following statements will be correct? (a) Both the mean and median score will equal 0. (b) The mean will equal 0, but the median cannot be determined. (c) The mean of the standardized z-scores will equal 100. (d) The mean of the standardized z-scores will equal 5.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 11 / 26

slide-28
SLIDE 28

Normal distribution Percentiles

Percentiles

Percentile is the percentage of observations that fall below a given data point. Graphically, percentile is the area below the probability distribution curve to the left of that observation.

600 900 1200 1500 1800 2100 2400

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 12 / 26

slide-29
SLIDE 29

Normal distribution Percentiles

Approximating percentiles

Approximately what percent of students score below 1800 on the SAT? (Hint: Use the 68-95-99.7% rule.)

600 900 1200 1500 1800 2100 2400

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 13 / 26

slide-30
SLIDE 30

Normal distribution Percentiles

Approximating percentiles

Approximately what percent of students score below 1800 on the SAT? (Hint: Use the 68-95-99.7% rule.)

100 − 68 = 32% 32/2 = 16% 68 + 16 = 84%

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 13 / 26

slide-31
SLIDE 31

Normal distribution Percentiles

Calculating percentiles - using computation

There are many ways to compute percentiles/areas under the curve:

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 14 / 26

slide-32
SLIDE 32

Normal distribution Percentiles

Calculating percentiles - using computation

There are many ways to compute percentiles/areas under the curve: R: > pnorm(1800, mean = 1500, sd = 300) [1] 0.8413447

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 14 / 26

slide-33
SLIDE 33

Normal distribution Percentiles

Calculating percentiles - using computation

There are many ways to compute percentiles/areas under the curve: R: > pnorm(1800, mean = 1500, sd = 300) [1] 0.8413447 Applet: http://www.socr.ucla.edu/htmls/SOCR Distributions.html

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 14 / 26

slide-34
SLIDE 34

Normal distribution Percentiles

Calculating percentiles - using tables

Second decimal place of Z Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0

0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359

0.1

0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753

0.2

0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141

0.3

0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517

0.4

0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879

0.5

0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224

0.6

0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549

0.7

0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852

0.8

0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133

0.9

0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389

1.0

0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621

1.1

0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830

1.2

0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015 Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 15 / 26

slide-35
SLIDE 35

Normal distribution Percentiles

Participation question What percent of the standard normal distribution is above Z = 0.82? Choose the closest answer. (a) 79.4% (b) 20.6% (c) 82% (d) 18% (e) Need to be provided the mean and the standard deviation of the distribution in order to be able to solve this problem.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 16 / 26

slide-36
SLIDE 36

Normal distribution Percentiles

Participation question What percent of the standard normal distribution is above Z = 0.82? Choose the closest answer. (a) 79.4% (b) 20.6% (c) 82% (d) 18% (e) Need to be provided the mean and the standard deviation of the distribution in order to be able to solve this problem.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 16 / 26

slide-37
SLIDE 37

Normal distribution Recap

Participation question Which of the following is false? (a) Z scores are helpful for determining how unusual a data point is compared to the rest of the data in the distribution. (b) Majority of Z scores in a right skewed distribution are negative. (c) Regardless of the shape of the distribution (symmetric vs. skewed) the Z score of the mean is always 0. (d) In a normal distribution, Q1 and Q3 are more than one SD away from the mean.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 17 / 26

slide-38
SLIDE 38

Normal distribution Recap

Participation question Which of the following is false? (a) Z scores are helpful for determining how unusual a data point is compared to the rest of the data in the distribution. (b) Majority of Z scores in a right skewed distribution are negative. (c) Regardless of the shape of the distribution (symmetric vs. skewed) the Z score of the mean is always 0. (d) In a normal distribution, Q1 and Q3 are more than one SD away from the mean.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 17 / 26

slide-39
SLIDE 39

Evaluating the normal approximation

1

Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Percentiles Recap

2

Evaluating the normal approximation

3

Application exercises Finding probabilities // Quality control Finding cutoff points // Hot bodies Conditional probability // SAT scores Finding missing parameters // Auto insurance premiums

Statistics 101 U2 - L2: Normal distribution Mine C ¸ etinkaya-Rundel

slide-40
SLIDE 40

Evaluating the normal approximation

Normal probability plot

A histogram and normal probability plot of a sample of 100 male heights.

Male heights (inches) 60 65 70 75 80

  • Theoretical Quantiles

male heights (in.) −2 −1 1 2 65 70 75

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 18 / 26

slide-41
SLIDE 41

Evaluating the normal approximation

Anatomy of a normal probability plot

Data are plotted on the y-axis of a normal probability plot, and theoretical quantiles (following a normal distribution) on the x-axis. If there is a one-to-one relationship between the data and the theoretical quantiles, then the data follow a nearly normal distribution. Since a one-to-one relationship would appear as a straight line

  • n a scatter plot, the closer the points are to a perfect straight

line, the more confident we can be that the data follow the normal model. Constructing a normal probability plot requires calculating percentiles and corresponding z-scores for each observation, which is tedious. Therefore we generally rely on software when making these plots.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 19 / 26

slide-42
SLIDE 42

Evaluating the normal approximation

Below is a histogram and normal probability plot for the NBA heights from the 2008-2009 season. Do these data appear to follow a normal distribution?

Height (inches)

70 75 80 85 90

  • Theoretical quantiles

NBA heights

−3 −2 −1 1 2 3 70 75 80 85 90 Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 20 / 26

slide-43
SLIDE 43

Evaluating the normal approximation

Below is a histogram and normal probability plot for the NBA heights from the 2008-2009 season. Do these data appear to follow a normal distribution?

Height (inches)

70 75 80 85 90

  • Theoretical quantiles

NBA heights

−3 −2 −1 1 2 3 70 75 80 85 90

Why do the points on the normal probability have jumps?

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 20 / 26

slide-44
SLIDE 44

Evaluating the normal approximation

Normal probability plot and skewness

Right Skew - If the plotted points appear to bend up and to the left of the normal line that indicates a long tail to the right. Left Skew - If the plotted points bend down and to the right of the normal line that indicates a long tail to the left. Short Tails - An S shaped-curve indicates shorter than normal tails, i.e. narrower than expected. Long Tails - A curve which starts below the normal line, bends to follow it, and ends above it indicates long tails. That is, you are seeing more variance than you would expect in a normal distribution, i.e. wider than expected.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 21 / 26

slide-45
SLIDE 45

Application exercises

1

Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Percentiles Recap

2

Evaluating the normal approximation

3

Application exercises Finding probabilities // Quality control Finding cutoff points // Hot bodies Conditional probability // SAT scores Finding missing parameters // Auto insurance premiums

Statistics 101 U2 - L2: Normal distribution Mine C ¸ etinkaya-Rundel

slide-46
SLIDE 46

Application exercises Finding probabilities // Quality control

Six sigma

“The term “six sigma process” comes from the notion that if one has six standard deviations between the process mean and the nearest specification limit, as shown in the graph, practically no items will fail to meet specifications.”

http://en.wikipedia.org/wiki/Six Sigma Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 22 / 26

slide-47
SLIDE 47

Application exercises Finding probabilities // Quality control

At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles pass the quality control inspection?

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 23 / 26

slide-48
SLIDE 48

Application exercises Finding probabilities // Quality control

At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles pass the quality control inspection?

35.8 36 36.2

=

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 23 / 26

slide-49
SLIDE 49

Application exercises Finding probabilities // Quality control

At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles pass the quality control inspection?

35.8 36 36.2

=

36 36.2

  • Statistics 101 (Mine C

¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 23 / 26

slide-50
SLIDE 50

Application exercises Finding probabilities // Quality control

At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles pass the quality control inspection?

35.8 36 36.2

=

36 36.2

  • 35.8

36 Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 23 / 26

slide-51
SLIDE 51

Application exercises Finding probabilities // Quality control

At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles pass the quality control inspection?

35.8 36 36.2

=

36 36.2

  • 35.8

36

Z = 36.2−36

0.11

= 1.82

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 23 / 26

slide-52
SLIDE 52

Application exercises Finding probabilities // Quality control

At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles pass the quality control inspection?

35.8 36 36.2

=

36 36.2

  • 35.8

36

Z = 36.2−36

0.11

= 1.82 P(Z < 1.82) = 0.9656

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 23 / 26

slide-53
SLIDE 53

Application exercises Finding probabilities // Quality control

At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles pass the quality control inspection?

35.8 36 36.2

=

36 36.2

  • 35.8

36

Z = 36.2−36

0.11

= 1.82 P(Z < 1.82) = 0.9656 Z = 35.8−36

0.11

= −1.82

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 23 / 26

slide-54
SLIDE 54

Application exercises Finding probabilities // Quality control

At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles pass the quality control inspection?

35.8 36 36.2

=

36 36.2

  • 35.8

36

Z = 36.2−36

0.11

= 1.82 P(Z < 1.82) = 0.9656 Z = 35.8−36

0.11

= −1.82 P(Z < −1.82) = 0.0344

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 23 / 26

slide-55
SLIDE 55

Application exercises Finding probabilities // Quality control

At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles pass the quality control inspection?

35.8 36 36.2

=

36 36.2

  • 35.8

36

Z = 36.2−36

0.11

= 1.82 P(Z < 1.82) = 0.9656 Z = 35.8−36

0.11

= −1.82 P(Z < −1.82) = 0.0344

0.9656 − 0.0344 = 0.9312

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 23 / 26

slide-56
SLIDE 56

Application exercises Finding cutoff points // Hot bodies

Body temperatures of healthy humans are distributed nearly normally with mean 98.2◦F and standard deviation 0.73◦F. What is the cutoff for the highest 10% of human body temperatures?

Mackowiak, Wasserman, and Levine (1992), A Critical Appraisal of 98.6 Degrees F, the Upper Limit of the Normal Body Temperature, and Other Legacies of Carl Reinhold August Wunderlick. Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 24 / 26

slide-57
SLIDE 57

Application exercises Finding cutoff points // Hot bodies

Body temperatures of healthy humans are distributed nearly normally with mean 98.2◦F and standard deviation 0.73◦F. What is the cutoff for the highest 10% of human body temperatures?

Mackowiak, Wasserman, and Levine (1992), A Critical Appraisal of 98.6 Degrees F, the Upper Limit of the Normal Body Temperature, and Other Legacies of Carl Reinhold August Wunderlick.

98.2 ? 0.10 0.90

Z 0.05 0.06 0.07 0.08 0.09 1.0

0.8531 0.8554 0.8577 0.8599 0.8621

1.1

0.8749 0.8770 0.8790 0.8810 0.8830

1.2

0.8944 0.8962 0.8980 0.8997 0.9015

1.3

0.9115 0.9131 0.9147 0.9162 0.9177 Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 24 / 26

slide-58
SLIDE 58

Application exercises Finding cutoff points // Hot bodies

Body temperatures of healthy humans are distributed nearly normally with mean 98.2◦F and standard deviation 0.73◦F. What is the cutoff for the highest 10% of human body temperatures?

Mackowiak, Wasserman, and Levine (1992), A Critical Appraisal of 98.6 Degrees F, the Upper Limit of the Normal Body Temperature, and Other Legacies of Carl Reinhold August Wunderlick.

98.2 ? 0.10 0.90

Z 0.05 0.06 0.07 0.08 0.09 1.0

0.8531 0.8554 0.8577 0.8599 0.8621

1.1

0.8749 0.8770 0.8790 0.8810 0.8830

1.2

0.8944 0.8962 0.8980 0.8997 0.9015

1.3

0.9115 0.9131 0.9147 0.9162 0.9177

P(X > x) = 0.10 → P(Z < 1.28) = 0.90 Z = obs − mean SD → x − 98.2 0.73 = 1.28 x = (1.28 × 0.73) + 98.2 = 99.1

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 24 / 26

slide-59
SLIDE 59

Application exercises Conditional probability // SAT scores

SAT scores (out of 2400) are distributed normally with mean 1500 and standard devi- ation 300. Suppose a school council awards a certificate of excellence to all students who score at least 1900 on the SAT. What percent of the students who received this certificate scored above 2100?

P(SAT > 2100 | SAT > 1900) = P(SAT > 2100 and SAT > 1900) P(SAT > 1900)

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 25 / 26

slide-60
SLIDE 60

Application exercises Conditional probability // SAT scores

SAT scores (out of 2400) are distributed normally with mean 1500 and standard devi- ation 300. Suppose a school council awards a certificate of excellence to all students who score at least 1900 on the SAT. What percent of the students who received this certificate scored above 2100?

P(SAT > 2100 | SAT > 1900) = P(SAT > 2100 and SAT > 1900) P(SAT > 1900) = P(SAT > 2100) P(SAT > 1900)

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 25 / 26

slide-61
SLIDE 61

Application exercises Conditional probability // SAT scores

SAT scores (out of 2400) are distributed normally with mean 1500 and standard devi- ation 300. Suppose a school council awards a certificate of excellence to all students who score at least 1900 on the SAT. What percent of the students who received this certificate scored above 2100?

P(SAT > 2100 | SAT > 1900) = P(SAT > 2100 and SAT > 1900) P(SAT > 1900) = P(SAT > 2100) P(SAT > 1900) P(SAT > 2100) = P 2100 − 1500 300

  • Statistics 101 (Mine C

¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 25 / 26

slide-62
SLIDE 62

Application exercises Conditional probability // SAT scores

SAT scores (out of 2400) are distributed normally with mean 1500 and standard devi- ation 300. Suppose a school council awards a certificate of excellence to all students who score at least 1900 on the SAT. What percent of the students who received this certificate scored above 2100?

P(SAT > 2100 | SAT > 1900) = P(SAT > 2100 and SAT > 1900) P(SAT > 1900) = P(SAT > 2100) P(SAT > 1900) P(SAT > 2100) = P 2100 − 1500 300

  • =

P(Z > 2) = 1 − 0.9772 = 0.0228

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 25 / 26

slide-63
SLIDE 63

Application exercises Conditional probability // SAT scores

SAT scores (out of 2400) are distributed normally with mean 1500 and standard devi- ation 300. Suppose a school council awards a certificate of excellence to all students who score at least 1900 on the SAT. What percent of the students who received this certificate scored above 2100?

P(SAT > 2100 | SAT > 1900) = P(SAT > 2100 and SAT > 1900) P(SAT > 1900) = P(SAT > 2100) P(SAT > 1900) P(SAT > 2100) = P 2100 − 1500 300

  • =

P(Z > 2) = 1 − 0.9772 = 0.0228 P(X > 1900) = P(Z > 1.33) = 1 − 0.9082 = 0.0918

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 25 / 26

slide-64
SLIDE 64

Application exercises Conditional probability // SAT scores

SAT scores (out of 2400) are distributed normally with mean 1500 and standard devi- ation 300. Suppose a school council awards a certificate of excellence to all students who score at least 1900 on the SAT. What percent of the students who received this certificate scored above 2100?

P(SAT > 2100 | SAT > 1900) = P(SAT > 2100 and SAT > 1900) P(SAT > 1900) = P(SAT > 2100) P(SAT > 1900) P(SAT > 2100) = P 2100 − 1500 300

  • =

P(Z > 2) = 1 − 0.9772 = 0.0228 P(X > 1900) = P(Z > 1.33) = 1 − 0.9082 = 0.0918 P(SAT > 2100 | SAT > 1900) = 0.0228 0.0918 ≈ 0.25 →

25% of students

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 25 / 26

slide-65
SLIDE 65

Application exercises Finding missing parameters // Auto insurance premiums

Suppose a newspaper article states that the distribution of auto insurance premiums for residents of California is approximately normal with a mean of $1,650. The article also states that 25% of California residents pay more than $1,800.

  • 1. What is the standard deviation of this distribution?
  • 2. What is the IQR of this distribution?

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 26 / 26

slide-66
SLIDE 66

Application exercises Finding missing parameters // Auto insurance premiums

Suppose a newspaper article states that the distribution of auto insurance premiums for residents of California is approximately normal with a mean of $1,650. The article also states that 25% of California residents pay more than $1,800.

  • 1. What is the standard deviation of this distribution?
  • 2. What is the IQR of this distribution?
  • 1. SD: The Z score corresponding to the upper 25% of the distribution

is 0.67.

Z = x − µ σ

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 26 / 26

slide-67
SLIDE 67

Application exercises Finding missing parameters // Auto insurance premiums

Suppose a newspaper article states that the distribution of auto insurance premiums for residents of California is approximately normal with a mean of $1,650. The article also states that 25% of California residents pay more than $1,800.

  • 1. What is the standard deviation of this distribution?
  • 2. What is the IQR of this distribution?
  • 1. SD: The Z score corresponding to the upper 25% of the distribution

is 0.67.

Z = x − µ σ → 0.67 = 1800 − 1650 σ

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 26 / 26

slide-68
SLIDE 68

Application exercises Finding missing parameters // Auto insurance premiums

Suppose a newspaper article states that the distribution of auto insurance premiums for residents of California is approximately normal with a mean of $1,650. The article also states that 25% of California residents pay more than $1,800.

  • 1. What is the standard deviation of this distribution?
  • 2. What is the IQR of this distribution?
  • 1. SD: The Z score corresponding to the upper 25% of the distribution

is 0.67.

Z = x − µ σ → 0.67 = 1800 − 1650 σ σ = 1800 − 1650 0.67

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 26 / 26

slide-69
SLIDE 69

Application exercises Finding missing parameters // Auto insurance premiums

Suppose a newspaper article states that the distribution of auto insurance premiums for residents of California is approximately normal with a mean of $1,650. The article also states that 25% of California residents pay more than $1,800.

  • 1. What is the standard deviation of this distribution?
  • 2. What is the IQR of this distribution?
  • 1. SD: The Z score corresponding to the upper 25% of the distribution

is 0.67.

Z = x − µ σ → 0.67 = 1800 − 1650 σ σ = 1800 − 1650 0.67 = $223.88

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 26 / 26

slide-70
SLIDE 70

Application exercises Finding missing parameters // Auto insurance premiums

Suppose a newspaper article states that the distribution of auto insurance premiums for residents of California is approximately normal with a mean of $1,650. The article also states that 25% of California residents pay more than $1,800.

  • 1. What is the standard deviation of this distribution?
  • 2. What is the IQR of this distribution?
  • 1. SD: The Z score corresponding to the upper 25% of the distribution

is 0.67.

Z = x − µ σ → 0.67 = 1800 − 1650 σ σ = 1800 − 1650 0.67 = $223.88

  • 2. IQR: 25th and 75th percentile are equally distant from the mean,

therefore 25th percentile is 1650 − 150 = 1500.

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 26 / 26

slide-71
SLIDE 71

Application exercises Finding missing parameters // Auto insurance premiums

Suppose a newspaper article states that the distribution of auto insurance premiums for residents of California is approximately normal with a mean of $1,650. The article also states that 25% of California residents pay more than $1,800.

  • 1. What is the standard deviation of this distribution?
  • 2. What is the IQR of this distribution?
  • 1. SD: The Z score corresponding to the upper 25% of the distribution

is 0.67.

Z = x − µ σ → 0.67 = 1800 − 1650 σ σ = 1800 − 1650 0.67 = $223.88

  • 2. IQR: 25th and 75th percentile are equally distant from the mean,

therefore 25th percentile is 1650 − 150 = 1500.

IQR = 1800 − 1500 = 300

Statistics 101 (Mine C ¸ etinkaya-Rundel) U2 - L2: Normal distribution September 17, 2013 26 / 26