STAT 113 Working with Theoretical Distributions Colin Reimer Dawson - - PowerPoint PPT Presentation

stat 113 working with theoretical distributions
SMART_READER_LITE
LIVE PREVIEW

STAT 113 Working with Theoretical Distributions Colin Reimer Dawson - - PowerPoint PPT Presentation

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution STAT 113 Working with Theoretical Distributions Colin Reimer Dawson Oberlin College November 2, 2017 1 / 26 Analytic Approximations


slide-1
SLIDE 1

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

STAT 113 Working with Theoretical Distributions

Colin Reimer Dawson

Oberlin College

November 2, 2017 1 / 26

slide-2
SLIDE 2

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution 2 / 26

slide-3
SLIDE 3

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Outline

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution 3 / 26

slide-4
SLIDE 4

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

P-value = Proportion of Randomized Sample Statistics

200 220 240 260 280 300 0.00 0.02 0.04 Values Probability

  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
  • ● ● ● ●
  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

P(X ≥ 270) ≈ 0.04

Figure: Randomization distribution for the number of heads in 500 coin flips, highlighting the one-tailed P-value testing H1 : p > 0.5 for an

  • bservation of 270 heads.

4 / 26

slide-5
SLIDE 5

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Confidence Level = Proportion of Bootstrap Samples

xMercury Density

2 4 6 8 0.4 0.5 0.6 0.7

Figure: Bootstrap distribution for mean mercury level in fish in Florida Lakes (from FloridaLakes dataset). The middle 95% is highlighted illustrating a 95% confidence interval.

5 / 26

slide-6
SLIDE 6

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Properties of Sampling Distributions

Most (about 95%) of simple random samples have a sample mean (¯ x) which is within 2 Standard Errors of the population mean (µ). Therefore, about 95% of the time, the population mean will be within 2SE of the sample mean! A similar statement holds for some other statistics/parameters, under a particular condition. What condition? The sampling distribution needs to be (approximately) symmetric and bell-shaped 6 / 26

slide-7
SLIDE 7

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

So what’s with all these bell shapes?

  • Q: Why are so many distributions “bell-shaped”?
  • A: The Central Limit Theorem
  • One of the most important results in probability: for

sufficiently large samples, sample means have a Normal (bell-shaped) distribution. 7 / 26

slide-8
SLIDE 8

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Sample Means Show Up A Lot

  • Sample means are sample means (did you know this?)
  • Sample proportions are sample means (encode binary variable

as 0s and 1s) 8 / 26

slide-9
SLIDE 9

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Even More Stuff is Normal

Also...

  • Sum of two Normals is Normal
  • Rescaling a Normal by a constant is Normal
  • Difference of Normals is Normal

So...

  • Sampling distribution for difference of sample means is

approximately Normal

  • Sampling distribution for difference of sample proportions is

approximately Normal 9 / 26

slide-10
SLIDE 10

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Outline

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution 10 / 26

slide-11
SLIDE 11

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Approximating with a Smooth Curve

Birthweight in oz Frequency 50 100 150 200 400 800 Birthweight in oz Frequency 50 100 150 200 200 400 Birthweight in oz Frequency 50 100 150 200 100 200 Birthweight in oz Frequency 50 100 150 200 100 200 Birthweight in oz Frequency 50 100 150 200 50 100 150 Birthweight in oz Frequency 50 100 150 200 20 40 60

Figure: Frequency Histograms of Babies’ Birth Weights (Nolan and Speed, 2000)

11 / 26

slide-12
SLIDE 12

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Density

Proportion = Area = Height × Width Density = Height = Proportion Width This quantity (proportion divided by width) is called “density” by analogy to physics: “amount of stuff” divided by “amount of space”. 12 / 26

slide-13
SLIDE 13

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Density Histograms

Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030 Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030 Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030 Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030 Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030 Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030

Figure: Density Histograms of Babies’ Birth Weights (Nolan and Speed, 2000)

13 / 26

slide-14
SLIDE 14

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Density Functions

Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030 Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030 Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030 Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030 Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030 Birthweight in oz Density 50 100 150 200 0.000 0.015 0.030

Figure: Densities of Babies’ Birth Weights (Nolan and Speed, 2000)

14 / 26

slide-15
SLIDE 15

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Proportion = Area Under the Density Curve

50 100 150 200 0.000 0.010 0.020 Birthweight in oz Density P = 0.067

Figure: Approximating birth weight distribution using a Normal. Shaded area

is P(Weight ≥ 148 oz)

15 / 26

slide-16
SLIDE 16

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Outline

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution 16 / 26

slide-17
SLIDE 17

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Normal Distributions

Normal distributions are completely specified by their mean (µ) and their standard deviation (σ). We can write N(0, 1) as shorthand for a Normal with mean 0 and standard deviation 1.

−6 −4 −2 2 4 6 0.0 0.5 1.0 1.5 x Density N(0, 1) N(2, 1) N(0, 0.5) N(−4, 0.3)

density(x) = 1 σ √ 2πe−( x−µ

σ ) 2

, but we won’t use this directly. 17 / 26

slide-18
SLIDE 18

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Normal Distributions

µ − 2σ µ − σ µ µ + σ µ + 2σ

Pairs: (Approximately) what proportion of the area under the curve is shaded? In a bell-shaped (normal) distribution, 95% of cases lie within 2 standard deviations of the mean. So 5% lie beyond 2σ from µ. 18 / 26

slide-19
SLIDE 19

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Area Under Normal Curve

−2 −1 1 2

Area under a curve using calculus: ∞

1.5

1 σ √ 2πe−( x−0

1 ) 2

dx but this integrand doesn’t have a closed-form antiderivative 19 / 26

slide-20
SLIDE 20

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

StatKey to the Rescue!

20 / 26

slide-21
SLIDE 21

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

R Works Too

library("mosaic") ## Area to the right of 1.5 xpnorm(1.5, mean = 0, sd = 1, lower.tail = FALSE) If X ~ N(0, 1), then P(X <= 1.5) = P(Z <= 1.5) = 0.9331928 P(X > 1.5) = P(Z > 1.5) = 0.0668072

density

0.1 0.2 0.3 0.4 0.5 −2 2

1.5

(z=1.5) 0.9332 0.0668

21 / 26

slide-22
SLIDE 22

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Outline

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution 22 / 26

slide-23
SLIDE 23

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Quantiles of a Normal Curve

Suppose that the bootstrap distribution of means for samples of size 500 Atlanta commute times is N(29.11, 0.93). Find an endpoint (percentile) so that just 5% of the bootstrap means are smaller. 23 / 26

slide-24
SLIDE 24

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

StatKey...

24 / 26

slide-25
SLIDE 25

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

And in R ...

xqnorm(0.05, mean = 29.11, sd = 0.93) P(X <= 27.5802861269351) = 0.05 P(X > 27.5802861269351) = 0.95

density

0.1 0.2 0.3 0.4 0.5 26 28 30 32

27.5803 (z=−1.645) 0.05 0.95

[1] 27.58029

25 / 26

slide-26
SLIDE 26

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution

Goals

Confidence Intervals

If we can approximate a bootstrap distribution with a Normal, we can construct a confidence interval.

P-values

If we can approximate a randomization distribution with a Normal, we can compute P-values. 26 / 26