STAT 113 Normal-Based Inference for Proportions Colin Reimer Dawson - - PowerPoint PPT Presentation

stat 113 normal based inference for proportions
SMART_READER_LITE
LIVE PREVIEW

STAT 113 Normal-Based Inference for Proportions Colin Reimer Dawson - - PowerPoint PPT Presentation

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test STAT 113 Normal-Based Inference for Proportions Colin Reimer Dawson Oberlin College November 11-12, 2019 1 / 21 Theoretical SE Distribution of Single


slide-1
SLIDE 1

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

STAT 113 Normal-Based Inference for Proportions

Colin Reimer Dawson

Oberlin College

November 11-12, 2019 1 / 21

slide-2
SLIDE 2

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Outline

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test 2 / 21

slide-3
SLIDE 3

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Limits of Normal Approximation So Far

  • So far we have still needed to do simulation to calculate the

standard error

  • We can avoid that with some more theory

3 / 21

slide-4
SLIDE 4

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Cases to Address

We will need standard errors to do CIs and tests for the following parameters:

  • 1. Single Proportion (now)
  • 2. Single Mean (tomorrow)
  • 3. Difference of Proportions (tomorrow?)
  • 4. Difference of Means (tomorrow?)
  • 5. Mean of Differences (Thursday?)

4 / 21

slide-5
SLIDE 5

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Analytic Approximations of Sampling Distributions

Param. Stat. Theory SE Distribution df p ˆ p

  • p(1−p)

n

N(0, 1) – µ ¯ x

  • s2

n

t n − 1 pA − pB ˆ pA − ˆ pB

  • SE2

ˆ pA + SE2 ˆ pB

N(0, 1) – µA − µB ¯ xA − ¯ xB

  • SE2

ˆ pA + SE2 ˆ pB

t min(nA, nB) − 1 µdiff ¯ xdiff

  • s2

diff

ndiff

t ndiff − 1 ρ r

  • 1−r2

n−2

t n − 2

CI : Observed Statistic ± Standardized Quantile × SE Sandardized Test Statistic : Observed Statistic − Null Param.

  • SE

5 / 21

slide-6
SLIDE 6

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Outline

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test 6 / 21

slide-7
SLIDE 7

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Sampling Distribution of a Sample Proportion

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 p ^

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 p ^

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 p ^

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.00 0.15 p ^

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.00 0.15 p ^

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.00 0.15 p ^

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.00 0.02 0.04 p ^

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.00 0.02 0.04 p ^

  • 0.0

0.2 0.4 0.6 0.8 1.0 0.00 0.02 0.04 p ^

  • Columns: values of p (left: 0.1, middle: 0.5; right: 0.9)

Rows: values of n (top: 10, middle: 50; bottom: 1000)

  • Larger samples and more population homogeneity make SE go

down 7 / 21

slide-8
SLIDE 8

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Sampling Distribution of ˆ p

  • Condition: Use a Normal approximation with at least 10

expected cases of each outcome: np ≥ 10 n(1 − p) ≥ 10

  • Mean: p
  • Standard deviation (standard error):

SEˆ

p =

  • p(1 − p)

n 8 / 21

slide-9
SLIDE 9

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Outline

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test 9 / 21

slide-10
SLIDE 10

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

CI Summary: Single Proportion

  • 0. Check whether conditions for distribution approximation hold:

nˆ p >= 10 and n(1 − ˆ p) ≥ 10

  • 1. If so, find the mean and SD of the theoretical distribution to

replace the bootstrap distribution

  • Mean = Sample Statistic = ˆ

p

  • SD = Standard Error =
  • ˆ

p(1−ˆ p) n

  • 2. Use the confidence level and a Standard Normal to get

z-scores of the endpoints

  • 95%: z = ±1.96 (≈ 2)
  • 99%: z = ±2.58
  • 90%: z = ±1.64
  • 3. Convert z scores to endpoints on the original scale using the

mean and standard deviation found in step 1. Endpoint = Sample Statistic + z · SE 10 / 21

slide-11
SLIDE 11

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Example: Kissing Right

Most people are right-handed, and even the right eye is dominant for most people. Developmental biologists have suggested that late-stage human embryos tend to turn their heads to the right. In a study reported in Nature (2003), German bio-psychologist Onur Güntürkün studied kissing couples in public places such as airports, train stations, beaches, and parks. They observed 124 couples, age 13-70 years. For each kissing couple observed, the researchers noted whether the couple leaned their heads to the right or to the

  • left. Of the 124 couples, 80 leaned right.

Let’s find a 95% confidence interval for p, the proportion of all couples in the target population who would lean right. 11 / 21

slide-12
SLIDE 12

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Kissing Right: Descriptive Stats and Conditions

  • Population Parameter (what we want to estimate): p,

proportion of all couples who would lean right

  • Sample Statistic (what we have): ˆ

p, proportion of couples in this study who leaned right ˆ p = 80/124 = 0.645

  • Conditions: nˆ

p ≥ 10 and n(1 − ˆ p) ≥ 10 nˆ p = 124 × 0.645 = 80 n(1 − ˆ p) = 124 × 0.355 = 44 so we are okay to use the Normal distribution approximation. 12 / 21

slide-13
SLIDE 13

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Kissing Right: Standard Error

  • Standard Error for a proportion: SEˆ

p =

  • p(1−p)

n

  • Estimate with

ˆ SE ˆ

p =

  • ˆ

p(1−ˆ p) n

  • We have n = 124 and ˆ

p = 0.645: ˆ SE ˆ

p =

  • ˆ

p(1 − ˆ p) n =

  • 0.645 × 0.355

124 = 0.043 13 / 21

slide-14
SLIDE 14

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Kissing Right: z score and Confidence Interval

  • For a 95% interval, the z-scores of the endpoints are the 0.025

and 0.975 quantiles of a standard Normal

zEndpoints <- xqnorm(c(0.025, 0.975), mean = 0, sd = 1)

0.0 0.1 0.2 0.3 0.4 −2 2

density probability

A:0.0250000 B:0.9500000 C:0.0250000

zEndpoints [1] -1.959964 1.959964

  • The confidence interval is given by

Sample Stat. ± zendpoint · ˆ SE

  • which is 0.645 ± 1.96 · 0.043, or [0.561, 0.729]
  • Conclusion: We are 95% confident that between 56.1% and

72.9% of couples would tend to lean right when kissing. 14 / 21

slide-15
SLIDE 15

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Outline

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test 15 / 21

slide-16
SLIDE 16

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

P-values: Single Proportion

Computing P-values using a Normal in place of a randomization distribution:

  • 0. Check whether conditions for distribution approximation hold:

np0 ≥ 10 and n(1 − p0) ≥ 10

  • 1. If so, find the mean and SD of the Normal to replace the

randomization distribution

  • Mean = Null Parameter Value = p0
  • SD = Standard Error =
  • p0(1−p0)

n

  • 2. Convert the observed statistic to its z-score within this Normal

distribution z = Observed Sample Statistic − Null Parameter Standard Error

  • 3. The P-value is the area under the Standard Normal curve past

z (or past z and −z if two-tailed) 16 / 21

slide-17
SLIDE 17

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Example: Kissing Right

Most people are right-handed, and even the right eye is dominant for most people. Developmental biologists have suggested that late-stage human embryos tend to turn their heads to the right. In a study reported in Nature (2003), German bio-psychologist Onur Güntürkün studied kissing couples in public places such as airports, train stations, beaches, and parks. They observed 124 couples, age 13-70 years. For each kissing couple observed, the researchers noted whether the couple leaned their heads to the right or to the

  • left. Of the 124 couples, 80 leaned right.

Let’s assess how strong the evidence is against the null hypothesis that couples are equally likely to lean right and left. 17 / 21

slide-18
SLIDE 18

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Kissing Right: Descriptive Stats and Conditions

  • Population Parameter (what our hypotheses are about): p,

proportion of all couples who would lean right

  • Hypotheses

H0 : p = 0.5 (We write p0 for the null parameter) H1 : p = 0.5

  • Sample Statistic (what we have): ˆ

p, proportion of couples in this study who leaned right ˆ p = 80/124 = 0.645

  • Conditions: np0 ≥ 10 and n(1 − p0) ≥ 10

np0 = 124 × 0.5 = 62 n(1 − p0) = 124 × 0.5 = 62 so we are okay to use the Normal distribution approximation. 18 / 21

slide-19
SLIDE 19

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Kissing Right: Standard Error

  • Standard Error for a proportion: SEˆ

p =

  • p(1−p)

n

  • Estimate with

ˆ SE ˆ

p =

  • p0(1−p0)

n

  • We have n = 124 and p0 = 0.5:

ˆ SE ˆ

p =

  • p0(1 − p0)

n =

  • 0.5 × 0.5

124 = 0.045 19 / 21

slide-20
SLIDE 20

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Kissing Right: z-score

  • In place of a randomization distribution, we use a Normal with

mean p0 = 0.5 and standard deviation equal to our estimated standard error: 0.045.

  • Find the z-score (test statistic) associated with our observed

sample statistic, ˆ p = 0.645 Test Statistic = Observed Statistic − Null Parameter ˆ SE = ˆ p − p0 ˆ SE = 0.645 − 0.5 0.045 = 3.22 20 / 21

slide-21
SLIDE 21

Theoretical SE Distribution of Single Proportion Confidence Interval Hypothesis Test

Kissing Right: P-value and Conclusion

  • Use the z-score (test statistic) associated with our sample

statistic to find the P-value using a Standard Normal

## two-tailed P.value <- 2 * xpnorm(3.22, mean = 0, sd = 1, lower.tail = FALSE)

z = 3.22 0.0 0.1 0.2 0.3 0.4 −4 −2 2 4 x density

P.value [1] 0.001281906

  • Conclusion: We have statistically significant evidence

(z = 3.22, p = 0.001) that couples have a tendency to lean right more often than not when leaning in to kiss. 21 / 21