Unit 3: Foundations for inference Lecture 3: Decision errors, - - PowerPoint PPT Presentation

▶

Jan 20, 2024 247 likes •544 views

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size, and power Statistics 101 Thomas Leininger May 31, 2013 Visualization of the day The Flesch/Flesch-Kincaid readability tests are designed to

SLIDE 1

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size, and power

Statistics 101

Thomas Leininger

May 31, 2013

SLIDE 2

Visualization of the day

The Flesch/Flesch-Kincaid readability tests are designed to indicate comprehension difficulty when reading a passage of contemporary academic English.

http://www.guardian.co.uk/world/interactive/2013/feb/12/state-of-the-union-reading-level Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 2 / 12

SLIDE 3

Video of the day

2013 is the International Year of Statistics

https://www.youtube.com/watch?feature=player embedded&v=nTBZuQR7dRc

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 3 / 12

SLIDE 4

Two-sided hypothesis testing with p-values

1

Two-sided hypothesis testing with p-values

2

Significance level vs. confidence level

3

Statistical vs. Practical Significance

Statistics 101 U3 - L3: Decision errors, significance levels, sample size, and power Thomas Leininger

SLIDE 5

Two-sided hypothesis testing with p-values

From yesterday:

A poll by the National Sleep Foundation found that college students average about 7 hours of sleep per night. A sample of 169 Duke students yielded an average of 6.88 hours, with a standard deviation of 0.94 hours. Assuming that this is a random sample representative of all Duke students (bit of a leap of faith?), a hypothesis test was conducted to evaluate if Duke students on average sleep less than 7 hours per night. The p-value for this hypothesis test is 0.0485. Which of the following is correct?

If the research question was “Do the data provide convincing evidence that the average amount of sleep Duke students get per night is different than the national average?”, the alternative hypothesis would be different.

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 4 / 12

SLIDE 6

Two-sided hypothesis testing with p-values

First scenario (Duke students lower than US average)

H0 : µ = 7 HA : µ < 7

Second scenario (Duke students different than US average)

H0 : µ = 7 HA : µ 7

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 5 / 12

SLIDE 7

Two-sided hypothesis testing with p-values

First scenario (Duke students lower than US average)

H0 : µ = 7 HA : µ < 7

Second scenario (Duke students different than US average)

H0 : µ = 7 HA : µ 7

Hence the p-value would change as well:

6.88 7.00 7.12

p-value

= 0.0485 × 2 = 0.097

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 5 / 12

SLIDE 8

Two-sided hypothesis testing with p-values

Recap: Hypothesis testing framework

Set the hypotheses.

Check assumptions and conditions.

Calculate a test statistic and a p-value.

Make a decision, and interpret it in context of the research question.

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 6 / 12

SLIDE 9

Two-sided hypothesis testing with p-values

Recap: Hypothesis testing for a population mean

Set the hypotheses H0 : µ = null value HA : µ < or > or null value

Check assumptions and conditions

Independence: random sample/assignment, 10% condition when sampling without replacement Normality: nearly normal population or n ≥ 30, no extreme skew

Calculate a test statistic and a p-value (draw a picture!)

Z = ¯ x − µ SE , where SE = s √n

Make a decision, and interpret it in context of the research question

If p-value < α, reject H0, data provide evidence for HA If p-value > α, do not reject H0, data do not provide evidence for HA

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 7 / 12

SLIDE 10

Significance level vs. confidence level

1

Two-sided hypothesis testing with p-values

2

Significance level vs. confidence level

3

Statistical vs. Practical Significance

Statistics 101 U3 - L3: Decision errors, significance levels, sample size, and power Thomas Leininger

SLIDE 11

Significance level vs. confidence level

Two sided

0.025 0.025 0.95

1.96

1.96

Two sided HT with α = 0.05 is equivalent to 95% confidence interval.

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 8 / 12

SLIDE 12

Significance level vs. confidence level

Two sided

0.025 0.025 0.95

1.96

1.96

Two sided HT with α = 0.05 is equivalent to 95% confidence interval. One sided

1.65

0.9 0.05 0.05

One sided HT with α = 0.05 is equivalent to 90% confidence interval.

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 8 / 12

SLIDE 13

Significance level vs. confidence level

Agreement of CI and HT

Confidence intervals and hypothesis tests agree, as long as the two methods use equivalent levels of significance / confidence.

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 9 / 12

SLIDE 14

Significance level vs. confidence level

Agreement of CI and HT

Confidence intervals and hypothesis tests agree, as long as the two methods use equivalent levels of significance / confidence.

A two sided hypothesis with threshold of α is equivalent to a confidence interval with CL = 1 − α. A one sided hypothesis with threshold of α is equivalent to a confidence interval with CL = 1 − (2 × α).

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 9 / 12

SLIDE 15

Significance level vs. confidence level

Agreement of CI and HT

Confidence intervals and hypothesis tests agree, as long as the two methods use equivalent levels of significance / confidence.

A two sided hypothesis with threshold of α is equivalent to a confidence interval with CL = 1 − α. A one sided hypothesis with threshold of α is equivalent to a confidence interval with CL = 1 − (2 × α).

If H0 is rejected, a confidence interval that agrees with the result

f the hypothesis test should not include the null value.

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 9 / 12

SLIDE 16

Significance level vs. confidence level

Agreement of CI and HT

Confidence intervals and hypothesis tests agree, as long as the two methods use equivalent levels of significance / confidence.

A two sided hypothesis with threshold of α is equivalent to a confidence interval with CL = 1 − α. A one sided hypothesis with threshold of α is equivalent to a confidence interval with CL = 1 − (2 × α).

If H0 is rejected, a confidence interval that agrees with the result

f the hypothesis test should not include the null value.

If H0 is failed to be rejected, a confidence interval that agrees with the result of the hypothesis test should include the null value.

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and power May 31, 2013 9 / 12

SLIDE 17

Significance level vs. confidence level

Question A 95% confidence interval for the average waiting time at an emer- gency room is (128 minutes, 147 minutes). Which of the following is false? (a) A hypothesis test of HA : µ 120 min at α = 0.05 is equivalent to this CI. (b) A hypothesis test of HA : µ > 120 min at α = 0.025 is equivalent to this CI. (c) This interval does not support the claim that the average wait time is 120 minutes. (d) The claim that the average wait time is 120 minutes would not be rejected using a 90% confidence interval.

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 10 / 12

SLIDE 18

Significance level vs. confidence level

Question A 95% confidence interval for the average waiting time at an emer- gency room is (128 minutes, 147 minutes). Which of the following is false? (a) A hypothesis test of HA : µ 120 min at α = 0.05 is equivalent to this CI. (b) A hypothesis test of HA : µ > 120 min at α = 0.025 is equivalent to this CI. (c) This interval does not support the claim that the average wait time is 120 minutes. (d) The claim that the average wait time is 120 minutes would not be rejected using a 90% confidence interval.

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 10 / 12

SLIDE 19

Statistical vs. Practical Significance

1

Two-sided hypothesis testing with p-values

2

Significance level vs. confidence level

3

Statistical vs. Practical Significance

Statistics 101 U3 - L3: Decision errors, significance levels, sample size, and power Thomas Leininger

SLIDE 20

Statistical vs. Practical Significance

Sample Size

Question All else held equal, will p-value be lower if n = 100 or n = 10, 000? (a) n = 100 (b) n = 10, 000

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 11 / 12

SLIDE 21

Statistical vs. Practical Significance

Sample Size

Question All else held equal, will p-value be lower if n = 100 or n = 10, 000? (a) n = 100 (b) n = 10, 000

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 11 / 12

SLIDE 22

Statistical vs. Practical Significance

Sample Size

Question All else held equal, will p-value be lower if n = 100 or n = 10, 000? (a) n = 100 (b) n = 10, 000 Suppose ¯

x = 50, s = 2, H0 : µ = 49.5, and HA : µ ≥ 49.5.

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 11 / 12

SLIDE 23

Statistical vs. Practical Significance

Sample Size

Question All else held equal, will p-value be lower if n = 100 or n = 10, 000? (a) n = 100 (b) n = 10, 000 Suppose ¯

x = 50, s = 2, H0 : µ = 49.5, and HA : µ ≥ 49.5. Zn=100 = 50 − 49.5

2 √ 100

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 11 / 12

SLIDE 24

Statistical vs. Practical Significance

Sample Size

Question All else held equal, will p-value be lower if n = 100 or n = 10, 000? (a) n = 100 (b) n = 10, 000 Suppose ¯

x = 50, s = 2, H0 : µ = 49.5, and HA : µ ≥ 49.5. Zn=100 = 50 − 49.5

2 √ 100

= 50 − 49.5

2 10

= 0.5 0.2 = 2.5,

p-value = 0.0062

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 11 / 12

SLIDE 25

Statistical vs. Practical Significance

Sample Size

Question All else held equal, will p-value be lower if n = 100 or n = 10, 000? (a) n = 100 (b) n = 10, 000 Suppose ¯

x = 50, s = 2, H0 : µ = 49.5, and HA : µ ≥ 49.5. Zn=100 = 50 − 49.5

2 √ 100

= 50 − 49.5

2 10

= 0.5 0.2 = 2.5,

p-value = 0.0062

Zn=10000 = 50 − 49.5

2 √ 10000

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 11 / 12

SLIDE 26

Statistical vs. Practical Significance

Sample Size

Question All else held equal, will p-value be lower if n = 100 or n = 10, 000? (a) n = 100 (b) n = 10, 000 Suppose ¯

x = 50, s = 2, H0 : µ = 49.5, and HA : µ ≥ 49.5. Zn=100 = 50 − 49.5

2 √ 100

= 50 − 49.5

2 10

= 0.5 0.2 = 2.5,

p-value = 0.0062

Zn=10000 = 50 − 49.5

2 √ 10000

= 50 − 49.5

2 100

= 0.5 0.02 = 25,

p-value ≈ 0

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 11 / 12

SLIDE 27

Statistical vs. Practical Significance

Sample Size

Question All else held equal, will p-value be lower if n = 100 or n = 10, 000? (a) n = 100 (b) n = 10, 000 Suppose ¯

x = 50, s = 2, H0 : µ = 49.5, and HA : µ ≥ 49.5. Zn=100 = 50 − 49.5

2 √ 100

= 50 − 49.5

2 10

= 0.5 0.2 = 2.5,

p-value = 0.0062

Zn=10000 = 50 − 49.5

2 √ 10000

= 50 − 49.5

2 100

= 0.5 0.02 = 25,

p-value ≈ 0 As n increases - SE ↓, Z ↑, p-value ↓

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 11 / 12

SLIDE 28

Statistical vs. Practical Significance

Real differences between the point estimate and null value are easier to detect with larger samples. However, very large samples will result in statistical significance even for tiny differences between the sample mean and the null value (effect size), even when the difference is not practically significant. This is especially important to research: if we conduct a study, we want to focus on finding meaningful results (we want

bserved differences to be real, but also large enough to matter).

The role of a statistician is not just in the analysis of data, but also in planning and design of a study.

“To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.” – R.A. Fisher

Statistics 101 (Thomas Leininger) U3 - L3: Decision errors, significance levels, sample size, and powerMay 31, 2013 12 / 12