Course 02402 Overview, Hypotheses Tests Concerning Two Means - - PowerPoint PPT Presentation

course 02402 overview hypotheses tests concerning two
SMART_READER_LITE
LIVE PREVIEW

Course 02402 Overview, Hypotheses Tests Concerning Two Means - - PowerPoint PPT Presentation

Course 02402 Overview, Hypotheses Tests Concerning Two Means Introduction to Statistics Hypothesis test - a repetition 1 Hypothesis tests and confidence intervals Lecture 7: Chapter 7 and 8: Hypotheses Tests Concerning Power and sample size


slide-1
SLIDE 1

Course 02402 Introduction to Statistics Lecture 7: Chapter 7 and 8: Hypotheses Tests Concerning Two Means (7.7-7.8,8.1-8-5) Per Bruun Brockhoff

DTU Informatics Building 305 - room 110 Danish Technical University 2800 Lyngby – Denmark e-mail: pbb@imm.dtu.dk

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 1 / 40

Overview, Hypotheses Tests Concerning Two Means

1

Hypothesis test - a repetition Hypothesis tests and confidence intervals

2

Power and sample size

3

Hypotheses Concerning Two Means Example 1 In general With known variance With unknown variance - large samples With unknown variance - small samples, normal assumption Example 1 - cont.

4

Confidence Interval for Difference between Two Means Example 1 - cont. Example 2

5

Paired t-test Example 2 - cont.

6

R (R note 7)

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 2 / 40

Chapter 7 and 8: Hypotheses Tests Concerning Two Means Hypotheses Tests (7.7-7.8,8.1-8.5) Tests and confidence intervals Hypotheses tests for two means Randomization and pairing R

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 3 / 40 Hypothesis test - a repetition

Hypotheses The null hypothesis vs. the alternative hypothesis H0 : µ = µ0 H1 : µ = µ0 Note that the ’burden of proof’ is on H0. We either choose to accept H0 or to reject H0

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 5 / 40

slide-2
SLIDE 2

Hypothesis test - a repetition

Tests of Hypotheses A couple of rules of thumb when formulating the hypothesis: Use equal sign ’=’ in the null hypothesis when possible The alternative hypotheses should be the claim we wish to establish The alternative hypothesis can either be one- or two-sided two-sided: ’=’

  • ne-sided: ’<’ or ’>’

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 6 / 40 Hypothesis test - a repetition

Hypotheses When testing statistical hypotheses, two kinds of errors can

  • ccur:

Type I: Rejection of H0 when H0 is true Type II: Non-rejection of H0 when H1 is true We define P(Type I error) = α P(Type II error) = β

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 7 / 40 Hypothesis test - a repetition

Example: Formulation of the Hypotheses An ambulance company claims that on average it takes 20 minutes from a telephone call to their switchboard until an ambulance is at the location. We have some measurements: 21.1 22.3 19.6 24.2... If our goal is to show that on average it takes longer than 20 minutes, the null- and the alternative hypotheses are: H0 : µ = 20 minutes H1 : µ > 20 minutes

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 8 / 40 Hypothesis test - a repetition

Example What kind of errors can occur? Type I: Reject H0 when H0 is true, that is we mistakenly conclude that it takes longer than 20 minutes for the ambulance to be on location Type II: Not reject H0 when H1 is true, that is we mistakenly conclude that it takes 20 minutes for the ambulance to be on location

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 9 / 40

slide-3
SLIDE 3

Hypothesis test - a repetition

Choosing the Level of Significance α The level of significance α is chosen according to the size of Type I error we are willing to accept. A typical value is α = 5% If we want to reduce the probability of Type I error we choose a smaller α, e.g. α = 1% Lower α means that it is more difficult to reject H0

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 10 / 40 Hypothesis test - a repetition

Hypotheses Tests in 4 Steps Formulate the hypotheses and choose the level of significance α (choose the "risk-level") Calculate, using the data, the value of the test statistic Calculate the p-value using the test statistic Compare the p-value and the level of significance and draw a conclusion ∗ An alternative to (4) is to compare the test statistic to the critical value.

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 11 / 40 Hypothesis test - a repetition Hypothesis tests and confidence intervals

The Connection between Hypotheses Tests and Confidence Intervals We consider a (1 − α) · 100% confidence interval for µ (for small n and unknown σ): ¯ x − tα/2 · s √n < µ < ¯ x + tα/2 · s √n The confidence interval corresponds to the acceptance area for H0 when testing the hypotheses (with two-sided alternative): H0 : µ = µ0 H1 : µ = µ0

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 12 / 40 Power and sample size

Power and sample size How to affect the error probabilities in hypothesis testing? Change the level of significance α Take larger samples, that is bigger n The power of a test is defined as 1 − β → Section 7.7 Necessary sample size, given the wanted power: n =

  • σ zβ + zα

(µ0 − µ1) 2

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 14 / 40

slide-4
SLIDE 4

Hypotheses Concerning Two Means Example 1

Example 1 In a nutrition study it is desired to investigate if there is a difference in the energy usage for different types of (moderate physical demanding) work. In the study, the energy usage of 9 secretaries and 9 nurses have been

  • measured. The secretaries are expected to have a sedentary

job while the nurses are expected to have a more physical demanding job. The measurements are given in the following table in MJ:

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 16 / 40 Hypotheses Concerning Two Means Example 1

Example 1 A (secretaries) B (nurses) 7.53 9.21 7.48 11.51 8.08 12.79 8.09 11.85 10.15 9.97 8.40 8.79 10.88 9.69 6.13 9.68 7.90 9.19

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 17 / 40 Hypotheses Concerning Two Means In general

Hypotheses Concerning Two Means We want to compare mean values of two samples Sample 1: n1, ¯ X1 and s2

1

Sample 2: n2, ¯ X2 and s2

2

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 18 / 40 Hypotheses Concerning Two Means In general

Formulation of Hypotheses Null hypothesis vs. alternative hypothesis (shown here as two-sided) H0 : µ1 − µ2 = δ H1 : µ1 − µ2 = δ We either choose to accept H0 or to reject H0 (We are often interested in the test where δ = 0)

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 19 / 40

slide-5
SLIDE 5

Hypotheses Concerning Two Means With known variance

Calculating the Test Statistic When testing hypotheses concerning two means, (µ1 and µ2) for data that is assumed to be normal and σ2

1 and σ2 2

are known, the test statistic is: Z = ( ¯ X1 − ¯ X2) − δ

  • σ2

1/n1 + σ2 2/n2

If H0 is true, Z ∼ N(0, 12). From this the P-value of the test can be found.

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 20 / 40 Hypotheses Concerning Two Means With known variance

Comparing to a Critical Value When testing hypotheses concerning two means, (µ1 and µ2) for data that is assumed to be normal and σ2

1 and σ2 2

are known, we have: Alternative Reject hypothesis null hypothesis if µ1 − µ2 < δ Z < −zα µ1 − µ2 > δ Z > zα µ1 − µ2 = δ Z < −zα/2

  • r Z > zα/2

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 21 / 40 Hypotheses Concerning Two Means With unknown variance - large samples

Calculating the Test Statistic When testing hypotheses concerning two means, (µ1 and µ2) for data that is assumed to be normal, σ2

1 and σ2 2 are

unknown but the samples are large, the test statistic is: Z = ( ¯ X1 − ¯ X2) − δ

  • s2

1/n1 + s2 2/n2

If H0 is true, Z ∼ N(0, 12). From this the P-value of the test can be found.

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 22 / 40 Hypotheses Concerning Two Means With unknown variance - large samples

Comparing to a Critical Value When testing hypotheses concerning two means, (µ1 and µ2) for data that is assumed to be normal, σ2

1 and σ2 2 are

unknown but the samples are large, we have: Alternative Reject hypothesis null hypothesis if µ1 − µ2 < δ Z < −zα µ1 − µ2 > δ Z > zα µ1 − µ2 = δ Z < −zα/2

  • r Z > zα/2

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 23 / 40

slide-6
SLIDE 6

Hypotheses Concerning Two Means With unknown variance - small samples, normal assumption

Calculating the Test Statistic When testing hypotheses concerning two means, (µ1 and µ2) for data that is assumed to be normal, σ2

1 and σ2 2 are

unknown but σ2

1 = σ2 2 and the samples are small the test

statistic is: t = ( ¯ X1 − ¯ X2) − δ

  • s2

p/n1 + s2 p/n2

where s2

p = (n1 − 1)s2 1 + (n2 − 1)s2 2

n1 + n2 − 2 If H0 is true, t ∼ t(n1 + n2 − 2).

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 24 / 40 Hypotheses Concerning Two Means With unknown variance - small samples, normal assumption

Comparing to a Critical Value When testing hypotheses concerning two mean values, for data that is assumed to be normal, σ2

1 and σ2 2 are unknown

and the samples are small: Alternative Reject hypothesis null-hypothesis if µ1 − µ2 < δ t < −tα µ1 − µ2 > δ t > tα µ1 − µ2 = δ t < −tα/2

  • r t > tα/2

When using Table 4, v = n1 + n2 − 2

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 25 / 40 Hypotheses Concerning Two Means Example 1 - cont.

Example 1 A (secretaries) B (nurses) 7.53 9.21 7.48 11.51 8.08 12.79 8.09 11.85 10.15 9.97 8.40 8.79 10.88 9.69 6.13 9.68 7.90 9.19

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 26 / 40 Hypotheses Concerning Two Means Example 1 - cont.

Example 1 - cont. Carry out a hypothesis test for the (average) energy usage to check if the two kind of jobs are equal. Use a significance level of α = 5%

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 27 / 40

slide-7
SLIDE 7

Confidence Interval for Difference between Two Means

Confidence Interval for Difference between Two Means For large samples, (1 − α) · 100% confidence interval is calculated as: ¯ x1 − ¯ x2 ± zα/2

  • s2

1

n1 + s2

2

n2 (if σ2

1 and σ2 2 are known they should be used instead of s2 1

and s2

2)

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 29 / 40 Confidence Interval for Difference between Two Means

Confidence Interval for Difference between Two Means For small samples and unknown σ2

1 and σ2 2 but σ2 1 = σ2 2 a

(1 − α) · 100% confidence interval is calculated as: ¯ x1 − ¯ x2 ± tα/2

  • (n1 − 1)s2

1 + (n2 − 1)s2 2

n1 + n2 − 2

  • 1

n1 + 1 n2 Using Table 4, the number of degrees of freedom is v = n1 + n2 − 2

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 30 / 40 Confidence Interval for Difference between Two Means Example 1 - cont.

Example 1 - cont., confidence interval

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 31 / 40 Confidence Interval for Difference between Two Means Example 2

Example 2 In a study it is desired to compare two kinds of sleeping medicine A and B. From 10 test persons the following results are obtained, given in prolonged sleep length (in hours)

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 32 / 40

slide-8
SLIDE 8

Confidence Interval for Difference between Two Means Example 2

Example 2 - cont. person A B 1 +0.7 +1.9 2

  • 1.6

+0.8 3

  • 0.2

+1.1 4

  • 1.2

+0.1 5

  • 1.0
  • 0.1

6 +3.4 +4.4 7 +3.7 +5.5 8 +0.8 +1.6 9 +4.6 10 +2.0 +3.4

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 33 / 40 Paired t-test

Paired t-test We now consider the situation where we want to compare two means but the data is paired The hypotheses test is based on the differences, Di, between the paired observations: Di = Xi − Yi for i = 1, 2, ..., n Then we can calculate the mean ¯ D and the variance S2

D for

  • D. Testing ¯

D is done the same way as when testing one mean.

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 35 / 40 Paired t-test Example 2 - cont.

Example 2 - cont. person A B D = B − A 1 +0.7 +1.9 +1.2 2

  • 1.6

+0.8 +2.4 3

  • 0.2

+1.1 +1.3 4

  • 1.2

+0.1 +1.3 5

  • 1.0
  • 0.1

+0.9 6 +3.4 +4.4 +1.0 7 +3.7 +5.5 +1.8 8 +0.8 +1.6 +0.8 9 +4.6 +4.6 10 +2.0 +3.4 +1.4

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 36 / 40 Paired t-test Example 2 - cont.

Example 2 - cont. Carry out a hypothesis test to check if the two sleeping medicines are equally effective. Use a significance level of α = 5%

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 37 / 40

slide-9
SLIDE 9

R (R note 7)

R (R note 7)

> x1=c(10,13,16,19,17,15,20,23,15,16) > x2=c(13,16,20,25,18,16,27,30,17,19) > t.test(x1,x2,alt="less",conf.level=0.95,var.equal=TRUE) Pooled-Variance Two-Sample t-Test data: x1 and x2 t = -1.779, df = 18, p-value = 0.04606 alternative hypothesis: difference in means is less than 0 95 percent confidence interval:

  • Inf -0.09349972

sample estimates: mean of x mean of y 16.4 20.1

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 39 / 40 R (R note 7)

Overview

1

Hypothesis test - a repetition Hypothesis tests and confidence intervals

2

Power and sample size

3

Hypotheses Concerning Two Means Example 1 In general With known variance With unknown variance - large samples With unknown variance - small samples, normal assumption Example 1 - cont.

4

Confidence Interval for Difference between Two Means Example 1 - cont. Example 2

5

Paired t-test Example 2 - cont.

6

R (R note 7)

Per Bruun Brockhoff (pbb@imm.dtu.dk) Introduction to Statistics, Lecture 7 Fall 2012 40 / 40