Chapter 9: Testing Hypotheses In this chapter we will cover: 1. - - PDF document

chapter 9 testing hypotheses in this chapter we will
SMART_READER_LITE
LIVE PREVIEW

Chapter 9: Testing Hypotheses In this chapter we will cover: 1. - - PDF document

Chapter 9: Testing Hypotheses In this chapter we will cover: 1. Hypothesis tests ( 9.1, 9.2 Rice) 2. Inference in regression models ( 14.4 Rice) 3. Comparing two samples ( 11.1, 11.2, 11.3) Testing Hypotheses Statistical hypothesis


slide-1
SLIDE 1

Chapter 9: Testing Hypotheses In this chapter we will cover:

  • 1. Hypothesis tests (§9.1, 9.2 Rice)
  • 2. Inference in regression models (§14.4 Rice)
  • 3. Comparing two samples (§11.1, 11.2, 11.3)

Testing Hypotheses

  • Statistical hypothesis testing is a method of distinguishing between different possible distributions, based on ob-

served data

  • We will set up the problem as a choice between a null hypothesis H0 and an alternative hypothesis, HA
  • To make the choice we use the observed data and some statistical test procedure

Example B: page 300

  • Suppose we are studying a subject to see if they have ESP powers
  • We would like to know if they can predict the suit of a random selected card better than would be expected by

chance

  • 52 cards are randomly selected with replacement, and the subject is asked what the suit is. Let T be the number

guessed correctly

  • The null hypothesis is that he is randomly guessing, so under H0 (i.e.

assuming H0 is true) then T has a Binomial(0.25, 52) distribution. So H0 : p = 0.25

  • The alternative hypothesis is that he can guess better than chance so the alternative hypothesis is HA : p > 0.25

Hypotheses

  • Suppose the subject claimed to get 50% correct, then the alternative might be HA : p = 0.5. This is called a simple

alternative

  • The alternative HA : p > 0.25 is called a composite hypothesis since it contains many possibilities. It is also called
  • ne sided
  • The alternative HA : p = 0.25 is called a two sided composite hypothesis

1

slide-2
SLIDE 2

Neyman-Pearson paradigm

  • We want to choose between H0 and HA, and we make this decision based on a statistic T which is a function of the
  • bserved data X
  • The set of values for T for which we accept H0 is called the acceptence region
  • The set of values for T for which we reject H0 is called the rejection region
  • Note that we always use the choice: accept or reject H0. We don’t say accept HA.

Errors

  • One possible error we might have made was to reject H0 when it is in fact true. This is called a type I error.
  • The probability of type I error is called α, the significance level
  • The second type of possible error is to accept H0 when it is false. This is called type II error
  • The probability of type II error is called β, and often 1 − β is called the power
  • A good test would have small α and small β

Example B

  • We want to test H0 : p = 0.25 against HA : p > 0.25 in the ESP example
  • The test statistic is T the number of correct guesses
  • We choose an acceptence region to be: accept H0 if T < T0
  • What value would you choose for T0?
  • Suppose we choose T0 = 18 what would α and β be?

2

slide-3
SLIDE 3

Example B

  • ● ● ● ●
  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

10 20 30 40 50 0.00 0.02 0.04 0.06 0.08 0.10 0.12

Null distribution of T

Test statistic Binomial(0.25, 52) probabilities Acceptence region Rejection region

Example B

  • The size of the test defined by T0 = 18 is the probability of rejection H0 when H0 is true
  • Thus

α = P(T ≥ 18|p = 0.25) = 1 − P(T ≤ 17|p = 0.25) = 1 − F(17) = 0.078 where F(x) is the cdf for the Binomial(90.25, 52) distribution

  • So this choice of T0 has a small probability of committing type I error

3

slide-4
SLIDE 4

Example B

  • To calculate β, the probability of type II error we need to calculate P(Accept H0|HA)
  • The value of this will depend on the actual value of p in HA
  • Thus if p = 0.5 we can calculate

β = P(T < 18|p = 0.5)

  • This can be calculated from the cdf for a Binomial(0.5, 52) distribution. It equals 0.008.
  • Since the size of β depends on the value of p a power curve, which plots 1 − β against p is a useful tool

Example B: Power curve

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Power curve

Probability Power

Example B: Power curve

  • We see that the test defined by T0 = 18 has high power for any alternative greater than 0.5
  • It is not very powerful against p = 0.3
  • You have to choose the acceptence region to trade of good size and good power. You can’t have both all the time.

4

slide-5
SLIDE 5

Questions

  • If we choose T0 = 25 would the size α increase or decrease?
  • What would happen to the power curve?

Recommended Questions Look from §9.12 questions 1, 2, 3, 5, 14 Regression Examples

  • Suppose we have a simple regression model

Yi = β0 + β1Xi + ei which satisfies the assumptions of Chapter 14

  • We have the least squares regression estimates for β0 and β1. These are the same as the maximum likelihood

estimates for the parameters

  • We can therefore ask what are confidence intervals for ˆ

β0 and ˆ β1? Confidence Intervals

  • Just as with the example above to build the confidence intervals we need to know about the standard error of the

estimates

  • We have from Chapter 14 that the standard errors s ˆ

βi can be calculated from

V ar(ˆ β0) = σ2 n

i=1 x2 i

n n

i=1 x2 i − (n i=1 xi)2

V ar(ˆ β1) = nσ2 n n

i=1 x2 i − (n i=1 xi)2

by the formula s ˆ

βi =

  • V ar(ˆ

βi) n 5

slide-6
SLIDE 6

Confidence Intervals

  • We can then use the fact that for large enough samples the distribution of

ˆ βi − βi s ˆ

βi

has a tn−p distribution, where p is the number of terms in the regression.

  • The 95% confidence intervals for ˆ

βi are given by

  • ˆ

βi − tn−p(0.025)s ˆ

βi, ˆ

βi + tn−p(0.025)s ˆ

βi

  • It is of interest to see if β1 = 0 lies inside the confidence interval

Hypothesis tests

  • The previous analysis can also be used to test the hypthesis β1 = 0
  • The Null model is then

Yi = β0 + ei

  • This is asking; is there a linear relationship between the explanatory variable and the response?
  • Freedman makes clear that this is not testing if the explanatory variable causes the change in the response.

6

slide-7
SLIDE 7

Example

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 40 45 50 55 30 40 50 60 70 80

Democratic Congress Seats/Votes 1900−1970

Votes % Seat %

7

slide-8
SLIDE 8

Example

  • Here is some output from a computer package which has fitted the linear regression

Estimate Std. Error beta0

  • 50.1821

6.9424 beta1 2.0845 0.1385

  • Thus we can easily calculate the confidence intervals for β0 is

(−50.1821 − t35(0.025)6.9424, −50.1821 + t35(0.025)6.9424) = (−64.28, −36.09) and for β1 (2.0845 − 2.03 × 0.1385, 2.0845 + 2.03 × 0.138) = (1.80, 2.36)

  • It is clear that β1 = 0 lies a long way from the confidence interval

Example: Q40 page 562 Rice

  • Cogswell studied a method of measuring resistence to breathing in children. He looked at two groups: asthma and

cystic fibrosis patients

  • He wanted to investigate if there was a relationship between the height of the patient, H and the resistence R
  • He tried fitting a regression model

Ri = β0 + β1Hi + ei

  • After fitting we need to consider if β1 = 0 is consistent with the data

8

slide-9
SLIDE 9

Example: Q40 page 562 Rice

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 90 110 130 150 5 10 15 20 25

Asthma patients

height (cm) resistence + + + + + + + + + + + + + + + + + + + + + + + + 90 95 100 110 10 15 20

Cystic Fibrosis

height (cm) resistence

Example: Q40 page 562 Rice

  • From the fit we find that for the asthma patients the fitted values are ˆ

β0 = 27.5157(s.e. = 6.63), ˆ β1 = −0.136, (s.e. = 0.053)

  • The 95% confidence interval for β1 is therefore (−0.253, −0.019). Note that 0 does not lie inside this interval.
  • For the cystic fibrosis patients the fitted values are ˆ

β0 = 23.8(s.e. = 9.63), ˆ β1 = −0.125(s.e. = 0.0941)

  • Now the 95% confidence interval for β1 is (−0.32, 0.07). This does contain 0 so it might be that there is no

relationship between height and resistence for these patients 9

slide-10
SLIDE 10

Example: Q40 page 562 Rice

  • We can investigate β1 = 0 by a hypothesis test.
  • The test statistic is for H0 : β1 = 0 against HA : β1 = 0 is given by

ˆ β1 s ˆ

β1

which has a tn−2-distribution and a two sided alternative

  • For the asthma patients the test statistic has value −2.562 which is significant at the 5% level since −2.56 < −2.02

Thus there is evidence to reject the null β1 = 0

  • For the cystic fibrosis patients the test statistic has value −1.324 which is not significant at the 5% level since

−2.073 < −1.324 < 2.073 Thus there is no evidence to reject the null β1 = 0 p-values

  • Another way of viewing a test of H0 is to consider the p-value of a test statistic S(x).
  • Suppose that large values of S(x) are evidence against H0, and that we observed the value Sobs
  • We can ask the question; what is the probability, given H0 is true, that we would observe a value of S at least as big

as the one we did observe?

  • The p-value is defined as

P(S ≥ Sobs|H0) is this is small it means that Sobs was ‘unusual’ if H0 is true.

  • We shall investigate p-values in the regression context

p-values

  • If the p-value is greater than 0.05 then we say there is not strong evidence that the hypothesis is inconsistent with

the data.

  • If the p-value is between 0.05 and 0.01 we say that there is reasonable evidence that the hypothesis is inconsistent

with the data.

  • If the p-value of less than 0.01 then we say that there is strong evidence that the hypothesis is inconsistent with the

data. 10

slide-11
SLIDE 11

Example: Q40 page 562 Rice

  • Here in testing for β1 = 0 the p-value is the probability that the test statistic

P

  • |

ˆ β1 s ˆ

β1

| > tobs

  • The output from the computer package gives for the asthma patients

t value Pr(>|t|) beta0 4.146 0.000171 *** beta1

  • 2.562 0.014264 *
  • signif. codes: ‘***’ 0.001

‘*’ 0.05

  • Thus β1 is ‘significantly’ different from zero at the 5% value.

Recommended questions Please look at Rice §14.8 Questions 12, 40 in JMP-IN, Consider the following computer generated output for a regression with 45 observations Estimate s.e. t-value p-value beta0 41.94 0.17114 245.116 <2e-16 *** beta1 -0.095 0.13300

  • 0.714

0.479 Give a clear statement of what you have learned from this output in terms of confidence intervals and hypothesis testing. What is your predicted value of the response when the explanatory varible equals 10.3? Comparing two samples

  • Two independent samples
  • Paired samples

Two independent samples

  • In many experiments two samples may be regarded as independent; for example a treatment group against a control

group in a medical trial

  • We will concentrate on the following set-up. X1, · · · , Xn is assumed drawn from a N(µX, σ2) distribution while

Y1, · · · , Ym is independently drawn from a N(µY , σ2) distribution

  • We are intested in the difference µX − µY
  • Note the samples sizes n and m are different, but the variance σ2 is common to both samples

11

slide-12
SLIDE 12

Two independent samples: known σ

  • To estimate µX − µY we can use the maximum likelihood estimate ¯

x − ¯ y

  • If σ was known then a confidence interval for µX − µY can be based on the statistic

z = (¯ x − ¯ y) − (µX − µY ) σ

  • 1

n + 1 m

which follows a standard Normal distribution

  • The confidence interval would be of the form

(¯ x − ¯ y) ± z(α/2)σ

  • 1

n + 1 m Unknown σ

  • When σ is unknown it also must be estimated
  • We use the pooled sample variance

s2

p = (n − 1)s2 X + (m − 1)s2 Y

m + n − 2 where s2

X, s2 Y are the sample variances for the two samples

  • The test statistic is then

t = (¯ x − ¯ y) − (µX − µY ) sp

  • 1

n + 1 m

which has a t distribution with m + n − 2 degrees of freedom

  • The confidence interval is then given by

(¯ x − ¯ y) ± tn+m−2(α/2)sp

  • 1

n + 1 m Example

  • Two methods A and B were used to find the latent heat of fusion for ice. The data is given on page 390 of Rice
  • The sample sizes are n = 13 and m = 8, while the sample mean and variance for A is 80.02, 0.0242 and for B they

are 79.98, 0.0312

  • The pooled sample variance is then given by

s2

p = 12 × 0.0242 + 7 × 0.0312

13 + 8 − 2 = 0.0007178

  • The standard error is then

sp

  • 1

13 + 1 8 = 0.012 12

slide-13
SLIDE 13

Example

  • Method A

Method B 79.94 79.96 79.98 80.00 80.02 80.04

Example

  • The 95% confidence interval for µA − µB is then caculated using a t statistic with 13 + 8 − 2 = 19 degrees of

freedom.

  • The t19(0.025) value is then 2.093, thus the confidence interval is given by

(80.02 − 79.98) ± 2.093 × 0.012 which gives (0.017, 0.067)

  • This does not contain 0 thus is seems to tell us that methods A gives significantly larger values than B and the mean

difference lies in the above range 13

slide-14
SLIDE 14

Hypothesis testing

  • One important null hypothesis is often

H0 : µX = µY

  • There are three common alternative hypotheses to consider

H1 : µX = µY H2 : µX > µY H3 : µX < µY

  • The first is called a two side alternative, while H2 and H3 are called one-sided

Hypothesis testing

  • The test statistics (for unknown σ) is given by

t = ¯ x − ¯ y sp

  • 1

n + 1 m

which, under the null, has a t distribution with m + n − 2 degress of freedom

  • The rejection regions are the given by

for H1 : |t| > tm+n−2(α/2) for H2 : t > tm+n−2(α) for H3 : t < −tm+n−2(α) Example

  • To test µA = µB against a two sided hypothesis µ = µB we calculate

t = ¯ x − ¯ y sp

  • 1

n + 1 m

= 3.47

  • Since the critical value for α = 0.01 is

t19(0.005) = 2.861 and 3.47 > 2.861 we reject the null at this level

  • The p-value for the 2 sided hypthesis is 0.0025. Thus there is very strong evidence against the null that µX = µY

14

slide-15
SLIDE 15

Paired samples

  • Having looked at the case of independent samples, now lets look at the case where there is dependence between the

samples

  • One example is that of paired samples.
  • Typically this can be one meaurement ‘before’ and one ‘after’. Thus each person in the trial has two observations

i.e. a pair.

  • They are dependent since each part of the pair refers to the same person

Paired samples

  • The data is in the form (Xi, Yi), i = 1, · · · , n and the difference is defined to be Di = Xi − Yi
  • Assume X’s have mean µX and Y ’s have mean µY . We shall defined µD = µX − µY
  • Further assume that the differences Xi − Yi come from a normal distribution N(µD, σ2

D)

  • If σD is unknown we will use the test statistic

t = ¯ D − µD s ¯

D

which has a tn−1 distribution. Example (page 412 Rice)

  • To study the effect of platelet aggregation a samples of blood from 11 patients was drwn both before and after

smoking

  • The data records the maximum percentage of aggregated platelets, see page 412 Rice
  • We are intested in seeing if the smoking has an effect on the mean values before and after.
  • We can see that the before and after values are dependent because they are correlated

15

slide-16
SLIDE 16

Example

+ + + + + + + + + + + 30 40 50 60 30 40 50 60 70 80

Platet data

Before After 5 10 15 20 25

Differences Example (page 412 Rice)

  • The 99% confidence interval for the difference in values is (2.65, 17.89). This does not contain 0
  • The p-value for the null that µX = µY against a two sided alternative is 0.0016
  • Hence we conclude that there is very strong evidence against the null that there in no differene.

Recommended questions Please look at Rice §11.6 Questions 1, 2, 5, 6, 17 (a), 19(a) 16