Business Statistics CONTENTS A hypothesis test Hypotheses - - PowerPoint PPT Presentation

business statistics
SMART_READER_LITE
LIVE PREVIEW

Business Statistics CONTENTS A hypothesis test Hypotheses - - PowerPoint PPT Presentation

HYPOTHESES: LOGIC AND FRAMEWORK Business Statistics CONTENTS A hypothesis test Hypotheses Rejection region and significance level Five-step procedure for hypothesis tests More on hypotheses Old exam question Further study A HYPOTHESIS TEST


slide-1
SLIDE 1

HYPOTHESES: LOGIC AND FRAMEWORK

Business Statistics

slide-2
SLIDE 2

A hypothesis test Hypotheses Rejection region and significance level Five-step procedure for hypothesis tests More on hypotheses Old exam question Further study CONTENTS

slide-3
SLIDE 3

▪ Suppose a beverage company wants to test if its bottles are filled with 1 liter

▪ more than 1 liter: not competitive ▪ less than 1 liter: trouble with the consumers association

▪ They take a random sample of 9 bottles

▪ and find ҧ 𝑦 = 1.02 liter

▪ Can they claim 𝜈 = 1 liter? ▪ Assume:

▪ population is normally distributed ▪ population has standard deviation 𝜏 = 0.003 liter ▪ so: 𝑌~𝑂 𝜈 =? , 𝜏 = 0.003

A HYPOTHESIS TEST

slide-4
SLIDE 4

If all assumptions (including 𝜈 = 1!) are true: ▪ The sampling distribution of the mean ( ത 𝑌)

▪ is normal ▪ has mean 𝜈 ത

𝑌 = 𝜈𝑌 = 1

▪ has standard deviation 𝜏 ത

𝑌 = 𝜏𝑌 9 = 0.003 3

= 0.001

▪ So, there is a probability of finding a sample mean ത 𝑌 = 1.02 or even larger, given by

▪ 𝑄𝑂 ത 𝑌 ≥ 1.02 = 𝑄𝑎

ത 𝑌−1 0.001 ≥ 1.02−1 0.001

= 𝑄𝑎 𝑎 ≥ 20 = 0.000 ≈ 0% ▪ very very unlikely!

▪ So, you can reject the claim 𝜈𝑌 = 1 with high confidence! A HYPOTHESIS TEST

Or even larger? We’ll go into that soon.

slide-5
SLIDE 5

Now, suppose you had found ҧ 𝑦 = 1.002 liter ▪ There is a probability of finding a sample mean ത 𝑌 = 1.002

  • r even larger, given by

▪ 𝑄𝑂 ത 𝑌 ≥ 1.002 = 𝑄𝑎

ത 𝑌−1 0.001 ≥ 1.002−1 0.001

= 𝑄𝑎 𝑎 ≥ 2 = 0.02275 ≈ 2.3% ▪ not very likely, but it may certainly happen now and then

▪ So, you can reject the claim 𝜈𝑌 = 1 with some confidence

▪ but you know that there is some chance to make the wrong decision

▪ Or: you can decide to not reject the claim 𝜈𝑌 = 1

▪ because you know that it may still be true, despite the data

A HYPOTHESIS TEST

slide-6
SLIDE 6

In general a hypothesis is an unproven assertion ▪ In statistics:

▪ a hypothesis is a claim about a (population!) parameter

▪ Examples:

▪ the mean monthly cell phone bill of this city is 42$ is (𝜈 = $42) ▪ the proportion of adults in this city with an iPhone is at least 0.68 (𝜌 ≥ 0.68) ▪ the variance of spending on fashion for men is not smaller than that for women (𝜏men

2

≥ 𝜏women

2

) ▪ the median life expectancy is the same for all three income groups (𝑁1 = 𝑁2 = 𝑁3)

HYPOTHESES

slide-7
SLIDE 7

Statistical hypotheses have the following aspects: ▪ A (population!) parameter

▪ 𝜈 , 𝜌 , 𝜏2 , etc.

▪ In case of one-sample: a benchmark

▪ 𝜈 = 181 , 𝜌 ≤ 0.2 , etc.

▪ In case of several samples: a comparison

▪ 𝜈1 = 𝜈2 , 𝜌1 − 𝜌2 ≤ 0.2 , 𝜏1

2 = 𝜏2 2 = 𝜏3 2 , etc.

A hypothesis test is a decision between two competing mutually exclusive and collectively exhaustive hypotheses about the value(s) of the parameter(s) HYPOTHESES

slide-8
SLIDE 8

▪ Examples of a hypothesis test:

▪ 𝐼0: 𝜈 = 181 versus 𝐼1: 𝜈 ≠ 181 ▪ 𝐼0: 𝜌 ≤ 0.2 versus 𝐼1: 𝜌 > 0.2

▪ Terminology

▪ 𝐼0 is the null hypothesis (on which the test focuses) ▪ 𝐼1 is the alternative hypothesis

▪ We focus on 𝐼0

▪ so if we reject 𝐼0, we automatically accept 𝐼1 ▪ while if we do not reject 𝐼0, we “maintain” 𝐼0 (but do not reject 𝐼1 and do not accept 𝐼0)

HYPOTHESES

slide-9
SLIDE 9

A government official wants to proudly announce that unemployment is under 4%. Which hypothesis should he test? EXERCISE 1

slide-10
SLIDE 10

▪ Example:

▪ 𝐼0: 𝜈 = 181 versus 𝐼1: 𝜈 ≠ 181

▪ We collect data and perform the hypothesis test ▪ Two possible outcomes:

▪ reject 𝐼0, so accept 𝐼1, and conclude 𝜈 ≠ 181 ▪ do not reject 𝐼0, and conclude that there is no evidence to reject 𝜈 = 181

▪ Whatever the decision is, you may be wrong

▪ there is sampling variation ▪ you may always have an exceptional sample ▪ example: if you want to test if a coin is fair, it may happen that you have only “heads” in your sample, even if the coin is fair!

REJECTION REGION AND SIGNIFICANCE LEVEL

slide-11
SLIDE 11

▪ Between rejecting and not rejecting, there is a boundary ▪ This boundary defines the risk you are prepared to take

▪ if you want to test if a coin is fair, and you use a sample of size 20, how many “heads” will induce you to reject the null hypothesis (𝜌 = 0.5)?

▪ You will determine a rejection region

▪ for instance: you will reject the null hypothesis (𝜌 = 0.5) when you obtain 5 heads or fewer, or 15 heads or more

▪ You use a pre-established significance level to determine the boundaries of the rejection region REJECTION REGION AND SIGNIFICANCE LEVEL

slide-12
SLIDE 12

▪ So, you define a significance level

▪ conventional symbol 𝛽 ▪ often taken to be 0.05 ▪ but also 0.1, 0.01, 0.005, 0.001, etc are used often

▪ There is a close link between

▪ the confidence level (1 − 𝛽, as used in a confidence interval) ▪ and a significance level (𝛽, as used in a hypothesis test) ▪ confidence level+significance level=1

REJECTION REGION AND SIGNIFICANCE LEVEL

slide-13
SLIDE 13

▪ Suppose we have a sample and want to see if it comes from a distribution with mean 𝜈0

▪ assuming normality of the population ▪ assuming a known value for 𝜏 ▪ testing 𝜈 = 𝜈0 at a significance level 𝛽 = 5%

▪ We want to determine boundary values for ത 𝑌 such that the claim 𝜈 = 𝜈0 becomes unlikely

▪ upper boundary: 𝑄 ത 𝑌 ≥ ҧ 𝑦upper = 0.025 ▪ lower boundary: 𝑄 ത 𝑌 ≤ ҧ 𝑦lower = 0.025

▪ If the value of the test statistic is in the rejection region

▪ so if ҧ 𝑦data ≤ 𝜈0 − 1.96𝜏 ത

𝑌 or ҧ

𝑦data ≥ 𝜈0 + 1.96𝜏 ത

𝑌

▪ we reject 𝐼0: 𝜈 = 𝜈0 and accept 𝐼1: 𝜈 ≠ 𝜈0

REJECTION REGION AND SIGNIFICANCE LEVEL

So, we distribute the 𝛽 = 5% equally at both sides

slide-14
SLIDE 14

Rejection region for non-standardized statistic ത 𝑌 (𝛽 = 0.05) REJECTION REGION AND SIGNIFICANCE LEVEL

Reject H0 Do not reject H0

𝛽 2 = 0.025 ҧ 𝑦crit = 𝜈0 − 1.96𝜏 ത

𝑌

Reject H0

𝛽 2 = 0.025 𝑨crit = 𝜈0 + 1.96𝜏 ത

𝑌

1 − 𝛽 = 0.95 𝜈0

slide-15
SLIDE 15

▪ The rejection region in this test is defined by the boundary values 𝜈0 − 1.96𝜏 ത

𝑌 and 𝜈0 + 1.96𝜏 ത 𝑌

▪ But we can also standardize the test statistic, and focus on 𝑎 =

ത 𝑌−𝜈0 𝜏ഥ

𝑌 rather than on ത

𝑌 ▪ The rejection region in this test is defined by the boundary values −1.96 and 1.96 ▪ If the value of your standardized test statistic is in the rejection region

▪ so if 𝑨data ≤ −1.96 or 𝑨data ≥ 1.96 ▪ reject 𝐼0: 𝜈 = 𝜈0 and accept 𝐼1: 𝜈 ≠ 𝜈0

REJECTION REGION AND SIGNIFICANCE LEVEL

slide-16
SLIDE 16

Rejection region for standardized statistic 𝑎 =

ത 𝑌−𝜈0 𝜏ഥ

𝑌

(𝛽 = 0.05) REJECTION REGION AND SIGNIFICANCE LEVEL

Reject H0 Do not reject H0

𝛽 2 = 0.025 𝑨crit = −1.96

Reject H0

𝛽 2 = 0.025 𝑨crit = +1.96 1 − 𝛽 = 0.95

slide-17
SLIDE 17

Suppose we test a hypothesis on the mean 𝐼0: 𝜈 = 310 with significance level 𝛽 = 0.05. We sample data, and calculate a test statistic 𝑢 = −2.13. What do we conclude? EXERCISE 2

slide-18
SLIDE 18

Five-step procedure ▪ step 1: state the hypotheses and the significance level ▪ step 2: choose a sample statistic and determine the rejection region (qualitatively) ▪ step 3: determine the null distribution, and state and/or check the requirements needed ▪ step 4: calculate the value of the test statistic and its critical value(s) ▪ step 5: draw conclusions FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS

These steps are done somewhat differently in every book and course. Never mind, all elements reappear.

slide-19
SLIDE 19

Using an example about the mean body height 𝜈𝑌 of a population with 𝜏𝑌

2 = 225 cm2

▪ On the basis of a sample of size 𝑜 = 100 with ҧ 𝑦 = 179.1 cm Step 1 ▪ State the hypotheses and the significance level

▪ null hypothesis 𝐼0: 𝜈𝑌 = 181 ▪ alternative hypothesis 𝐼1: 𝜈𝑌 ≠ 181 ▪ significance level 𝛽 = 0.05

FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS

slide-20
SLIDE 20

Step 2 ▪ Choose a sample statistic and determine the rejection region (qualitatively)

▪ sample statistic: sample mean ത 𝑌 ▪ because the hypothesis is about 𝜈𝑌 ▪ rejection region: reject 𝐼0 when ҧ 𝑦 is “too small” or “too large” ▪ because both situations suggest that 𝐼0 is probably wrong

FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS

slide-21
SLIDE 21

Step 3 ▪ Determine the null distribution, and state and/or validate the requirements needed

  • A. sampling distribution of ത

𝑌 under 𝐼0: ത 𝑌~𝑂 181,

225 100

▪ or even better: 𝑎 =

ത 𝑌−181 225/100 ~𝑂 0,1

▪ where the sample statistic ത 𝑌 is transformed into a standardized test statistic 𝑎

  • B. requirements: because 𝑜 = 100 ≥ 30, the sampling distribution
  • f ത

𝑌 will indeed be approximately normal ▪ no additional assumptions are needed

FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS

slide-22
SLIDE 22

Step 4 ▪ Calculate the value of the test statistic and its critical value(s)

▪ value of 𝑎 calculated from the data is

179.1−181 225/100 = −1.267

▪ we write this as 𝑨calc = −1.267 ▪ critical values of 𝑎 from the table are 𝑨crit,lower,0.025 = −1.96 and 𝑨crit,upper,0.025 = 1.96 ▪ rejection region for 𝑎 is 𝑆crit = −∞, −1.96 ∪ [1.96, ∞)

FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS

−1.96 +1.96 −1.267

slide-23
SLIDE 23

Step 5 ▪ Draw conclusions

▪ 𝑨calc ∉ 𝑆crit ▪ the probability of obtaining a sample mean of 179.1 (or even more extreme) while the population mean is 181 exceeds 0.05 ▪ do not reject 𝐼0 ▪ the data sampled are compatible with a population with 𝜈𝑌 = 181 ▪ this does not mean that you reject 𝐼1, and neither that you accept 𝐼0 ▪ it is the weak conclusion

FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS

slide-24
SLIDE 24

Suppose you want to test 𝐼0: 𝜏 ≤ 3. What would be step 2

  • f the 5-step procedure?

▪ remember: choose the sample statistic and determine the rejection region (qualitatively)

EXERCISE 3

slide-25
SLIDE 25

Alternative five-step procedure, using 𝑞-values ▪ step 1: identical ▪ step 2: identical ▪ step 3: identical ▪ step 4’: calculate the 𝑞-value of the test statistic ▪ step 5’: almost identical FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS

slide-26
SLIDE 26

Step 4’ ▪ Calculate the 𝑞-value of the test statistic

▪ probability of obtaining ҧ 𝑦 = 179.1 or even smaller: 𝑄ሺ ) ത 𝑌 ≤ 179.1 = 𝑄 𝑎 ≤

179.1−181 225/100

= 𝑄 𝑎 ≤ −1.267 = 0.1026 ▪ probability of obtaining ҧ 𝑦 = 182.9 or even larger: 𝑄ሺ ) ത 𝑌 ≥ 182.9 = 𝑄 𝑎 ≥

182.9−181 225/100

= 𝑄 𝑎 ≥ 1.267 = 0.1026 ▪ 𝑞-value: 0.1026 + 0.1026 = 0.2052

FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS

Why 182.9? It is equally far from 𝜈0 = 181, so equally “extreme” ... So, 𝑞-value is 𝑄൫ ൯ ሺ ) ത 𝑌 ≤ 179.1 ∪ ത 𝑌 ≥ 182.9 |𝐼0

slide-27
SLIDE 27

Step 5’ ▪ Draw conclusions

▪ 𝑞−value = 0.2052 > 𝛽 = 0.05 ▪ do not reject 𝐼0 ▪ the data sampled are compatible with a population with 𝜈𝑌 = 181 ▪ this does not mean that you reject 𝐼1, and neither that you accept 𝐼0 ▪ it is the weak conclusion

FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS

slide-28
SLIDE 28

So: ▪ Two possible implementations of the testing procedure

▪ standardize on the basis of the hypothesized distribution, compute 𝑨calc and a rejection region 𝑆calc, and determine if 𝑨calc ∈ 𝑆crit (this was the first procedure using a critical region) ▪ calculate the probability of finding 𝑨calc or even more extreme, and determine if it is smaller than or equal to the significance level 𝛽 (this is the second procedure, using 𝑞-values)

▪ Both methods yield the same conclusion ▪ You should be able to apply both methods FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS

slide-29
SLIDE 29

There is always a pair of hypotheses ▪ Example 1:

▪ you doubt if a coin is fair ▪ 𝐼0: 𝜌 = 0.5 ▪ 𝐼1: 𝜌 ≠ 0.5

▪ Example 2:

▪ you are designing a building and you think that the mean body height is less than 181 cm ▪ 𝐼0: 𝜈 ≥ 181 ▪ 𝐼1: 𝜈 < 181

MORE ON HYPOTHESES

slide-30
SLIDE 30

General rules: ▪ 𝐼0 and 𝐼1 are complementary (𝐼1

′ = 𝐼0)

▪ 𝐼0 contains an =-sign (=, ≥, or ≤) ▪ 𝐼1 is the statement we would like to establish (hmmm...) Note: some books and some programs use a different strategy ▪ 𝐼0 containing only an =-sign ▪ 𝐼1 being either ≠, <, or > ▪ so 𝐼0 and 𝐼1 not necessarily complementary MORE ON HYPOTHESES

slide-31
SLIDE 31

▪ Testing a two-sided hypothesis

▪ also called an undirected hypothesis ▪ example: 𝐼0: 𝜈 = 181 ▪ two-tailed rejection region (reject for “too small” and for “too large” values of ത 𝑌)

▪ Testing a one-sided hypothesis

▪ also called a directed hypothesis ▪ example: 𝐼0: 𝜈 ≥ 181 ▪ two types: left-sided and right-sided ▪ one-tailed rejection region (reject for “too small” or for “too large” values of ത 𝑌, depending on the direction)

MORE ON HYPOTHESES

slide-32
SLIDE 32

Possible one-sided hypotheses: Right-sided hypothesis ▪ 𝐼0: 𝜈 ≤ 𝜈0 vs. 𝐼1: 𝜈 > 𝜈0 ▪ rejection region is at the right Left-sided hypothesis ▪ 𝐼0: 𝜈 ≥ 𝜈0 vs. 𝐼1: 𝜈 < 𝜈0 ▪ rejection region is at the left MORE ON HYPOTHESES

It is easiest to remember this by thinking of 𝐼1: if 𝐼1 contains “<” the rejection region is on the left We often write 𝜈0 for the mean according to the null hypothesis

slide-33
SLIDE 33

Where to locate the rejection region (grey area) ▪ for 𝐼0: 𝜈 ≥ 𝜈𝑌 (left-sided) ▪ for 𝐼0: 𝜈 = 𝜈𝑌 (two-sided) ▪ for 𝐼0: 𝜈 ≤ 𝜈𝑌 (right-sided) MORE ON HYPOTHESES

Not precise enough in the book:

  • ne-sided and two-sided
slide-34
SLIDE 34

What about one-sided vs. one-tailed and two-sided vs. two-tailed? ▪ Suppose the hypothesis is 𝐼0: 𝜈 = 𝜈0

▪ two-sided, because we will reject when ത 𝑌 is too small and when ത 𝑌 is too large ▪ we might do a test on

ത 𝑌−𝜈 something and then reject when the test statistic

falls in either side of the distribution ▪ but we might alternatively do a test on

ത 𝑌−𝜈 2 something and then reject when

the test statistic falls in the right side of the distribution

▪ So, a two-sided hypothesis might lead to a one-tailed rejection region

▪ we will see this later when we study the so-called 𝜓2-test ▪ no need to bother now

MORE ON HYPOTHESES

slide-35
SLIDE 35

Example of one-sided hypothesis test Context:

▪ you are designing a building and you think that the mean body height of the population is less than 181 cm ▪ do a hypothesis test at 𝛽 = 5% ▪ sample of size 𝑜 = 100 has ҧ 𝑦 = 179.1 cm ▪ it is known that 𝜏𝑌

2 = 225 cm2

▪ can you be (statistically speaking) confident that 𝜈 < 181?

▪ Use five-step procedure MORE ON HYPOTHESES

slide-36
SLIDE 36

▪ Step 1

▪ 𝐼0: 𝜈 ≥ 181, 𝐼1: 𝜈 < 181, 𝛽 = 0.05

▪ Step 2

▪ sample statistic: ത 𝑌 ▪ reject for “too small values”

▪ Step 3

▪ under least extreme version of 𝐼0:

179.1−181 225/100 ~𝑂 0,1

▪ no further requirements (because 𝑜 ≥ 30)

▪ Step 4

▪ value of test statistic 𝑨calc = −1.267 ▪ 𝑨crit,0.05 = −1.645, so critical region 𝑆crit = ሺ−∞, −1.645]

▪ Step 5

▪ 𝑨calc ∉ 𝑆crit, so do not reject 𝐼0 ▪ conclude that there is no evidence for 𝜈 < 181

MORE ON HYPOTHESES

slide-37
SLIDE 37

The same problem, using 𝑞-values ▪ Steps 1-3

▪ identical

▪ Step 4’

▪ 𝑞-value: 𝑄 ത 𝑌 ≤ 179.1 = 𝑄

ത 𝑌−𝜈ഥ

𝑌

𝜏ഥ

𝑌

179.1−181 225/100

= 𝑄 𝑎 ≤ −1.267 = 0.1026

▪ Step 5’

▪ 𝑞−value=0.1026 > 𝛽 = 0.05, so do not reject 𝐼0 ▪ conclude that there is no evidence against 𝜈 ≥ 181

MORE ON HYPOTHESES

slide-38
SLIDE 38

Suppose we test a hypothesis on the mean 𝐼0: 𝜈 = 310 with significance level 𝛽 = 0.05. We sample data, and perform the test and find a 𝑞-value 0.37. What do we conclude? EXERCISE 4

slide-39
SLIDE 39

27 May 2014, Q1n OLD EXAM QUESTION

slide-40
SLIDE 40

Doane & Seward 5/E 9.1-9.5 Tutorial exercises week 2 hypothesis tests null and alternative testing one sample mean FURTHER STUDY