business statistics
play

Business Statistics CONTENTS A hypothesis test Hypotheses - PowerPoint PPT Presentation

HYPOTHESES: LOGIC AND FRAMEWORK Business Statistics CONTENTS A hypothesis test Hypotheses Rejection region and significance level Five-step procedure for hypothesis tests More on hypotheses Old exam question Further study A HYPOTHESIS TEST


  1. HYPOTHESES: LOGIC AND FRAMEWORK Business Statistics

  2. CONTENTS A hypothesis test Hypotheses Rejection region and significance level Five-step procedure for hypothesis tests More on hypotheses Old exam question Further study

  3. A HYPOTHESIS TEST ▪ Suppose a beverage company wants to test if its bottles are filled with 1 liter ▪ more than 1 liter: not competitive ▪ less than 1 liter: trouble with the consumers association ▪ They take a random sample of 9 bottles ▪ and find ҧ 𝑦 = 1.02 liter ▪ Can they claim 𝜈 = 1 liter? ▪ Assume: ▪ population is normally distributed ▪ population has standard deviation 𝜏 = 0.003 liter ▪ so: 𝑌~𝑂 𝜈 =? , 𝜏 = 0.003

  4. A HYPOTHESIS TEST If all assumptions (including 𝜈 = 1 !) are true: ▪ The sampling distribution of the mean ( ത 𝑌 ) Or even larger? ▪ is normal We’ll go into that soon. ▪ has mean 𝜈 ത 𝑌 = 𝜈 𝑌 = 1 𝜏 𝑌 0.003 ▪ has standard deviation 𝜏 ത 𝑌 = 9 = = 0.001 3 ▪ So, there is a probability of finding a sample mean ത 𝑌 = 1.02 or even larger, given by ത 𝑌−1 1.02−1 ▪ 𝑄 𝑂 ത 𝑌 ≥ 1.02 = 𝑄 𝑎 0.001 ≥ = 𝑄 𝑎 𝑎 ≥ 20 = 0.001 0.000 ≈ 0% ▪ very very unlikely! ▪ So, you can reject the claim 𝜈 𝑌 = 1 with high confidence!

  5. A HYPOTHESIS TEST Now, suppose you had found ҧ 𝑦 = 1.002 liter ▪ There is a probability of finding a sample mean ത 𝑌 = 1.002 or even larger, given by ത 𝑌−1 1.002−1 ▪ 𝑄 𝑂 ത 𝑌 ≥ 1.002 = 𝑄 𝑎 0.001 ≥ = 𝑄 𝑎 𝑎 ≥ 2 = 0.001 0.02275 ≈ 2.3% ▪ not very likely, but it may certainly happen now and then ▪ So, you can reject the claim 𝜈 𝑌 = 1 with some confidence ▪ but you know that there is some chance to make the wrong decision ▪ Or: you can decide to not reject the claim 𝜈 𝑌 = 1 ▪ because you know that it may still be true, despite the data

  6. HYPOTHESES In general a hypothesis is an unproven assertion ▪ In statistics: ▪ a hypothesis is a claim about a (population!) parameter ▪ Examples: ▪ the mean monthly cell phone bill of this city is 42$ is ( 𝜈 = $42) ▪ the proportion of adults in this city with an iPhone is at least 0.68 ( 𝜌 ≥ 0.68 ) ▪ the variance of spending on fashion for men is not smaller than that for women ( 𝜏 men 2 2 ) ≥ 𝜏 women ▪ the median life expectancy is the same for all three income groups ( 𝑁 1 = 𝑁 2 = 𝑁 3 )

  7. HYPOTHESES Statistical hypotheses have the following aspects: ▪ A (population!) parameter ▪ 𝜈 , 𝜌 , 𝜏 2 , etc. ▪ In case of one-sample: a benchmark ▪ 𝜈 = 181 , 𝜌 ≤ 0.2 , etc. ▪ In case of several samples: a comparison 2 , etc. 2 = 𝜏 2 2 = 𝜏 3 ▪ 𝜈 1 = 𝜈 2 , 𝜌 1 − 𝜌 2 ≤ 0.2 , 𝜏 1 A hypothesis test is a decision between two competing mutually exclusive and collectively exhaustive hypotheses about the value(s) of the parameter(s)

  8. HYPOTHESES ▪ Examples of a hypothesis test: ▪ 𝐼 0 : 𝜈 = 181 versus 𝐼 1 : 𝜈 ≠ 181 ▪ 𝐼 0 : 𝜌 ≤ 0.2 versus 𝐼 1 : 𝜌 > 0.2 ▪ Terminology ▪ 𝐼 0 is the null hypothesis (on which the test focuses) ▪ 𝐼 1 is the alternative hypothesis ▪ We focus on 𝐼 0 ▪ so if we reject 𝐼 0 , we automatically accept 𝐼 1 ▪ while if we do not reject 𝐼 0 , we “maintain” 𝐼 0 (but do not reject 𝐼 1 and do not accept 𝐼 0 )

  9. EXERCISE 1 A government official wants to proudly announce that unemployment is under 4%. Which hypothesis should he test?

  10. REJECTION REGION AND SIGNIFICANCE LEVEL ▪ Example: ▪ 𝐼 0 : 𝜈 = 181 versus 𝐼 1 : 𝜈 ≠ 181 ▪ We collect data and perform the hypothesis test ▪ Two possible outcomes: ▪ reject 𝐼 0 , so accept 𝐼 1 , and conclude 𝜈 ≠ 181 ▪ do not reject 𝐼 0 , and conclude that there is no evidence to reject 𝜈 = 181 ▪ Whatever the decision is, you may be wrong ▪ there is sampling variation ▪ you may always have an exceptional sample ▪ example: if you want to test if a coin is fair, it may happen that you have only “heads” in your sample, even if the coin is fair!

  11. REJECTION REGION AND SIGNIFICANCE LEVEL ▪ Between rejecting and not rejecting, there is a boundary ▪ This boundary defines the risk you are prepared to take ▪ if you want to test if a coin is fair, and you use a sample of size 20 , how many “heads” will induce you to reject the null hypothesis ( 𝜌 = 0.5 )? ▪ You will determine a rejection region ▪ for instance: you will reject the null hypothesis ( 𝜌 = 0.5 ) when you obtain 5 heads or fewer, or 15 heads or more ▪ You use a pre-established significance level to determine the boundaries of the rejection region

  12. REJECTION REGION AND SIGNIFICANCE LEVEL ▪ So, you define a significance level ▪ conventional symbol 𝛽 ▪ often taken to be 0.05 ▪ but also 0.1 , 0.01 , 0.005 , 0.001 , etc are used often ▪ There is a close link between ▪ the confidence level ( 1 − 𝛽 , as used in a confidence interval) ▪ and a significance level ( 𝛽 , as used in a hypothesis test) ▪ confidence level+significance level=1

  13. REJECTION REGION AND SIGNIFICANCE LEVEL ▪ Suppose we have a sample and want to see if it comes from a distribution with mean 𝜈 0 ▪ assuming normality of the population ▪ assuming a known value for 𝜏 ▪ testing 𝜈 = 𝜈 0 at a significance level 𝛽 = 5% ▪ We want to determine boundary values for ത 𝑌 such that the claim 𝜈 = 𝜈 0 becomes unlikely So, we distribute the 𝛽 = 5% equally at ▪ upper boundary: 𝑄 ത 𝑌 ≥ ҧ 𝑦 upper = 0.025 both sides ▪ lower boundary: 𝑄 ത 𝑌 ≤ ҧ 𝑦 lower = 0.025 ▪ If the value of the test statistic is in the rejection region ▪ so if ҧ 𝑌 or ҧ 𝑦 data ≤ 𝜈 0 − 1.96𝜏 ത 𝑦 data ≥ 𝜈 0 + 1.96𝜏 ത 𝑌 ▪ we reject 𝐼 0 : 𝜈 = 𝜈 0 and accept 𝐼 1 : 𝜈 ≠ 𝜈 0

  14. ҧ REJECTION REGION AND SIGNIFICANCE LEVEL Rejection region for non-standardized statistic ത 𝑌 ( 𝛽 = 0.05 ) 1 − 𝛽 = 0.95 𝛽 𝛽 2 = 0.025 2 = 0.025 Reject H 0 Do not reject H 0 Reject H 0 𝑨 crit 𝑦 crit 𝜈 0 = 𝜈 0 + 1.96𝜏 ത = 𝜈 0 − 1.96𝜏 ത 𝑌 𝑌

  15. REJECTION REGION AND SIGNIFICANCE LEVEL ▪ The rejection region in this test is defined by the boundary values 𝜈 0 − 1.96𝜏 ത 𝑌 and 𝜈 0 + 1.96𝜏 ത 𝑌 ▪ But we can also standardize the test statistic, and focus on ത 𝑌−𝜈 0 𝑌 rather than on ത 𝑎 = 𝑌 𝜏 ഥ ▪ The rejection region in this test is defined by the boundary values −1.96 and 1.96 ▪ If the value of your standardized test statistic is in the rejection region ▪ so if 𝑨 data ≤ −1.96 or 𝑨 data ≥ 1.96 ▪ reject 𝐼 0 : 𝜈 = 𝜈 0 and accept 𝐼 1 : 𝜈 ≠ 𝜈 0

  16. REJECTION REGION AND SIGNIFICANCE LEVEL ത 𝑌−𝜈 0 Rejection region for standardized statistic 𝑎 = ( 𝛽 = 𝜏 ഥ 𝑌 0.05 ) 1 − 𝛽 = 0.95 𝛽 𝛽 2 = 0.025 2 = 0.025 Reject H 0 Do not reject H 0 Reject H 0 0 𝑨 crit = +1.96 𝑨 crit = −1.96

  17. EXERCISE 2 Suppose we test a hypothesis on the mean 𝐼 0 : 𝜈 = 310 with significance level 𝛽 = 0.05 . We sample data, and calculate a test statistic 𝑢 = −2.13 . What do we conclude?

  18. FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS Five-step procedure ▪ step 1: state the hypotheses and the significance level ▪ step 2: choose a sample statistic and determine the rejection region (qualitatively) ▪ step 3: determine the null distribution, and state and/or check the requirements needed ▪ step 4: calculate the value of the test statistic and its critical value(s) ▪ step 5: draw conclusions These steps are done somewhat differently in every book and course. Never mind, all elements reappear.

  19. FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS Using an example about the mean body height 𝜈 𝑌 of a 2 = 225 cm 2 population with 𝜏 𝑌 ▪ On the basis of a sample of size 𝑜 = 100 with ҧ 𝑦 = 179.1 cm Step 1 ▪ State the hypotheses and the significance level ▪ null hypothesis 𝐼 0 : 𝜈 𝑌 = 181 ▪ alternative hypothesis 𝐼 1 : 𝜈 𝑌 ≠ 181 ▪ significance level 𝛽 = 0.05

  20. FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS Step 2 ▪ Choose a sample statistic and determine the rejection region (qualitatively) ▪ sample statistic: sample mean ത 𝑌 ▪ because the hypothesis is about 𝜈 𝑌 ▪ rejection region: reject 𝐼 0 when ҧ 𝑦 is “too small” or “too large” ▪ because both situations suggest that 𝐼 0 is probably wrong

  21. FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS Step 3 ▪ Determine the null distribution, and state and/or validate the requirements needed 225 A. sampling distribution of ത 𝑌 under 𝐼 0 : ത 𝑌~𝑂 181, 100 ത 𝑌−181 ▪ or even better: 𝑎 = 225/100 ~𝑂 0,1 ▪ where the sample statistic ത 𝑌 is transformed into a standardized test statistic 𝑎 B. requirements: because 𝑜 = 100 ≥ 30 , the sampling distribution of ത 𝑌 will indeed be approximately normal ▪ no additional assumptions are needed

  22. FIVE-STEP PROCEDURE FOR HYPOTHESIS TESTS Step 4 ▪ Calculate the value of the test statistic and its critical value(s) 179.1−181 ▪ value of 𝑎 calculated from the data is 225/100 = −1.267 ▪ we write this as 𝑨 calc = −1.267 ▪ critical values of 𝑎 from the table are 𝑨 crit,lower,0.025 = −1.96 and 𝑨 crit,upper,0.025 = 1.96 ▪ rejection region for 𝑎 is 𝑆 crit = −∞, −1.96 ∪ [1.96, ∞) −1.96 0 +1.96 −1.267

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend