gov 2000 6 hypothesis testing
play

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 - PowerPoint PPT Presentation

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis Testing Examples 2. Hypothesis Test Nomenclature 3. Conducting Hypothesis Tests 4. p-values 5. Power Analyses 6. Exact Inference* 7. Wrap up 2 / 55


  1. Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55

  2. 1. Hypothesis Testing Examples 2. Hypothesis Test Nomenclature 3. Conducting Hypothesis Tests 4. p-values 5. Power Analyses 6. Exact Inference* 7. Wrap up 2 / 55

  3. Where are we? Where are we going? population parameter, drawing on our knowledge of probability. values of the parameter in the confjdence interval. about the data. the term! 3 / 55 • Last few weeks = how to produce a best estimate of some • Also learned how to derive an estimated range of plausible • Now: how to use our estimates to test a particular hypothesis • We’ll draw heavily on our probability knowledge from earlier in

  4. 1/ Hypothesis Testing Examples 4 / 55

  5. The lady tasting tea Your advisor asks you to grab a tea with milk for him before your meeting and he says that he prefers tea poured before the milk. You stop by Darwin’s and ask for a tea with milk. When you bring it to your advisor, he complains that it was prepared milk-fjrst. devise a test: 5 / 55 • Remember the setup: • You are skeptical that he can really tell the difgerence, so you ▶ Prepare 8 cups of tea, 4 milk-fjrst, 4 tea-fjrst ▶ Present cups to advisor in a random order ▶ Ask advisor to pick which 4 of the 8 were milk-fjrst.

  6. Assuming we know the truth correct if she were guessing randomly? probability. 1 Another testing example 6 / 55 • Advisor picks out all 4 milk-fjrst cups correctly! • Statistical thought experiment: how often would she get all 4 ▶ Only one way to choose all 4 correct cups. ▶ But 70 ways of choosing 4 cups among 8. ▶ Choosing at random ≈ picking each of these 70 with equal • Chances of guessing all 4 correct is 70 ≈ 0.014 or 1.4%. • ⇝ the guessing at random hypothesis might be implausible.

  7. Social pressure effect 7 / 55

  8. Social pressure effect load("../data/gerber_green_larimer.RData") "Neighbors"]) "Civic Duty"]) neigh.mean - contr.mean ## [1] 0.0634 due to random chance. treatment efgect at all? 8 / 55 social$voted <- 1 * (social$voted == "Yes") neigh.mean <- mean(social$voted[social$treatment == contr.mean <- mean(social$voted[social$treatment == • Treatment efgect of 6.341 percentage points. • But we know that the estimator varies from sample to sample • Could this happen by random chance if there was no

  9. Review of the difference in means ̂ 𝑜 𝑦 𝑦 𝑜 𝑧 𝑧 se [̂ 9 / 55 and population variance 𝜏 2 𝑦 𝑧 and population variance 𝜏 2 • Treated group 𝑍 1 , 𝑍 2 , … , 𝑍 𝑜 𝑧 i.i.d. with population mean 𝜈 𝑧 • Control group 𝑌 1 , 𝑌 2 , … , 𝑌 𝑜 𝑦 i.i.d. with population mean 𝜈 𝑦 • Quantity of interest: population difgerences in average turnout: 𝔽[𝑍 𝑗 ] − 𝔽[𝑌 𝑗 ] = 𝜈 𝑧 − 𝜈 𝑦 • Estimator: sample difgerence in means: ̂ 𝐸 𝑜 = 𝑍 𝑜 𝑧 − 𝑌 𝑜 𝑦 • We estimated the standard error of ̂ 𝐸 𝑜 with: + 𝑇 2 𝐸 𝑜 ] = √𝑇 2

  10. 2/ Hypothesis Test Nomenclature 10 / 55

  11. What is a hypothesis test? about the population distribution. see under this assumption. under it. 11 / 55 • A hypothesis test is an evaluation of a particular hypothesis • Statistical thought experiments: ▶ Assume we know (part of) the true DGP.. ▶ Use tools of probability to see what types of data we should ▶ Compare our observed data to this thought experiment. • Statistical proof by contradiction: ▶ We will “reject” the assumed DGP if the data is too unusual

  12. What is a hypothesis? parameters. turnout higher in social pressure group compared to Civic Duty group?) issues? (voting behavior difgerent among members of Congress with daughters?) treaty signers?) 12 / 55 • Defjnition A hypothesis is just a statement about population • We might have hypotheses about causal inferences: ▶ Does social pressure induce higher voter turnout? (mean ▶ Do daughters cause politicians to be more liberal on women’s ▶ Do treaties constrain countries? (behavior difgerent among • We might also have hypotheses about other parameters: ▶ Is the share of Hillary Clinton supporters more than 50%? ▶ Are traits of treatment and control groups difgerent?

  13. Null and alternative hypotheses value for a population parameter. hypothesis is the research claim we are interested in supporting. 13 / 55 • Defjntion The null hypothesis is a proposed, conservative ▶ This is usually “no efgect/difgerence/relationship.” ▶ We denote this hypothesis as 𝐼 0 ∶ 𝜄 = 𝜄 0 . ▶ 𝐼 0 : Social pressure doesn’t afgect turnout ( 𝐼 0 ∶ 𝜈 𝑧 − 𝜈 𝑦 = 0 ) • Defjnition The alternative hypothesis for a given null ▶ Usually, “there is a relationship/difgerence/efgect.” ▶ We denote this as 𝐼 𝑏 ∶ 𝜄 ≠ 𝜄 0 . ▶ 𝐼 𝑏 : Social pressure afgects turnout ( 𝐼 𝑏 ∶ 𝜈 𝑧 − 𝜈 𝑦 ≠ 0 ) • Always mutually exclusive

  14. General framework hypothesis based on the data we observe. 𝑈 under the null. 14 / 55 • A hypothesis test chooses whether or not to reject the null • Rejection based on a test statistic, 𝑈 𝑜 = 𝑈(𝑍 1 , … , 𝑍 𝑜 ) . ▶ Will help us adjudicate between the null and the alternative. ▶ Typically: larger values of 𝑈 𝑜 ⇝ null less plausible. ▶ A test statistic is a r.v. • Defjnition The null/reference distribution is the distribution of ▶ We’ll write its probabilities as ℙ 0 (𝑈 𝑜 ≤ 𝑢) .

  15. Test statistic example ̂ population difg-in-means is not plausible. → 𝑂(0, 1) 𝑒 𝐸 𝑜 ] se [̂ ̂ 𝐸 𝑜 15 / 55 → 𝑂(0, 1) 𝑒 𝐸] se [̂ ̂ ̂ means has a standard normal distribution in large samples: • By the CLT, we know that the standardized difgerence in 𝐸 𝑜 − (𝜈 𝑧 − 𝜈 𝑦 ) 𝑈 𝑜 = • Under the null hypothesis of 𝐼 0 ∶ 𝜈 𝑧 − 𝜈 𝑦 = 0 , then we have 𝑈 𝑜 = • If 𝑈 𝑜 is very far from 0 ⇝ large sample difg-in-means ⇝ no

  16. Rejection regions for which we reject the null. the null. null 16 / 55 • Defjnition The rejection region, 𝑆 , contains the values of 𝑈 𝑜 ▶ These are the areas that indicate that there is evidence against • Two-sided alternative (our focus): ▶ 𝐼 0 ∶ 𝜈 𝑧 − 𝜈 𝑦 = 0 and 𝐼 𝑏 ∶ 𝜈 𝑧 − 𝜈 𝑦 ≠ 0 ▶ Implies that 𝑈 𝑜 >> 0 or 𝑈 𝑜 << 0 will be evidence against the ▶ Rejection regions: |𝑈 𝑜 | > 𝑑 for some value 𝑑 • How to determine these regions?

  17. Type I and Type II errors Type I errors A Type I error is when we reject the null hypothesis when it is in fact true. Type II errors A Type II error is when we fail to reject the null hypothesis when it is false. discerning. 17 / 55 • We say that the Lady is discerning when she is just guessing. • A false discovery (very bad, thus type I). • We say that the Lady is just guessing when she is truly • An undetected fjnding (not as bad, thus type II).

  18. Test level/size Good stufg! to discovery 1,750,000 1 there a Type I error. 18 / 55 Type I error Reject 𝐼 0 Type II error Awesome! Retain 𝐼 0 𝐼 0 True 𝐼 0 False • Defjntion The level/size of the test, or 𝛽 , is the probability of ▶ With two-sided alternative, we reject when |𝑈 𝑜 | > 𝑑 ▶ Size of test then is: ℙ 0 (|𝑈 𝑜 | > 𝑑) = 𝛽 • Choose a level 𝛽 based on aversion to false discovery: ▶ Convention in social sciences is 𝛽 = 0.05 , but nothing magical ▶ Particle physicists at CERN use 𝛽 ≈ ▶ Lower values of 𝛽 guard against “fmukes” but increase barriers

  19. 3/ Conducting Hypothesis Tests 19 / 55

  20. Hypothesis testing procedure 1. Choose null and alternative hypotheses 2. Choose a test statistic, 𝑈 𝑜 3. Choose a level, 𝛽 4. Determine rejection region 20 / 55 5. Reject if 𝑈 𝑜 in rejection region, fail to reject otherwise

  21. Rejection region the rejection region only 5% of the time. normal! 21 / 55 0.5 Reject Reject Retain 0.4 0.3 P 0 ( T ) 0.2 0.1 0.0 -c c -4 -2 0 2 4 T under the null hypothesis • What’s the rejection region |𝑈 𝑜 | > 𝑑 if 𝛽 = 0.05 ? • Under the null hypothesis of no efgect, we want 𝑈 𝑜 to be in ▶ ⇝ false rejection of the null only 5% of the time. ▶ Can fjnd 𝑑 based on the null distribution being ≈ standard

  22. Determining the rejection region 22 / 55 0.5 Reject Reject Retain 0.4 0.3 P 0 ( T ) 0.2 0.1 α 2 α 2 0.0 − c = z α 2 c = z α 2 -4 -2 0 2 4 T under the null hypothesis • Find 𝑨 𝛽/2 such that ℙ 0 (𝑈 𝑜 < −𝑨 𝛽/2 ) = ℙ 0 (𝑈 𝑜 > 𝑨 𝛽/2 ) = 𝛽/2

  23. Determining the rejection region 23 / 55 0.5 Reject Reject Retain 0.4 0.3 P 0 ( T ) 0.2 0.1 1 − α 2 α 2 0.0 − c = − z α 2 c = z α 2 -4 -2 0 2 4 T under the null hypothesis • Find 𝑨 𝛽/2 such that ℙ 0 (𝑈 𝑜 < −𝑨 𝛽/2 ) = ℙ 0 (𝑈 𝑜 > 𝑨 𝛽/2 ) = 𝛽/2 • ⇝ fjnd quantile ℙ 0 (𝑈 𝑜 < 𝑨 𝛽/2 ) = 1 − 𝛽/2 ▶ if 𝛽 = 0.05 ⇝ 𝑨 𝛽/2 = qnorm(1-0.05/2) = 1.96

  24. Final hypothesis test 𝐸 𝑜 /̂ se [̂ 𝐸 𝑜 ] 3. Use 𝛽 = 0.05 4. Rejection region is |𝑈 𝑜 | > 1.96 . 24 / 55 1. Hypotheses: 𝐼 0 ∶ 𝜈 𝑧 − 𝜈 𝑦 = 0 vs. 𝐼 𝑏 ∶ 𝜈 𝑧 − 𝜈 𝑦 ≠ 0 2. Test statistic: 𝑈 𝑜 = ̂

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend