hypothesis testing
play

HYPOTHESIS TESTING PART I RECAP & OUTLOOK BAYESIAN PARAMETER - PowerPoint PPT Presentation

INTRODUCTION TO DATA ANALYSIS HYPOTHESIS TESTING PART I RECAP & OUTLOOK BAYESIAN PARAMETER ESTIMATION FREQUENTIST HYPOTHESIS TESTING model captures prior beliefs model captures a hypothetically M M about data-generating process


  1. INTRODUCTION TO DATA ANALYSIS HYPOTHESIS TESTING PART I

  2. RECAP & OUTLOOK BAYESIAN PARAMETER ESTIMATION FREQUENTIST HYPOTHESIS TESTING ▸ model captures prior beliefs ▸ model captures a hypothetically M M about data-generating process assumed data-generating process ▸ prior over latent parameters ▸ fix parameter value of interest ▸ likelihood of data ▸ likelihood of data ▸ Bayesian posterior inference using ▸ single out some aspect of the data as most important (test statistic) observed data D obs ▸ look at distribution of test statistic ▸ compare posterior beliefs to some given the assumed model parameter value of interest (sampling distribution) ▸ check likelihood of test statistic applied to the observed data D obs

  3. CAVEAT ! FREQUENTIST HYPOTHESIS TESTING ▸ there are at least three flavors of frequentist hypothesis testing ▸ Fisher ▸ Neyman-Pearson ▸ modern hybrid NHST [null-hypothesis significance testing] ▸ not every text book is clear on these differences and/or which flavor it endorses ▸ there is also no unanimity of practice between or within research fields

  4. LEARNING GOALS ▸ understand basic idea of frequentist hypothesis testing ▸ understand what a p-value is ▸ definition, one- vs two-sided ▸ test statistic & sampling distribution ▸ relation to confidence intervals ▸ significance levels & -error α

  5. p -value

  6. PRELIMINARIES ▸ research hypothesis: theoretically implied answer to a main question of interest for research ▸ e.g., truth-judgements of sentences with presupposition failure at chance level? (King of France) ▸ e.g., faster reactions in reaction time trials than in go/No-go trials? (Mental Chronometry) ▸ null hypothesis: specific assumption made for purposes of analysis ▸ fix parameter value in a data-generating model for technical reasons ▸ analogy: useful assumption in mathematical proof (e.g., in reductio ad absurdum) ▸ alternative hypothesis: the antagonist of the null hypothesis, specified to relate the null hypothesis to the research hypothesis

  7. P-VALUE

  8. Binomial Model

  9. BAYESIAN BINOMIAL MODEL (AS ORIGINALLY INTRODUCED) θ ∼ Beta(…) N θ k ∼ Binomial( θ , N ) k

  10. BAYESIAN BINOMIAL MODEL (EXTENDED) θ θ ∼ Beta(…) x i ∼ Bernoulli( θ 0 ) x i N N ∑ k = x i i =1 k

  11. FREQUENTIST BINOMIAL MODEL [doted line = “working assumption”] θ 0 x i ∼ Bernoulli( θ 0 ) [likelihood of “raw” data] N ∑ k = x i [test statistic (derived from “raw” data)] i =1 x i N FACT: The sampling distribution of is: k k ∼ Binomial( θ 0 , N ) k

  12. ⃗ FREQUENTIST BINOMIAL MODEL ▸ null-hypothesis: θ = θ 0 θ 0 ▸ test statistic: derived from “raw” data k x ▸ the most important (numerical) aspect of the data for the current testing purposes x i N ▸ sampling distribution: likelihood of observing a particular value of in this model k ▸ notice: the observed data has not yet made D obs any appearance k remark: sometimes summary statistics of other than the ▸ D obs test statistic might be used in the model

  13. FREQUENTIST BINOMIAL MODEL ▸ likelihood of data: random variable 𝒠 | H 0 θ 0 N ∏ P ( 𝒠 | H 0 = ⟨ x 1 , …, x N ⟩ ) = Bernoulli( x i , θ 0 ) i =1 ▸ sampling distribution: random variable T | H 0 x i N P ( T | H 0 = k ) = Binomial( k , θ 0 , N ) k

  14. Binomial p-values

  15. BINOMIAL TEST ▸ 24/7 example: and N = 24 k = 7 ▸ t ( D obs ) = 7 P ( T | H 0 = k ) = Binomial( k , θ 0 , N ) ▸ ▸ p-value definition: p ( D obs ) = P ( T | H 0 ⪰ H 0, a t ( D obs )) we know this ??? we know this What counts as “more extreme evidence against the null hypothesis” is a context-sensitive notion that depends on the null-hypothesis and the alternative hypothesis because only when put together do null- and alternative hypothesis address the research question in the background.

  16. BINOMIAL TEST ▸ compare two research questions ▸ we still use a point-valued null- hypothesis for technical reasons 1. Is the coin fair? ▸ the alternative hypothesis is ▸ H 0 : θ = 0.5 important to fix the meaning of ⪰ H 0, a ▸ H a : θ ≠ 0.5 2. Is the coin biased towards heads? ▸ H 0 : θ = 0.5 ▸ H a : θ < 0.5

  17. BINOMIAL TEST ▸ Case 1: Is the coin fair? ▸ H 0 : θ = 0.5 ▸ H a : θ ≠ 0.5 ▸ which values of are k more extreme evidence against ? H 0

  18. BINOMIAL TEST ▸ Case 1: Is the coin fair? ▸ H 0 : θ = 0.5 ▸ H a : θ ≠ 0.5 ▸ which values of are k more extreme evidence against ? H 0 ▸ anything that’s even less likely to occur

  19. BINOMIAL TEST

  20. BINOMIAL TEST ▸ Case 2: Is the coin biased towards heads? ▸ H 0 : θ = 0.5 ▸ H a : θ < 0.5 ▸ which values of are k more extreme evidence against ? H 0

  21. BINOMIAL TEST ▸ Case 2: Is the coin biased towards heads? ▸ H 0 : θ = 0.5 ▸ H a : θ < 0.5 ▸ which values of are k more extreme evidence against ? H 0 ▸ anything even more in favor of H a

  22. BINOMIAL TEST

  23. p -value revisit

  24. P-VALUE

  25. significance α and -errors

  26. SIGNIFICANCE LEVELS ▸ standardly we fix a significance level before the test α ▸ common values of are: α ▸ α = 0.05 ▸ α = 0.01 ▸ α = 0.001 ▸ if the p -value for the observed data passes the pre-established threshold of significance, we say that the test result was significant ▸ a significant test result is conventionally regarded as “strong enough” evidence against the null-hypothesis, so that we can reject the null hypothesis as a viable explanation of the data ▸ non-significant results are interpreted differently in different approaches (more later)

  27. α -ERROR ▸ an -error (aka type-I error) occurs when we reject a true null hypothesis α ▸ by definition this type of error occurs, in the long run, with a proportion of no more than α ▸ it is in this way that frequentist statistic is subscribed and cherishes a regime of long-term error control on research results ▸ Bayesian approaches (usually) are not concerned with long-term error control

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend