chapter 6 hypothesis testing what is hypothesis testing
play

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use - PowerPoint PPT Presentation

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical procedures to answer research questions Typical research question (generic): For hypothesis testing, research questions are statements: This is


  1. Chapter 6 Hypothesis Testing

  2. What is Hypothesis Testing? • … the use of statistical procedures to answer research questions • Typical research question (generic): • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical procedures seek to reject or accept the null hypothesis (details to follow) 2

  3. • Thus far: – You have generated a hypothesis (E.g. The mean of group A is different than the mean of group B) – You have collected some data (samples in group A, samples in group B) – Now you want to know if this data supports your hypothesis – Formally: – H0 (null hypothesis): there is no difference in the mean values of group A and group B – H1 (experimental hypothesis): there is a difference in the mean of group A and group B 3

  4. A practitioner’s point of view • Test statistic – Inferential statistics tell us what is the likelihood that the experimental hypothesis is true à by computing a test statistic. – Typically, if the likelihood of obtaining a value of a test statistic is <0.05, then we can reject the null hypothesis – “…significant effect of …” • Non-significant results – Does not mean that the null hypothesis is true – Interpreted to mean that the results you are getting could be a chance finding • Significant result – Means that the null hypothesis is highly unlikely 4

  5. A practitioner’s point of view • Errors: – Type 1 error (False positive) : we believe that there is an effect when there isn’t one – Type 2 error (False negative) : we believe that there isn’t an effect, when there is one – If p<0.05, then the probability of a Type 1 error is < 5% (alpha level) • Typically, we deal with two types of hypotheses – The mean of group A is different from the mean of group B (one-tailed test) – The mean of group A is larger than the mean of group B (two-tailed test) 5

  6. Statistical Procedures • Two types: – Parametric • Data are assumed to come from a distribution, such as the normal distribution, t -distribution, etc. – Non-parametric • Data are not assumed to come from a distribution – Lots of debate on assumptions testing and what to do if assumptions are not met (avoided here, for the most part) – A reasonable basis for deciding on the most appropriate test is to match the type of test with the measurement scale of the data (next slide) 6

  7. Measurement Scales vs. Statistical Tests Examples M=Male, F=Female Preference ranking Likert scale responses Task completion time • Parametric tests most appropriate for… – Ratio data, interval data • Non-parametric tests most appropriate for… – Ordinal data, nominal data (although limited use for ratio and interval data) 7

  8. Tests Presented Here • Parametric – T-test – Analysis of variance (ANOVA) – Most common statistical procedures in HCI research 8

  9. T-test • Goal: To ascertain if the difference in the means of two groups is significant • Assumptions – Data are normally distributed (you checked for this by looking at the histograms, reporting the mean/median/standard deviation, and by running Shapiro-Wilks) – If data come from different groups of people à Independent t-test (assumes scores are independent and variances in the populations are roughly equal … check your table of descriptive statistics) – If data come from same group of people à dependent t-test • Practioner’s point of view: When in doubt, consult a book! Let’s do an example in R 9

  10. Tests Presented Here • Parametric – Analysis of variance (ANOVA) • Used for ratio data and interval data • Most common statistical procedure in HCI research • Non-parametric – Chi-square test • Used for nominal data – Mann-Whitney U, Wilcoxon Signed-Rank, Kruskal- Wallis, and Friedman tests • Used for ordinal data 10

  11. Analysis of Variance • The analysis of variance (ANOVA) is the most widely used statistical test for hypothesis testing in factorial experiments • Goal à determine if an independent variable has a significant effect on a dependent variable • Remember, an independent variable has at least two levels (test conditions) • Goal (put another way) à determine if the test conditions yield different outcomes on the dependent variable (e.g., one of the test conditions is faster/slower than the other) 11

  12. Why Analyse the Variance? • Seems odd that we analyse the variance, but the research question is concerned with the overall means: • Let’s explain through two simple examples (next slide) 12

  13. Example #1 Example #2 “Significant” implies that in all “Not significant” implies that the likelihood the difference observed difference observed is likely due is due to the test conditions to chance. (Method A vs. Method B). File: 06-AnovaDemo.xlsx 13

  14. Example #1 - Details Note: Within-subjects design Error bars show ±1 standard deviation Note: SD is the square root of the variance 14

  15. Example #1 – ANOVA 1 Probability of obtaining the observed data if the null hypothesis is true Thresholds for “p” • .05 Reported as … • .01 • .005 F 1,9 = 9.80, p < .05 • .001 • .0005 • .0001 1 ANOVA table created by StatView (now marketed as JMP , a product of SAS; www.sas.com)

  16. How to Report an F -statistic • Notice in the parentheses – Uppercase for F – Lowercase for p – Italics for F and p – Space both sides of equal sign – Space after comma – Space on both sides of less-than sign – Degrees of freedom are subscript, plain, smaller font – Three significant figures for F statistic – No zero before the decimal point in the p statistic (except in Europe)

  17. Example #2 - Details Error bars show ±1 standard deviation

  18. Example #2 – ANOVA Probability of obtaining the observed data if the null hypothesis is true Note: For non-significant Reported as … effects, use “ns” if F < 1.0, or “ p > .05” if F > 1.0. F 1,9 = 0.626, ns

  19. Example #2 - Reporting 19

  20. More Than Two Test Conditions 20

  21. ANOVA • There was a significant effect of Test Condition on the dependent variable ( F 3,45 = 4.95, p < .005) • Degrees of freedom – If n is the number of test conditions and m is the number of participants, the degrees of freedom are… – Effect à ( n – 1) – Residual à ( n – 1)( m – 1) – Note: single-factor, within-subjects design 21

  22. Post Hoc Comparisons Tests • A significant F -test means that at least one of the test conditions differed significantly from one other test condition • Does not indicate which test conditions differed significantly from one another • To determine which pairs differ significantly, a post hoc comparisons tests is used • Examples: – Fisher PLSD, Bonferroni/Dunn, Dunnett, Tukey/Kramer, Games/ Howell, Student-Newman-Keuls, orthogonal contrasts, Scheffé 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend