user research statistics quick guide
play

User Research Statistics Quick Guide Reference: Jeff Sauro and James - PowerPoint PPT Presentation

User Research Statistics Quick Guide Reference: Jeff Sauro and James R. Lewis, Quantifying the User Experience, 2 nd ed, Chapter 3, parts of Chapter 9 1 CS464, Spring 2017 Why? To completely answer usability questions we need to test every member


  1. User Research Statistics Quick Guide Reference: Jeff Sauro and James R. Lewis, Quantifying the User Experience, 2 nd ed, Chapter 3, parts of Chapter 9 1 CS464, Spring 2017

  2. Why? To completely answer usability questions we need to test every member of the population. This isn’t possible so we: • Test a sample population, then estimate what the values would be for the entire population. – Estimates are less accurate as the sample size gets smaller. • The value we really want is called a population parameter . 2 CS464, Spring 2017

  3. Confidence Intervals • Range of values that we believe will have a specific chance of containing the unknown population parameter. • A confidence interval is twice the margin of error of a measurement. • Strict interpretation is that we are 95% confident in the method of creating the confidence interval – not 95% confident of any particular interval. – So, if a 95% confidence interval is calculated as 0.7  0.28, we can say that we are 95% confident that the actual population parameter mean value is between 42% and 98%. If we run 100 tests with the same sample size from the population and compute the 95% confidence interval each time, on average 95 of those 100 intervals will contain the population parameter mean value. But that also means that 5 of them won’t contain it, and we don’t know which ones don’t contain it. – You can say that any value inside the interval is plausible, and any outside the interval are not (Smithson, 2003). – DO NOT say there is a 95% probability that the population parameter mean value is between 42% and 98%. 3 CS464, Spring 2017

  4. Confidence Intervals Affected by 3 things: • Confidence level: e.g. 95% confident • Variability of the population: estimated using the standard deviation • Sample size: usually the only thing a researcher can control – Confidence interval width has an inverse square root relationship with sample size. To halve the interval width, you must quadruple your sample size:  20% error with sample size of 20 means sample size of 80 to achieve  10% error. 4 CS464, Spring 2017

  5. Confidence intervals for binary response questions Did the user complete the task? Did the user encounter problem X? • Yes or No, coded as 1 or 0 • A sample completion rate (proportion) is the number of successes divided by the sample size • What is the likely range for the completion rate of the full population? – Compute a binomial confidence interval around the sample proportion. • Problem: Many computations are very inaccurate for small sample sizes E.g. Laplace/Wald Interval found in most statistics texts: – Very inaccurate with sample sizes less than around 100 – Inaccurate when proportion is close to 0 or to 1 – Instead of containing the proportion 95% of the time, it can be as low as 50 ‐ 60% of the time. – More likely to contain the actual proportion 70% of the time. So your calculated 95% interval is really a 70% confidence interval. 5 CS464, Spring 2017

  6. Exact Confidence Intervals • Unlike Wald intervals, these work even for small sample sizes. • Computationally intensive. • Conservative: – If you calculate a 95% exact confidence interval, it is guaranteed to contain the proportion at least 95 times out of 100. In fact this interval would contain the proportion closer to 99% of the time. – Makes the interval wider than needed. 6 CS464, Spring 2017

  7. Adjusted Wald Intervals • Add 2 success and 2 failures for 95% confidence intervals and then use the Wald formula. – Works well for small sample sizes – Works well when the proportion is close to 1 or to 0 • The number of successes/failures to add depends on the confidence desired, and is actually the critical value from the normal distribution for the level of confidence: – The critical value for 90% is 1.64 – The critical value for 95% is 1.96 – The critical value for 99% is 2.57 7 CS464, Spring 2017

  8. Adjusted Wald Wald Interval Interval 8 CS464, Spring 2017

  9. Confidence intervals for rating scale questions How difficult was this task (Likert scale)? • Code the scale data: e.g., from very difficult =1 to very easy =7 for a 7 ‐ point Likert scale. • Compute mean and standard deviation • Determine t ‐ distribution (table lookup). – t ‐ distribution takes sample size into account • Compute t ‐ confidence interval 9 CS464, Spring 2017

  10. t ‐ confidence Interval • Interval is 2 margins of error around the mean: (mean ‐ (margin of error)) to (mean + (margin of error)) • Margin of error: (critical value from t ‐ distribution) x (standard error) • Standard error is how much the sample mean can fluctuate given a sample size (standard deviation divided by square root of sample size) – Standard error has to do with the sample mean – Standard deviation has to do with the raw data • Confidence interval calculated from sample mean, standard error, sample size, critical value from t ‐ distribution (table lookup based on sample size and desired confidence level) 10 CS464, Spring 2017

  11. t ‐ confidence intervals Excel 2013: T.INV.2T() 11 CS464, Spring 2017

  12. Statistical Analyses on Ordinal Data • Problem: scale data is ordinal data; many people believe it is wrong to use it for statistical analysis. • Many experts believe it is OK to perform statistical analysis with it (including t ‐ test, analysis of variance, factor analysis); you just have to make sure you don’t draw any conclusions that assume ratio or interval data. – Ex: Average response on design A is a 4 (e.g., “ I like the design” ), and on design B it is a 2 (“ I don’t really like the design ”). Assume a t ‐ test indicates the difference is statistically significant. • You can ONLY claim there is a consistent difference between the responses. • You CANNOT claim that design A is twice as good as design B – this is a ratio data claim • You CANNOT claim that the difference between the 4 and 2 is equal to what a difference between 4 and 6 would be – this is an interval claim. 12 CS464, Spring 2017

  13. Confidence intervals for continuous questions How long does it take to do task X? • Task time data tends to be positively skewed, not a symmetrical distribution. • We need to decide a better center of distribution than the mean. • Median may be a better center. • Problems: – Variability based on the number of samples: odd number and it is the middle, even number and it is the average of 2 other points. With small sample sizes it can jump around a lot by just adding another few samples. – Bias: with small samples the median of completion times tends to consistently overestimate the population median. Whereas any mean is just as likely to overestimate as underestimate the population mean. • Better choice for small samples: Geometric mean – Sauro/Lewis found for sample sizes < 25, geometric mean has less bias than mean or median. – To compute geometric mean: 1. Convert raw data to natural log 2. Find mean of transformed values 3. Convert back by exponentiation 13 CS464, Spring 2017

  14. Log transforming confidence intervals • Generate the confidence levels using the natural logs – Compute standard deviation of the natural logs of the raw data and the natural log of the geometric mean – Use these numbers as in the t ‐ confidence intervals to compute the log of the confidence interval. – Take the exponents of these values to get the confidence interval. 14 CS464, Spring 2017

  15. ln ‐ based transform confidence intervals 15 CS464, Spring 2017

  16. Median confidence intervals • If the sample size is >25, use the median to compute the confidence intervals using the z ‐ distribution (also called normal distribution). • Similar computation to t ‐ confidence intervals: (sample size) x (0.5)  (( z ‐ distribution) x (standard error)) – 0.5 is for median calculation; the 75 th percentile number could be used (higher than 75% of all the values), or any other percentile – Standard error is square root of: ((sample size) x (0.5) x ( 1 ‐ 0.5)) • Again, 0.5 is for median and any other percentile can be used 16 CS464, Spring 2017

  17. Using a median with binomial distribution to estimate confidence intervals 17 CS464, Spring 2017

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend