User Research Statistics Quick Guide Reference: Jeff Sauro and James - PowerPoint PPT Presentation

User Research Statistics Quick Guide Reference: Jeff Sauro and James R. Lewis, Quantifying the User Experience, 2 nd ed, Chapter 3, parts of Chapter 9 1 CS464, Spring 2017

Why? To completely answer usability questions we need to test every member of the population. This isn’t possible so we: • Test a sample population, then estimate what the values would be for the entire population. – Estimates are less accurate as the sample size gets smaller. • The value we really want is called a population parameter . 2 CS464, Spring 2017

Confidence Intervals • Range of values that we believe will have a specific chance of containing the unknown population parameter. • A confidence interval is twice the margin of error of a measurement. • Strict interpretation is that we are 95% confident in the method of creating the confidence interval – not 95% confident of any particular interval. – So, if a 95% confidence interval is calculated as 0.7  0.28, we can say that we are 95% confident that the actual population parameter mean value is between 42% and 98%. If we run 100 tests with the same sample size from the population and compute the 95% confidence interval each time, on average 95 of those 100 intervals will contain the population parameter mean value. But that also means that 5 of them won’t contain it, and we don’t know which ones don’t contain it. – You can say that any value inside the interval is plausible, and any outside the interval are not (Smithson, 2003). – DO NOT say there is a 95% probability that the population parameter mean value is between 42% and 98%. 3 CS464, Spring 2017

Confidence Intervals Affected by 3 things: • Confidence level: e.g. 95% confident • Variability of the population: estimated using the standard deviation • Sample size: usually the only thing a researcher can control – Confidence interval width has an inverse square root relationship with sample size. To halve the interval width, you must quadruple your sample size:  20% error with sample size of 20 means sample size of 80 to achieve  10% error. 4 CS464, Spring 2017

Confidence intervals for binary response questions Did the user complete the task? Did the user encounter problem X? • Yes or No, coded as 1 or 0 • A sample completion rate (proportion) is the number of successes divided by the sample size • What is the likely range for the completion rate of the full population? – Compute a binomial confidence interval around the sample proportion. • Problem: Many computations are very inaccurate for small sample sizes E.g. Laplace/Wald Interval found in most statistics texts: – Very inaccurate with sample sizes less than around 100 – Inaccurate when proportion is close to 0 or to 1 – Instead of containing the proportion 95% of the time, it can be as low as 50 ‐ 60% of the time. – More likely to contain the actual proportion 70% of the time. So your calculated 95% interval is really a 70% confidence interval. 5 CS464, Spring 2017

Exact Confidence Intervals • Unlike Wald intervals, these work even for small sample sizes. • Computationally intensive. • Conservative: – If you calculate a 95% exact confidence interval, it is guaranteed to contain the proportion at least 95 times out of 100. In fact this interval would contain the proportion closer to 99% of the time. – Makes the interval wider than needed. 6 CS464, Spring 2017

Adjusted Wald Intervals • Add 2 success and 2 failures for 95% confidence intervals and then use the Wald formula. – Works well for small sample sizes – Works well when the proportion is close to 1 or to 0 • The number of successes/failures to add depends on the confidence desired, and is actually the critical value from the normal distribution for the level of confidence: – The critical value for 90% is 1.64 – The critical value for 95% is 1.96 – The critical value for 99% is 2.57 7 CS464, Spring 2017

Adjusted Wald Wald Interval Interval 8 CS464, Spring 2017

Confidence intervals for rating scale questions How difficult was this task (Likert scale)? • Code the scale data: e.g., from very difficult =1 to very easy =7 for a 7 ‐ point Likert scale. • Compute mean and standard deviation • Determine t ‐ distribution (table lookup). – t ‐ distribution takes sample size into account • Compute t ‐ confidence interval 9 CS464, Spring 2017

t ‐ confidence Interval • Interval is 2 margins of error around the mean: (mean ‐ (margin of error)) to (mean + (margin of error)) • Margin of error: (critical value from t ‐ distribution) x (standard error) • Standard error is how much the sample mean can fluctuate given a sample size (standard deviation divided by square root of sample size) – Standard error has to do with the sample mean – Standard deviation has to do with the raw data • Confidence interval calculated from sample mean, standard error, sample size, critical value from t ‐ distribution (table lookup based on sample size and desired confidence level) 10 CS464, Spring 2017

t ‐ confidence intervals Excel 2013: T.INV.2T() 11 CS464, Spring 2017

Statistical Analyses on Ordinal Data • Problem: scale data is ordinal data; many people believe it is wrong to use it for statistical analysis. • Many experts believe it is OK to perform statistical analysis with it (including t ‐ test, analysis of variance, factor analysis); you just have to make sure you don’t draw any conclusions that assume ratio or interval data. – Ex: Average response on design A is a 4 (e.g., “ I like the design” ), and on design B it is a 2 (“ I don’t really like the design ”). Assume a t ‐ test indicates the difference is statistically significant. • You can ONLY claim there is a consistent difference between the responses. • You CANNOT claim that design A is twice as good as design B – this is a ratio data claim • You CANNOT claim that the difference between the 4 and 2 is equal to what a difference between 4 and 6 would be – this is an interval claim. 12 CS464, Spring 2017

Confidence intervals for continuous questions How long does it take to do task X? • Task time data tends to be positively skewed, not a symmetrical distribution. • We need to decide a better center of distribution than the mean. • Median may be a better center. • Problems: – Variability based on the number of samples: odd number and it is the middle, even number and it is the average of 2 other points. With small sample sizes it can jump around a lot by just adding another few samples. – Bias: with small samples the median of completion times tends to consistently overestimate the population median. Whereas any mean is just as likely to overestimate as underestimate the population mean. • Better choice for small samples: Geometric mean – Sauro/Lewis found for sample sizes < 25, geometric mean has less bias than mean or median. – To compute geometric mean: 1. Convert raw data to natural log 2. Find mean of transformed values 3. Convert back by exponentiation 13 CS464, Spring 2017

Log transforming confidence intervals • Generate the confidence levels using the natural logs – Compute standard deviation of the natural logs of the raw data and the natural log of the geometric mean – Use these numbers as in the t ‐ confidence intervals to compute the log of the confidence interval. – Take the exponents of these values to get the confidence interval. 14 CS464, Spring 2017

ln ‐ based transform confidence intervals 15 CS464, Spring 2017

Median confidence intervals • If the sample size is >25, use the median to compute the confidence intervals using the z ‐ distribution (also called normal distribution). • Similar computation to t ‐ confidence intervals: (sample size) x (0.5)  (( z ‐ distribution) x (standard error)) – 0.5 is for median calculation; the 75 th percentile number could be used (higher than 75% of all the values), or any other percentile – Standard error is square root of: ((sample size) x (0.5) x ( 1 ‐ 0.5)) • Again, 0.5 is for median and any other percentile can be used 16 CS464, Spring 2017

Using a median with binomial distribution to estimate confidence intervals 17 CS464, Spring 2017

User Research Statistics Quick Guide Reference: Jeff Sauro and James - PowerPoint PPT Presentation

User Research Statistics Quick Guide Reference: Jeff Sauro and James R. Lewis, Quantifying the User Experience, 2 nd ed, Chapter 3, parts of Chapter 9 1 CS464, Spring 2017 Why? To completely answer usability questions we need to test every member

e:Vision USER GUIDE FOR STUDENTS Warning! For the full interactive experience of this user

e:Vision USER GUIDE FOR SUPERVISORS Warning! For the full interactive experience of this user

The Office of Research an d Spon s or ed Pr ogr am s Quick Guide to Grant Writing Quick Guide to

PNP User Guide PNP User Guide Training aining Objectives AGENDA Networking Breakfast

Powerful Presentation Skills: A Quick and Handy Guide for Any Manager Powerful Presentation

Printout Tuesday, October 29, 2019 7:38 PM Quick Notes Page 1 Quick Notes Page 2 Quick Notes

Evaluation Map Guide Evaluation Map Guide Evaluation Map Guide Evaluation Map Guide Progress on

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Office Ergonomics Guide to setting up your computer workstation Overview of Guide This guide to

QUICK INTRODUCTION People call me GONZ QUICK INTRODUCTION 1. Never went to Art School

Sorting Chapter 7 1 Quick Sort One of the most popular fast sorting algorithms Quick sort

YOUR COMPLETE YOUR COMPLETE PRESENTATION GUIDE PRESENTATION GUIDE PRESENTATION GUIDE Im so

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

UX/UI What is UX and UI? UX Process User Research User Research Creating User

The Quick and Easy Guide to Presentation Planning Many people take far too long to plan a

Rand Stagen March 5, 2019 THE NEXT LEVEL NEXT LEVEL You cannot solve a problem from the

first quarter results 10 Some of the statements in this presentation may constitute

Private Sector Freight Roundtable Takeaways Freight Initiatives Committee Meeting February 18,

Wegmans Process Improvement DPM Fall 2011 What We Believe At Wegmans, we believe that good

Guide to Urban Traffic Volume Counting September 1981 This report presents methods by which

Overall Response rate: 50.2% Confidence Level: 99% Confidence Interval: 2 ( 1.933) 67.5%

Technical Overview MIRA MIRA 2013 - Pakistan Outline Survey of Surveys MIRA Overview

Confidence intervals and the Feldman-Cousins construction Edoardo Milotti Advanced Statistics

User Research Statistics Quick Guide Reference: Jeff Sauro and James - PowerPoint PPT Presentation

User Research Statistics Quick Guide Reference: Jeff Sauro and James R. Lewis, Quantifying the User Experience, 2 nd ed, Chapter 3, parts of Chapter 9 1 CS464, Spring 2017 Why? To completely answer usability questions we need to test every member

e:Vision USER GUIDE FOR STUDENTS Warning! For the full interactive experience of this user

e:Vision USER GUIDE FOR SUPERVISORS Warning! For the full interactive experience of this user

The Office of Research an d Spon s or ed Pr ogr am s Quick Guide to Grant Writing Quick Guide to

PNP User Guide PNP User Guide Training aining Objectives AGENDA Networking Breakfast

Powerful Presentation Skills: A Quick and Handy Guide for Any Manager Powerful Presentation

Printout Tuesday, October 29, 2019 7:38 PM Quick Notes Page 1 Quick Notes Page 2 Quick Notes

Evaluation Map Guide Evaluation Map Guide Evaluation Map Guide Evaluation Map Guide Progress on

RUN groupadd -r user &amp;&amp; useradd -r -g user user USER user $ docker run --read-only debian

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Office Ergonomics Guide to setting up your computer workstation Overview of Guide This guide to

QUICK INTRODUCTION People call me GONZ QUICK INTRODUCTION 1. Never went to Art School

Sorting Chapter 7 1 Quick Sort One of the most popular fast sorting algorithms Quick sort

YOUR COMPLETE YOUR COMPLETE PRESENTATION GUIDE PRESENTATION GUIDE PRESENTATION GUIDE Im so

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

UX/UI What is UX and UI? UX Process User Research User Research Creating User

The Quick and Easy Guide to Presentation Planning Many people take far too long to plan a

Rand Stagen March 5, 2019 THE NEXT LEVEL NEXT LEVEL You cannot solve a problem from the

first quarter results 10 Some of the statements in this presentation may constitute

Private Sector Freight Roundtable Takeaways Freight Initiatives Committee Meeting February 18,

Wegmans Process Improvement DPM Fall 2011 What We Believe At Wegmans, we believe that good

Guide to Urban Traffic Volume Counting September 1981 This report presents methods by which

Overall Response rate: 50.2% Confidence Level: 99% Confidence Interval: 2 ( 1.933) 67.5%

Technical Overview MIRA MIRA 2013 - Pakistan Outline Survey of Surveys MIRA Overview

Confidence intervals and the Feldman-Cousins construction Edoardo Milotti Advanced Statistics

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian