User Research Statistics Quick Guide Reference: Jeff Sauro and James - - PowerPoint PPT Presentation

user research statistics quick guide
SMART_READER_LITE
LIVE PREVIEW

User Research Statistics Quick Guide Reference: Jeff Sauro and James - - PowerPoint PPT Presentation

User Research Statistics Quick Guide Reference: Jeff Sauro and James R. Lewis, Quantifying the User Experience, 2 nd ed, Chapter 3, parts of Chapter 9 1 CS464, Spring 2017 Why? To completely answer usability questions we need to test every member


slide-1
SLIDE 1

User Research Statistics Quick Guide

1

CS464, Spring 2017

Reference: Jeff Sauro and James R. Lewis, Quantifying the User Experience, 2nd ed, Chapter 3, parts of Chapter 9

slide-2
SLIDE 2

Why?

To completely answer usability questions we need to test every member of the population. This isn’t possible so we:

  • Test a sample population, then estimate what

the values would be for the entire population.

– Estimates are less accurate as the sample size gets smaller.

  • The value we really want is called a population

parameter.

CS464, Spring 2017

2

slide-3
SLIDE 3

Confidence Intervals

  • Range of values that we believe will have a specific chance of

containing the unknown population parameter.

  • A confidence interval is twice the margin of error of a

measurement.

  • Strict interpretation is that we are 95% confident in the method of

creating the confidence interval – not 95% confident of any particular interval.

– So, if a 95% confidence interval is calculated as 0.7  0.28, we can say that we are 95% confident that the actual population parameter mean value is between 42% and 98%. If we run 100 tests with the same sample size from the population and compute the 95% confidence interval each time, on average 95 of those 100 intervals will contain the population parameter mean value. But that also means that 5 of them won’t contain it, and we don’t know which ones don’t contain it. – You can say that any value inside the interval is plausible, and any

  • utside the interval are not (Smithson, 2003).

– DO NOT say there is a 95% probability that the population parameter mean value is between 42% and 98%.

CS464, Spring 2017

3

slide-4
SLIDE 4

Confidence Intervals

Affected by 3 things:

  • Confidence level: e.g. 95% confident
  • Variability of the population: estimated using

the standard deviation

  • Sample size: usually the only thing a

researcher can control

– Confidence interval width has an inverse square root relationship with sample size. To halve the interval width, you must quadruple your sample size:  20% error with sample size of 20 means sample size of 80 to achieve  10% error.

CS464, Spring 2017

4

slide-5
SLIDE 5

Confidence intervals for binary response questions

Did the user complete the task? Did the user encounter problem X?

  • Yes or No, coded as 1 or 0
  • A sample completion rate (proportion) is the number of

successes divided by the sample size

  • What is the likely range for the completion rate of the full

population?

– Compute a binomial confidence interval around the sample proportion.

  • Problem: Many computations are very inaccurate for small

sample sizes

E.g. Laplace/Wald Interval found in most statistics texts:

– Very inaccurate with sample sizes less than around 100 – Inaccurate when proportion is close to 0 or to 1 – Instead of containing the proportion 95% of the time, it can be as low as 50‐ 60% of the time. – More likely to contain the actual proportion 70% of the time. So your calculated 95% interval is really a 70% confidence interval.

CS464, Spring 2017

5

slide-6
SLIDE 6

Exact Confidence Intervals

  • Unlike Wald intervals, these work even for

small sample sizes.

  • Computationally intensive.
  • Conservative:

– If you calculate a 95% exact confidence interval, it is guaranteed to contain the proportion at least 95 times out of 100. In fact this interval would contain the proportion closer to 99% of the time. – Makes the interval wider than needed.

CS464, Spring 2017

6

slide-7
SLIDE 7

Adjusted Wald Intervals

  • Add 2 success and 2 failures for 95% confidence

intervals and then use the Wald formula.

– Works well for small sample sizes – Works well when the proportion is close to 1 or to 0

  • The number of successes/failures to add

depends on the confidence desired, and is actually the critical value from the normal distribution for the level of confidence:

– The critical value for 90% is 1.64 – The critical value for 95% is 1.96 – The critical value for 99% is 2.57

CS464, Spring 2017

7

slide-8
SLIDE 8

Wald Interval

CS464, Spring 2017

8

Adjusted Wald Interval

slide-9
SLIDE 9

Confidence intervals for rating scale questions

How difficult was this task (Likert scale)?

  • Code the scale data: e.g., from very difficult

=1 to very easy =7 for a 7‐point Likert scale.

  • Compute mean and standard deviation
  • Determine t‐distribution (table lookup).

– t‐distribution takes sample size into account

  • Compute t‐confidence interval

CS464, Spring 2017

9

slide-10
SLIDE 10

t‐confidence Interval

  • Interval is 2 margins of error around the mean:

(mean ‐ (margin of error)) to (mean + (margin of error))

  • Margin of error:

(critical value from t‐distribution) x (standard error)

  • Standard error is how much the sample mean can

fluctuate given a sample size (standard deviation divided by square root of sample size)

– Standard error has to do with the sample mean – Standard deviation has to do with the raw data

  • Confidence interval calculated from sample

mean, standard error, sample size, critical value from t‐distribution (table lookup based on sample size and desired confidence level)

CS464, Spring 2017

10

slide-11
SLIDE 11

t‐confidence intervals

CS464, Spring 2017

11

Excel 2013: T.INV.2T()

slide-12
SLIDE 12

Statistical Analyses on Ordinal Data

  • Problem: scale data is ordinal data; many people believe it is

wrong to use it for statistical analysis.

  • Many experts believe it is OK to perform statistical analysis

with it (including t‐test, analysis of variance, factor analysis); you just have to make sure you don’t draw any conclusions that assume ratio or interval data.

– Ex: Average response on design A is a 4 (e.g., “I like the design”), and on design B it is a 2 (“I don’t really like the design”). Assume a t‐ test indicates the difference is statistically significant.

  • You can ONLY claim there is a consistent difference between the

responses.

  • You CANNOT claim that design A is twice as good as design B – this is a

ratio data claim

  • You CANNOT claim that the difference between the 4 and 2 is equal to

what a difference between 4 and 6 would be – this is an interval claim.

CS464, Spring 2017

12

slide-13
SLIDE 13

Confidence intervals for continuous questions

How long does it take to do task X?

  • Task time data tends to be positively skewed, not a symmetrical

distribution.

  • We need to decide a better center of distribution than the mean.
  • Median may be a better center.
  • Problems:

– Variability based on the number of samples: odd number and it is the middle, even number and it is the average of 2 other points. With small sample sizes it can jump around a lot by just adding another few samples. – Bias: with small samples the median of completion times tends to consistently

  • verestimate the population median. Whereas any mean is just as likely to
  • verestimate as underestimate the population mean.
  • Better choice for small samples: Geometric mean

– Sauro/Lewis found for sample sizes < 25, geometric mean has less bias than mean or median. – To compute geometric mean:

1. Convert raw data to natural log 2. Find mean of transformed values 3. Convert back by exponentiation

CS464, Spring 2017

13

slide-14
SLIDE 14

Log transforming confidence intervals

  • Generate the confidence levels using the

natural logs

– Compute standard deviation of the natural logs of the raw data and the natural log of the geometric mean – Use these numbers as in the t‐confidence intervals to compute the log of the confidence interval. – Take the exponents of these values to get the confidence interval.

CS464, Spring 2017

14

slide-15
SLIDE 15

ln‐based transform confidence intervals

CS464, Spring 2017

15

slide-16
SLIDE 16

Median confidence intervals

  • If the sample size is >25, use the median to

compute the confidence intervals using the z‐ distribution (also called normal distribution).

  • Similar computation to t‐confidence intervals:

(sample size) x (0.5)  ((z‐distribution) x (standard error))

– 0.5 is for median calculation; the 75th percentile number could be used (higher than 75% of all the values), or any other percentile – Standard error is

square root of: ((sample size) x (0.5) x ( 1‐ 0.5))

  • Again, 0.5 is for median and any other percentile can

be used

CS464, Spring 2017

16

slide-17
SLIDE 17

Using a median with binomial distribution to estimate confidence intervals

CS464, Spring 2017

17