cs160. cs160. valkyriesavage.com valkyriesavage.com data analysis - - PowerPoint PPT Presentation

cs160 cs160 valkyriesavage com valkyriesavage com data
SMART_READER_LITE
LIVE PREVIEW

cs160. cs160. valkyriesavage.com valkyriesavage.com data analysis - - PowerPoint PPT Presentation

cs160. cs160. valkyriesavage.com valkyriesavage.com data analysis July 22, 2015 Valkyrie Savage thanks for the feedback! Data Analysis 41057893@N02 on flickr Start by counting 5680 trials total normal: bubble: mean time 976.1


slide-1
SLIDE 1

data analysis

July 22, 2015 Valkyrie Savage

cs160. valkyriesavage.com cs160. valkyriesavage.com

slide-2
SLIDE 2

thanks for the feedback!

slide-3
SLIDE 3

Data Analysis

41057893@N02 on flickr

slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6

Start by counting

5680 trials total

  • normal:

mean time 976.1 ms, mean errors 2.560

  • bubble:

mean time 809.4 ms, mean errors 0.287

slide-7
SLIDE 7

Start by counting

71 users completed condition normal, size 10 mean time: 1123.43 ms, mean errors: 3.408 median time: 1039 ms, median errors: 3

  • 70 users completed condition normal, size 25

mean time: 826.64 ms, mean errors: 1.700 median time: 785 ms, median errors: 1 71 users completed condition bubble, size 10 mean time: 852.75 ms, mean errors: 0.296 median time: 804 ms, median errors: 0

  • 72 users completed condition bubble, size 25

mean time: 766.58 ms, mean errors: 0.014 median time: 725 ms, median errors: 0

slide-8
SLIDE 8

Descriptive Statistics

Continuous data: Central tendency mean,median,mode Dispersion Range (max-min) Standard deviation Shape of distribution Skew, Kurtosis Categorical data: Frequency distributions

µ = Xi

i=1 N

N

σ = Xi − µ

( )

2

N

Mean Standard

slide-9
SLIDE 9

Understanding Y

  • ur Data

Exploratory Data Analysis (EDA): Look at your data from different perspectives to get better intuition for it. Show the raw data!

  • Use different visualizations: Histograms, scatterplots, box plots, …
slide-10
SLIDE 10

1D Scatter Plot with Jitter

slide-11
SLIDE 11

1D Scatter Plot with Jitter colored by condition

slide-12
SLIDE 12

1D Scatter Plot with Jitter separated by condition

slide-13
SLIDE 13

Cleaning Data

Don’t discard data just because it doesn’t fit your expectation! Maybe your assumptions were wrong

  • In online experiments, discarding extreme outliers can make sense if you

believe they reflect users not following normal task protocol (e.g., multitasking in a reaction-time study)

slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18

Median vs. Mean

For normally distributed data, mean=median. Many data sets gathered online are strongly skewed Outliers pull the mean to the right/left Median is more robust!

slide-19
SLIDE 19
slide-20
SLIDE 20

Power Law Distributions

From C. Shirky, Here Comes Everybody

slide-21
SLIDE 21

Power Law Distribution

Source: Ed Chi

slide-22
SLIDE 22
slide-23
SLIDE 23
slide-24
SLIDE 24
slide-25
SLIDE 25
slide-26
SLIDE 26
slide-27
SLIDE 27

Confidence interval

confidence interval (also called margin of error) is the plus-or-minus figure usually reported in newspaper or television opinion poll results.

  • if you use a confidence interval of 4 and 47% percent of your sample picks an

answer you can be "sure" that if you had asked the question of the entire relevant population between 43% (47-4) and 51% (47+4) would have picked that answer

slide-28
SLIDE 28

Sample size

1000 people in population

  • 95% confidence level
  • Confidence interval of +-5
  • Need to sample 278 people
  • Confidence interval of +-1
  • …you need to sample 906 people

https://www.qualtrics.com/blog/ determining-sample-size/

slide-29
SLIDE 29

Effect Sizes: Time

  • Normal vs. Bubble cursor at target size 10:


1123ms vs. 852ms: Bubble cursor 31% faster Normal vs. Bubble cursor at target size 25:
 826ms vs. 766ms: Bubble cursor 8% faster

  • Target size for normal cursor:


1123ms vs 826ms: Larger targets 35% faster Target size for Bubble cursor:
 852ms vs. 766ms: Larger targets 11% faster

slide-30
SLIDE 30

Effect Sizes: Error

Normal vs. Bubble cursor, target size 10:
 3.4 vs. 0.3 Errors per 20 trials: 1033% fewer errors Normal vs. Bubble cursor, target size 25:
 1.7 vs. 0.3 Errors per 20 trials: 466% fewer errors

slide-31
SLIDE 31

break!

slide-32
SLIDE 32

Interaction Effects

Relationship between one IV and DV depends on the level of another IV

slide-33
SLIDE 33

Example of Interactions

Group problem solving Independent variable: Leadership

[example from Martin 04]

slide-34
SLIDE 34

Example of Interactions

Group problem solving Independent variable: Leadership Independent variable: Group size

[example from Martin 04]

slide-35
SLIDE 35

Example of Interactions

Group problem solving Change in time due to leadership is same regardless

  • f group size

[example from Martin 04]

slide-36
SLIDE 36

Example of Interactions

Group problem solving Change in time due to leadership is same regardless of group size Change in time due to group size is same regardless of leadership Independent variables do not interact

[example from Martin 04]

slide-37
SLIDE 37

Example of Interactions

Multiple IVs affect DV non-additively Change in time due to leadership differs with changes in group size Independent variables do interact

[example from Martin 04]

slide-38
SLIDE 38
slide-39
SLIDE 39
slide-40
SLIDE 40

Population versus Sample

slide-41
SLIDE 41

Are the Results Meaningful?

Hypothesis testing Hypothesis: Manipulation of IV effects DV in some way Null hypothesis: Manipulation of IV has no effect on DV Null hypothesis assumed true unless statistics allow us to reject it Statistical significance (p value) Likelihood that results are due to chance variation p < 0.05 usually considered significant (Sometimes p < 0.01) Means that < 5% chance that null hypothesis is true Statistical tests T-test (1 factor, 2 levels) Correlation ANOVA (1 factor, > 2 levels, multiple factors) MANOVA ( > 1 dependent variable)

slide-42
SLIDE 42

T

  • test

Compare means of 2 groups Null hypothesis: No difference between means Assumptions Samples are normally distributed Very robust in practice Population variances are equal (between subjects tests) Reasonably robust for differing variances Individual observations in samples are independent Important!

slide-43
SLIDE 43

ANOV A

Single factor analysis of variance (ANOVA) Compare means for 3 or more levels of a single independent variable Multi-Way Analysis of variance (n-Way ANOVA) Compare more than one independent variable Can find interactions between independent variables
 Repeated measures analysis of variance 
 (RM-ANOVA) Use when > 1 observation per subject (within subjects experiment) Multi-variate analysis of variance (MANOVA) Compare between more than one dependent var. ANOVA tests whether means differ, but does not tell us which means differ – for this we must perform pairwise t-tests

slide-44
SLIDE 44

t-test? ANOV A? n-way ANOV A? MANOV A?

slide-45
SLIDE 45

Our Example

Two-Way ANOVA (Cursor, Size) for time: Main effect for cursor F(1,5676) = 424.9, p<0.001 is statistically significant. Main effect for size F(1,5676)=556.2, p<0.001 is statistically significant. Interaction cursor x size F(1,5676)=169.5, p<0.001 is statistically significant.

slide-46
SLIDE 46

Our Example

Two-Way ANOVA (Cursor, Size) for errors: Main effect for cursor F(1,564) = 314.04, p<0.001 is statistically significant. Main effect for size F(1,564)=44.65, p<0.001 is statistically significant. Interaction cursor x size F(1,564)=43.40, p<0.001 is statistically significant.

slide-47
SLIDE 47

errors in Bubble Cursor case only

F(1,2038) = 0.009, p=0.92 – NOT significant

slide-48
SLIDE 48

What does p > 0.05 mean?

No statistically significant (at 5% level) Does that mean that the two conditions are equivalent? No! We did observe differences. But we can’t be confident they weren’t due to chance.

slide-49
SLIDE 49
slide-50
SLIDE 50

Draw Conclusions

What is the scope of the finding? Are there other parameters at play? Internal validity Does the experiment reflect real use? External validity

slide-51
SLIDE 51

Summary

Quantitative evaluations Repeatable, reliable evaluation of interface elements To control properly, usually limited to low- level issues Menu selection method A faster than method B

  • Pros/Cons

Objective measurements Good internal validity -> repeatability But, real-world implications may be difficult to foresee Statistically significant results doesn’t imply real-world importance 3.05s versus 3.00s for menu selection

slide-52
SLIDE 52

assignments!

collegedegrees360 on flickr

slide-53
SLIDE 53

Midterm Exam

Midterm July 27 (Monday!!) 80 minute exam: be here on time! Covers lectures & studios up to now (plus readings, assignments, …) Closed book. No notes, no tech.

slide-54
SLIDE 54

midterm reviews: today in section, tomorrow in studio

slide-55
SLIDE 55

GRP05 : interactive prototype

due Monday after midterm (3 August)

slide-56
SLIDE 56

PRG03

framer license details are on Piazza

slide-57
SLIDE 57

another judge : Anca Mosoiu

founder of community tech hub in oakland

slide-58
SLIDE 58

:’(

slide-59
SLIDE 59

data analysis

July 22, 2015 Valkyrie Savage

cs160. valkyriesavage.com cs160. valkyriesavage.com