Experimental design Spring 2017 Michelle Mazurek Some content - - PowerPoint PPT Presentation

experimental design
SMART_READER_LITE
LIVE PREVIEW

Experimental design Spring 2017 Michelle Mazurek Some content - - PowerPoint PPT Presentation

Experimental design Spring 2017 Michelle Mazurek Some content adapted from Bilge Mutlu, Vibha Sazawal, Howard Seltman 1 Administrative No class Tuesday Homework 1 Plug for Tanu Mitra talk + grad student session 2 Todays class


slide-1
SLIDE 1

1

Experimental design

Spring 2017 Michelle Mazurek

Some content adapted from Bilge Mutlu, Vibha Sazawal, Howard Seltman

slide-2
SLIDE 2

2

Administrative

  • No class Tuesday
  • Homework 1
  • Plug for Tanu Mitra talk + grad student session
slide-3
SLIDE 3

3

Today’s class

  • Quick HCI background
  • Defining an experiment
  • Threats to validity
slide-4
SLIDE 4

4

QUI QUICK CK HCI HCI BA BACKGRO ROUND

slide-5
SLIDE 5

5

Th The old co computing is about what co computers ca can do do; The he new co computing is about what people ca can do. – Ben Ben Shnei Shneiderm derman

slide-6
SLIDE 6

6

How would you define HCI?

  • What are the key goals/questions?
slide-7
SLIDE 7

7

Some questions

  • How to DESIGN a computer system?
  • How to EVALUATE a computer system?
  • What are the PSYCHOLOGICAL THEORIES

governing interaction with technology?

  • How does emerging technology create

SOCIETAL CHANGE?

  • How does technology intersect with

ECONOMICS and POLICY?

slide-8
SLIDE 8

8

DE DEFINING AN AN EXPERIMENT

slide-9
SLIDE 9

9

The goal of an experiment is …

  • “The goal of any research design is to arrive at

clear answers to questions of interest [about the populations] while expending a minimum of resources.” – Ramsey and Shafer

  • Avoid threats that reduce in

interpretabil ilit ity or limit ge generalizabi bility

– Con Control

  • l what you can’t prevent
slide-10
SLIDE 10

10

So what is an experiment?

  • Ideally, testing causality

– Change in X causes a change in Y

  • Experimental setup:

– Multiple levels of independent var

  • (“conditions”, “treatments”)

– Control for other things that might matter

slide-11
SLIDE 11

11

What do we mean by control?

  • Two basic options:
  • Control variables: Same for every condition

– Plus: actually controlling – Minus: hard to get them all. Generalizability?

  • Random variables: Randomly assign to conditions

– Law of large numbers: Unimportant differences will fall

  • ut in the noise

– Minus: How large is large?

slide-12
SLIDE 12

12

Third option: Constrained variables

  • Also sometimes called blocking
  • Distribute variation across conditions:

– Per condition:1/3 novice, 1/3 intermediate, 1/3 expert

  • Pluses: Works in smaller samples
  • Minuses: What is not being controlled for?
slide-13
SLIDE 13

13

TH THREA REATS TS TO TO VALIDITY TY

slide-14
SLIDE 14

14

  • 1. Internal validity
  • Experiment was properly designed

– And can show causality!

  • Threatened by co

confounds ds

– Multiple things that vary between conditions

slide-15
SLIDE 15

15

Avoiding confounds

  • Randomize condition assignment

– Best option whenever possible

  • Change only one variable at a time

– Use more conditions for more variables

  • Use blinding

– Expectation is a confound! (on both sides)

  • Use a control group
slide-16
SLIDE 16

16

Threats to internal validity

  • Learning/ordering effects (much more later)
  • Placebo effect
  • Self-selection
  • Dropouts
  • Errors in measurement
  • Bad randomization
slide-17
SLIDE 17

17

  • 2. External validity
  • What population does your sample represent?

– Race, gender, age, nationality, education, others

  • What environment does your sample represent?

– Carefully controlled study vs. real world

slide-18
SLIDE 18

18

Sampling

  • Best: Truly random sample of the population

– Hard, expensive

  • Worst: Convenience

– Undergrads who want free pizza – Other grad students in my lab – My Facebook friends

  • Most good studies are somewhere in between
  • CS studies sometimes lean toward convenience
slide-19
SLIDE 19

19

Environment

  • Does your experiment reflect real-world

conditions for the thing you are testing?

  • In medicine, taking medications in correct

dosage and on time

  • In security research, secondary task
  • In general: time constraints, reality of synthetic

task, competing incentives, etc.

slide-20
SLIDE 20

20

Threats to EV

  • Poor sampling
  • Non-response/self-selection, dropout
  • Unrealistic environment
slide-21
SLIDE 21

21

Internal vs. external

  • In general, tension between them. Why?
  • More control variables

– Internal: UP – External: DOWN

slide-22
SLIDE 22

22

  • 3. Construct validity
  • Often difficult to directly measure the concept(s)
  • f interest

– Ind. and dep. vars

  • What do our metrics measure?

– Is it what we intended?

slide-23
SLIDE 23

23

Analyzing your construct(s)

  • Is there a gold standard?

– Use it – Or, correlate your construct with it

  • What else should it correlate with?
  • Is it reliable?

– Inter-rater, test-retest

  • Floor and ceiling effects
  • Potential for circular reasoning