experimental design
play

Experimental design Spring 2017 Michelle Mazurek Some content - PowerPoint PPT Presentation

Experimental design Spring 2017 Michelle Mazurek Some content adapted from Bilge Mutlu, Vibha Sazawal, Howard Seltman 1 Administrative No class Tuesday Homework 1 Plug for Tanu Mitra talk + grad student session 2 Todays class


  1. Experimental design Spring 2017 Michelle Mazurek Some content adapted from Bilge Mutlu, Vibha Sazawal, Howard Seltman 1

  2. Administrative • No class Tuesday • Homework 1 • Plug for Tanu Mitra talk + grad student session 2

  3. Today’s class • Quick HCI background • Defining an experiment • Threats to validity 3

  4. QUI QUICK CK HCI HCI BA BACKGRO ROUND 4

  5. Th The old co computing is about what co computers ca can do do; The he new co computing is about what people ca can do. – Ben Ben Shnei Shneiderm derman 5

  6. How would you define HCI? • What are the key goals/questions? 6

  7. Some questions • How to DESIGN a computer system? • How to EVALUATE a computer system? • What are the PSYCHOLOGICAL THEORIES governing interaction with technology? • How does emerging technology create SOCIETAL CHANGE? • How does technology intersect with ECONOMICS and POLICY? 7

  8. DEFINING AN DE AN EXPERIMENT 8

  9. The goal of an experiment is … • “The goal of any research design is to arrive at clear answers to questions of interest [about the populations] while expending a minimum of resources.” – Ramsey and Shafer • Avoid threats that reduce in ity or limit interpretabil ilit ge generalizabi bility ol what you can’t prevent – Con Control 9

  10. So what is an experiment? • Ideally, testing causality – Change in X causes a change in Y • Experimental setup: – Multiple levels of independent var • (“conditions”, “treatments”) – Control for other things that might matter 10

  11. What do we mean by control? • Two basic options: • Control variables: Same for every condition – Plus: actually controlling – Minus: hard to get them all. Generalizability? • Random variables: Randomly assign to conditions – Law of large numbers: Unimportant differences will fall out in the noise – Minus: How large is large? 11

  12. Third option: Constrained variables • Also sometimes called blocking • Distribute variation across conditions: – Per condition:1/3 novice, 1/3 intermediate, 1/3 expert • Pluses: Works in smaller samples • Minuses: What is not being controlled for? 12

  13. TH THREA REATS TS TO TO VALIDITY TY 13

  14. 1. Internal validity • Experiment was properly designed – And can show causality! • Threatened by co confounds ds – Multiple things that vary between conditions 14

  15. Avoiding confounds • Randomize condition assignment – Best option whenever possible • Change only one variable at a time – Use more conditions for more variables • Use blinding – Expectation is a confound! (on both sides) • Use a control group 15

  16. Threats to internal validity • Learning/ordering effects (much more later) • Placebo effect • Self-selection • Dropouts • Errors in measurement • Bad randomization 16

  17. 2. External validity • What population does your sample represent? – Race, gender, age, nationality, education, others • What environment does your sample represent? – Carefully controlled study vs. real world 17

  18. Sampling • Best: Truly random sample of the population – Hard, expensive • Worst: Convenience – Undergrads who want free pizza – Other grad students in my lab – My Facebook friends • Most good studies are somewhere in between • CS studies sometimes lean toward convenience 18

  19. Environment • Does your experiment reflect real-world conditions for the thing you are testing? • In medicine, taking medications in correct dosage and on time • In security research, secondary task • In general: time constraints, reality of synthetic task, competing incentives, etc. 19

  20. Threats to EV • Poor sampling • Non-response/self-selection, dropout • Unrealistic environment 20

  21. Internal vs. external • In general, tension between them. Why? • More control variables – Internal: UP – External: DOWN 21

  22. 3. Construct validity • Often difficult to directly measure the concept(s) of interest – Ind. and dep. vars • What do our metrics measure? – Is it what we intended? 22

  23. Analyzing your construct(s) • Is there a gold standard? – Use it – Or, correlate your construct with it • What else should it correlate with? • Is it reliable? – Inter-rater, test-retest • Floor and ceiling effects • Potential for circular reasoning 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend