Field studies and ecological validity Michelle Mazurek 1 Todays - - PowerPoint PPT Presentation

field studies and ecological validity
SMART_READER_LITE
LIVE PREVIEW

Field studies and ecological validity Michelle Mazurek 1 Todays - - PowerPoint PPT Presentation

Field studies and ecological validity Michelle Mazurek 1 Todays class Field studies (pluses and minuses) Ecological validity Ethics Crowdsourced studies (MTurk and friends) Project pitches: Next week 2 FIELD STUDIES 3 Why


slide-1
SLIDE 1

1

Field studies and ecological validity

Michelle Mazurek

slide-2
SLIDE 2

2

Today’s class

  • Field studies (pluses and minuses)
  • Ecological validity

– Ethics

  • Crowdsourced studies (MTurk and friends)
  • Project pitches: Next week
slide-3
SLIDE 3

3

FIELD STUDIES

slide-4
SLIDE 4

4

Why (not) a field study?

  • Better ecological validity

– Validate a lab study result

  • Because you can’t get the data any other way
  • Logistically difficult
  • Limited piloting / not easy to adjust

– One shot at your participant pool

  • Expensive (money and time)

Plan extremely carefully!

slide-5
SLIDE 5

5

PhishGuru in the real world

  • Anti-phishing training delivered when users

follow a phishing link

  • Training, phishing, legitimate emails delivered to

300 employees in a Portuguese company

slide-6
SLIDE 6

6

PhishGuru in the real world

  • Was a field study necessary here? Why?

– How could it have been designed differently?

  • What logistical problems were encountered?

– Design choices the authors later regretted? – How did they threaten the study’s outcome?

slide-7
SLIDE 7

7

ECOLOGICAL VALIDITY

Case study: Measuring password strength

slide-8
SLIDE 8

8

Comm ACM, 1979 Computers and Security, 1989

Passwords research is everywhere

ACIS 2004 (Campbell and Bryant) CCS 2005 (Narayanan

and Shmatikov)

WWW 2007

(Florencio and Herley)

CCS 2010 (Weir et al.) CHI 2011 (Komanduri et al.) NDSS 2012 (Castelluccia et al.) IEEE S&P 2012 (Bonneau)

slide-9
SLIDE 9

9

… but good data is hard to find

  • Small data sets
  • Experimental rather than field data
  • Self-reported surveys
  • Leaked data of questionable validity
  • Minimal-value accounts
  • No access to plaintext passwords

Are the results generalizable?

slide-10
SLIDE 10

10

Fahl et al.: Password study validity

  • Goal: Compare lab study, online study, real

passwords

  • Methods:

– Several thousand passwords (plaintext, anonymized) – Invite same pool to online or lab study – Security priming, or not – Manual analysis for similarity

  • 583 online, 63 lab participants
slide-11
SLIDE 11

11

Results: Validity

% Online Lab Priming Non Total Highly valid 46 49 47 44 46 Somewhat valid 23 32 24 24 24 Invalid 31 18 29 32 30

  • Overall, experimental data can be useful

– Self-reporting of realistic behavior can help

  • No significant difference due to priming
  • Lab slightly but significantly better than online
slide-12
SLIDE 12

12

Critique the study design

  • What was measured?

– Comments about the manual analysis approach?

  • Priming vs. non-priming

– How are the instructions different in the 2 cases? – Would you have given different instructions? – Are there other conditions you would test?

slide-13
SLIDE 13

13

Implications of the results

  • Do these findings apply to other studies in the

security/privacy area? How?

slide-14
SLIDE 14

14

Passwords for an entire university

  • 25,000 real, high-value

eal, high-value passwords from CMU

  • Contextual data – logs, demographics, survey
  • What factors correlate with password strength?

– New (to passwords) statistical methods – Find new results, confirm prior results

  • What to do when you don’t have field data?

– Comparison with leaked and study data

slide-15
SLIDE 15

15

What are CMU passwords?

  • 25,459 accounts for faculty, staff, and students

– Plus 17,104 deactivated accounts

  • Single-sign-on for email, financial, grades,

registration, health, etc.

  • Password requirements:

– Minimum 8 characters – Upper, lower, digit, symbol – Dictionary check (241,497 words)

slide-16
SLIDE 16

16

Strength metric: Guessability

  • How many guesses to reach each password?

– Subject to guessing algorithm and training data

  • Result: guess number or beyond t

beyond the cutof he cutoff

– Cutoff = 380 trillion guesses (runs in about 1 day) Password Guess number 12345678 4 Password178 1.4 x 106 jn%fKXsl!8@Df Beyond cutoff

Example:

slide-17
SLIDE 17

17

Comparing password sets

  • Examining CMU password policy

– Use conforming subset conforming subset for all leaked data

  • Online studies

– MTsim: Closest match to real CMU experience – MTcomp8: Similar password requirements

  • Leaked: plaintext

– RockYou, Yahoo!, CSDN

  • Leaked: hashed and cracked

– Gawker, StratFor

slide-18
SLIDE 18

18

Comparing sets – Guessability

Leaked hashed/cracked: Very easy to guess

60% 50% 40% 30% 20% 10% 0% 1E4 1E7 1E10 1E13

Gcomp8 SFcomp8 MT RYcomp8 CMUactive MT CSDNcomp8 Ycomp8

Extensive-knowledge

Gawker Stratfor RockYou MTsim CMU MTcomp8 CSDN Yahoo

Guess number Percent guessed

slide-19
SLIDE 19

19

Comparing sets – Guessability

Leaked plaintext: RockYou close to CMU, others much tougher

60% 50% 40% 30% 20% 10% 0% 1E4 1E7 1E10 1E13

Gcomp8 SFcomp8 MT RYcomp8 CMUactive MT CSDNcomp8 Ycomp8

Extensive-knowledge

Guess number Percent guessed

Gawker Stratfor RockYou MTsim CMU MTcomp8 CSDN Yahoo

slide-20
SLIDE 20

20

Comparing sets – Guessability

60% 50% 40% 30% 20% 10% 0% 1E4 1E7 1E10 1E13

Gcomp8 SFcomp8 MT RYcomp8 CMUactive MT CSDNcomp8 Ycomp8

Extensive-knowledge

Online studies: Both close, MTcomp8 closer

Guess number Percent guessed

Gawker Stratfor RockYou MTsim CMU MTcomp8 CSDN Yahoo

slide-21
SLIDE 21

21

Other metrics for comparison

  • Composition: length, character classes
  • Structures
  • Entropy (Shay et al., SOUPS 2010)
  • Frequency distribution
slide-22
SLIDE 22

22

Comparing sets – Length

9.5 10 10.5 11 11.5

CMUactive sim comp8 Ycomp8 Ycomp8 CSDNcomp8 SFcomp8 Gcomp8 SVcomp8 MTbasic8 MTdictionary8 MTbasic16

Length

Digits

number of characters number of digits

Overall: Online studies closest across metrics

(Full results in the paper)

CMU MTSim MTcomp8 RockYou Yahoo CSDN Stratfor Gawker Survey MTbasic8 MTdictionary8 MTbasic16

12.6 8.0 17.9

slide-23
SLIDE 23

23

Discussion

  • Critique the study design

– Challenges of field studies

  • Are there lessons for other HFSP studies?
slide-24
SLIDE 24

24

Quick note on ethics

  • All three studies we discussed today have

significant ethical implications

  • We’ll revisit this in a couple of weeks

– Any comments/questions in the meantime?

slide-25
SLIDE 25

33

Homework 2

  • Suggesting study designs

– We’ll talk about more options Tuesday

  • Deploy and analyze an MTurk survey

– Part, but only part, Part, but only part, can be done wit can be done with partners h partners – Ther There ar e are 11 of you … potent e 11 of you … potential ially one triple ly one triple

  • Read d

Read dir irect ections car ions careful efully! ly!

slide-26
SLIDE 26

34

Project pitches

  • 5 min each; slides optional

– What is the research question? – Preliminary high-level methodology – Ideally: Quick overview of related work / why it’s novel

  • We’ll vote to narrow down and then form teams
  • Final teams by 2/24
  • Proposals due 3/3

– Two pages; details posted on course website