Recruiting and crowdsourcing Michelle Mazurek Some slides adapted - - PowerPoint PPT Presentation

recruiting and crowdsourcing
SMART_READER_LITE
LIVE PREVIEW

Recruiting and crowdsourcing Michelle Mazurek Some slides adapted - - PowerPoint PPT Presentation

Recruiting and crowdsourcing Michelle Mazurek Some slides adapted from Lorrie Cranor 1 Warmup: Diary study activity In groups of 2-3 Plan a diary/ESM study and brainstorm potential pitfalls 2 Recruiting Spectrum from convenience


slide-1
SLIDE 1

1

Recruiting and crowdsourcing

Michelle Mazurek

Some slides adapted from Lorrie Cranor

slide-2
SLIDE 2

2

Warmup: Diary study activity

  • In groups of 2-3
  • Plan a diary/ESM study and brainstorm potential

pitfalls

slide-3
SLIDE 3

3

Recruiting

  • Spectrum from convenience sample to true

random (probabilistic).

– There is convenient and convenient

  • “Snowball” sampling

– Ask people to refer their friends

slide-4
SLIDE 4

4

HCI recruiting, in practice

  • People on campus (ugh)
  • Ask people you know to spread via social media (not

great)

  • Flyering / community mailing lists (maybe?)
  • Craigslist or similar
  • Crowdsourcing services (further discussion)
  • Web panels (further discussion)
  • Essentially no probabilistic
slide-5
SLIDE 5

5

When is (relative) convenience OK?

  • Questions where demographics/background

really don’t matter (pretty rare)

  • Interviews/experiments that require local visit

– Not just students – Demographic/skills blocking!

  • Study population is hard to access
slide-6
SLIDE 6

6

CROW OWDSOU OURCED STUDIES (ALSO O ON ONLINE IN GENERAL)

slide-7
SLIDE 7

7

What is crowdsourcing?

  • Merriam-Webster: “The process of obtaining

needed services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees or suppliers”

  • Academic Daren Brabham: “online, distributed

problem-solving and production model.”

slide-8
SLIDE 8

8

In our context

  • Finding study participants online
  • Service handles details of recruitment, payment,

etc.

  • (Much of what’s here might also refer to large-

scale online study outside crowdsourcing service as well, except the payment/recruitment part)

slide-9
SLIDE 9

9

Why crowdsource?

  • Large numbers of participants

– Without complicated logistics – From around the country, world

  • Easily controlled conditions (sort of!)
  • Relatively inexpensive
slide-10
SLIDE 10

10

Why not crowdsource?

  • No direct observation of participants
  • Limited followups
  • Some participants will enter garbage (always)
  • Specific demographics participate

– Younger, more technical than general population – Better than recruiting all students! – Usually worse than, e.g., Craigslist recruiting

slide-11
SLIDE 11

11

Participant problems

  • Attempted repeaters

– Especially if you pay too much

  • Entering garbage / not paying attention

– Finish as quickly as possible

  • Discussion in forums

– What about deception?

  • Terms of service may limit request types
slide-12
SLIDE 12

12

Participant solutions

  • Collect a lot of data

– Noise distributed across conditions

  • Use cookies, IP tracking, worker IDs
  • Ensure there is no “shortcut”
  • Use attention check questions, repeats

– Carefully designed and placed – Do NOT use “trick” questions, esp. well-known

  • Screening and training (Mitra paper)
  • Monitor forums
slide-13
SLIDE 13

17

Logistics: Infrastructure

  • Directly within MTurk

– Easiest, limited feature selection

  • Redirect to survey software

– UMD Qualtrics subscription – Well coordinated, not great for non-survey things

  • Redirect to your own server

– Best option for complicated studies – But requires design / management

slide-14
SLIDE 14

18

Online infrastructure more generally: What can you measure?

  • Time spent
  • Window focus
  • Copy-paste behavior
  • Device type and browser version
  • Other javascript things, etc.
slide-15
SLIDE 15

19

Other useful features

  • Screen and reject workers

– Location, quality rating, etc.

  • Send notifications (e.g. to come back for part 2)
  • Prevent repeated workers in the same task

– May need multiple tasks per study

  • On average, 100 participants / day

– Starts faster, slows down, repost

slide-16
SLIDE 16

20

Kang et al., SOUPS 2014

  • Survey on privacy attitudes and behavior
  • Administered to:

– Representative Pew phone sample

  • 775 Internet users

– U.S. Turkers (182) – Indian Turkers (128)

slide-17
SLIDE 17

21

Results: Demographics

  • Turk younger, maler, more educated

– Indian Turk even more so

slide-18
SLIDE 18

22

Results: U.S. general vs. U.S. Turk

  • Turkers more likely to seek anonymity
  • Turkers more likely to hide content selectively

– Except, general more likely to hide from hackers

  • Younger, more educated say more data on them

is available; take more steps to hide

  • Turkers more concerned about privacy, more

likely to say anonymity should be possibl

slide-19
SLIDE 19

23

Results: U.S. Turk vs. India Turk

  • Indians say more personal data is online
  • U.S. more likely to seek anonymity

– Indians more likely to hide from boss/supervisor

  • Indians less concerned about privacy, more

satisfaction with gov’t protection

  • Fewer Indians say anonymity should be possible

– More comfortable with monitoring to prevent terrorism

slide-20
SLIDE 20

24

Beyond Turk

  • Prolific: New but quickly growing

– May have broader demographics

  • Crowdflower
  • crowdsource.com
  • Samasource
  • Google consumer surveys

– Only 10 questions, no experiments! – But more probabilistic

slide-21
SLIDE 21

25

Web panels vs. Turk

  • Panels: Qualtrics, SSI, others
  • Recruit to match request demographics
  • More expensive (priced by demographic

difficulty)

– You pay panel; they pay participant

  • Can be useful to find non-Turk demographics
  • Lots of biases in who joins panel, who responds
slide-22
SLIDE 22

26

Panels vs. Turk vs. the U.S.

  • New work specific to security/privacy questions
  • Panel did worse than Turk in many ways
  • Key problem seems to be about tech knowledge

rather than about demographics per se

slide-23
SLIDE 23

27

Resources

  • https://experimentalturk.wordpress.com/
  • http://www.behind-the-enemy-lines.com/