Motivation of Crowd Workers Does it matter? Babak Naderi Quality - - PowerPoint PPT Presentation

motivation of crowd workers
SMART_READER_LITE
LIVE PREVIEW

Motivation of Crowd Workers Does it matter? Babak Naderi Quality - - PowerPoint PPT Presentation

Motivation of Crowd Workers Does it matter? Babak Naderi Quality and Usability Lab, Telekom Innovation Laboratories Technische Universitt Berlin 1 Agenda Theoretical Model Measurement of Motivation Influence of Task Design Trapping


slide-1
SLIDE 1

Motivation of Crowd Workers

Does it matter?

Babak Naderi

Quality and Usability Lab, Telekom Innovation Laboratories Technische Universität Berlin

1

slide-2
SLIDE 2

Agenda

Theoretical Model Measurement of Motivation Influence of Task Design

  • Trapping questions in survey
  • Trapping questions in speech quality assessment

Tools from QUL Discussion

2

slide-3
SLIDE 3

Theoretical Model

→ Based on Self Determination Theory of motivation by Deci & Ryan (1985)

3

slide-4
SLIDE 4

Measurement of Motivation

Development of Crowdsourcing Work Motivation Scale

Aim: Develop scale targeted to the crowdworkers Procedure: Pilot study + Main study + Validation study Participants (main study): 405 crowd worker (US); 284 responses remained after reliability check Preliminary item set contained 33 items Sample items:

– I am enjoying doing tasks in MTurk very much. [Intrinsic] – Because this job is a part of my life. [Identified] – Because I want people (requesters or others) to like me. [Introjected] – Because it allows me to earn money. [External] – I don’t know why, we are provided with unrealistic working conditions. [Amotivation]

4

slide-5
SLIDE 5

Measurement of Motivation

Development of Crowdsourcing Work Motivation Scale

5

No high school

PhD+ 1.40% 3.90%

Retired

4.20% Yes 5.30% Age 18-26 27-32 33-40 41-55 Gender Male 43% Female 57% 56+ 12.70% 32% 28.20% 20.01% 7% Employment Status $100,000 + 15.50% Education Level Some college Bachelor's degree

Master's degree

34.50% 32.70% 10.90% High school/GED 16.50% 12.30% 8.10% 6% Household Income $10,000 - $19,999 $20,000 - $39,999 $40,000 - $59,999 $60,000 - $79,999 No 94.70% Mturk Master 16.20% HIT Approval Rate (%) [0,85] (85,90] (90,95] (95,98] 12.30% 19% 21.50% Perceived Weekly Working on Mturk <4 Hours 10-15 H 15-25 H

Going school

5.60% 24.30% (98,100] 22.90% 25 H+ 21.80% 11.30% 20.40%

$80,000 - $99,999

4-10 Hours 30.30%

< $10,000

7.40% 29.20% 21.50% Other 10.20%

Unable to work

7.40% Keeping house 10.90% Working part time 16.50% Working full time 45.10%

slide-6
SLIDE 6

40% of data used for training 16 items remained in final questionnaire

6

Adjusted χ² by the degrees of freedom (df) (

𝜓² 𝑒𝑔) ≤ 3

Root mean square error of approximation (RMSEA) ≤ .08 Comparative fit index (CFI) ≥ .95 Item reliability (IR) ≥ .4

Measurement of Motivation

Development of Crowdsourcing Work Motivation Scale

slide-7
SLIDE 7

Influence of Task Design

7

slide-8
SLIDE 8

Reliability of Responses in Crowdsourcing

Effect of “Being Observed”

Two reliability check methods were used:

  • 1. Trapping Questions - noticeable for workers:

“I believe two plus five does not equal nine.”

  • 2. Inconsistency Score - unnoticeable for workers:

Some randomly selected items are asked twice in the questionnaire

8

slide-9
SLIDE 9

Reliability of Responses in Crowdsourcing

Effect of “Being Observed”

Two studies were conducted in MTurk

Study1: 74 Questions, 256 workers, unnoticeable method Study2: 97 Questions, 405 workers, noticeable + unnoticeable methods

9

Groups (based on Approval Rate) Study1 Range [%] [0,70] (70,80] (80,90] (90,95] (95,100] # workers 11 2 29 110 70 Study2 Range [%] [0,85] (85,90] (90,95] (95,98] (98,100] # workers 58 80 92 93 78

slide-10
SLIDE 10

Reliability of Responses in Crowdsourcing

Effect of “Being Observed”

Inconsistency Score (unnoticeable method):

10

𝐽𝑇 = 𝑥𝑗 𝑥

𝑘 𝑂 𝑘=1

∙ (𝐽𝑗 − 𝐽′𝑗)2

𝑂 𝑗=1

𝑥𝑗 = 1 + log 𝑁 𝑛𝑗 𝑈 = 𝑅3(𝐽𝑇) + 1.5 ∙ 𝐽𝑅𝑆(𝐽𝑇)

Where:

N # of duplicated items Ii a reliability check Item Ii´ duplicate of item Ii M # of participants mi # of participants with deviated answer to items i T Cut-off threshold

slide-11
SLIDE 11

Reliability of Responses in Crowdsourcing

Effect of “Being Observed”

11

Inconsistency Scores of Study1 (Mdn = 1) were significantly different from the Inconsistency Scores of Study2 (Mdn =.65), 𝑞 < .001, 𝑠 = −.31.

slide-12
SLIDE 12

Goal: Evaluate the effect of different trapping question strategy. Study: Compare MOS ratings from crowdsourcing listening-only-test with MOS ratings obtained in the laboratory. Platform: Crowdee

12

Reliability of Responses in Crowdsourcing

Effect of Trapping Questions Strategies

slide-13
SLIDE 13

Database: ITU-T Rec. P.863 (# 501) competition by SwissQual AG; 50 degradation conditions x 4 stimuli = 200 stimuli to rate; 24 Repetitions Strategies (conditions):

– T0 – No Trapping Question (control group) – T1 - Motivation Message – T2 - Low Effort [to cheat] – T3 - High Effort [to cheat]

General Job: Rate quality of 5 stimuli in one task

13

Reliability of Responses in Crowdsourcing

Effect of Trapping Questions Strategies

 Herzberg’s two factory theory of job satisfaction.

slide-14
SLIDE 14

14

T0 No-trapping T1 Motivating msg

This is an interruption. We - the team of Crowdee - like to ensure that people work conscientiously and attentively on our tasks. Please select the answer ‘< x>’ to confirm your attention now.

slide-15
SLIDE 15

15

T2 Easy to cheat T3 Hard to cheat

slide-16
SLIDE 16

16

Reliability of Responses in Crowdsourcing

Effect of Trapping Questions Strategies

Performing Qualification Job (Q1)

New Crowd Workers

Job T1 Motivation Message Job T2 Low Effort

Entrance Level Study Level

Job T0 No Trapping

Training Job (T1)

Job T3 High Effort

Group Assigned

Participants: 179 workers 87 female and 92 male Mage= 27.9 y., SDage=8.1 y

slide-17
SLIDE 17

17

Reliability of Responses in Crowdsourcing

Effect of Trapping Questions Strategies

slide-18
SLIDE 18

Comparing 95% Confidence Intervals (CIs) of crowdsourcing studies with laboratory data

18

Reliability of Responses in Crowdsourcing

Effect of Trapping Questions Strategies

*

Group N of CIs lower N of CIs higher N of CIs overlapping RMSD

Trapping T0 – No Trapping 17 6 27 0.426 Trapping T1 –Motivating message 13 2 35 0.375 Trapping T2 – Easy to cheat 17 3 30 0.411 Trapping T3 – Hard to Cheat 16 4 30 0.390

* χ² 1, 𝑂 = 50 = 5.15, 𝑞 = .023

slide-19
SLIDE 19

Tools by QUL

19

slide-20
SLIDE 20

Crowdee

  • Mobile micro-task crowdsourcing platform
  • Mobile workforce-on-demand:

– available in Germany, US, UK

  • Research tool for investigating

– motivation for crowd workers – crowdsourcing platform optimization – data quality analysis

20

http://crowdee.de

slide-21
SLIDE 21

Crowdee

  • Creating Forms

– Free text, Selections (radio buttons, checkbox,..), info text

  • Recording

– Taking photo, recording audio and video, sensor

  • Profiles

– Dynamic profile, temporarily expiring values

  • Job Orchestration

– Job filtering using conditions, and automatically assign profile values on an action happens

  • Mobile

– Notification, collect data in the field!

21

http://crowdee.de

slide-22
SLIDE 22

Turkmotion

  • HIT rating platform to support Mturk worker
  • Rate HITs for:

– How enjoyable is this task for you? – How good is the payment for this task?

22

slide-23
SLIDE 23

Discussion

Sustainability, what if all job providers use those methods? Should the collected data in lab be considered as baseline?

23

slide-24
SLIDE 24

Thank you!

http://qu.tu-berlin.de

24