Getting To Know The Crowd David Martin Neha Gupta Jacki ONeill Ben - - PowerPoint PPT Presentation

getting to know the crowd
SMART_READER_LITE
LIVE PREVIEW

Getting To Know The Crowd David Martin Neha Gupta Jacki ONeill Ben - - PowerPoint PPT Presentation

Getting To Know The Crowd David Martin Neha Gupta Jacki ONeill Ben Hanrahan Outline Quick intro: Crowdsourcing and MTurk Some remarks on use and ethics Crowdworker studies in academic and other venues Some interesting hidden


slide-1
SLIDE 1

Getting To Know The Crowd

David Martin Neha Gupta Jacki O’Neill Ben Hanrahan

slide-2
SLIDE 2

Outline

  • Quick intro: Crowdsourcing and MTurk
  • Some remarks on use and ethics
  • Crowdworker studies in academic and other

venues

  • Some interesting hidden features
  • Questions and practical issues for research
  • Alternative research possibilities
slide-3
SLIDE 3

Crowdsourcing Definition

  • “the act of a company or institution taking a function once

performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an

  • pen call” (Howe, 2006)
  • Crowdsourcing work is in most cases labour
  • Encompasses multiple types of activity: invention, project

work, creative activities, and microtasking

– experimental use in research, providing data services, training algorithms – computer vision, text analytics, visualisation, translation and…?

  • Amazon Mechanical Turk (MTurk) is the best known

microtasking platform – 500k registered Turkers (probably 50k active)

slide-4
SLIDE 4

MTurk: Home Page

slide-5
SLIDE 5

The Work: Human Intelligence Tasks (HITs)

  • Image tagging, duplicate

recognition, text digitization, translation, transcription, object classification, and content generation

  • Originally used by

Amazon for quality control on their DBs

  • Hidden human work

behind much of the Internet

  • Pay in cents for minutes’

work

slide-6
SLIDE 6

Ethics of Crowdsourcing in and for Research

  • How to classify use?

– Should be treated as work and subject to conditions

  • perating in more conventional labour markets
  • Vast majority of crowdworkers see it as work
  • E.g. machine learning – image tagging

– When used for experimentation participants should be offered the same rights, protections and rewards

  • E.g. psychological experiments, usability tests
  • Consent, duty of care, reimbursement, debriefing

– Situation unclear since crowdsourcing has not been properly considered in employment law and ethics committees

  • Direct consequence of global and hi-tech nature and

misrepresentation of platform and workers

slide-7
SLIDE 7

Breaking Down the Crowdworker Studies

  • Academic Literature

– Computing – Law – Sociology

  • Advocacy and employment, legal, government
  • rganisations

– World Bank, trade unions, citizen rights

  • Journalism

– ‘I became a Turker’, crowdworker interviews, exposés, apologias, business digests

slide-8
SLIDE 8

Advocacy, Government, NGOs etc.

  • World Bank: The global opportunity in online
  • utsourcing

– http://www.behind-the-enemy-lines.com/2015/05/the- world-bank-report-on-online-labor.html

  • National Employment Rights Project: Rights on

Demand – Ensuring workplace standards and worker security in the on-demand economy

– http://www.nelp.org/content/uploads/Rights-On- Demand-Report.pdf

  • IGMetal: Crowdwork – zuruck in die Zukunft?

– https://www.igmetall.de/buch-crowdwork--zurueck-in-die- zukunft-14219.htm

slide-9
SLIDE 9

Journalism

  • Critiques from academics in the press:

– The Unregulated Work of Mechanical Turk, Nancy Folbre – http://economix.blogs.nytimes.com/2013/03/18/the-unregulated-work-of- mechanical-turk/?_r=0

  • Support from business writers:

– On the New York Times Stupidity Over Amazon's Mechanical Turk, Tim Worstall – http://www.forbes.com/sites/timworstall/2013/03/19/on-the-new-york- times-stupidity-over-amazons-mechanical-turk/

  • I became a Turker stories:

– “I make $1.45 a week and I love it” Katharine Mieszkowski – http://www.salon.com/2006/07/24/turks_3/

  • Interviews with Turkers:

– Amazon's Mechanical Turk workers protest: 'I am a human being, not an algorithm' Mark Harris – http://www.theguardian.com/technology/2014/dec/03/amazon-mechanical- turk-workers-protest-jeff-bezos?CMP=twt_gu

slide-10
SLIDE 10

Academic Work on Crowdworkers

  • Legal issues and the legal position
  • Numbers and demographics relating to the

Turkers and the market

  • Who are the crowdworkers, what do they do

and think, what are their problems?

  • India and development
  • This is non-exhaustive… we welcome help!
slide-11
SLIDE 11

Felsteiner – The Legal Position

  • Lack of a tailored legal environment

– Novelty of technology and market and global reach mean laws can be side-stepped and co’s can use labour arbitrage – Amazon hands-off role as market facilitator

  • Minimal open regulations, patchy opaque enforcement
  • Saving admin burden, time and money

– Categorised as independent contractors but law was designed for highly-paid professionals

  • More like radical outsourcing of piece-work

– Comparison with the homeworking/piece-work struggles – Crowdflower minimum wage lawsuit settled out of court

  • Principle in place but crowdflower was direct employer
slide-12
SLIDE 12

Quants on Crowdworkers

Best source: Panos Ipeirotis

  • http://www.behind-the-enemy-lines.com/
  • 2010: US 46.8, India 34.0, Other 19.2

– Gender breakdown US 2/3 women 1/3 men, India the opposite – Similar figures from Ross et al. (2010) less ‘others’ – Figures for income unclear as focus on household income – Ross et al. 1/3 <$10,000, Ipeirotis US ~60% <$60,000, India 55% <$10,000 – Education level – India ~50%, US ~35% Bachelors

  • Now? Probably quite similar although others have

disappeared and India has dwindled %-wise

– Up-to-date demographics (with API) available to explore – http://www.mturk-tracker.com/#/general

slide-13
SLIDE 13

Fort et al. Goldmine or Coalmine?

  • Study by researchers in domain of NLP
  • 500k + registered users
  • Est. 5,950,000 HITs per week
  • Est. 15,059-42,912 ‘active’ Turkers
  • Est. 80% of tasks done by 20% most active –

3,011-8,582

  • This raises sampling issues
slide-14
SLIDE 14

‘Curve Balls’

  • Studies that try and translate work into play

– Antin and Shaw – Social Desirability Bias – Kauffman et al. More fun than money – Studies proceed from the premise that Turkers cannot be working for that level of pay then fabricate an explanation – Hopefully naivety, lack of understanding – Turn us away from considering Turkers as workers

slide-15
SLIDE 15

Qualitative Work

  • Ipeirotis – Turker comments
  • Kittur, Bernstein, Bederson, Quinn
  • Irani, Silberman and Co

– Skype interviews, forum participation – Haikus, Turkers Bill of Rights – Turkopticon – sharing ratings – Dynamo – helping organisation/advocacy

  • Key Problems

– Unfair rejection, slow payment, low pay, lack of communication, threat of suspension, requester scams, badly designed tasks, information asymmetry, lack/imbalance of power, lack of search tools/user configuration

slide-16
SLIDE 16

Crowdsourcing and Development

  • Khanna et al. (2010) study of platform design for

low-income workers in India

– barriers preventing workers: difficulties understanding the intent of tasks, complex instructions, user interface issues, and cultural differences

  • Kelsa+ project (Gawade et al. 2012)

– showed low-income workers with limited literacy in English and computers have the potential to develop skills when provided with access to resources

slide-17
SLIDE 17

Qualitative Study of Turker Nation I

  • Turking is work → primarily motivated by earning money
  • Considerable variation in earnings but it is low wage work

– Highest earners $15-16k per year (~ equivalent to 40 hours/per week, US minimum wage - $7.25per hour). – Some evidence of v rare Turkers on $30,000

  • Workers generally aspire to earning $7-10 per hour

– Newbies do lower paid easy work to increase their reputation and ranking – Lower wages off-set against search time, amount of concentration required etc.

  • Turkers have preferences and skills

– E.g. high volume grinding, writing, professional tasks, some multi-skilled

  • AMT as a compromise - problems accessing the regular job

market or need to supplement income.

– Some housebound, others are in difficult circumstances

slide-18
SLIDE 18

Qualitative Study of Turker Nation II

  • Turker Nation for information and community support.

– Share info on tools, techniques and tricks of the trade, earnings, learning – Generous in sharing information about good (and bad) HITs and requesters. – Lots of off-market collaboration

  • Relationships are key:

– Like anonymity and freedom to work for who they want, when they want – Value good courteous relationships with requesters – Fair pay for fair work (decent wages, fairness, timely payment…) – Respect works both ways → regular work from good requester highly prized

  • Turker Nation Turkers mostly behave ethically

– Ripping requesters off is not endorsed on the forum – Duty to their fellow members to be honest

  • Hope that by sharing information and acting cooperatively they can have a

stronger effect on regulating the market (setting standards and wages)

  • Work is invisible and work to make the turking work is doubly invisible
slide-19
SLIDE 19

Qualitative Study of Indian Turkers I

  • Family and community collaboration

– Word of mouth, Facebook groups etc. – Sharing accounts, market in trading accounts, training, CS companies

  • Minimum English and some keyboard skills

required

– Lower skilled do simple and intuitive tasks – Danger of misunderstandings – Higher skilled can earn a good wage by Indian levels

slide-20
SLIDE 20

Qualitative Study of Indian Turkers II

  • Infrastructure challenges, bricolage and back-ups

– Juggling devices, mobile

  • Flexibility and turk-life balance

– Organise life around turking and are often helped by family

  • Precariousness and reputation management

– Accounts/blocking/suspension, getting paid – Many of the participants no longer have accounts

  • Cultural questions

– Some operate on a basis of accepted = allowed

slide-21
SLIDE 21

The Hidden Side to MTurk: skewing the market/changing the sample? I

  • MTurk Set-up ‘promotes’ rumour, misapprehension,

and distrust but intriguingly see a lot of sharing, altruism and cooperation

– Turkers not isolated atomic individuals – Use personal networks, forums (TurkerNation, MTurk forum, Reddit, Facebook groups etc.) – Discuss what tasks are for (interest and ethics) and how to do them as quickly as possible – How much is available, how often and when – rhythms and cycles of market? – How to influence the operation of the market by targeting and withholding labour – Loopholes and cheats

slide-22
SLIDE 22

The Hidden Side to MTurk: skewing the market/changing the sample? II

  • Experienced Turkers use suites of tools:

– Tools that automatically identify and grab tasks – Optimised browsers, shortcuts and so forth through plug- ins etc. – Juggling - weight of unintegrated tools and the use of scrapers causes crashes or they are blocked by Amazon – Adds to market speed and volatility

  • Difficult to measure the cumulative effects and impacts
  • Hidden markets and connections managed by

qualification system

  • Lower the price = greater possibility of attracting

Indian an lower skilled, higher attract everyone

slide-23
SLIDE 23

Questions & Practical Issues for Research

  • Sampling – particularly for a representative set of participants?

– What is the size of the population? – What are their demographics, vis-à-vis the public in general? – Getting the subjects you want? – Repeat subjects? – Questions of validity and repeatability

  • Ensuring the task is done ‘properly’

– The speed problem – The engagement problem – The imitation problem – false demographics , mutiple/shared accounts – Genuine mistakes – understanding instructions etc. – Scamming – scripts, random data etc. – Spamming – bots, advertising etc.

slide-24
SLIDE 24

Alternative Research Possibilities

  • Crowdsourcing can be a research playground but it can also be a

topic of research

– It’s cheap but does it really work and is it ethical? – Wouldn’t it just be better to construct a bespoke academic platform?

  • Within our disciplines – HCI, CSCW – there is a long history of

designing for the users and to empower the users

– Information, advocacy and design

  • So how do we design to support crowdworkers?

– E.g. Turkopticon (Irani and Silberman), Crowdworkers (Callinson- Burch), Turkbench (Hanrahan et al.), Turkmotion – Better information – Better tools: search, interfaces, optimisations, productivity – Positive market manipulation – Catering for different groups and needs?