Crowdsourcing and Human Computation Tuesdays and Thursdays - - PowerPoint PPT Presentation

crowdsourcing and human computation
SMART_READER_LITE
LIVE PREVIEW

Crowdsourcing and Human Computation Tuesdays and Thursdays - - PowerPoint PPT Presentation

Crowdsourcing and Human Computation Tuesdays and Thursdays 3pm-4:30pm 3401 Walnut St room 401B Instructor: Chris Callison-Burch Website: crowdsourcing-class.org Inter-related concepts Groups of individuals doing things collectively


slide-1
SLIDE 1

Tuesdays and Thursdays 3pm-4:30pm 3401 Walnut St room 401B Instructor: Chris Callison-Burch Website: crowdsourcing-class.org

Crowdsourcing and Human Computation

slide-2
SLIDE 2

Collective Intelligence

“Groups of individuals doing things collectively that seem intelligent”

Human Computation

“A paradigm for utilizing human processing power to solve problems that computers cannot yet solve.”

Inter-related concepts

Crowd- sourcing

“Outsourcing a job traditionally performed by an employee to an undefined, generally large group

  • f people via open call.”

The Gig Economy

“A labor market characterized by the prevalence of short-term contracts or freelance work as

  • pposed to

permanent jobs.”

Data Mining “Applying algorithms

to extract patterns from data.”

slide-3
SLIDE 3

Francis Galton

slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6

Collective Intelligence?

Group Think

slide-7
SLIDE 7

Collective Intelligence?

Mob Mentality

slide-8
SLIDE 8

Collective Intelligence?

Fake News

slide-9
SLIDE 9

Collective Intelligence?

Misinformation campaigns

slide-10
SLIDE 10

Popular Delusions and the Madness of Crowds

  • Economic bubbles
  • Alchemy &

Psuedoscience

  • Witch hunts
  • Prophecies
slide-11
SLIDE 11

Tulip Mania

“A tulip, known as "the Viceroy" displayed in a 1637 Dutch catalog. Its bulb was offered for sale between 3,000 and 4,200 guilders depending on size. A skilled craftsworker at the time earned about 300 guilders a year.”

slide-12
SLIDE 12

"Looking back, it’s clear that the Beanie Baby craze was an economic bubble, fueled by frenzied speculation and blatantly baseless optimism. Bubbles are quite common, but bubbles over toys are not."

slide-13
SLIDE 13

Wisdom of Crowds

  • Diversity of Opinion
  • Independence
  • De-centralization
  • Aggregation

Requirements for a crowds to be wise

slide-14
SLIDE 14

Groups / Crowds

  • Employees of a business
  • Participants in a poll
  • Sports fans betting on games
  • Independent stock market investors
  • Internet users linking to sites
  • Citizens in a democracy
slide-15
SLIDE 15

Ways of aggregating collective intelligence

  • Point spreads / parimutuel odds
  • Stock prices
  • Futures contracts
  • Voting
  • Computer algorithms, interfaces
slide-16
SLIDE 16

2010 Haitian Earthquake

slide-17
SLIDE 17

Disaster Response

The maps are bad

Jan 12

Robert Munro

slide-18
SLIDE 18

Disaster Response

Jan 12

Robert Munro

Jan 23

Better maps from Crowdsourcing

slide-19
SLIDE 19

Disaster Response

  • Fanmi mwen nan Kafou, 24

Cote Plage, 41A bezwen manje ak dlo

  • Moun kwense nan Sakre Kè

nan Pòtoprens

  • Ti ekipman Lopital General

genyen yo paka minm fè 24 è

  • Fanm gen tranche pou fè yon

piIt nan Delmas 31

  • My family in Carrefour, 24 Cote

Plage,41A needs food and water

  • People trapped in Sacred Heart

Church, PauP

  • General Hospital has less than

24 hrs. supplies

  • Undergoing children delivery

Delmas 31

The responders don’t speak Kreyol

Robert Munro

slide-20
SLIDE 20

Disaster Response

Robert Munro

Maps + Translation + Local Knowledge

(18.4957, -72.3185)

Workers collaborated to find locations:

Dalila: I need Thomassin Apo please Apo: Kenscoff Route: Lat: 18.495746829274168, Long:-72.31849193572998 Apo: This Area after Petion-Ville and Pelerin 5 is not on Google Map. We have no streets name Apo: I know this place like my pocket Dalila: thank God u was here

Feedback from responders:

"just got emergency SMS, child delivery, USCG are acting, and, the GPS coordinates of the location we got from someone of your team were 100% accurate!"

Apo Dalila Haiti responders
slide-21
SLIDE 21

Disaster Response

  • Clark Craig of the Marine Corps:

–“I cannot overemphasize to you what the work of the Ushahidi/HaiI has

  • provided. It is saving lives every day.”
  • Secretary of State Hillary Clinton:

–“The technology community has set up interacIve maps to help us idenIfy needs and target resources. And on Monday, a seven-year-old girl and two women were pulled from the rubble of a collapsed supermarket by an American search-and-rescue team aVer they sent a text message calling for help.”

  • Craig Fulgate, FEMA Task Force:

–“[The] Crisis Map of HaiI represents the most comprehensive and up-to-date map available to the humanitarian community.”

  • Ushahidi@TuVs :

–“The World Food Program delivered food to an informal camp of 2500 people, having yet to receive food or water, in Diquini to a locaIon that 4636 had idenIfied for them.”

Robert Munro

slide-22
SLIDE 22

How can computer science and economics help facilitate collective intelligence?

slide-23
SLIDE 23

NASA Clickworkers (2000)

We try to have several people cover each region on Mars so that we can compute a consensus, throwing out any mistaken or frivolous entries and averaging out the inaccuracies. Here are all the clicks we received for this region Here is the consensus

NASA showed that public volunteers could do routine science analysis that would normally be done by a graduate student working for months on end. From November 2000 to January 2002, they had 101,000 clickworkers volunteering 14,000 work hours, 612,832 sessions, and 2,378,820 entries!

slide-24
SLIDE 24

NASA Clickworkers (2000)

Mars age map produced directly from clickworker inputs. Mars age map produced from scientists

Color guide: red=heavily cratered (old), green=medium, violet=lightly cratered (young).
slide-25
SLIDE 25
slide-26
SLIDE 26
slide-27
SLIDE 27
slide-28
SLIDE 28
slide-29
SLIDE 29
slide-30
SLIDE 30
slide-31
SLIDE 31

Postmark City:

Barre

Postmark State: MA Postmark Date:

Oct-11

Postmark Year:

1886

Stamp:

1c

$ 0. 01

slide-32
SLIDE 32
slide-33
SLIDE 33
slide-34
SLIDE 34

samasource.org

Help African Refugees

slide-35
SLIDE 35

What would you call these colors? Dolores Labs

Choose the right word

slide-36
SLIDE 36

Catch some zzzzs

thesheepmarket.com

slide-37
SLIDE 37
slide-38
SLIDE 38
slide-39
SLIDE 39

Dark Side of Crowdsourcing

Real Time with Bill Maher: The "Sharing" Economy – August 21, 2015 (HBO)
slide-40
SLIDE 40

Are Workers Treated Fairly?

40
slide-41
SLIDE 41

https://crowd-workers.com/

slide-42
SLIDE 42
slide-43
SLIDE 43

3.8m

task records

3k

workers

20k

requesters

slide-44
SLIDE 44 Kotaro Hara Abigail Adams Kristy Milland Saiph Savage Chris Callison-Burch Jeffrey P. Bigham

A Data-Driven Analysis of Workers’ Earnings on Amazon Mechanical Turk CHI-2018

ABSTRACT A growing number of people are working as part of on-line crowd work. Crowd work is often thought to be low wage
  • work. However, we know little about the wage distribution in practice and what causes low/high earnings in this
  • setting. We recorded 2,676 workers performing 3.8 million tasks on Amazon Mechanical Turk. Our task-level analysis
revealed that workers earned a median hourly wage of only ~$2/h, and only 4% earned more than $7.25/h. While the average requester pays more than $11/h, lower-paying requesters post much more work. Our wage calculations are influenced by how unpaid work is accounted for, e.g., time spent searching for tasks, working on tasks that are rejected, and working on tasks that are ultimately not submitted. We further explore the characteristics of tasks and working patterns that yield higher hourly wages. Our analysis informs platform design and worker tools to create a more positive future for crowd work.
slide-45
SLIDE 45

Takeaways

< $2/h

Crowd workers are underpaid and they often earn below $2/h

$

Unpaid work, particularly returning tasks has a large impact on the hourly wage Majority of the requesters reward workers below $5/h

slide-46
SLIDE 46

How to put crowdsourcing towards good uses

46
slide-47
SLIDE 47

Use of MTurk-like systems in research

  • Participant pool for user studies, polling,

cognitive science experiments

  • Annotation for machine learning tasks like

computer vision or NLP

  • Human Computer Interaction: worker pools

are hardwired into the UI

  • New Programming Languages Concepts
  • Study markets themselves for economics

research, cost-optimization

slide-48
SLIDE 48

Annotation for machine learning / artificial intelligence tasks

Is this a dog?

  • Yes
  • No

Answer: Yes Task: Dog? Pay: $0.01 Broker www.mturk.com $0.01

slide-49
SLIDE 49
slide-50
SLIDE 50

Human Computer Interaction

slide-51
SLIDE 51 Crowdsourcing - Rad Lab Talk - UC Berkeley Fall 2010
slide-52
SLIDE 52
slide-53
SLIDE 53

New Programming Languages Concepts

slide-54
SLIDE 54

New Programming Languages Concepts

slide-55
SLIDE 55

New Programming Languages Concepts

  • Latency
  • Cost
  • Parallelization
  • Non-determinism
  • Iterative improvement
slide-56
SLIDE 56

Study Markets Themselves

  • What predictions of economics hold true
  • n MTurk?
  • What incentives can we give to increase

throughput, quality, worker retention?

  • What is the cost-optimal solution to a

problem?

slide-57
SLIDE 57

What will we cover in this class (and should you take it)?

slide-58
SLIDE 58

Topics

  • Taxonomy of crowdsourcing and human

computation

  • Microtasking platforms like Mechanical Turk and

Figure-eight

  • Programming concepts for human computation
  • The economics of crowdsourcing
  • Crowdsourcing and machine learning
  • Applications to human computer interaction
  • Crowdsourcing and social science
slide-59
SLIDE 59

Who should take this class

  • Anyone who wants to be on the cutting

edge of this new field

  • Entrepreneurial students who want to start

their own companies

  • Students from the business school who

want to experiment with markets

  • Students from the social sciences who want

to conduct large-scale students with people

slide-60
SLIDE 60

Course Requirements

Weekly assignments Writing and Coding Presentations Company profile, project pitch Final project Self-designed, groups of 4-5 Final presentation Show off your work

slide-61
SLIDE 61

How much programming is required?

  • Programming assignments are in Python
  • We provide code that you modify
  • We want everyone to be able to participate,

regardless of programming experience

  • For most assignments, you can work with a

partner (turn in only one assignment - you’ll both get the same grade)

slide-62
SLIDE 62

Gun Violence Database

In 2016, the programming assignments for NETS213 formed a sequence Build a machine learning classifier to identify newspaper articles that describe gun violence Have crowd-workers verify its predictions Have crowd-workers extract structured information from the text of the articles Analyze the data and build visualizations about gun violence in the USA

slide-63
SLIDE 63 Douglas Wiebe Professor of Epidemiology University of Pennsylvania Perelman School of Medicine
slide-64
SLIDE 64
slide-65
SLIDE 65

The Gun Violence Database

http://gun-violence.org/

slide-66
SLIDE 66
slide-67
SLIDE 67

Reading assignments

We will be using The Wisdom of Crowds as the course reader, and supplementing it with readings from academic papers. You will have to replicate one academic paper.

slide-68
SLIDE 68

What will you get out of this class?

  • Understanding of an emerging field of CS
  • Basic python and machine learning skills
  • Ideas that you could transform into a

startup company or academic research

  • A new way of thinking about collective

decision making by companies and countries

slide-69
SLIDE 69

Who are we?

slide-70
SLIDE 70

Professor Callison-Burch

(not Professor Burch)

Bachelors from Stanford PhD from University of Edinburgh 6 years at Johns Hopkins University Joined Penn faculty in 2013 Research Interests: Crowdsourcing, Natural Language Processing

slide-71
SLIDE 71

Leaderboard

71
slide-72
SLIDE 72

TAs

slide-73
SLIDE 73
slide-74
SLIDE 74