Planning to Control Crowd-Sourced Workflows Daniel S. Weld - PDF document

3/4/20 Planning to Control Crowd-Sourced Workflows Daniel S. Weld University of Washington 1 30,000’ View • Crowdsourcing is huge & growing rapidly – Virtual organizations – Flash teams with mixed human & machine members • Automatic organization of work – Reduce labor required by 30-85% 2 2 1

3/4/20 Crowdsourcing • Performing work by soliciting effort from many people • Combining the efforts of volunteers/part-time workers (each contributing a small portion) to produce a large or significant result 3 3 Crowdsourcing Successes 190 M reviews of 4.4 M businesses Answers to 7.1 M prog. questions Universal reference for anything 4 4 2

3/4/20 Citizen Science 800,000 volunteers – Hubble images Discovered “Hanny’s Voorwerp” black-hole “Pea galaxies” Crowdsourced bird count & identification Migration shift -> effect of climate change Game to find 3D structure of proteins. Solved 15 year outstanding AIDS puzzle 5 5 Labor Marketplaces Will Grow to $5B by 2018 [Staffing Industry Analysts] • 2.7 million workers • 540,000 requestors • 35M hours worked in 2012 60% Growth Hours / week Nationality Charts from Panos Ipierotis’ blog; phone from pixabay . Nationality and gender of 6 6 3

3/4/20 Example Job on Mechanical Turk Write a descriptive caption for this picture, then submit. A parial view of a pocket calculator together with some coins and a pen. Submit $0.05 7 Figure from [Little et al. 2010] 7 Big Work from Micro-Contributions • Challenges – Small work units – Reliability & skill of individual workers vary • Therefore – Use a workflow to aggregate results & ensure quality – Manage workers with (unreliable) workers 8 8 4

3/4/20 Ex: Iterative Improvement initial caption [Little et al, 2010] 9 9 Ex: Iterative Improvement initial caption [Little et al, 2010] 10 10 5

3/4/20 Ex: Iterative Improvement initial caption [Little et al, 2010] 11 11 Ex: Iterative Improvement initial caption [Little et al, 2010] 12 12 6

3/4/20 Iterative Improvement [Little et al, 2010] First version After 8 iterations A parial view of a pocket A CASIO multi-function, solar powered calculator together with some scientific calculator. coins and a pen. A blue ball point pen with a blue rubber grip and the tip extended. Six British coins; two of £1 value, three of 20p value and one of 1p value. Seems to be a theme illustration for a brochure or document cover treating finance - probably personal finance. Figure from [Little et al. 2010] 13 13 [Little et al, 2010] 14 14 7

3/4/20 [Little et al, 2010] 15 15 Workflow Control Problem How many voters? Adaptive, Decision-Theoretic How many times? Control 16 16 8

3/4/20 TurKontrol POMDP Control of Iterative Improvement Peng Dai Chris Lin Both co-advised with Mausam 18 Artificial Intelligence 101 Agent Sensors Percepts Environment POMDP ? Actions Actuators 19 19 9

3/4/20 Partially-Observable Markov Decision Process Input: World State Actions Observe: Next State s’ = <x’, y’> s = <x, y> P(s’ | s, a) Reward = f(s, a, s’) Cost c Output: Construct policy , π : S à A, that chooses best action for each state I.e., actions that maximize expected reward – costs over time While learning action & reward probabilities (Reinforcement learning) Figure from Dan Klein & Pieter Abbeel - UC Berkeley CS188: http://ai.berkeley.edu.] 20 And Sarah Reeves (http://dear-theophilus.deviantart.com/) 20 Partially-Observable Markov Decision Process Input: Belief State Actions Observe: Noisy Sensor = f(s’) P(s) P(s’ | s, a) Reward Cost c Output: Construct policy , π : S à A, that chooses best action for each state I.e., actions that maximize expected reward – costs over time While learning action & reward probabilities (Reinforcement learning) Figure from Dan Klein & Pieter Abbeel - UC Berkeley CS188: http://ai.berkeley.edu.] 21 21 10

3/4/20 Solving the POMDP Constructing the policy , π, to choose the best action • Many algorithms – Point-based methods – UCT on discretized space – Lookahead search with beta distribution belief states Q * (s, a) = Σ s’ P(s’ | s, a) [ R(s, a, s’) + γ Max a Q * (s, a) ] • Exploration / exploitation problem – ε -greedy – UCB / Multi-armed bandit 22 22 From To (Hidden) World State <x,y> coords Quality Q 1 , Q 2 ∈ (0,1) Actions Move Improve caption task Grasp Vote best caption Costs Power used $$ paid to workers Reward F(quality returned) Robot figure from Dan Klein & Pieter Abbeel - UC Berkeley CS188: http://ai.berkeley.edu.] 23 23 11

3/4/20 Transition Model of Improve Action Worker creates (hopefully) improved artifact P P Quality a 1 Quality a 2 27 27 Belief State P P Quality a 1 Quality a 2 28 28 12

3/4/20 Transition Model of Voting Action Learned using Expectation Maximization P P Quality a 1 Quality a 2 Worker votes that artifact 1 is better P P Quality a 1 Quality a 2 29 29 POMDP for Iterative Improvement submit α α’ α α’ N initial update artifact (α) generate estimate make need Y Y more improving quality improve quality ballot voting α’ ? job estimates job of ? N better of α and α’ 35 35 13

3/4/20 POMDP for Iterative Improvement submit α α’ α α’ N initial update artifact ( ) generate estimate make need Y Y more improving quality improve quality ballot voting α’ ? job estimates job of ? N better of α and α’ 36 36 POMDP for Iterative Improvement submit α α’ α α’ N initial update artifact ( ) generate estimate make need Y Y more improving quality improve quality ballot voting α’ ? job estimates job of ? N better of α and α’ 37 37 14

3/4/20 POMDP for Iterative Improvement submit α α’ α α’ N initial update artifact ( ) generate estimate make need Y Y more improving quality improve quality ballot voting α' ? job estimates job of ? N better of α and α’ 38 38 POMDP for Iterative Improvement submit α α’ α α’ N initial update artifact ( ) generate estimate make need Y Y more improving quality improve quality ballot voting α' ? job estimates job of ? N better of α and α’ 39 39 15

3/4/20 POMDP for Iterative Improvement submit α α’ α α’ N initial update artifact ( ) generate estimate make need Y Y more improving quality improve quality ballot voting α' ? job estimates job of ? N better of α and α’ 48 48 POMDP for Iterative Improvement submit α α α’ α α’ N initial update artifact ( ) generate estimate make need Y Y more improving quality improve quality ballot voting α' ? job estimates job of ? N better of α and α’ 49 49 20

3/4/20 Comparison 0.8 0.75 POMDP Hand Coded 0.7 Quality 0.65 0.6 0.55 0.5 40 images, same average cost Controlling quality: POMDP 30% less labor [Dai, Mausam & W, AAAI-11] [Dai et al. AIJ 2013] 50 50 Allocation of Human Labor POMDP (TurKontrol) Hand Coded 51 51 21

3/4/20 Human Labor Redirected POMDP Hand Coded 52 52 Extra Slides 117 117 22

Planning to Control Crowd-Sourced Workflows Daniel S. Weld - PDF document

3/4/20 Planning to Control Crowd-Sourced Workflows Daniel S. Weld University of Washington 1 30,000 View Crowdsourcing is huge & growing rapidly Virtual organizations Flash teams with mixed human & machine members

Slides from session at online conference imoot 2013 May 26 th 2013 These were crowd sourced from

Planning to Control Crowd-Sourced Workflows Daniel S. Weld University of Washington (Joint Work

Utilizing Crowd Funding Utilizing Crowd Funding for Support SMEs funding for Support SMEs

Importing data Peter Humburg Statistician, Macquarie University DataCamp ChIP-seq Workflows in

Analysis of eCrime in Crowd- sourced Labor Markets Vaibhav

Crowd-sourced knowledge generation for the validation of global vegetation change analyses A

TTN Mapper Processing 3 million crowd sourced LoRa packets JP Meijers Interests: Who am I?

Crowd Sourced City Final Presentation Fall 2011 Francesca Camillo, Julia Fredenburg, Doneliza

Citizen Reporting or Crowd-sourced Data Information actively produced or submitted by citizens

Estimated Benefits of Crowd- Sourced Data from Social Media ITS-Alaska Meeting September 30,

Inferring Restaurant Styles by Mining Crowd Sourced Photos from User-Review Websites Haofu Liao,

Crowd-sourced Event Localization using Smartphones Robin Wentao Ouyang Animesh Srivastava

AND STOPPING THE NEXT ONE DAN LARSON VP OF PRODUCT, CROWDSTRIKE CROWD-SOURCED CLOUD-BASED

SatNOGS Crowd-sourced satellite operations Nikos Roussos Libre Space Foundation Hunting

Design Lessons From Binary Fission A Crowd Sourced Game for Precondition Discovery Kate Compton,

Exploiting crowd sourced platforms for statistical purposes Donatella Fazio, Marina Signore

Streaming Video and TCP-Friendly Congestion Control Sugih Jamin Department of EECS University

OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02 John Kubiatowicz University of

A Cost-Sensitive Adaptation Engine for Server Consolidation of Multitier Applications Gueyoung

Load Balancing in Ceph: Load Balancing With Pseudorandom Placement Esteban Molina-Estolano,

CSE 543 - Computer Security (Fall 2006) Lecture 18 - Network Security November 7, 2006 URL:

CSE 543 - Computer Security Lecture 22 - Denial of Service November 15, 2007 URL:

Analysis of Peer to Peer Communication in Networks 2/15/10 Prestige Lecture Purdue University 1

Black-Box Performance Control for High-Volume Non-Interactive Systems Chunqiang (CQ) Tang IBM