Perspectives on Infrastructure for Crowdsourcing Omar Alonso - PowerPoint PPT Presentation

Perspectives on Infrastructure for Crowdsourcing Omar Alonso Microsoft 9 February 2011 WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Disclaimer The views and opinions expressed in this talk are mine and do not necessarily reflect the official policy or position of Microsoft. WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Disclaimer – II • Personal experience – MTurk, CrowdFlower, Internal MS tools • IR focus – Relevance evaluation, assessment, ranking, query classification, etc – TREC, INEX, Twitter, Facebook • Continuity • Industry perspective WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Introduction • Crowdsourcing is hot • Lots of interest in the research community – Articles showing good results – Workshops and tutorials (ECIR’10, SIGIR’10 , NACL’10, WSDM’11, WWW’11, etc.) – CrowdConf • Large companies leveraging crowdsourcing • Start-ups • VCs are putting money on it WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Areas of interest • Social/behavioral science • Human factors • Algorithms • Databases • Distributed systems • Statistics WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Why Mechanical Turk • Brand (Amazon) • Speed of experimentation • Price • Diversity • Payments • Lots of problems and missing features – Still, people keep using it WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Pedal to the metal • You read the papers • You tell your boss that crowdsourcing is the way to go • You know need to produce hundreds of Ks of labels per month • Easy, right? WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Why not Mechanical Turk • Spam • Worker and task quality • No analytics • Need to build tools around it WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Alternatives? • First mover advantage • The service hasn’t evolved that much • $$$ • People are trying … – CrowdFlower, CloudCrowd, etc. WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Infrastructure thoughts WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

The human • As a worker – I hate when instructions are not clear – I’m not a spammer – I just don’t get what you want – Boring task – A good pay is ideal but not the only condition for engagement WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

The human – features • Routing/recommendation of similar tasks based on past behavior and/or content. • Requester rating based on payment performance, rejected work, and overall task difficulty. A worker should be able to rate the quality of work and also the quality of the requester. • Ability to comment on a task • Work categorization. Similarly to a job search site, all work that is available should be classified WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

The experimenter • As an experimenter – Balancing act: an experiment that would produce the right results and is appealing to workers – Attrition – I want your honest answer for the task – I want qualified workers and I want the system to do some of that for me WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

The experimenter – features • Ability to manage workers in different levels of expertise including spammers and potential cases. • Abstract the task as much as possible from the quality control statistics. The developer should provide thresholds for good output. • Ability to mix different pools of workers based on different profile and expertise levels. • Honey-pot management and incremental qualification tests based on expertise and past performance. WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

The system • Similarities with MapReduce approaches • Integration of human computation to a language • I would like to program the crowd • Built-in statistics and other quality control WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

The system – features • Performance and high availability • Spam detection built in • Payments (including international markets) • Inter-agreement statistics library and ability to plug-in a user-defined one • Uncertainty management • High-level language for designing tasks • Analytics WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Conclusions and questions • Social networking and crowdsourcing • Crowds, clouds and algorithms • What is the best way to perform human computation? • What is the best way to combine CPU with HPU for solving problems? • What are the desirable integration points for a computation that involves CPU and HPU? WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining

Perspectives on Infrastructure for Crowdsourcing Omar Alonso - PowerPoint PPT Presentation

Perspectives on Infrastructure for Crowdsourcing Omar Alonso Microsoft 9 February 2011 WSDM 2011 Workshop on Crowdsourcing for Search and Data Mining Disclaimer The views and opinions expressed in this talk are mine and do not necessarily

THE MODEL: PERSPECTIVES AND CHALLENGES PERSPECTIVES AND CHALLENGES NOU 2012:2 Outside and Inside

Introductory Session FACTS AND PERSPECTIVES IN PSYCHOLOGY Overview Perspectives in psychology

Social Security and Social Security and Employment: Employment: Perspectives on the Linkages

THE AFRICAN UNION AND THE AFRICAN UNION AND ITS PERSPECTIVES ON ITS PERSPECTIVES ON BIOSAFETY

PROJECT CONCEPT 2 Project Introduction 13 Mar 2017 Project Location and Access 4 Project

Medium and long term perspectives perspectives of of Inland Inland Medium and long term

NHEC Perspectives on Energy NHEC Perspectives on Energy Efficiency and Sustainable Energy

State of the State: Housing Perspectives from the field State of the State: Housing Perspectives

International Taxation International Taxation perspectives and recent perspectives and

Obstacles and Perspectives Obstacles and Perspectives EES 3310/5310 EES 3310/5310 Global

Phylogenomic perspectives on reproductive Phylogenomic perspectives on reproductive isolation and

GENERAL PERSPECTIVES ON GENERAL PERSPECTIVES ON LONG- -TERM SURVEY RESEARCH TERM SURVEY

PCI based DDAQ PCI based DDAQ status and perspectives status and perspectives INFN Padova INFN

Recent results and and perspectives perspectives on on Recent results cosmic ray matter and

Historical Perspectives Historical Perspectives on Climate Change on Climate Change EES

The Perspectives of Digital Curators The Perspectives of Digital Curators on Building Distributed

Subscription on mobility instead of car ownership hans.arby@ubigo.me URBAN MOBILITY @Region

Vice President of Marketing, AccountantsWorld Vice President of Marketing, AccountantsWorld

MARK E T I NG F OR E NT RE PRE NE URS L E CT URE 6 RE CAP T he Star tup L ife

Thes ese a are e the fi e firs rst 6 6 slides. They can n be be used ed t to revi view

Albert-Lszl Barabsi With Emma K. Towlson, Sebastian Ruf, Michael Danziger, and Louis

Financial Risk Management for Cryptocurrencies A QUANTITATIVE ANALYSIS Eline Van der Auwera

A glimpse of auction theory Anna Karlin Agenda Loose end continuity correction A

Auctions Lirong Xia Sealed-Bid Auction One item A set of bidders 1,, n bidder j s

Sambuz

Useful Links

Newsletter

Mail Us