crowdsourcing with diverse groups of users
play

Crowdsourcing with Diverse Groups of Users Sara Cohen Moran - PowerPoint PPT Presentation

Crowdsourcing with Diverse Groups of Users Sara Cohen Moran Yashinski 1 Te Team Formation problem Example: Forming an education board ES SP, ES SP Required skills: School Principal (SP) High School teacher (HS) Bob Alice


  1. Crowdsourcing with Diverse Groups of Users Sara Cohen Moran Yashinski 1

  2. Te Team Formation problem • Example: Forming an education board ES SP, ES SP • Required skills: • School Principal (SP) • High School teacher (HS) Bob Alice Chris • Elementary School teacher (ES) HS, ES HS HS Denise Sharon Jack 2

  3. Team Formation problem with Te Co Commu mmunication Co Cost ES Goal: Find a team that has all • SP SP, ES required skills, while minimizing 3 4 communication cost Alice 2 4 Examples of communication costs • 1 1 Chris Bob Distance in the social network • 5 4 4 2 HS 2 HS, ES (An inverse of) the number of • HS papers each 2 experts 3 published together 5 1 Jack Denise 3 Sharon

  4. Re Research Question • What if we wanted to define diversity based on the properties? • Gender, Income, Age, Religion, Location, etc. • We would like to define target diversity function for the different experts’ properties • Goal: Efficiently find a team that has all required skills, and is as close as possible to the desired target diversity 4

  5. Te Team Formation with Target Diversity constraint co Target Diversity based on Properties • SP SP, ES ES Goal: Efficiently find a team that has all • required skills, and is as close as possible to the desired target diversity 𝑬𝒋𝒕𝒖𝒔𝒋𝒄𝒗𝒖𝒋𝒑𝒐 𝑫𝒑𝒕𝒖 = • Gender: Male Gender: Female Gender: Male |𝑈𝑓𝑏𝑛 𝐸𝑗𝑤𝑓𝑠𝑡𝑗𝑢𝑧 − 𝑈𝑏𝑠𝑕𝑓𝑢 𝐸𝑗𝑡𝑤𝑓𝑠𝑡𝑗𝑢𝑧| ; Income: High Income: High Income: Middle HS HS HS, ES Example: • Gender Target Diversity: • 𝑁𝑏𝑚𝑓, 𝐺𝑓𝑛𝑏𝑚𝑓 = 1 3 B , 2 3 B Income Target Diversity : • Gender: Female Gender: Female Gender: Male 𝐼𝑗𝑕ℎ, 𝑁𝑓𝑒𝑗𝑣𝑛, 𝑀𝑝𝑥 = 1 2 B , 1 4 B , 1 4 B 5 Income: Low Income: High Income: Middle

  6. Wha What are we going ng to di disc scuss? uss? • Research Question: diversity based on personal properties ✓ • Advantages of Diversity (or.. why is it interesting?) • Related work • Algorithms and computational considerations • Fixed Parameters Tractable (Optimal) Algorithm • Greedy Approximation Algorithm • Experimental Results • Conclusions 6

  7. Adv Advantages s of f Div Diversit ity ( (or.. w why is is it it in inter eres estin ting?) ?) • Advantages in the workplace • Increase in productivity and creativity (innovative solutions) • Increase morale in workplaces • Positive reputation/attraction of quality human resources • When crowdsourcing, it is important to consider different points of views • Defining the diversity of a team • Program committees • Adopting affirmative actions 7

  8. Re Related Work • Team formation with Communication Cost • Goal: Find a team that has all required skills, while minimizing communication cost (e.g. Sum of Distances, Diameter) • Diversity in terms of social influence • Depends on the social influences between candidates • Low social influence is correlated with high productivity • Diversity in query answering • The goal is to maximize the diversity of the results • Diversity based on different criteria (e.g. content, novelty and coverage) 8

  9. Wha What ha have we achi hieved? d? • Finding an optimal solution is NP-complete • Naïve algorithm • Check all possible options and finds optimal solution • Time complexity: 𝑃( 𝐷 O 𝑇 𝑄 ) • Intractable in practice as |𝐷| might be huge • Fixed Parameter Tractable (Optimal) Algorithm • Find an optimal solution in time complexity which is 𝑞𝑝𝑚𝑧( 𝐷 ) times exp ( 𝑇 , 𝑄 ) • Greedy Approximation Algorithm • Time complexity: 𝑞𝑝𝑚𝑧( 𝑇 , 𝐷 ) • Guaranteed to return 1/2-approximation of the optimal solution 9

  10. Fix Fixed ed Pa Para rameter Tr Tractable (Op (Optimal) ) Al Algorithm hm • Finds optimal solution • Complexity time: 𝑞𝑝𝑚𝑧( 𝐷 ) times exp 𝑇 , 𝑄 • Using preprocessed data structures in order to improve runtime performance • Use the notion of Abstract (Optimal) Templates and Concrete Templates 10

  11. Abstract (Op Ab (Optimal) ) Templ plates, , Co Concrete Te Templates: Example • One property (Gender): 𝑁𝑏𝑚𝑓, 𝐺𝑓𝑛𝑏𝑚𝑓 = W X ⁄ , ; X ⁄ • • 𝑇 = {𝑇𝑄, 𝐼𝑇, 𝐹𝑇} Male Female • Abstract Optimal Template 2 1 • Achieves minimum distribution cost • There could be many Abstract Optimal Templates Male Female • Abstract Template (non optimal) 3 0 • Concrete Templates: • 𝑕𝑓𝑜𝑒𝑓𝑠 𝑇𝑄 = 𝐺, 𝑕𝑓𝑜𝑒𝑓𝑠 𝐼𝑇 = 𝑁, 𝑕𝑓𝑜𝑒𝑓𝑠 𝐹𝑇 = 𝑁 • 𝑕𝑓𝑜𝑒𝑓𝑠 𝑇𝑄 = 𝑁, 𝑕𝑓𝑜𝑒𝑓𝑠 𝐼𝑇 = 𝐺, 𝑕𝑓𝑜𝑒𝑓𝑠 𝐹𝑇 = 𝑁 • 𝑕𝑓𝑜𝑒𝑓𝑠 𝑇𝑄 = 𝑁, 𝑕𝑓𝑜𝑒𝑓𝑠 𝐼𝑇 = 𝑁, 𝑕𝑓𝑜𝑒𝑓𝑠 𝐹𝑇 = 𝐺 11

  12. FPT T Optimal Al Algorithm hm: Data struc uctur ures • Used to optimize runtime performance • Hashset ℍ to hold all the abstract templates • To avoid evaluating an abstract template more than once (very costly) • minHeap 𝕅 to efficiently return the abstract template which has minimum cost • Structure 𝕋ℙℂ • Calculated offline 𝑇𝑙𝑗𝑚𝑚𝑡 SP HS ES 𝑄𝑠𝑝𝑞𝑓𝑠𝑢𝑗𝑓𝑡 M M F F M F Jack Bob Denise Denise Bob 𝐷𝑏𝑜𝑒𝑗𝑒𝑏𝑢𝑓𝑡 12 Chris Sharon Alice

  13. FPT T Optimal Al Algorithm hm: Workfl kflow Calculate Optimal Extract Abstract Abstract Templates Template A from and insert to ℍ and 𝕅 𝕅 Check in 𝕋ℙ for Create NEXT candidates which If found, STOP and Abstract satisfy all concrete return templates (for all Templates from A properties) Create Concrete If not in ℍ , insert Templates from A to ℍ and 𝕅 13

  14. Gr Greedy Approxim imatio tion Alg lgorith ithm • Time complexity: 𝑞𝑝𝑚𝑧 𝑇 , 𝐷 • Using sets of candidates per skill • Greedy solution: in each step chooses an unchosen skill and candidate with that skill which (locally) minimizes the distribution cost Jack Bob Bob Deinse Alice Chris Jack Deinse 14 HS ES SP

  15. Gr Greedy Approxim imatio tion Alg lgorith ithm (cont. t.) • Optimizing a function call benefit , that is inversely proportional to the distribution cost • The benefit function is a monotonic submodular function and therefore guaranteed to return 1/2-approximation of the optimal solution

  16. Expe Experimentation • Tested scalability as a function of 𝐷 , 𝑇 , 𝑄 𝑏𝑜𝑒 𝑄𝑠𝑝𝑞𝑓𝑠𝑢𝑧 𝑆𝑏𝑜𝑕𝑓 • Default values: 𝑇 = 8, 𝑄 = 5, 𝐷 = 100𝐿, 𝑄𝑠𝑝𝑞𝑓𝑠𝑢𝑧 𝑆𝑏𝑜𝑕𝑓 = 4 • Types of synthetic datasets: • TC1 (random assignment) • Property values: assigned randomly using uniform distribution • Skills per candidate: randomly choosing between 1 and |𝑇| skills per candidate • TC2 (random assignment with 1 skill) • Property values: assigned randomly using uniform distribution • Skills per candidate: each candidate is given 1 random skill • TC3 (skewed distribution with 2 skills) • Property values and skills (2 skills per candidate) are assigned using a skewed distribution 16

  17. Expe Experimentation: n: Varyi ying ng num numbe ber of f ski skills 1000 100 10 1 0.1 0.01 0.001 2 4 6 8 10 TC1FPT TC2FPT TC3FPT TC1Greedy TC2Greedy TC3Greedy 17

  18. Experimentation: Expe n: Varyi ying ng num numbe ber of f pr proper perties es 1000 100 10 1 0.1 0.01 0.001 3 5 7 TC1FPT TC2FPT TC3FPT TC1Greedy TC2Greedy TC3Greedy 18

  19. Experimentation: Expe n: Varyi ying ng num numbe ber of f ca candidates 1000 100 10 1 0.1 0.01 0.001 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 TC1FPT TC2FPT TC3FPT TC1Greedy TC2Greedy TC3Greedy 19

  20. Expe Experimentation: n: Varyi ying ng pr proper perty rang nge 1000 100 10 1 0.1 0.01 0.001 3 4 5 6 TC1FPT TC2FPT TC3FPT TC1Greedy TC2Greedy TC3Greedy 20

  21. Expe Experimentation: n: Qua uality y of f Resul sults s (G (Greedy dy Vs. FPT) T) TC1 TC1 TC1 TC2 TC2 TC2 TC3 TC3 TC3 Max diff Max diff Max diff 0 0 0 0.25 0.25 0.25 0.5 0.5 0.5 Average over Average over 0 0 0.01 0.01 0.11 0.11 all test cases all test cases Average over 0 0.25 0.29 test cases in which greedy didn’t return optimal result 21

  22. Co Conclusi sions • FPT Optimal Algorithm • Always returns an optimal result • Time increases exponentially with the number of skills, properties and property range • Increasing the number of candidates doesn’t impact running time (except when the data is skewed) • Might take long time to find the optimal solution (especially when the data is skewed) • Outperforms the Greedy Algorithm when there is little skew in the data • Greedy Approximation Algorithm • Performs well under all types of data • Returns results close to optimal 22

  23. Questions? 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend