Algorithms for Team Formation Evimaria Terzi (Boston University) - PowerPoint PPT Presentation

Experiments (Guru)  Competition-based Dollar-based • Boston University Slideshow Title Goes Here  ClusterHire evimaria@cs.bu.edu

Experiments (Freelancer)  Competition-based Dollar-based • Boston University Slideshow Title Goes Here  ClusterHire evimaria@cs.bu.edu

Experiments • Performance of CliqueGreedy Boston University Slideshow Title Goes Here Freelancer  Guru • Nodes: 1764 Nodes: 721 Cliques: 1660 Cliques: 520 evimaria@cs.bu.edu

Roadmap Boston University Slideshow Title Goes Here • Background • Team formation and cluster hires • Team formation in the presence of a social network • Inferring abilities of experts • Team formation in educational settings evimaria@cs.bu.edu

Setting [LLT’09]  Experts (defining the set V, with |V|=n): Boston University Slideshow Title Goes Here  Every expert i is associated with a set of skills X i  and a price p i  Tasks  Every task T is associated with a set of skills (T) required for performing the task  A social network of experts (G=(V,E))  Edges indicate ability to work well together Team Formation Experts’ skills Known Participation of experts in teams Unknown Network structure Known evimaria@cs.bu.edu

Team formation in the presence of a social network Boston University Slideshow Title Goes Here  Given a task and a set of experts organized in a network find the subset of experts that can e fg ectively perform the task  Task: set of required skills  Expert: has a set of skills  Network: represents strength of relationships evimaria@cs.bu.edu

Coverage is NOT enough T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here A lice B ob C ynthia D avid E leanor { algorithms } { python } { graphics, java } { graphics } { graphics,java,python } A,E could perform A,B,C form an A A D the task if they e fg ective group that could communicate can communicate B B C C E E Communication: the members of the team must be able to e ffj ciently communicate and work together evimaria@cs.bu.edu

Problem definition (E fg ectiveTeam) Boston University Slideshow Title Goes Here  Given a task and a social network of individuals, find the subset (team) of individuals that can e fg ectively perform the given task.  Thesis: Good teams are teams that have the necessary skills and can also communicate e fg ectively evimaria@cs.bu.edu

How to measure e fg ective communication? Boston University Slideshow Title Goes Here The longest shortest path between any two nodes in the subgraph  Diameter of the subgraph defined by the group members A A D B B C C E E diameter = 1 diameter = infty evimaria@cs.bu.edu

How to measure e fg ective communication? Boston University Slideshow Title Goes Here The total weight of the edges of a tree that spans all the team nodes  MST (Minimum spanning tree) of the subgraph defined by the group members A A D B B C C E E MST = infty MST = 2 evimaria@cs.bu.edu

Problem definition (MinDiameter) Boston University Slideshow Title Goes Here  Given a task and a social network G of experts, find the subset (team) of experts that can perform the given task and they define a subgraph in G with the minimum diameter.  Problem is NP-hard evimaria@cs.bu.edu

The RarestFirst algorithm Boston University Slideshow Title Goes Here Find Rarest skill α rare required for a task  S rare group of people that have α rare  Evaluate star graphs, centered at individuals  from S rare Report cheapest star  Running time: Quadratic to the number of nodes Approximation factor: 2xO PT evimaria@cs.bu.edu

The RarestFirst algorithm T={ algorithms,java,graphics,python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} A B Skills: A B algorithms {algorithms,graphics,java} E E graphics java C D python {python,java} {python} α rare = algorithms Diameter = 2 S rare ={B ob , E leanor } evimaria@cs.bu.edu

The RarestFirst algorithm T={ algorithms,java,graphics,python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} Skills: A B algorithms { algorithms,graphics,java } graphics E E java C python C D {python,java} {python} α rare = algorithms Diameter = 1 S rare ={B ob , E leanor } evimaria@cs.bu.edu

Analysis of RarestFirst Boston University Slideshow Title Goes Here S 1  D = max {d ℓ , d k , d ℓ k } d 1 …. S rare  Fact: OPT ≥ d ℓ d ℓ S ℓ  Fact: OPT ≥ d k …. d k d ℓ k  D ≤ d ℓ k ≤ d ℓ + d k ≤ 2*OPT S k evimaria@cs.bu.edu

Problem definition (MinMST) Boston University Slideshow Title Goes Here  Given a task and a social network G of experts, find the subset (team) of experts that can perform the given task and they define a subgraph in G with the minimum MST cost.  Problem is NP-hard evimaria@cs.bu.edu

The SteinerTree problem Boston University Slideshow Title Goes Here  Graph G(V,E) Required vertices  Partition of V into V = {R,N}  Find G’ subgraph of G such that G’ contains all the required vertices (R) and MST(G’) is minimized evimaria@cs.bu.edu

The EnhancedSteiner algorithm T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here graphics {graphics,python,java} {algorithms,graphics} A B java {algorithms,graphics,java} algorithms E E D C D python {python,java} {python} MST Cost = 1 evimaria@cs.bu.edu

Exploiting the SteinerTree problem further Boston University Slideshow Title Goes Here  Graph G(V,E) Required vertices  Partition of V into V = {R,N}  Find G’ subgraph of G such that G’ contains all the required vertices (R) and MST(G’) is minimized evimaria@cs.bu.edu

The CoverSteiner algorithm T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} A B 1. Solve SetCover {algorithms,graphics,java} E E 2. Solve Steiner D C D {python,java} {python} MST Cost = 1 evimaria@cs.bu.edu

How good is CoverSteiner? T={ algorithms , java , graphics , python } Boston University Slideshow Title Goes Here {graphics,python,java} {algorithms,graphics} B A B A 1. Solve SetCover {algorithms,graphics,java} E 2. Solve Steiner C D {python,java} {python} MST Cost = Infty evimaria@cs.bu.edu

Experiments – Cardinality of teams Boston University Slideshow Title Goes Here Dataset DBLP graph (DB, Theory, ML, DM) ~6000 authors ~2000 features Features: keywords appearing in papers Tasks: Subsets of keywords with di fg erent cardinality k evimaria@cs.bu.edu

Example teams (I) Boston University Slideshow Title Goes Here S. Brin, L. Page: The anatomy of a large-scale hypertextual  Web search engine  Paolo Ferragina, Patrick Valduriez, H. V. Jagadish, Alon Y. Levy, Daniela Florescu Divesh Srivastava, S. Muthukrishnan  P. Ferragina ,J. Han, H. V.Jagadish, Kevin Chen-Chuan Chang, A. Gulli , S. Muthukrishnan, Laks V. S. Lakshmanan evimaria@cs.bu.edu

Example teams (II) Boston University Slideshow Title Goes Here  J. Han, J. Pei, Y. Yin: Mining frequent patterns without candidate generation  F. Bronchi  A. Gionis, H. Mannila, R. Motwani evimaria@cs.bu.edu

Extensions Boston University Slideshow Title Goes Here  Other measures of e fg ective communication  density, number of times a team member participates as a mediator, information propagation  Other practical restrictions  Incorporate ability levels  Online team formation [ABCGL’12]  evimaria@cs.bu.edu

Setting • Pool of people/experts with different skills Boston University Slideshow Title Goes Here • People are connected through a social network • Stream of jobs/tasks arriving online • Jobs have some skill requirements • Goal: Create teams on-the-fly for each job – Select the right team – Satisfy various criteria evimaria@cs.bu.edu

Criteria • Fitness Boston University Slideshow Title Goes Here – E.g. if fitness is success rate, maximize expected number of successful jobs – Depends on: – People skills – Ability to coordinate • Efficiency – Do not load people very much • Fairness – Everybody should be involved in roughly the same number of jobs • evimaria@cs.bu.edu

Basic formulation 00010101 Boston University Slideshow Title Goes Here 10011101 Stream of tasks arriving online 10010010 10001101 Vector of skills Vector of skills evimaria@cs.bu.edu

Basic formulation 00010101 Boston University Slideshow Title Goes Here 10011101 Stream of tasks arriving online 10010010 10001101 Vector of skills Vector of skills Coordination cost evimaria@cs.bu.edu

Basic formulation: Skills and people Boston University Slideshow Title Goes Here 10001101 10010010 • n people/experts • m skills • Each person has some skills evimaria@cs.bu.edu

Basic formulation: jobs & teams 00010101 Boston University Slideshow Title Goes Here 10011101 10010010 10001101 • Stream of k Jobs/Tasks • A job requires some skills • k Teams are created online • A team must cover all job skills evimaria@cs.bu.edu

Basic formulation: jobs & teams 00010101 Boston University Slideshow Title Goes Here 10011101 10010010 10001101 • Stream of k Jobs/Tasks • A job requires some skills • k Teams are created online • A team must cover all job skills • Load of p: L(p) = total # of teams having p evimaria@cs.bu.edu

Coordination cost Boston University Slideshow Title Goes Here • Coordination cost measures the compatibility of the team members • Example of : – Degree of knowledge – Time-zone difference – Past collaboration • Select teams that minimizes coordination cost : – Steiner-tree cost – Diameter – Sum of distances evimaria@cs.bu.edu

Coordination cost Boston University Slideshow Title Goes Here • Steiner-tree cost • Diameter • Sum of distances evimaria@cs.bu.edu

Conflicting goals Boston University Slideshow Title Goes Here • We want to create teams online that minimize – Load – Unfairness – Coordination cost and cover each job. • How can we model all these requirements? evimaria@cs.bu.edu

Our modeling approach • Set a desirable coordination cost upper bound B Boston University Slideshow Title Goes Here • Online solve Load of person i Team j covers job j Bounded coordination cost • Must concurrently solve various combinatorial problems: – Set cover – Steiner tree – Online makespan minimization evimaria@cs.bu.edu

Our modeling approach Boston University Slideshow Title Goes Here Job p 1 p 2 p 3 p 4 p 5 p 6 p 7 Q j 1    Q 1 = {p 2 , p 4 , p 5 } 2    Q 2 = {p 1 , p 4 , p 6 } 3   Q 3 = {p 3 , p 4 } 4    Q 4 = {p 1 , p 5 , p 7 } 5 Q 5 = {p 2 , p 3 . p 4 , p 5 }     6    Q 6 = {p 3 , p 5 , p 6 } 7   Q 7 = {p 1 , p 2 } 8      Q 8 = {p 1 , p 2 , p 3 , p 4 , p 7 } 9    Q 9 = {p 3 , p 4 , p 5 } Load 4 4 5 6 5 2 2 evimaria@cs.bu.edu

Algorithm ExpLoad Load of p at time t Boston University Slideshow Title Goes Here At each time step t, when a task arrives: • Weight each person p by • Select team Q that – Covers all required skills – Satisfies – Minimizes • Theorem. If we can solve this problem optimally, then Competitive ratio = . This is the best possible. evimaria@cs.bu.edu

The ExpLoad algorithm Load of p at time t Boston University Slideshow Title Goes Here At each time step t, when a task arrives: • Weight each person p by • Select team Q that – Covers all required skills – Satisfies – Minimizes • Theorem. If we can solve this problem optimally, then Competitive ratio = . This is the best possible. evimaria@cs.bu.edu

The ExpLoad algorithm Load of p at time t At each time step t, when a task arrives: Boston University Slideshow Title Goes Here • Weight each person p by • Select team Q that We can solve this – Covers all required skills problem only – Satisfies approximately. – Minimizes • Theorem. If we can solve this problem optimally, then Competitive ratio = . This is the best possible. evimaria@cs.bu.edu

Setting [GLT’12]  Experts (defining the set V, with |V|=n): Boston University Slideshow Title Goes Here  Every expert i is associated with a set of skills X i  and a price p i  Tasks  Every task T is associated with a set of skills (T) required for performing the task  A social network of experts (G=(V,E))  Edges indicate ability to work well together Team Formation Skill Attribution Experts’ skills Known Unknown Participation of experts in teams Unknown Known Network structure Known Irrelevant evimaria@cs.bu.edu

The Skill-Attribution problem  Input: a set of teams and the tasks they performed Boston University Slideshow Title Goes Here Team T 1 ={A,B} performed task S 1 ={algorithms, databases}  Team T 2 ={B,C,D} performed task S 2 ={algorithms, system, programming}  Team T 3 ={A,B,C} performed task S 3 ={databases, algorithms, systems}   Question: What are the contributions of each team member? Team {A,B} appear to know algorithms and databases but who knows  algorithms and who knows databases?  Assumptions: Complementarity: A team has a skill if at least one of its members has  that skill Parsimony: It is hard to imagine a world where all individuals have all skills  evimaria@cs.bu.edu

The Skill-Attribution problem  The input introduces a set of constraints Boston University Slideshow Title Goes Here Team T 1 ={A,B} performed task S 1 ={algorithms, databases}  Team T 2 ={B,C,D} performed task S 2 ={algorithms, system, programming}  Team T 3 ={A,B,C} performed task S 3 ={databases, algorithms, systems}   A skill assignment is consistent if for every task T i and every skill in s Є S i there exist at least one expert in T i who has s. A skill assignment is consistent if and only if it is consistent for every skill  separately Focus on the single-skill attribution problem evimaria@cs.bu.edu

Skill vectors and hitting sets A Boston University Slideshow Title Goes Here s = algorithms  T1 Team T 1 ={A,B}  B Team T 2 ={B,C} T2  C Team T 3 ={C,D}  T3 Team T 4 ={D,E} D  T4 E  A skill vector assigns skill s to individuals from V  Any consistent skill vector is a hitting set for the set system (T 1 ,T 2 ,…,T m , V) Teams: subsets of Universe of individuals individuals evimaria@cs.bu.edu

Minimum skill attribution (v 0.0)  For a single skill s, and input teams T 1 ,T 2 ,…,T m Boston University Slideshow Title Goes Here find a consistent skill attribution with the minimum number of individuals possessing s.  Minimum skill attribution: X * = {B,D} A  Minimum skill attribution is as hard as s = algorithms  T1 the minimum hitting set problem B Team T 1 ={A,B}   X * is a strictly parsimonious solution T2 Team T 2 ={B,C}  C  One solution is not enough: Team T 3 ={C,D}  T3 Near-optimal attributions are ignored  D Team T 4 ={D,E}  X’={A,C,D}, X’’={A,C,E}, X’’’={B,C,D}, T4 X’’’’={B,C,E} E evimaria@cs.bu.edu

Counting all consistent skill vectors Boston University Slideshow Title Goes Here  For a single skill s, and input teams T 1 ,T 2 ,…,T m count for every individual in V the number of consistent skill vectors he participates in.  Equivalent to counting hitting sets for input (T 1 ,T 2 ,…,T m ,V)  #P-complete problem evimaria@cs.bu.edu

The lattice of skill vectors Boston University Slideshow Title Goes Here V Everyone has skill s Minimal sets Supersets if Consistent subsets a minimal set Subset of V that possesses skill s Inconsistent subsets Ø Noone has skill s evimaria@cs.bu.edu

Counting all consistent skill vectors V Everyone has skill s Boston University Slideshow Title Goes Here  Naïve Monte-Carlo sampling  C=0  for i=1…N  Sample an element from the lattice; if it is consistent C++  return (C/N)x2 n Ø Noone has skill s evimaria@cs.bu.edu

Counting all consistent skill vectors V Everyone has skill s Boston University Slideshow Title Goes Here  Naïve Monte-Carlo sampling  C=0  for i=1…N  Sample an element from the lattice; if it is consistent C++  return (C/N)x2 n Does not work when there are few consistent vectors Ø Noone has skill s evimaria@cs.bu.edu

The ImportanceSampling algorithm Boston University Slideshow Title Goes Here • Assume we know the set of V minimal sets that contain r M(r) = {M 1 ,…,M k } Supersets of a minimal sets • Sample consistent vectors from the space of hitting sets only • Running time: polynomial in k Ø evimaria@cs.bu.edu

ImportanceSampling Speedups Boston University Slideshow Title Goes Here • Run ImportanceSampling for all experts simultaneously • View the input as a bipartite graph and partition it into (almost) independent components • Cluster together experts that participate in identical sets of teams into super-experts evimaria@cs.bu.edu

A T1 1 T2 Boston University Slideshow Title Goes Here 2 T3 3 T5 B T6 4 T7 5 T8 ConsistentVectors(1) = ConsistentVectors(1,A)xConsistentVectors(B) evimaria@cs.bu.edu

Ranking of experts Boston University Slideshow Title Goes Here social networks privacy graphs P. Mika (1) A. Acquisti (1) C. Faloutsos (1) J. Golbeck (5) M. S. Ackerman (3) J. Kleinberg (2) M. Richardson (5) L. Faith Cranor (3) J. Leskovec (2) P. Singla (19) B. Berendt (5) R. Kumar (3) L. Zhou (7) S. Spiekermann (5) A. Tomkins (3) A. Java (19) O. Gunther (19) L. A. Adamic (3) L. Ding (2) J. Grossklags (5) E. Vee (4) T. Finin (2) G. Hsieh (19) P. Ginsparg (4) A. Joshi (2) K. Vaniea (19) J. Gehrke (4) R. Agrawal (19) N. Sadeh (19) B. A. Huberman (3) evimaria@cs.bu.edu

Team formation in educational settings [AGT’14] Boston University Slideshow Title Goes Here • Consider a class of students Di fg erent ability levels (single scores) • • Example: GRE, TOEFL, SAT, … How to form study groups? evimaria@cs.bu.edu

Team formation in educational settings [AGT’14] Boston University Slideshow Title Goes Here • Classical methods Ability-Based Grouping • • Grouping students with similar abilities together Pseudo-Random Grouping • • Grouping students based on some arbitrary ordering • Alphabetically, FCFS, … evimaria@cs.bu.edu

Team formation in educational settings [AGT’14] Boston University Slideshow Title Goes Here • Classical methods Ability-Based Grouping • • Grouping students with similar abilities together Pseudo-Random Grouping • • Grouping students based on some arbitrary ordering • Alphabetically, FCFS, … Which method to use? Inconclusive verdict from empirical studies (Kulik 92, Loveless 13, McPartland 87) Let’s take a computational approach evimaria@cs.bu.edu

Framework Boston University Slideshow Title Goes Here • evimaria@cs.bu.edu

Framework Boston University Slideshow Title Goes Here • Two groups of students in a study group Students below the collective ability • Students above the collective ability • evimaria@cs.bu.edu

Framework • Two groups of students in a study group Boston University Slideshow Title Goes Here Students below the collective ability • Students above the collective ability •  Mostly improve by  Mostly learn from other teaching others members of the group evimaria@cs.bu.edu

Framework • Two groups of students in a study group Boston University Slideshow Title Goes Here Students below the collective ability • Students above the collective ability •  Mostly improve by  Mostly learn from other teaching others members of the group Our Focus Maximize the number of such students • evimaria@cs.bu.edu

Algorithms for Team Formation Evimaria Terzi (Boston University) - PowerPoint PPT Presentation

Algorithms for Team Formation Evimaria Terzi (Boston University) Team-formation problems Boston University Slideshow Title Goes Here Given a task and a set of experts (organized in a network) find the subset of experts that can e fg ectively

Pawel K. Olszewski, PhD pawel@waikato.ac.nz TEAM TEAM TEAM TEAM TEAM TEAM TEAM TEAM TEAM

Dwarf Galaxy Formation with Dwarf Galaxy Formation with H 2 -regulated Star Formation H 2

Media Team Formation in Social Networks Network Ties Thanks to Evimari Terzi ALGORITHMS FOR

OBT Formation in Night Experiments and OBT Formation in Night Experiments and OBT Formation in

Planetesimal formation in Planetesimal formation in turbulent protoplanetary discs turbulent

Th The New Life Spiritual Formation through the Means of Grace Wh What is Spiritual Formation?

Image formation How are objects in the world captured in Image formation an image? Matlab

comments on star formation at the peak of the galaxy formation epoch its all different and

Autonomous Formation Flying (AFF) Sensor for Precision Formation Flying Missions MiMi Aung

Adult Faith Formation in Parish Today A Day of Reflection and Information on the possibilities

ray tracing 1 image formation 2 image formation 3 rendering computational simulation of

Graph Algorithms Chapter 22 1 CPTR 430 Algorithms Graph Algorithms Why Study Graph Algorithms?

Greedy Algorithms Chapter 16 1 CPTR 430 Algorithms Greedy Algorithms Greedy Algorithms For

Algorithms Chapter 3 Chapter Summary Algorithms n Example Algorithms n Algorithmic Paradigms

BUILDING A CAREGIVING TEAM Lessons Learned Team Structure Team Member Team Coordinator Team

General remarks Algorithms Algorithms Oliver Oliver Week 8 Kullmann Kullmann Greedy Greedy

Who are virtual workers and where in the labour market

CSCI 246 Class 18 PROBABILITIES AND COUNTING Quiz Questions Lecture 30: What is the

SuperNova burst buffer ( NVMe from Zynq ) Roy Wastie University of Oxford 1 17/10/19 DUNE-UK

WMC Conference 2017 Number Relationship Activities Improve Learners Mental Representation ~1

The Basics of Product Creation How to Get Others to Create Content for You Ways to Get Others to

Sales Closing System Proven Process To Consistently Close SEO Prospects For High Monthly Prices

CrowdBC: A Blockchain-based Decentralized Framework for Crowdsourcing Hailiang ZHAO

10 WAYS TO COST DRUPAL PROJECTS Mark Matuschka Managing Director, Glo Digital