Algorithms for Team Formation
Evimaria Terzi (Boston University)
Algorithms for Team Formation Evimaria Terzi (Boston University) - - PowerPoint PPT Presentation
Algorithms for Team Formation Evimaria Terzi (Boston University) Team-formation problems Boston University Slideshow Title Goes Here Given a task and a set of experts (organized in a network) find the subset of experts that can e fg ectively
Evimaria Terzi (Boston University)
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Given a task and a set of experts (organized in a network) find
the subset of experts that can efgectively perform the task
Task: set of required skills and potentially a budget Expert: has a set of skills and potentially a price Network: represents strength of relationships
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
2001
Organizer Insider Co-organizer Security expert Mechanic Mechanic Electronics expert Explosives expert Acrobat Con-man Pick-pocket thief
Boston University Slideshow Title Goes Here
2001
Organizer Insider Co-organizer Security expert Mechanic Mechanic Electronics expert Explosives expert Acrobat Con-man Pick-pocket thief
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Collaboration networks (e.g., scientists, actors) Organizational structure of companies LinkedIn, Odesk, Elance Geographical (map) of experts
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
UcєCc contains many elements from U
for the task
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
to form collection C (C subset of S) such that UcєCc=U
mean?)
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
elements in U
O(2|S||U|)
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
selected set is going to be
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
factor F if for all instances I we have that a(I)≤F x a*(I)
algorithm for set cover ?
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
approximation factor F = O(log |smax|)
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
T = {algorithms, java, graphics, python}
Coverage: For every required skill in T there is at least
Alice
{algorithms}
Bob
{python}
Cynthia
{graphics, java}
David
{graphics}
Eleanor
{graphics,java,python}
Alice
{algorithms}
Eleanor
{graphics,java,python}
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Given a task and a set of individuals, find the most
effjcient subset (team) of individuals that can perform the given task.
NP-hard (Set Cover Problem)
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Experts (defining the set V, with |V|=n):
Every expert i is associated with a set of skills Xi and a price pi
Tasks
Every task T is associated with a set of skills (T)
required for performing the task
Team Formation Experts’ skills Known Participation of experts in teams Unknown
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
JAVA Node.JS 90$ / hour Node.JS SQL 10$ / hour HTML JAVA 33$ / hour
JAVA, C++, SQL 18$ / hour JAVA, HTML 7$ / hour HTML, Node.JS 40$ / hour
… …
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
JAVA Node.JS 90$ / hour Node.JS SQL 10$ / hour HTML JAVA 33$ / hour
JAVA, C++, SQL 18$ / hour JAVA, HTML 7$ / hour HTML, Node.JS 40$ / hour
… …
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
JAVA Node.JS 90$ / hour Node.JS SQL 10$ / hour HTML JAVA 33$ / hour
JAVA, C++, SQL 18$ / hour JAVA, HTML 7$ / hour HTML, Node.JS 40$ / hour
… …
Organizations Agencies Who to hire and which jobs to do?
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
JAVA Node.JS 90$ / hour Node.JS SQL 10$ / hour HTML JAVA 33$ / hour
JAVA, C++, SQL 18$ / hour JAVA, HTML 7$ / hour HTML, Node.JS 40$ / hour
… …
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
JAVA Node.JS 90$ / hour Node.JS SQL 10$ / hour HTML JAVA 33$ / hour
JAVA, C++, SQL 18$ / hour JAVA, HTML 7$ / hour HTML, Node.JS 40$ / hour
… …
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Node.JS 90$ / hour Node.JS SQL 10$ / hour HTML JAVA 33$ / hour
…
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
holds
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Competition-based ClusterHire
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Competition-based ClusterHire
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Guru
Nodes: 721 Cliques: 520 Nodes: 1764 Cliques: 1660
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Experts (defining the set V, with |V|=n):
Every expert i is associated with a set of skills Xi and a price pi
Tasks
Every task T is associated with a set of skills (T) required for
performing the task
A social network of experts (G=(V,E))
Edges indicate ability to work well together
Team Formation Experts’ skills Known Participation of experts in teams Unknown Network structure Known
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Given a task and a set of experts organized in a network find
the subset of experts that can efgectively perform the task
Task: set of required skills Expert: has a set of skills Network: represents strength of relationships
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Communication: the members of the team must be able to effjciently communicate and work together
Bob
{python}
Cynthia
{graphics, java}
David
{graphics}
Alice
{algorithms}
Eleanor
{graphics,java,python}
A B C E D T={algorithms,java,graphics,python} A E C B
A,E could perform the task if they could communicate A,B,C form an efgective group that can communicate
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Given a task and a social network of individuals,
find the subset (team) of individuals that can efgectively perform the given task.
Thesis: Good teams are teams that have the
necessary skills and can also communicate efgectively
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Diameter of the subgraph defined by the
group members
A B C E D A E C B
The longest shortest path between any two nodes in the subgraph
diameter = infty diameter = 1
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
MST (Minimum spanning tree) of the
subgraph defined by the group members
A B C E D A E C B
The total weight of the edges of a tree that spans all the team nodes
MST = infty MST = 2
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Given a task and a social network G of experts, find
the subset (team) of experts that can perform the given task and they define a subgraph in G with the minimum diameter.
Problem is NP-hard
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Find Rarest skill αrare required for a task
Srare group of people that have αrare
Evaluate star graphs, centered at individuals from Srare
Report cheapest star
Running time: Quadratic to the number of nodes Approximation factor: 2xOPT
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
A B C E D T={algorithms,java,graphics,python}
{graphics,python,java} {algorithms,graphics} {algorithms,graphics,java} {python,java} {python}
αrare = algorithms Srare ={Bob, Eleanor}
B E A Skills:
algorithms graphics java python
Diameter = 2
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
A B C E D T={algorithms,java,graphics,python}
{graphics,python,java} {algorithms,graphics} {algorithms,graphics,java} {python,java} {python}
E Skills:
algorithms graphics java python
Diameter = 1 C
αrare = algorithms Srare ={Bob, Eleanor}
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
D = max {dℓ, dk, dℓk} Fact: OPT ≥ dℓ Fact: OPT ≥ dk D ≤ dℓk ≤ dℓ + dk ≤ 2*OPT
Srare
…. ….
S1 Sℓ Sk d1 dℓ dk dℓk
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Given a task and a social network G of experts,
find the subset (team) of experts that can perform the given task and they define a subgraph in G with the minimum MST cost.
Problem is NP-hard
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Graph G(V,E) Partition of V into V = {R,N} Find G’ subgraph of G such that G’ contains all
the required vertices (R) and MST(G’) is minimized
Required vertices
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
A B C E D T={algorithms,java,graphics,python}
{graphics,python,java} {algorithms,graphics} {algorithms,graphics,java} {python,java} {python}
python java graphics
algorithms
E D MST Cost = 1
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Graph G(V,E) Partition of V into V = {R,N} Find G’ subgraph of G such that G’ contains all
the required vertices (R) and MST(G’) is minimized
Required vertices
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
A B C E D T={algorithms,java,graphics,python}
{graphics,python,java} {algorithms,graphics} {algorithms,graphics,java} {python,java} {python}
E D MST Cost = 1
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
A B C E D T={algorithms,java,graphics,python}
{graphics,python,java} {algorithms,graphics} {algorithms,graphics,java} {python,java} {python}
A B MST Cost = Infty
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Dataset DBLP graph (DB, Theory, ML, DM) ~6000 authors ~2000 features Features: keywords appearing in papers Tasks: Subsets of keywords with difgerent cardinality k
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Web search engine
Paolo Ferragina, Patrick Valduriez, H. V. Jagadish, Alon
Muthukrishnan
P. Ferragina ,J. Han, H. V.Jagadish, Kevin Chen-Chuan
Chang, A. Gulli, S. Muthukrishnan, Laks V. S. Lakshmanan
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
J. Han, J. Pei, Y. Yin: Mining frequent patterns
without candidate generation
F. Bronchi A. Gionis, H. Mannila, R. Motwani
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Other measures of efgective communication
density, number of times a team member
participates as a mediator, information propagation
Other practical restrictions
Incorporate ability levels
Online team formation [ABCGL’12]
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
– Select the right team – Satisfy various criteria
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
– E.g. if fitness is success rate, maximize expected number
– Depends on:
– People skills – Ability to coordinate
– Do not load people very much
– Everybody should be involved in roughly the same number of jobs
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Vector of skills Vector of skills Stream of tasks arriving online
00010101 10010010 10001101 10011101
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Vector of skills Vector of skills Stream of tasks arriving online Coordination cost
00010101 10010010 10001101 10011101
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Vector of skills Vector of skills Stream of tasks arriving online Coordination cost
00010101 10010010 10001101 10011101
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
10001101 10010010
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
00010101 10010010 10001101 10011101
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
00010101 10010010 10001101 10011101
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
members
– Degree of knowledge – Time-zone difference – Past collaboration
– Steiner-tree cost – Diameter – Sum of distances
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
– Load – Unfairness – Coordination cost
and cover each job.
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
– Set cover – Steiner tree – Online makespan minimization
Load of person i
Team j covers job j Bounded coordination cost
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Job p1 p2 p3 p4 p5 p6 p7 Qj 1
Q1 = {p2, p4, p5} 2 Q2 = {p1, p4, p6} 3 Q3 = {p3, p4} 4 Q4 = {p1, p5, p7} 5 Q5 = {p2, p3. p4, p5} 6 Q6 = {p3, p5, p6} 7 Q7 = {p1, p2} 8 Q8 = {p1, p2, p3, p4, p7} 9 Q9 = {p3, p4, p5} Load 4 4 5 6 5 2 2
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
At each time step t, when a task arrives:
– Covers all required skills – Satisfies – Minimizes
Competitive ratio = . This is the best possible. Load of p at time t
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
At each time step t, when a task arrives:
– Covers all required skills – Satisfies – Minimizes
Competitive ratio = . This is the best possible. Load of p at time t
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
At each time step t, when a task arrives:
– Covers all required skills – Satisfies – Minimizes
Competitive ratio = . This is the best possible. Load of p at time t We can solve this problem only approximately.
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Experts (defining the set V, with |V|=n):
Every expert i is associated with a set of skills Xi and a price pi
Tasks
Every task T is associated with a set of skills (T) required for
performing the task
A social network of experts (G=(V,E))
Edges indicate ability to work well together
Team Formation Skill Attribution Experts’ skills Known Unknown Participation of experts in teams Unknown Known Network structure Known Irrelevant
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Input: a set of teams and the tasks they performed
Team T1={A,B} performed task S1={algorithms, databases}
Team T2={B,C,D} performed task S2={algorithms, system, programming}
Team T3={A,B,C} performed task S3={databases, algorithms, systems}
Question: What are the contributions of each team member?
Team {A,B} appear to know algorithms and databases but who knows algorithms and who knows databases?
Assumptions:
Complementarity: A team has a skill if at least one of its members has that skill
Parsimony: It is hard to imagine a world where all individuals have all skills
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
The input introduces a set of constraints
Team T1={A,B} performed task S1={algorithms, databases}
Team T2={B,C,D} performed task S2={algorithms, system, programming}
Team T3={A,B,C} performed task S3={databases, algorithms, systems}
A skill assignment is consistent if for every task Ti and
every skill in sЄSi there exist at least one expert in Ti who has s.
A skill assignment is consistent if and only if it is consistent for every skill separately
Focus on the single-skill attribution problem
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
A skill vector assigns skill s to individuals from V Any consistent skill vector is a hitting set for the set
system (T1,T2,…,Tm, V) A B C D E
T1 T2 T3 T4
s = algorithms
Team T1={A,B}
Team T2={B,C}
Team T3={C,D}
Team T4={D,E}
Teams: subsets of individuals Universe of individuals
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
For a single skill s, and input teams T1,T2,…,Tm
find a consistent skill attribution with the minimum number of individuals possessing s.
A B C D E
T1 T2 T3 T4
s = algorithms
Team T1={A,B}
Team T2={B,C}
Team T3={C,D}
Team T4={D,E}
Minimum skill attribution: X* = {B,D} Minimum skill attribution is as hard as
the minimum hitting set problem
X* is a strictly parsimonious solution One solution is not enough:
Near-optimal attributions are ignored X’={A,C,D}, X’’={A,C,E}, X’’’={B,C,D}, X’’’’={B,C,E}
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
For a single skill s, and input teams T1,T2,…,Tm count
for every individual in V the number of consistent skill vectors he participates in.
Equivalent to counting hitting sets for input (T1,T2,…,Tm ,V) #P-complete problem
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Ø
Noone has skill s Everyone has skill s
V
Subset of V that possesses skill s Inconsistent subsets Consistent subsets Minimal sets
Supersets if a minimal set
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Ø
Noone has skill s
V
Naïve Monte-Carlo sampling
C=0 for i=1…N
Sample an element from the
lattice; if it is consistent C++
return (C/N)x2n Everyone has skill s
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Ø
Noone has skill s
V
Naïve Monte-Carlo sampling
C=0 for i=1…N
Sample an element from the
lattice; if it is consistent C++
return (C/N)x2n
Does not work when there are few consistent vectors
Everyone has skill s
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Ø V
Supersets of a minimal sets
minimal sets that contain r M(r) ={M1,…,Mk}
space of hitting sets only
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
simultaneously
(almost) independent components
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
T2 T1 T3 T5 T6 T7 T8
A B ConsistentVectors(1) = ConsistentVectors(1,A)xConsistentVectors(B)
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
social networks privacy graphs
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
(Kulik 92, Loveless 13, McPartland 87)
Let’s take a computational approach
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Mostly learn from other
members of the group
Mostly improve by
teaching others
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Mostly learn from other
members of the group
Mostly improve by
teaching others
Our Focus
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Grouping strong students with not much weaker
students
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Grouping strong students with not much weaker
students
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Similar structure with difgerent distributions of
abilities
Normal Distribution Uniform Distribution Pareto Distribution
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Classical methods are not optimal
With respect to our objective
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Other Gain functions
How much do followers learn? See the paper for more details
Gain Function Gain (leader) Gain (follower)
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Traditional methods are not optimal Difgerent objectives leads to difgerent team
structures
Computation approaches can reveal such optimal
structures
Future Work
Richer gain functions
Gain for the leaders Non-linear gain functions
Incorporating constraints due to socio-emotional
factors
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu
SIGKDD 2014
Unity: Forming teams in large-scale community systems. CIKM 2010
formation in social networks. WWW 2012
making tasks on micro-blog services. VLDB 2012.
ACM SIGKDD 2012
Social Networks. CIKM 2011
networks: IEEE SocialCom, 2010
Boston University Slideshow Title Goes Here
evimaria@cs.bu.edu