Applying Social Network Analysis to the Information in CVS - - PowerPoint PPT Presentation

applying social network analysis to the information in
SMART_READER_LITE
LIVE PREVIEW

Applying Social Network Analysis to the Information in CVS - - PowerPoint PPT Presentation

Applying Social Network Analysis to the Information in CVS Repositories Luis L opez-Fern andez, Gregorio Robles, Jes us M. Gonz alez Barahona, GSyC, Universidad Rey Juan Carlos, Madrid, Spain { llopez,grex,jgb } @gsyc.escet.urjc.es


slide-1
SLIDE 1

Applying Social Network Analysis to the Information in CVS Repositories

Luis L´

  • pez-Fern´

andez, Gregorio Robles, Jes´ us M. Gonz´ alez Barahona, GSyC, Universidad Rey Juan Carlos, Madrid, Spain

{llopez,grex,jgb}@gsyc.escet.urjc.es

MSR 2004 (Edinburgh, UK) 25th May 2004

slide-2
SLIDE 2

Background 1

Background

There is a lot of (too much?) information about libre software projects

  • ut there

We’re starting to streamline the extraction of raw data (e.g., from CVS repositories) We have to apply data mining and data interpretation techniques to get meaningful information Let’s explore approaches which were productive in other fields

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-3
SLIDE 3

Main aims of the study 2

Main aims of the study

To advance in the understanding of the social structure of libre soft- ware projects To characterize projects according to this structure To relate the evolution of a project to the evolution of its social structure To explore self-organization in the social structure of libre software projects

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-4
SLIDE 4

Methodology 3

Methodology

Download CVS history information from the repository for a libre software project Extract the information related to who commited what Build with it the commiter and module networks Analyze the resulting networks using social network analysis Extract some conclusions

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-5
SLIDE 5

The commiter network 4

The commiter network

One side of affiliation network Each vertex, a commiter (usually a developer) Edge: when there is contribution to at least one common module Weight of edges: commits by both commiters to all common modules

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-6
SLIDE 6

The module network 5

The module network

Other side of the same affiliation network Each vertex, a module (usually a top-level directory) Edge: when there is at least one common commiter Weight of edges: commits by common commiters to both modules

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-7
SLIDE 7

Both are a complex mesh 6

Both are a complex mesh

Module network for the Apache project, ca. February 2004

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-8
SLIDE 8

But they can be characterized 7

But they can be characterized

Degree (number of connections per vertex) Weighted degree (in our case, by commits) Distance centrality (proximity to the rest of the network) Betweenness centrality (shortest paths traversing a vertex) Clustering coefficient (connectivity to the neighborhood) Weighted clustering coefficient (in our case, by commits) Community analysis (Girvan-Newman algorithm)

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-9
SLIDE 9

Apache: connection degree (commiters network) 8

Apache: connection degree (commiters network)

Degree

50 100 150 200 250 300 350 400 450 20 40 60 80 100 120

Apache, circa February 2004

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-10
SLIDE 10

Apache and GNOME clustering coefficient (modules network) 9

Apache and GNOME clustering coefficient (modules network)

cc (clustering coeficient)

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 5 10 15 20 25 30

cc (clustering coeficient)

0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 20 40 60 80 100 120

Apache (left), GNOME (right) circa February 2004

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-11
SLIDE 11

Apache, GNOME, KDE weighted clustering coefficient (modules network) 10

Apache, GNOME, KDE weighted clustering coefficient (modules network)

Weighted clustering coeficient

5000 10000 15000 20000 5 10 15 20 25 30

Weighted clustering coeficient

20000 40000 60000 80000 100000 120000 140000 50 100 150 200 250

Weighted clustering coeficient

20000 40000 60000 80000 100000 120000 140000 50 100 150 200 250

Apache (left), GNOME (center), KDE (right) circa February 2004

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-12
SLIDE 12

Apache connection degree (modules network) 11

Apache connection degree (modules network)

Degree

5 10 15 20 25 30 35 1 2 3 4 5 6 7 8

Degree

10 20 30 40 50 60 70 2 4 6 8 10 12

Degree

10 20 30 40 50 60 70 80 90 2 4 6 8 10 12 14

Degree

20 40 60 80 100 120 2 4 6 8 10 12 14

2001 (top left) to 2004 (bottom right)

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-13
SLIDE 13

Apache modules community analysis (1999.01) 12

Apache modules community analysis (1999.01)

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-14
SLIDE 14

Apache modules community analysis (2000.01) 13

Apache modules community analysis (2000.01)

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-15
SLIDE 15

Apache modules community analysis (2000.09) 14

Apache modules community analysis (2000.09)

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-16
SLIDE 16

Apache modules community analysis (2002.01) 15

Apache modules community analysis (2002.01)

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-17
SLIDE 17

Apache modules community analysis (2004.02) 16

Apache modules community analysis (2004.02)

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories

slide-18
SLIDE 18

Conclusions 17

Conclusions

Methodology for studying the structure of libre software projects Captures both relationships between modules and commiters First step to community analysis Access to traditional social network analysis tools Further work: characterization of projects

c 2004 Jes´ us M. Gonz´ alez Barahona Applying Social Network Analysisto the Information in CVS Repositories