Network Analysis of Software Repositories The Eclipse Bugzilla Case - - PowerPoint PPT Presentation

network analysis of software repositories
SMART_READER_LITE
LIVE PREVIEW

Network Analysis of Software Repositories The Eclipse Bugzilla Case - - PowerPoint PPT Presentation

Knowledge Management Institute Network Analysis of Software Repositories The Eclipse Bugzilla Case Monika Schubert, Michel Wermelinger, Yijun Yu Knowledge Management Institute Department of Computing Graz University of Technology The Open


slide-1
SLIDE 1

Knowledge Management Institute 1

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Network Analysis of Software Repositories

The Eclipse Bugzilla Case

Monika Schubert, Michel Wermelinger, Yijun Yu

Knowledge Management Institute Graz University of Technology monika.schubert@tugraz.at Department of Computing The Open University Milton Keynes, UK

slide-2
SLIDE 2

Knowledge Management Institute 2

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Conway‘s Law

“Any organization that designs a system will inevitably produce a design whose structure is a copy of the

  • rganization's communication structure” [Con1968]

Community Software Architecture

slide-3
SLIDE 3

Knowledge Management Institute 3

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Research Questions

  • How can we infer social structure and hierarchies

among software engineers from open source software repositories?

  • Is there a correlation between the social and the

technical aspects of software development?

slide-4
SLIDE 4

Knowledge Management Institute 4

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Eclipse Bugzilla Dataset

Total SDK Number of Bugs: 207743 101966 Number of Developer: 25741 16025 Number of Components: 662 18 Data provided:

  • Software component
  • Reporter
  • Assignee
  • Discussants

Distribution of Developers

500 1000 1500 2000 2500 3000 3500 4000 4500 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 Number of Bugs Number of Developers

… … 60 10 … … 350 4 544 3 1356 2 4134 1 #Developer #Bugs

slide-5
SLIDE 5

Knowledge Management Institute 5

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Related Work

  • Work on Conway‘s Law

– Analysing the structure of organizations and products of scientific computing projects [Ara2008]

  • Work on the Eclipse Bugzilla dataset

– Bug Prediction [Jos2007] – Forecasting the number of changes [Her2007] – Author–Topic Modelling [Lin2007] – Fixing time of a Bug [Wei2007]

slide-6
SLIDE 6

Knowledge Management Institute 6

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Analysis Concepts

  • Analysis of Community

– Folding, Cooccurance [Was1994] – Formal Concept Analysis[Wil2005]

  • Analysis of the Architecture

– Static and Dynamic Dependencies

  • Correlations

– Degree Centrality – Centrality Rank [Spe1987] In cooperation with The Open University

slide-7
SLIDE 7

Knowledge Management Institute 7

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Social Structure and Hierarchies

  • Single entity dominance
  • Geographic clustering

Network of Developers Created by folding a Component-Developer Graph

slide-8
SLIDE 8

Knowledge Management Institute 8

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Social vs. Technical Aspects

Connections between components from the architecture Network of Components created by folding a Component-Developer Graph K=256

JDT Equiniox PDE Platform JDT Equiniox PDE Platform

slide-9
SLIDE 9

Knowledge Management Institute 9

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Degree Distribution

Degree Distribution: ranking according to the degree of each node Histogramm: clustering nodes to degree intervals Total Degree Distribution: cumulative degree distribution for a given number

Social inferred component network with k=32

slide-10
SLIDE 10

Knowledge Management Institute 10

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Degree Distribution

k=32 k=1024 static undir. dynamic undir.

slide-11
SLIDE 11

Knowledge Management Institute 11

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Rank Correlation

16

PDEBuild

16 PlatformDoc 8 PlatformUser Assistance 16 JDTAnt 15 PlatformDoc 14 PlatformSearch 8 PlatformTeam 15 PlatformSearch 14 PlatformSearch 14 EquinoxFramework 8 PlatformSWT 14 PlatformDoc 13 PlatformUpdate 13

PDEBuild

8 PlatformSearch 13

PDEBuild

12 JDTCore 12 PlatformSWT 8 PlatformDoc 12 EquinoxFramework 11 JDTDebug 10 PlatformUpdate 8

PDEBuild

11 PlatformUser Assistance 9 PlatformSWT 10 JDTDebug 8 JDTDebug 10 PlatformUpdate 9 JDTAnt 9 JDTCore 8 JDTAnt 9 JDTDebug 8 PlatformUser Assistance 8 JDTAnt 8 EquinoxFramework 8 PlatformText 7 PlatformText 7 PlatformUser Assistance 5 PlatformUpdate 7 PlatformTeam 6 EquinoxFramework 6 PlatformText 5 PDEUI 6

PlatformResources

5 PDEUI 5 PlatformTeam 5 JDTCore 5 JDTCore 4 PlatformTeam 4 PDEUI 3 PlatformText 4 PDEUI 3 JDTUI 3 JDTUI 3

PlatformResources

3 JDTUI 2

PlatformUI

2

PlatformResources

2 JDTUI 2 PlatformSWT 1

PlatformResources

1

PlatformUI

1

PlatformUI

1

PlatformUI

dynamic undirected static undirected k=1024 k=32

slide-12
SLIDE 12

Knowledge Management Institute 12

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Rank Correlation

Spearman:

  • Compared all different social-inferred and code-inferred

graphs with each other Results:

– up to 0.7368 correlation – between k=1024 and static undirected

  • Compared the social-inferred with random graphs

Results:

– Up to 0.1114 correlation

slide-13
SLIDE 13

Knowledge Management Institute 13

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Contributions

  • Provide a large-scale study of the relationship between

social systems and the software architecture

  • Exploring evidence that speaks for and/or against

Conway‘s Law

slide-14
SLIDE 14

Knowledge Management Institute 14

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Discussion Points

  • Conway‘s law is incomplete

– What is a communication structure? – What is the structure of a product or source code?

  • Degree centrality versus graph structure

– The degree centrality is an indication of the importance of a node – The graph structure is represented by the edges

  • Rank correlation

– How do the tied ranks effect the interpretation?

slide-15
SLIDE 15

Knowledge Management Institute 15

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

Monika Schubert

Graz, University of Technology monika.schubert@tugraz.at

slide-16
SLIDE 16

Knowledge Management Institute 16

Monika Schubert Graz, 2.9.2008 Network Analysis of Software Repositories

References

[Con1968] Conway M.E. (1968). How do committees invent. Datamation, (14)4:28—31. [Her2007] Herraiz, I.; Gonzalez-Barahona, J. M.; Robles, G. (2007), Forecasting the number of changes in Eclipse using time series analysis, in 'Proceedings of the 29th International Conference on Software Engineering Workshops', IEEE Computer Society. [Jos2007] Joshi, H.; Zhang, C.; Ramaswamy, S.; Bayrak, C. (2007), Local and Global Recency Weighting Approach to Bug Prediction, in 'Proceedings of the Fourth International Workshop on Mining Software Repositories', IEEE Computer Society. [Lin2007] Linstead, E.; Rigor, P.; Bajracharya, S.; Lopes, C.; Baldi, P. (2007), Mining Eclipse Developer Contributions via Author-Topic Models, in 'Proceedings of the Fourth International Workshop on Mining Software Repositories', IEEE Computer Society. [Was1994] Wasserman S.; Faust K. (1994). Social Network Analysis: Methods and Applications. Cambridge University Press. [Wei2007] Weiss, C.; Premraj, R.; Zimmermann, T. & Zeller, A. (2007), How Long will it Take to Fix This Bug?, in Harald Gall & Michele Lanza, ed.,'Proceedings of the Fourth International Workshop on Mining Software Repositories'. [Wil2005] Wille R. (2005). Formal Concept Analysis as Mathematical Theory of Concepts and Concept

  • Hierarchies. Formal Concept Analysis, 1--33, 2005.