Empirical Project Monitor and Results from 100 OSS Development - - PowerPoint PPT Presentation

empirical project monitor and results from 100 oss
SMART_READER_LITE
LIVE PREVIEW

Empirical Project Monitor and Results from 100 OSS Development - - PowerPoint PPT Presentation

Empirical Project Monitor and Results from 100 OSS Development Projects Masao Ohira Empirical Software Engineering Research Laboratory, Nara Institute of Science and Technology ohira@empirical.jp collection EASE Project analysis


slide-1
SLIDE 1

Empirical Project Monitor and Results from 100 OSS Development Projects

Masao Ohira

Empirical Software Engineering Research Laboratory, Nara Institute of Science and Technology

  • hira@empirical.jp
slide-2
SLIDE 2

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

2

EASE Project

Empirical software development

environment for tens of thousands of projects

Massive data collection Intensive data analysis Feedback for software process

improvement in

  • rganizations/communities

(not only a single developer/project)

collection analysis improvement

slide-3
SLIDE 3

Empirical Environment

Versioning (CVS) Mailing (Mailman) Issue tracking (GNATS) Other tool data Format Translator Format Translator Format Translator Format Translator Process data archive (XML format) Product data archive (CVS format) Code clone detection Component search Metrics measurement Project categorization Cooperative filtering GUI Widely used development support tools Managers Developers Project x Project y Project z. . .

EPM(developing)

slide-4
SLIDE 4

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

4

EPM: Empirical Project Monitor

A partial implementation of Empirical

Environment

Collect, measure, and show various data

for project control

Data source from tools used in software

development

Versioning system (e.g. CVS) Mailing list manager (e.g. Mailman) Issue tracking tool (e.g. GNATS)

slide-5
SLIDE 5

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

5

Architecture of EPM

versioning history mail history problem history

Standardized empirical SE data (in XML) Standardized empirical SE data (in XML)

PostgreSQL(Repository) CVS, Mailman, GNATS (ShareSourceTM) analysis tools

prediction/schedule metrics value

  • ther tool data

etc.

developer Manager developer manager

measurement of intra and inter projects

slide-6
SLIDE 6

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

6

Characteristics of EPM

Use open source development tools

→ Easy to introduce

Small overhead of data collection Most data from versioning history Communication through e-mail, and

recoding issues by tracking tool

Easy to transform other data format to the

standardized empirical SE data format

slide-7
SLIDE 7

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

7

Application Area of EPM

Large project Share project status immediately Reduce project management load Reduce risk for tampering data Small project Apply with small cost Apply to various projects, including XP

and distributed development

slide-8
SLIDE 8

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

8

Data collection from OSS Development Projects

SourceForge.net hosted projects: 72,853 (Dec. 15) registered Users: 753,428 (Dec. 15) A variety of collaboration tools SourceForge Collaborative Development System

(CDS) web tools

Project Web Server Tracker: Tools for Managing Support Mailing lists and discussion forums MySQL Database Services Project CVS Services etc.

Available data source for EPM

collection analysis improvement

slide-9
SLIDE 9

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

9

Overview of Collected Data

100 Active projects @ SF.net Data sources for EPM

  • CVS data (only 40 projects)
  • Mailing Lists data
  • Issue (Bug) reports data

Project info. in a summary page

  • number of developers
  • period of a project
  • development status
  • intended audience

collection analysis improvement

  • programming language
  • number of bugs
  • number of CVS commits
  • etc.
slide-10
SLIDE 10

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

10

SourceForge.net

information related to the project links to available data source for EPM

collection analysis improvement

slide-11
SLIDE 11

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

11

Summary of 100 OSS projects@SF.net: Evolution?

5 1 1 5 2 2 5 3 3 5 4 4 5 5 A u g

  • 9

9 M a r

  • O

c t

  • A

p r

  • 1

N

  • v
  • 1

M a y

  • 2

D e c

  • 2

J u n

  • 3

J a n

  • 4

R e g i s t e r e d D a y

  • f

P r

  • j

e c t s C u r r e n t D e v e l

  • p

e r s

?

collection analysis improvement

slide-12
SLIDE 12

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

12

Result of CVS Product Data: Lines of Code (history of software growth)

collection analysis improvement

slide-13
SLIDE 13

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

13

Result of CVS Process Data: Check in/out (history of developer’s activities)

collection analysis improvement

slide-14
SLIDE 14

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

14

How can we use such a lot of data?

collection analysis improvement

slide-15
SLIDE 15

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

15

Gross Classification using EVIDII

EVIDII: Interactive interfaces that

visualize relationships among three sets of data

(original application domain: face-to-face communication support between clients and designers)

collection analysis improvement

slide-16
SLIDE 16

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

16

Demo: organizing dynamic community?

collection analysis improvement

Project X Project info. numbers of developers, LOC, development terms, etc.

slide-17
SLIDE 17

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

17

Scenario: organizing a dynamic community / providing feedback for improvement

1.

Comparing other projects with a target project

2.

Finding similarities and differences between them

collection analysis improvement

3-a. Notifying to related

project leaders of the existence of communities

4-a. Asking them help/

advices for improvement

DynC approach

3-b. Identifying factors of

the similarities and differences

4-b. Providing suggestions

for improvement

EASE approach

slide-18
SLIDE 18

12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003

18

Summary and Future Work

EPM: Empirical Project Monitor Data Collection from 100 OSS projects

(only 40 CVS data…)

Two scenarios using EVIDII More data collection (mails and bug

issues) and analysis using EPM/EVIDII