[PPT] - Mining Version Histories to Guide Software Changes by T. Zimmermann, PowerPoint Presentation

SLIDE 1

Mining Version Histories to Guide Software Changes

by T. Zimmermann, P. Weißgerber, S. Diehl, A. Zeller

in IEEE Transaction on Software Engineering,

Vol. 31, No. 6., June 2005

SLIDE 2

The Idea

Can we make similar suggestions for software changes?

SLIDE 3

Extending Eclipse Preferences

 Extend Eclipse IDE with a new preference  Preferences are stored in a field fKeys[]

SLIDE 4

Extending Eclipse Preferences

 What else do you need to change?

 Which of the 27,000 files?  Which of the 20,000 classes?  Which of the 200,000 methods?

 Program analysis

 fKeys[] and initDefaults() use the same variables  Usage does not induce change  Usage can be detected only within the source code

 Eclipse has 12,000 non-Java files

SLIDE 5

Learning from History

 Programmer who changed fKeys[] also changed …

SLIDE 6

From CVS to Transactions

 The CVS archive for Eclipse has more than 47,000

transactions

SLIDE 7

ROSE in a Nutshell

SLIDE 8

Changes -> Transactions -> Rules



Entity – a triple (c, i, p),

 where c – syntactic category; i – identifier; p – parent entity



Example: (method, initDefaults(), (class, Comp, (file, Comp.java, …))



Operations on entities: add_to, del_from, alter



Transaction – the set of changes simultaneously submitted by a developer to a version archive

SLIDE 9

Getting Syntactic Entities

SLIDE 10

Light-Weight Analysis with ROSE

SLIDE 11

Light-Weight Analysis with ROSE

Rose analyzes C/C++, JAVA, PYTHON, TEX and TEXINFO files We get modified methods, variables and subsections

SLIDE 12

Changes -> Transactions -> Rules

 ROSE retrieves changes and transactions from CVS

[Berliner’90]

 CVS provides only file versioning  Per-file changes are grouped into transactions

 Files -> Transactions -> Sliding window approach

[Fogel’02]

 Two subsequent changes, the same author, 200 second apart

 Branches and Merges in CVS

 Rose ignores changes that affect more than 30 entities

SLIDE 13

Changes -> Transactions -> Rules

 Rules are mined from transactions  Rules are mined with Apriori Algorithm [Agrawal’94]  The generated rules have the form:



antecedent(s) => consequent (s)

 The rules have a probabilistic interpretation



Evidence: support count (# of transactions) and confidence (the strength of the correspondence)

SLIDE 14

Evolutionary Coupling

SLIDE 15

Evolutionary Coupling

SLIDE 16

Evolutionary Coupling

Support: How much evidence (= simultaneous changes)? Confidence: How much relevant is coupling for participants?

SLIDE 17

Evolutionary Coupling

Support: How much evidence (= simultaneous changes)? Confidence: How much relevant is coupling for participants?

SLIDE 18

Applying Rules

 The programmer performs a change – “a situation”:  ROSE suggests further changes by applying matching

rules

 Matching rule = situation = antecedent

 The suggestion = union of the consequents of all the

matching rules

 The # of rules depends on support count and

confidence

SLIDE 19

Multi-Dimensional Rules

 If something is added to software, there is no way to

predict the change based on history

 E.g., the developer adds “Foo” constant to Comp.java  ROSE can do that in “operation” dimension

SLIDE 20

Examples of Rules

 GCC arrays that define the cost of different assembler

perations for INTEL CPUs

 The arrays have been altered 9 times; 9 out of 11 times,

the change is triggered by a change in the type:

SLIDE 21

Examples of Rules

 Python and C files – detecting evolutionary couplings

in different programming languages

 It would require cross-language program analysis to

detect this coupling

SLIDE 22

Examples of Rules

 POSTGRES documentation

SLIDE 23

ROSE Server and Client

 The ROSE server determines coupling and rules  The ROSE client guides the programmer along related

changes

SLIDE 24

Evaluation

 How good are rules at predicting changes?  Training period: ROSE infers rules from the past  Evaluation period: ROSE applies the mined rules  In evaluation period, every transaction T is checked:

 Navigation: given one change in T, does ROSE point to

further changes in T?

 Error Prevention: given all but one change from T, does

ROSE point to the missing change?

 Closure: given all changes of T, does ROSE stay silent?

SLIDE 25

Evaluating Additional Questions

 Granularity

 Files and functions

 Maintenance

 No addition or deletions

 Multiple Dimensions

 What is the benefit of add_to and del_from?

 History

 How much history? Usefulness over time? Quality or

recommendations depending of the development cycle and releases?

 Recent Changes

 Relevance of old changes

SLIDE 26

Projects Used for Evaluation

SLIDE 27

Precision vs. Recall

 Recall: How many relevant entities are returned?  Precision: How many of the returned entities are

relevant?

SLIDE 28

Precision vs. Feedback / Support Count vs. Confidence

SLIDE 29

Results: Navigation, Prevention, Closure

SLIDE 30

Navigation, Prevention, Closure

 The programmer has changed one single entity. Can

ROSE suggest other entities that should be changed?

 The programmer has changed several entities but one.

Does ROSE find the missing one?

 The programmer made all necessary changes. How often

does ROSE still suggest a missing change?

SLIDE 31

Results for Fine Granularity

SLIDE 32

Results: Navigation

 Given one initial item, ROSE makes predictions in 66

percent of all queries

 On average, the predictions contain 33 percent of all

items changed

 For those queries for which ROSE makes

recommendations, in 7 percent of the cases, a correct location is within ROSE’s topmost three suggestions

SLIDE 33

Results: Prevention and Closure

 In 3 percent of the queries where one item is missing,

ROSE issues a correct warning

 A warning predicts 75 percent of the items that need to

be changed

 ROSE’s warning about missing items should be taken

seriously …

 Only 2 percent of all transactions cause a false alarm (!)

SLIDE 34

Results for Coarse Granularity

SLIDE 35

Results for Maintenance

 Rose shows the best predictive power for

changes to existing entities

SLIDE 36

Threads to Validity

 Kinds of version histories and software projects

 8 projects; 100,000 transactions

 Transactions do not record the order

 CVS limitation

 Quality of transactions?  User studies?

SLIDE 37

Summary



For stable systems like GCC, ROSE gives precise suggestions (recommendation in 63% of transactions, precision – 30%, in 90%

f all recommendations – 3 topmost suggestions contain correct

entity)



For rapidly changing systems like KOFFICE, most useful suggestions are at the file level (because prediction new functions –

ut of reach for any approach)



Predictive power of ROSE is best during maintenance phases



In about 2-7% of all erroneous transactions, ROSE correctly detects the missing change (only 2% of all transactions cause false alarm)



ROSE detects coupling between non-program entities (e.g. docs, manuals, mappings)

SLIDE 38

Future Work

 Taxonomies: identify patterns of changes  Sequence rules: detect rules across multiple

transactions

 Further data sources: log messages, bug databases  Refactoring: ROSE does not recognize renamings of

methods or files

 Program analysis: can improve the overall approach  Rule presentation: visualization of rules can help

SLIDE 39