Failure is a four-letter word
Andreas Zeller • Thomas Zimmermann • Christian Bird PROMISE 2011, Banff, CanadaFailure is a four-letter word Andreas Zeller Thomas Zimmermann - - PowerPoint PPT Presentation
Failure is a four-letter word Andreas Zeller Thomas Zimmermann - - PowerPoint PPT Presentation
Failure is a four-letter word Andreas Zeller Thomas Zimmermann Christian Bird PROMISE 2011, Banff, Canada Software failures 2 Defect distributions 3 Failure causes 4 Failure causes 5 Failure causes 6 Failure causes 7 Failure
Software failures
2Defect distributions
3Failure causes
4Failure causes
5Failure causes
6Failure causes
7Failure causes
7Cost of consequence
8Back to basics
9Back to basics
9A B C
Basic actions
10- /**
Basic actions
10- /**
- pppppppppp
Hypotheses
13Hypotheses
13- 1. We can predict defects from
programmer actions.
Hypotheses
13- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
Hypotheses
13- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
- 3. We can prevent defects by
restricting programmer actions.
Hypotheses
14- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
- 3. We can prevent defects by
restricting programmer actions.
Eclipse bug data
[PROMISE 2007] 15 Table 1: Features of the Eclipse datasets. Release& Total&chars& Total&files& Files&with& defects& Eclipse(2.0( 44,914,520( 6,728( 975((14%)( Eclipse(2.1( 56,068,650( 7,887( 854((11%)( Eclipse(3.0( 76,193,482( 10,593( 1,568((15%)(Eclipse characters
16Precision
17Precision
18Precision
18Recall
19 Table 3: Recall for various training/testing combinations. Training&Set& Eclipse&2.0& Eclipse&2.1& Eclipse&3.0& Average& Eclipse(2.0( 0.32( 0.27( 0.27( 0.28( Eclipse(2.1( 0.03( 0.18( 0.14( 0.11( Eclipse(3.0( 0.19( 0.16( 0.20( 0.18( Average( 0.18( 0.20( 0.20( 0.19(Hypotheses
20- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
- 3. We can prevent defects by
restricting programmer actions.
Hypotheses
20- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
- 3. We can prevent defects by
restricting programmer actions.
✔
Hypotheses
21- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
- 3. We can prevent defects by
restricting programmer actions.
✔
Defect correlations
22Defect correlations
23Defect correlations
23Defect correlations
24Defect correlations
24IROP
Hypotheses
25- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
- 3. We can prevent defects by
restricting programmer actions.
✔
Hypotheses
25- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
- 3. We can prevent defects by
restricting programmer actions.
✔ ✔
Hypotheses
26- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
- 3. We can prevent defects by
restricting programmer actions.
✔ ✔
Explicit causes
27Explicit causes
27IROP keyboard
28Coding standards
29Coding standards
29if ¡(p ¡!= ¡null) ¡ ¡{ ¡int ¡i; ¡while ¡(p[i] ¡< ¡0) ¡i++; ¡return ¡i; ¡}
Coding standards
29if ¡(p ¡!= ¡null) ¡ ¡{ ¡int ¡i; ¡while ¡(p[i] ¡< ¡0) ¡i++; ¡return ¡i; ¡} when ¡(q ¡!= ¡null) ¡ ¡ ¡{ ¡num ¡n; ¡as ¡(q[n] ¡< ¡0) ¡n++; ¡handback ¡n; ¡}
Coding standards
30when ¡(q ¡!= ¡null) ¡ ¡ ¡{ ¡num ¡n; ¡as ¡(q[n] ¡< ¡0) ¡n++; ¡handback ¡n; ¡}
Coding standards
30when ¡(q ¡!= ¡null) ¡ ¡ ¡{ ¡num ¡n; ¡as ¡(q[n] ¡< ¡0) ¡n++; ¡handback ¡n; ¡}
100% semantics preserving
New habits
31New habits
31W e can sun tete set majusculet, and t text says jus as swelm as antecedently . Let us jus ban tem!
Hypotheses
32- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
- 3. We can prevent defects by
restricting programmer actions.
✔ ✔
Hypotheses
32- 1. We can predict defects from
programmer actions.
- 2. We can isolate defect-prone
programmer actions.
- 3. We can prevent defects by
restricting programmer actions. ✔
✔ ✔
FAQs and threats
33FAQs and threats
33- 1. How about external validity?
FAQs and threats
33- 1. How about external validity?
- 2. Are the correlations significant?
FAQs and threats
33- 1. How about external validity?
- 2. Are the correlations significant?
- 3. Are the measures appropriate?
Future work
34Future work
34- Automatic renamings
Future work
34- Automatic renamings
- Abstraction
- vs. success / fame)
Future work
34- Automatic renamings
- Abstraction
- vs. success / fame)
- Generalization
Failure is a four-letter word
Failure is a four-letter word
Why all this is wrong
Correlation vs. Causation
Machine Learning works
Cherry Picking
Fix Causes, not Symptoms
Actionable Findings
Our Inspiration
http://xkcd.com/882/Use Book in Class
Use Paper in Class
Failure is a Four-Letter Word – A Parody in Empirical Research – Andreas Zeller* Saarland University Saarbrücken, Germany zeller@cs.uni-saarland.de Thomas Zimmermann Microsoft Research Washington, USA tzimmer@microsoft.com Christian Bird Microsoft Research Washington, USA cbird@microsoft.com ABSTRACT Background: The past years have seen a surge of techniques predicting failure-prone locations based on more or less complex- metrics. Few of these metrics are actionable, though.
- available. Specifically, our contributions include:
http://www.st.cs.uni-saarland.de/softevo/irop/