Using program slicing data to predict code faults David Bowes - PowerPoint PPT Presentation

Outline Using program slicing data to predict code faults Calculating the Slicing metrics for a ’module’ Relating slicing metrics to ’fault’ data Conclusion Using program slicing data to predict code faults David Bowes University of Hertfordshire February 10, 2010 David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Using program slicing data to predict code faults Calculating the Slicing metrics for a ’module’ Relating slicing metrics to ’fault’ data Conclusion Using program slicing data to predict code faults Calculating the Slicing metrics for a ’module’ Relating slicing metrics to ’fault’ data Conclusion David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Using program slicing data to predict code faults Calculating the Slicing metrics for a ’module’ Why? Relating slicing metrics to ’fault’ data Conclusion Why? ◮ Defect prediction 70% using machine learning ◮ Slicing Metrics rarely used for defect prediction ◮ Slicing metrics have some relationship of cohesion ◮ Slicing metrics do not tend to be a proxy for LOC David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Code example Using program slicing data to predict code faults Slicing Metrics Calculating the Slicing metrics for a ’module’ Which variables to choose? Relating slicing metrics to ’fault’ data Code example Conclusion What impact does the choice of variables have? Code example public class Fib { int start=1;//may be err? public static void main(String[] args) { Fib f = new Fib(); for (int i = 1; i < 10; i++) { System.out.println(i+" "+f.fib(i)); } } public int fib(int n) { int a = 0, b = 1; int c = start, d = 1;//fix me? while (c < n) { while e (c (c < < n) ) { System.out.printf(" debug %d\r\n", System. ystem.out. t.pri rintf(" tf(" deb ebug %d %d\r\ r\n", d d); ); ); d = a + b; d = = a a + + b; a = b; a a = = b; b = d; b = = d; c++; c++; c++; } } return retu return rn b b; ; } } David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Code example Using program slicing data to predict code faults Slicing Metrics Calculating the Slicing metrics for a ’module’ Which variables to choose? Relating slicing metrics to ’fault’ data Code example Conclusion What impact does the choice of variables have? Slicing Metrics Weiser ,Ott and Thuss defined a set of slice based metrics including: ◮ Tightness :The number of statements which are in every slice. High tightness values suggest that the code is cohesive. ◮ Overlap : Indicates how many statements in a slice are found only in that slice ◮ Coverage : Compares the length of slices to the length of the entire program ◮ Min Coverage :The length of the shortest slice as a proportion of the program length ◮ Max Coverage : Length of the longest slice as a proportion of the program length New metric Counsel et al ◮ NHD David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Code example Using program slicing data to predict code faults Slicing Metrics Calculating the Slicing metrics for a ’module’ Which variables to choose? Relating slicing metrics to ’fault’ data Code example Conclusion What impact does the choice of variables have? Which variables to choose? Previous studies exploring the efficacy of slice-based metrics have tended to use different sets of variables in specifying the slices: Categories Description Studies Formal ins ( V i ) Input parameters for the function 6 specified in the module declaration Formal outs ( V o ) The set of return variables 8 Global variables ( V g ) The set of variables which are used or 9 may be affected by the module printfs ( V p ) Variables which appear as formal outs 7 in the list of parameters in an output statement (e.g. printf) David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Code example Using program slicing data to predict code faults Slicing Metrics Calculating the Slicing metrics for a ’module’ Which variables to choose? Relating slicing metrics to ’fault’ data Code example Conclusion What impact does the choice of variables have? Code example public class Fib { int start=1;//may be err? public static void main(String[] args) { Fib f = new Fib(); for (int i = 1; i < 10; i++) { System.out.println(i+" "+f.fib(i)); } } public int fib(int n) { int a = 0, b = 1; int c = start, d = 1;//fix me? while (c < n) { while e (c (c < < n) ) { System.out.printf(" debug %d\r\n", System. ystem.out. t.pri rintf(" tf(" deb ebug %d %d\r\ r\n", d d); ); ); d = a + b; d = = a a + + b; a = b; a a = = b; b = d; b = = d; c++; c++; c++; } } return retu return rn b b; ; } } David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Code example Using program slicing data to predict code faults Slicing Metrics Calculating the Slicing metrics for a ’module’ Which variables to choose? Relating slicing metrics to ’fault’ data Code example Conclusion What impact does the choice of variables have? What impact does the choice of variables have? ◮ Studied barcode, open source barcode printing utility. ◮ http://ar.linux.it/software/barcode/barcode.html ◮ For 15 variants of variables: David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Code example Using program slicing data to predict code faults Slicing Metrics Calculating the Slicing metrics for a ’module’ Which variables to choose? Relating slicing metrics to ’fault’ data Code example Conclusion What impact does the choice of variables have? Overlap Tightness Coverage Min C Max C V i V o V g V p + + + + 0.649 0.481 0.691 0.523 0.901 + + + 0.643 0.482 0.705 0.524 0.901 + + + 0.712 0.551 0.717 0.588 0.898 + + + 0.759 0.563 0.712 0.587 0.892 + + + 0.745 0.519 0.671 0.543 0.845 + + 0.728 0.560 0.743 0.590 0.898 + + 0.772 0.518 0.653 0.538 0.820 + + 0.839 0.672 0.764 0.694 0.885 + + 0.767 0.521 0.653 0.544 0.761 + + 0.728 0.560 0.743 0.590 0.898 + + 0.820 0.591 0.688 0.610 0.792 + 0.944 0.823 0.856 0.832 0.885 + 1.000 0.612 0.612 0.612 0.612 + 0.851 0.538 0.639 0.547 0.717 + 0.749 0.464 0.597 0.496 0.778 David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Using program slicing data to predict code faults ’Cleaning’ the data Calculating the Slicing metrics for a ’module’ Building a prediction model ?Wackit into Weka? Relating slicing metrics to ’fault’ data result Conclusion Relating slicing metrics to ’fault’ data:Getting data Technique: ◮ Find a bug fix ◮ Assume before ( α ) was defective and after ( β ) was less defective. ◮ do the metrics of α predict a change to less defective state β ? 1 1 This technique produces balanced data so accuracy can be used to compare results. David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Using program slicing data to predict code faults ’Cleaning’ the data Calculating the Slicing metrics for a ’module’ Building a prediction model ?Wackit into Weka? Relating slicing metrics to ’fault’ data result Conclusion Wack it into Weka ◮ For each variant of slicing variable: ◮ format the data for Weka ◮ use Naive Bayesian Classifier ◮ 10 fold cross validation ◮ report accuracy David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Using program slicing data to predict code faults ’Cleaning’ the data Calculating the Slicing metrics for a ’module’ Building a prediction model ?Wackit into Weka? Relating slicing metrics to ’fault’ data result Conclusion Results using diff data Predicting defects using slicing metrics using diff data 0.45 0.4 0.35 0.3 0.25 accuracy % 0.2 diffs 0.15 0.1 0.05 0 a:all b:no Vp c:no Vg d:no Vo e:no Vi f:i+o g:g+p h:i+p I:o+g j:i+g k:o+p l:i m:o n:g o:p Slicing variables David Bowes University of Hertfordshire Using program slicing data to predict code faults

Outline Using program slicing data to predict code faults ’Cleaning’ the data Calculating the Slicing metrics for a ’module’ Building a prediction model ?Wackit into Weka? Relating slicing metrics to ’fault’ data result Conclusion Results Accuracy measure for predicting defectiveness from slicing metrics 0.5 comments diffs 0.45 sliding w indow 0.4 0.35 0.3 0.25 Accuracy 0.2 0.15 0.1 0.05 0 a:all b:no Vp c:no Vg d:no Vo e:no Vi f:i+o g:g+p h:i+p I:o+g j:i+g k:o+p l:i m:o n:g o:p Slicing variables David Bowes University of Hertfordshire Using program slicing data to predict code faults

Using program slicing data to predict code faults David Bowes - PowerPoint PPT Presentation

Outline Using program slicing data to predict code faults Calculating the Slicing metrics for a module Relating slicing metrics to fault data Conclusion Using program slicing data to predict code faults David Bowes University of

Program Slicing 2 1 Program Slicing 1. Slicing overview 2. Types of slices, levels of slices

Facing Up to Faults Facing Up to Faults Facing Up to Faults (v.2.0.1) (v.2.0.1) (v.2.0.1)

Ubiquitous faults T-79.4001 Seminar on Theoretical Computer Science Tero Pietilinen 4.4.2007

Slicing Functional Programs: A Suspicion 10th CREST Open Workshop on Program Analysis and

Program Slicing Gias Uddin Special Topic Lecture prepared from the Survey of Frank Tip on Program

INTERACTING FAULTS By Tyler Lagasse Faults typically form as a network How do we best

Fault Diagnosis of Discrete-Event Systems Alejandro White, Doctoral Candidate Advisor: Dr.

I m pact of I nterm ittent Faults on Nanocom puting Devices Cristian Constantinescu June 28th,

Using Dependence Graphs for Slicing Functional Programs Dr. Vadim Zaytsev aka @grammarware

Untangling Composite Commits Untangling Composite Commits Using Program Slicing Using Program

PREDICT- -HD HD PREDICT BIG QUESTION: What do we need before we can treat HD ? How does

From Channel Slicing to From Channel Slicing to Spatial Division Multiplexing Spatial Division

Projection and slicing theorems in Heisenberg groups Pertti Mattila University of Helsinki

Strategies for Spectrum Slicing Based on Restarted Lanczos Methods Carmen Campos and Jose E.

Dynamic Slicing Techniques for Petri Nets M. Llorens, J. Oliver, J. Silva, S. Tamarit, G. Vidal

What is Parametric Trace Slicing Good For? Giles Reger School of Computer Science, University of

CS675: Convex and Combinatorial Optimization Fall 2019 Combinatorial Problems as Linear and

Decay of aftershock density with distance indicates triggering by dynamic stress 2017 6/12

. Bruno Durand LIRMM CNRS Universit de Montpellier II November26 th 2011 . . . 1.

Massive Data Algorithmics Gerth Stlting Brodal Aarhus University Forskningsdag for

An Empirical Study of Fault Localization Families and Their Combinations Daming Zou, Jingjing

Fault-tolerant quantum computing with color codes Andrew J. Landahl with Jonas T. Anderson and

Dataflow Testing Chapter 10 Dataflow Testing Testing All-Nodes and All-Edges in a control

Comparative Causality: Explaining the Differences Between Executions William N. Sumner Xiangyu