Improving Bug Prediction Accuracy by Regularization and Hyperparameter Optimization
Haidar Osman Mohammad Ghafari Oscar Nierstrasz
1
Improving Bug Prediction Accuracy by Regularization and - - PowerPoint PPT Presentation
Improving Bug Prediction Accuracy by Regularization and Hyperparameter Optimization Haidar Osman Mohammad Ghafari Oscar Nierstrasz 1 Improving Bug Prediction Accuracy by Regularization and Hyperparameter Optimization Haidar Osman Mohammad
Haidar Osman Mohammad Ghafari Oscar Nierstrasz
1
Haidar Osman Mohammad Ghafari Oscar Nierstrasz
2
3
Number of Bugs Class (Buggy or Clean) Confusion Matrix Prediction Error Cost Effectiveness Source Code Metrics Change Metrics Organizational Metrics Package Class Filters Wrappers
4
Number of Bugs Class (Buggy or Clean) Confusion Matrix Prediction Error Cost Effectiveness Source Code Metrics Change Metrics Organizational Metrics Package Class Filters Wrappers KNN SVM Linear Regression Poisson Regression
5
Number of Bugs Class (Buggy or Clean) Confusion Matrix Prediction Error Cost Effectiveness Source Code Metrics Change Metrics Organizational Metrics Package Class Filters Wrappers KNN SVM Linear Regression Poisson Regression
6
KNN SVM
Complexity Kernel
Exponent Gamma Sigma Omega Search Algorithm Evaluation # Neighbors
7
Number of Classes 500 1000 1500 2000
629 1'620 195 1'288 798 62 242 129 209 199
Buggy Clean
JDT Core
PDE UI
8
0.8 0.81 0.65 0.63 1.05 0.94 0.41 0.42 0.53 0.53
SVM No Yes No Yes No Yes No Yes No Yes
0.5 1.0 1.5 2.0
Tuned RMSE
Prediction Error
JDT Core
PDE UI
Tuned?
9
0.99 0.79 0.8 0.62 1.19 1.03 0.55 0.43 0.66 0.52
Eclipse JDT Core Eclipse PDE UI Equinox Lucene Mylyn IBK
0.5 1.0 1.5 2.0
RMSE
No Yes No Yes No Yes No Yes No Yes
Tuned
Prediction Error
JDT Core
PDE UI
Tuned?
10
Number of Bugs Class (Buggy or Clean) Confusion Matrix Prediction Error Cost Effectiveness Source Code Metrics Change Metrics Organizational Metrics Package Class Filters Wrappers KNN SVM Linear Regression Poisson Regression
11
Number of Bugs Class (Buggy or Clean) Confusion Matrix Prediction Error Cost Effectiveness Source Code Metrics Change Metrics Organizational Metrics Package Class Filters Wrappers KNN SVM Linear Regression Poisson Regression
12
13
Filter Train
14
subset Train Train Train Train
15
Train
Lasso Ridge Elastic
16
0.96 0.8 0.82 0.81 0.59 0.57 0.58 0.58 0.98 0.92 1.01 1.01 0.4 0.38 0.38 0.38 0.52 0.51 0.51 0.51
Eclipse JDT Core Eclipse PDE UI Equinox Lucene Mylyn Linear Regression
0.0 0.5 1.0 1.5 2.0
Prediction Error
JDT Core
PDE UI
None Ridge Lasso Elastic None Ridge Lasso Elastic None Ridge Lasso Elastic None Ridge Lasso Elastic None Ridge Lasso Elastic 17
1.82 0.91 0.89 0.86 0.69 0.6 0.6 0.6 1.37 1.02 0.92 0.91 0.59 0.4 0.4 0.4 0.71 0.54 0.54 0.53 None Ridge Lasso ElasticNet None Ridge Lasso ElasticNet None Ridge Lasso ElasticNet None Ridge Lasso ElasticNet None Ridge Lasso ElasticNet
0.0 0.5 1.0 1.5 2.0
Prediction Error
JDT Core
PDE UI
None Ridge Lasso Elastic None Ridge Lasso Elastic None Ridge Lasso Elastic None Ridge Lasso Elastic None Ridge Lasso Elastic 18