How to Make Best Use of Cross-Company Data for Web Effort Estimation?
Leandro L. Minku University of Leicester, UK
How to Make Best Use of Cross-Company Data for Web Effort - - PowerPoint PPT Presentation
How to Make Best Use of Cross-Company Data for Web Effort Estimation? Leandro L. Minku University of Leicester, UK Leandro Minku, Federica Sarro, Emilia Mendes and Filomena Ferrucci. How to Make Best Use of Cross-Company Data for Web Effort
Leandro L. Minku University of Leicester, UK
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
Leandro Minku, Federica Sarro, Emilia Mendes and Filomena Ferrucci. How to Make Best Use of Cross-Company Data for Web Effort Estimation? Proceedings of the 9th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’15) (best paper award)
2
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
(e.g., person-hours) required to develop software projects.
3
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
(e.g., person-hours) required to develop web projects.
features, e.g., team expertise, number of web pages, number of images, etc.
4
[17] E. Mendes. Practitioner’s Knowledge Representation. Springer-Verlag, 2014, DOI: 10.1007/978-3-642-54157-5 2.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
5
Learning Algorithm
Training projects
Model Model
New project Prediction
Machine learning models can be used to perform effort estimations for a new project based on data describing past projects.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
6
Learning Algorithm
WC training projects
Model Model
New project Prediction
Early studies suggested that general-purpose models (e.g., COCOMO) needed to be calibrated to specific companies.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
data may be prohibitive.
collected, they may be
a consistent manner.
7
[1] B. Boehm. Software Engineering Economics. Prentice-Hall, Englewood Cliffs, NJ, 1981. [13] B. Kitchenham and N. Taylor. Software cost models. ICL Technical Journal, pages 73–102, 1984. [16] P. Kok, B. Kitchenham, and J. Kirawkowski. The mermaid approach to software cost estimation. In ESPRIT, pages 296–314. 1990.
Problems of using only within- company (WC) data:
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
8
Learning Algorithm
CC training projects
CC Model CC Model
New WC project Prediction
CC models are alternatives to WC models. [CC term used loosely.]
E.g.: ISBSG (www.isbsg.org) PROMISE (http://openscience.us/repo/)
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
9
Problem: CC data may have different characteristics from WC data, leading to poorly performing models.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
TEAK, NN filtering, Dycom) have been achieving more promising results.
models in 6 out of 8 data sets.
data sets.
10
[15] E. Kocaguneli, T. Menzies, and E. Mendes. Transfer learning in effort estimation. Empirical Software Engineering, pages 1–31, 2014. [33] B. Turhan and E. Mendes. A comparison of cross- versus single- company effort prediction models for web projects. In Euromicro Conference on Software Engineering and Advanced Applications, pages 285–292, 2014. [28] L. L. Minku and X. Yao. How to make best use of cross-company data in software effort estimation? In ICSE, pages 446–456, 2014.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
11
Our study is geared towards enabling Web development companies to make more efficient managerial decisions worthwhile, by investigating Dycom.
[17] E. Mendes. Practitioner’s Knowledge Representation. Springer-Verlag, 2014, DOI: 10.1007/978-3-642-54157-5 2.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
12
projects from a single company?
dataset for Web effort estimation?
previously used for CC Web effort estimation?
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
There is a relationship between the effort of two companies A and B:
13
Effort estimation models can be built by learning (1) CC models and (2) mapping functions based on a limited number of WC data.
Mapping function
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
14
CC Model 0 CC Model 1 CC Model 2 Mapped Model 0 Mapped Model 1 Mapped Model 2 WC Model
Weighted Ensemble
CC Data CC Data
High Productivity
CC Data
Medium Productivity
CC Data
Low Productivity
WC data
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
15
if no WC training example has been received yet; if (x,y) is the first WC training example;
i
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
8 WC data sets from the Tukutuku database.
16
[23] E. Mendes, N. Mosley, and S. Counsell. Investigating web size metrics for early web cost estimation. JSS, 77(2):157–172, 2005.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
17
8 WC data sets from the Tukutuku database.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
18
for Web projects from a single company?
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
19
Dycom performed almost always better than mean.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
20
Dycom performed similar or better than median most of the time. NN-filtering performed worse than median in five cases.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
modified leave-one-out procedure.
21
compared to a WC dataset for Web effort estimation?
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
22
Dycom performed frequently similarly or better than WC model. Other approaches that try to make CC data more similar to WC data did not perform better than WC model.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
23
techniques previously used for CC Web effort estimation?
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
24
Dycom always performed similar or better than NN-filtering, except in one case.
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
projects from a single company?
from a single company when using Dycom -- it was almost always better than mean, median or random guess.
dataset for Web effort estimation?
while using only half of WC data.
previously used for CC Web effort estimation?
except for one.
25
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
companies similar to the ones in this study and who have just a few WC projects.
be implemented so that empirical studies on site can be performed.
understanding of the relationship between efforts
improve productivity.
26
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
be investigated in future research.
data sets.
investigated.
sometimes did not perform so well as a WC model.
27
How to Make Best Use of Cross-Company Data for Web Effort Estimation?
28
Dycom vs Mean vs Median vs WC model vs NN-Filtering Dycom