Calculation and Optimization of Thresholds for Sets of Software - - PDF document

calculation and optimization of thresholds for sets of
SMART_READER_LITE
LIVE PREVIEW

Calculation and Optimization of Thresholds for Sets of Software - - PDF document

Calculation and Optimization of Thresholds for Sets of Software Metrics Steffen Herbold, Jens Grabow ski , Stephan Waack Georg-August-Universitt Gttingen Institute of Computer Science 1 Contents Motivation Software Metrics


slide-1
SLIDE 1

1

1

Calculation and Optimization of Thresholds for Sets of Software Metrics

Steffen Herbold, Jens Grabow ski, Stephan Waack Georg-August-Universität Göttingen Institute of Computer Science

Optimization of Metric Sets with Thresholds 2

Contents

 Motivation  Software Metrics  Classification with Thresholds  Optimization of Sets of Metrics  Applications  Case Studies  Summary and Outlook

slide-2
SLIDE 2

2

Optimization of Metric Sets with Thresholds 3

What is Software Engineering?

 Software Engineering is

 development,  maintenance, and  deployment

  • f high-quality Software in consideration of

 scientific methods,  economic principles,  planned development models, and  quantifiable goals.

 B. Kahlbrandt: Software-Engineering: Objektorientierte Software-

Entwicklung mit der Unified Modeling Language, Springer Verlag (1998)

Optimization of Metric Sets with Thresholds 4

What is Software Engineering?

 Software Engineering is

 development,  maintenance, and  deployment

  • f high-quality Softw are in consideration of

 scientific methods,  economic principles,  planned development models, and  quantifiable goals.

 B. Kahlbrandt: Software-Engineering: Objektorientierte Software-

Entwicklung mit der Unified Modeling Language, Springer Verlag (1998)

slide-3
SLIDE 3

3

Optimization of Metric Sets with Thresholds 5

What is Software Quality?

Usage Quality Maintenance Quality Project Quality Process Quality

time Project Start Start of Operation System Retirement

Optimization of Metric Sets with Thresholds 6

Quality Assessment using ISO 9126

External and Internal Quality

Suitability Accuracy Interoperability Security Maturity Fault-Tolerance Recoverability Understand- ability Learnability Operability Attractiveness Time Behaviour Resource Utilisation Analysability Changeability Stability Testability Adaptability Installability Co-Existence Replaceability

Functionality Reliability Usability Efficiency Portability Maintainability Metrics

slide-4
SLIDE 4

4

Optimization of Metric Sets with Thresholds 7

Contents

 Motivation  Softw are Metrics  Classification with Thresholds  Optimization of Sets of Metrics  Applications  Case Studies  Summary and Outlook

Optimization of Metric Sets with Thresholds 8

What do Software Metrics Measure?

 “You cannot control what you cannot measure”

(Tom DeMarco)

 “To measure is to know” (Clerk Maxwell)

Engine Power 100 PS Fuel usage 5,8 l

  • Max. speed

176 km/h Weight 1458 kg

???

slide-5
SLIDE 5

5

Optimization of Metric Sets with Thresholds 9

Properties of Software Metrics

 Modes of measurement

 internal  external

 Objects of measurement

 products  processes  resources

 W e perform an internal m easurem ent

  • f products by m eans of static analysis
  • f source code.

Optimization of Metric Sets with Thresholds 10

Metrics for Methods and Classes

Modules, Files Classes Methods

Number of Statements (NST) McCabe‘s Cyclomatic Number (VG) Nested Block Depth (NBD) Number of Function Calls (NFC) Coupling Between Objects (CBO) Response For a Class (RFC) Weighted Method per Class (WMC) Number of Overriden Methods (NORM) Lines Of Code (LOC) Number Of Methods (NOM) Number of Static Methods (NSM)

slide-6
SLIDE 6

6

Optimization of Metric Sets with Thresholds 11

Software Metrics for Methods

 Number of Statements (NST)  McCabe’s Cyclomatic Number (VG)

 Number of branches in the control flow

 Nested Block Depth (NBD)

 Max. depth of nested statement blocks

 Number of Function Calls (NFC)

 Number of methods invoked by the method under

investigation

Optimization of Metric Sets with Thresholds 12

Metrics for Classes 1(2)

 Coupling Between Objects (CBO)

 Number of associations with other classes

 Response For a Class (RFC)

 Number of methods that can be called when the

methods of the class under investigation are invoked

 Weighted Methods per Class (WMC)

 Sum of complexities of all methods

 Complexity: McCabe’s Cyclomatic Number (VG)

slide-7
SLIDE 7

7

Optimization of Metric Sets with Thresholds 13

Metrics for Classes 2(2)

 Number of Overridden Methods (NORM)

 Number of redefined methods inherited from a

superclass

 Lines Of Code (LOC)

 Lines of source code without empty lines and

comments

 Number Of Methods (NOM)  Number of Static Methods (NSM)

Optimization of Metric Sets with Thresholds 14

Quality Assessment using ISO 9126

(revisited)

External and Internal Quality

Suitability Accuracy Interoperability Security Maturity Fault-Tolerance Recoverability Understand- ability Learnability Operability Attractiveness Time Behaviour Resource Utilisation Analysability Changeability Stability Testability Adaptability Installability Co-Existence Replaceability

Functionality Reliability Usability Efficiency Portability Maintainability Metrics

slide-8
SLIDE 8

8

Optimization of Metric Sets with Thresholds 15

Contents

 Motivation  Software Metrics  Classification w ith Thresholds  Optimization of Sets of Metrics  Applications  Case Studies  Summary and Outlook

Optimization of Metric Sets with Thresholds 16

Thresholds

 Mechanism to classify values  Metrics with upper and lower bound

 Only upper bounds are considered Threshold

slide-9
SLIDE 9

9

Optimization of Metric Sets with Thresholds 17

Thresholds for Methods

Nam e of Metric Program m ing Language Threshold McCabe’s Cyclomatic Number (VG) C 24 C++/C# 10 Nested Block Depth (NBD) C/C++/C# 5 Number of Function Calls (NFC) C/C++/C# 5 Number of Statements (NST) C/C++/C# 50

Optimization of Metric Sets with Thresholds 18

Thresholds for Java Classes

Nam e of Metric Threshold Weighted Methods per Class (WMC) 100 Coupling Between Objects (CBO) 5 Response For a Class (RFC) 100 Number of Overriden Methods (NORM) 3 Lines of Code (LOC) 500 Number of Methods (NOM) 20 Number of Static Methods (NSM) 4

slide-10
SLIDE 10

10

Optimization of Metric Sets with Thresholds 19

Thresholds and Rectangles

* * * * *

1 2 ... 1 2 ...

*

Metric 2 Metric 1

Threshold 1 Threshold1 Threshold 2

Optimization of Metric Sets with Thresholds 20

Contents

 Motivation  Software Metrics  Classification with Thresholds  Optim ization of Sets of Metrics  Applications  Case Studies  Summary and Outlook

slide-11
SLIDE 11

11

Optimization of Metric Sets with Thresholds 21

General Idea

 Rectangles = sets of thresholds  Rectangles are computed using machine learning  Data-driven method

 Based on previous measurements (or manual

classification) of software

 Measurements (or classification) partition the

software into good and bad software

Optimization of Metric Sets with Thresholds 22

Optimization of Metric Sets

 Given:

 Set of metrics: M = { m 1, …

, m n}

 Software system: S = { s1, s2, …

}

 si = classes, methods or functions  Metric values m 1(si), …

, m n(si)

 Classification f(si) → good ∨ bad

 Sought-after:

 Subset M* ⊆ M

(including thresholds) with

 fM* (si) ≈ f(si)

and

 |M*| is minimal

slide-12
SLIDE 12

12

Optimization of Metric Sets with Thresholds 23

Calculation of Thresholds

 Calculate thresholds for all subsets

 { m 1} , { m 1, m 2} , { m 1, m 3} , …

, { m 1, … , m n}

 2n subsets Optimization of Metric Sets with Thresholds 24

Selection of the Best Subset

 Determine classification error ε

 deviation of metrics subset from input set  probability of fM* (si) ≠ f(si) (i.e., wrong classification)

 Select smallest subset with sufficient ε

 ε ≤ δ for a selected error limit δ  δ = 1%  increase δ by 0,5% until a subset is found

slide-13
SLIDE 13

13

Optimization of Metric Sets with Thresholds 25

Contents

 Motivation  Software Metrics  Classification with Thresholds  Optimization of Sets of Metrics  Applications  Case Studies  Summary and Outlook

Optimization of Metric Sets with Thresholds 26

Application Overview

 Size reduction of sets of metrics

 Higher efficiency

 Simplification of classification

 Better interpretation of classification

 Calculation of domain specific threshold

 Automated quality assessment in organizations

slide-14
SLIDE 14

14

Optimization of Metric Sets with Thresholds 27

Size Reduction of Sets of Metrics

 Given

 Set of metrics M with corresponding thresholds

 Classify software by means of M  Calculate optimal subset M* ⊆ M

M* is more efficient than M

Optimization of Metric Sets with Thresholds 28

Simplification of the Classification

 Goal:

 Using thresholds instead of a more complex

classifier fcomplex such as

 allowing certain violations of thresholds  decision trees

 Classify software S with classifier fcomplex  Select appropriate set of metrics M  Calculation of an optimal subset M* ⊆ M

slide-15
SLIDE 15

15

Optimization of Metric Sets with Thresholds 29

Classification with one Violation

* * * * *

1 2 ... 1 2 ...

*

Metric 2 Metric 1

Threshold 1 Threshold 2

Optimization of Metric Sets with Thresholds 30

Approximation with Thresholds

* * * * *

1 2 ... 1 2 ...

*

Metric 2 Metric 1

Threshold1 Threshold 2

slide-16
SLIDE 16

16

Optimization of Metric Sets with Thresholds 31

Domainspecific Thresholds

 Assumption

 No formal classifier available

 Expert provides base data

 Manual classification of parts of a software product  Selection of metrics set M that may reproduce the

manual classification

 Calculation of an optimal subset M* ⊆ M

Optimization of Metric Sets with Thresholds 32

Contents

 Motivation  Software Metrics  Classification with Thresholds  Optimization of Sets of Metrics  Applications  Case Studies  Summary and Outlook

slide-17
SLIDE 17

17

Optimization of Metric Sets with Thresholds 33

Data Pool

 Based on 8 open source projects

Nam e Version Program m ing Language Size Apache Webserver 2.2.10 C 6718 methods kdebase 12/05/2008 C++ 21404 methods kdelibs 12/05/2008 C++ 37444 methods AspectDNG 1.0.3 C# 2759 methods NetTopologieSuite 1.7.1.RC1 C# 3059 methods SharpDevelop 2.2.1.2648 C# 15700 methods Eclipse Java Development Tools 3.2 Java 4833 classes Eclipse Platform Project 3.2 Java 5399 classes

Optimization of Metric Sets with Thresholds 34

Case Study: Optimization of Metric Sets 1(2)

 C functions  C++ methods and C# methods

VG NBD NFC NST Input 24 5 5 50 Optimized 5 VG NBD NFC NST Input 10 5 5 50 Optimized 5

0.78% Error 75% Size Reduction! 0.59% Error, C# 0.06% Error, C++ 75% Size Reduction!

slide-18
SLIDE 18

18

Optimization of Metric Sets with Thresholds 35

Case Study: Optimization of Metric Sets 2(2)

 Java classes

WMC CBO RFC NORM LOC NOM NSM Input 100 5 100 3 500 20 4 Optimized 5 3 4

0.27% Error 57% Size Reduction!

Optimization of Metric Sets with Thresholds 36

Case Study: Usage of a Different Classifier 1(2)

 C functions – one violation is allowed  C++ methods – one violation is allowed  C# methods – one violation is allowed

VG NBD NFC NST Input 24 5 5 50 Optimized 50 VG NBD NFC NST Input 10 5 5 50 Optimized 10

0.84% Error 75% Size Reduction! 0.87% Error 75% Size Reduction!

VG NBD NFC NST Input 10 5 5 50 Optimized 9

1.26% Error 75% Size Reduction!

slide-19
SLIDE 19

19

Optimization of Metric Sets with Thresholds 37

Case Study: Usage of a Different Classifier 2(2)

 Java classes – one violation is allowed  Java classes – two violations are allowed

WMC CBO RFC NORM LOC NOM NSM Input 100 5 100 3 500 20 4 Optimized 98 3 20 4

1.71% Error 42% Size Reduction!

WMC CBO RFC NORM LOC NOM NSM Input 100 5 100 3 500 20 4 Optimized 99 110

2.21% Error 71% Size Reduction!

Optimization of Metric Sets with Thresholds 38

Results of Case Studies

 Successful size reduction of metric sets

 42%-75% smaller sets

 Error in the range of statistical noise  Complex classifications can be replaced by

thresholds

slide-20
SLIDE 20

20

Optimization of Metric Sets with Thresholds 39

Contents

 Motivation  Software Metrics  Classification with Thresholds  Optimization of Sets of Metrics  Applications  Case Studies  Sum m ary and Outlook

Optimization of Metric Sets with Thresholds 40

Summary

 Optimization of metric sets with thresholds for quality

assessment

Simple method with high effectiveness

 Data-driven method for the calculation of thresholds

Based on machine learning algorithms

 Complex classifications are replaceable by thresholds

Leads to a better interpretability of assessment results

 Case studies show that a small metric set is sufficient

Low effort for data collection

slide-21
SLIDE 21

21

Optimization of Metric Sets with Thresholds 41

Outlook

 Disjunctive normal forms instead of simple

thresholds

 Rating instead of classification

 critical, suspect, unproblematic

 Metric sets on other levels of abstraction

 modules, projects

 Inclusion of metrics for processes and resources

 number of errors, test effort

) ( ) (

4 3 1 2 1

m m m m m    

Optimization of Metric Sets with Thresholds 42

 Thank you for your attention

Jens Grabowski grabowski@informatik.uni-goettingen.de

For further details on the talk:

  • S. Herbold, J. Grabowski, S. Waack. Calculation and Optimization of Thresholds for

Sets of Software Metrics. Accepted for publication in: Empirical Software Engineering, An International Journal. Springer, 2011.