Using Metrics to Identify Design Using Metrics to Identify Design - - PowerPoint PPT Presentation

using metrics to identify design using metrics to
SMART_READER_LITE
LIVE PREVIEW

Using Metrics to Identify Design Using Metrics to Identify Design - - PowerPoint PPT Presentation

Using Metrics to Identify Design Using Metrics to Identify Design Patterns in Object-Oriented Software Patterns in Object-Oriented Software Patterns in Object-Oriented Software Patterns in Object-Oriented Software G. G. Antoniol Antoniol


slide-1
SLIDE 1

Using Metrics to Identify Design Patterns in Object-Oriented Software Using Metrics to Identify Design Patterns in Object-Oriented Software

EPM

Patterns in Object-Oriented Software Patterns in Object-Oriented Software

G.

  • G. Antoniol

Antoniol

slide-2
SLIDE 2

Outline

Motivations;

Motivations;

Design patterns concepts;

Design patterns concepts;

Design patterns recovery approach;

Design patterns recovery approach;

EPM

Experimental results;

Experimental results;

Conclusions.

Conclusions.

slide-3
SLIDE 3

Motivations

Program comprehension;

Program comprehension;

Aid in maintenance;

Aid in maintenance;

Design

Design-implementation compliance check; implementation compliance check;

EPM

Design

Design-implementation compliance check; implementation compliance check;

Design quality evaluation.

Design quality evaluation.

slide-4
SLIDE 4

Design Patterns

Reusable, well

Reusable, well-known known solutions to common solutions to common design problems; design problems;

Proxy Pattern Subject

  • p()

Proxy

  • p()

EPM

Categories:

Categories:

structural

structural;

creational

creational

behavioral.

behavioral.

RealSub

  • p()

Refers

RealSub -> op()

slide-5
SLIDE 5

Pattern/Motif/Architecture

Pattern also means

Pattern also means intent, consequences, … intent, consequences, …

Motif is just an instance

Motif is just an instance in a code of a trait, act, in a code of a trait, act, feature something we feature something we

  • bserve similar in nature
  • bserve similar in nature

to a DP to a DP

The pattern encompasses

The pattern encompasses a lot more then structure a lot more then structure

  • r behaviour
  • r behaviour

to a DP to a DP

Actually we observe the

Actually we observe the micro micro-architecture and architecture and related motifs pointing related motifs pointing to DP to DP

Motif intent may never

Motif intent may never be known be known

EPM

slide-6
SLIDE 6

Approach

Metrics based multi

Metrics based multi-stage recovery process for stage recovery process for structural design patterns; structural design patterns;

Conservative recovery process;

Conservative recovery process;

EPM

Conservative recovery process;

Conservative recovery process;

Uniform representation of design and code

Uniform representation of design and code using a design description language; using a design description language;

Experimented on industrial and public

Experimented on industrial and public-domain domain C++ applications. C++ applications.

slide-7
SLIDE 7

Old Metric Based Recovery Process

EPM

slide-8
SLIDE 8

A META-Level Design/Code AOL Representation

AOL: a design

AOL: a design description language; description language;

Captures OO design

Captures OO design

CLASS Subject OPERATIONS PUBLIC op(); CLASS RealSub OPERATIONS PUBLIC op();

EPM

concepts from design concepts from design and/or code; and/or code;

UML

UML-based; based;

StP/OMT

StP/OMT - AOL AOL translator; translator;

C++

C++ - AOL translator. AOL translator.

PUBLIC op(); CLASS Proxy OPERATIONS PUBLIC op(); GENERALIZATION Subject SUBCLASSES Realsub, Proxy; RELATION Refers ROLES CLASS Proxy MULT one, CLASS RealSub MULT one

slide-9
SLIDE 9

Computed Metrics

Private, public, protected methods and

Private, public, protected methods and attributes attributes

Relation number:

Relation number:

EPM

Relation number: Relation number:

association

association

aggregation

aggregation

inheritance

inheritance

Depth of the inheritance tree, number of

Depth of the inheritance tree, number of children ... children ...

slide-10
SLIDE 10

Multi-Stage Pattern Recovery

Given a design pattern

Given a design pattern

Pattern Recovery: check constraints on all

Pattern Recovery: check constraints on all dispositions of n classes k by k. dispositions of n classes k by k.

) , ,..., (

1

R > < =

k

e e p

EPM

dispositions of n classes k by k. dispositions of n classes k by k.

Complexity:

Complexity: O nk

( )

slide-11
SLIDE 11

Metrics-based stage

Software metrics allows to effectively prune the search

Software metrics allows to effectively prune the search space by means of class metrics and class pairs space by means of class metrics and class pairs shortest path; shortest path;

EPM

First kind of constraints:

First kind of constraints:

single class level metrics;

single class level metrics;

Output:

Output:

a set of class candidate sets for each pattern searched.

a set of class candidate sets for each pattern searched.

slide-12
SLIDE 12

Metrics-based Stage: continued

> =<

k p

m m M ,...,

1

) , ,..., (

1

R > < =

k

e e p

[ ]

} , : , | {

, i , i ij j i j i p i

m x m m D x x C ≥ ∈ ∀ ∈ ∈ = m M m Example: Proxy. <Sub,RealSub,Proxy>, k=3.

EPM

< >∈ c c c D

1 2 3

, , ( ) Proxy

  • nly

if

1 1 ) ( # 1 ) ( # 2 ) ( #

3 3 2 1

≥ ≥ ≥ ≥ ) #assoc(c c inh c inh c kids

[ ] [ ] [ ]>

1 , 1 , , , 1 , , 2,0,0 =<

Proxy

M

Suppose collected metrics are: kids nbr, father nbr, association nbr

slide-13
SLIDE 13

Metrics-based Stage: Shortest Path

Second kind of constraint: topology based;

Second kind of constraint: topology based;

Class pairs shortest path;

Class pairs shortest path;

Given a class: the remaining pattern

Given a class: the remaining pattern

EPM

Given a class: the remaining pattern

Given a class: the remaining pattern constituents must be reachable in a number of constituents must be reachable in a number of steps constrained by the pattern structure; steps constrained by the pattern structure;

Output: reduced candidate sets

Output: reduced candidate sets

j

R

slide-14
SLIDE 14

Topology Constraint

EPM

slide-15
SLIDE 15

Shortest Path Equations

( )

: | i C x y i R ∈ ∀

Let

min

C be the smallest set given build

min

C y∈

EPM

) min , ( ) , ( e i e ShPath y x ShPath i =

)} ( ) ( ) ( ,.., ),.., ( {

min 1

y R y r C y y r y y r R

i i n

∈ ∧ ∈ ∧ > < =

slide-16
SLIDE 16

Exact Structural Constraints

Exact design pattern relations are verified on reduced candidate sets. Let

R R ⊆

s

be the pattern subset of structural relations:

EPM

)} ,..., ( ) ,..., ( : ,..., | {

1 t p t p s n

x x r e e r r R x x x x S ⇒ ∈ ∀ ∧ >∈ =< = R

slide-17
SLIDE 17

Delegation Stage

A class may implement an operation by calling

A class may implement an operation by calling a methods of another class (to which it is a methods of another class (to which it is associated) thus it delegates the responsibility associated) thus it delegates the responsibility

EPM

associated) thus it delegates the responsibility associated) thus it delegates the responsibility to it to it

Method calls of candidate classes are analyzed

Method calls of candidate classes are analyzed to verify delegation; to verify delegation;

Often not applicable to design since method

Often not applicable to design since method calls is always not documented. calls is always not documented.

slide-18
SLIDE 18

Limitations of the approach

Code to AOL translation is affected by inherent

Code to AOL translation is affected by inherent ambiguities in language implementation of ambiguities in language implementation of associations and aggregations see C++ ; associations and aggregations see C++ ;

EPM

Heuristics example:

Heuristics example:

associations: object pointer/reference data member or

associations: object pointer/reference data member or method formal parameter; method formal parameter;

aggregations: object instance or array data member, class

aggregations: object instance or array data member, class template argument; template argument;

Solution: search also

Solution: search also soft soft patterns, where aggregations patterns, where aggregations are substituted by associations. are substituted by associations.

slide-19
SLIDE 19

Experiment: Public-Domain Sw

Design pattern recovery from code;

Design pattern recovery from code;

6 C++ applications, ranging from 5 to 127KLOC for a

6 C++ applications, ranging from 5 to 127KLOC for a total size of 328KLOC; total size of 328KLOC;

EPM

Not explicitly designed using patterns;

Not explicitly designed using patterns;

Reduction effectiveness of the three stages:

Reduction effectiveness of the three stages:

metrics

metrics-based: 3 based: 3-4 orders of magnitude; 4 orders of magnitude;

structural: 1

structural: 1-2 orders of magnitude; 2 orders of magnitude;

delegation: 2

delegation: 2-3 times. 3 times.

slide-20
SLIDE 20

Public-Domain SW Results

  • Few patterns instances:

Few patterns instances: Adapter Adapter and and Proxy Proxy most frequent most frequent

25 /100KLOC identified, 9 /100KLOC actual;

25 /100KLOC identified, 9 /100KLOC actual;

  • Recovery retrieval effectiveness: characterized using

Recovery retrieval effectiveness: characterized using precision precision and and recall; recall;

EPM

and and recall; recall;

  • Recall is always 100% (conservative approach);

Recall is always 100% (conservative approach);

  • Precision:

Precision:

no delegation: ave = 20.1%, range = 3.4

no delegation: ave = 20.1%, range = 3.4-63.3%; 63.3%;

delegation: ave = 55.4%, range = 18.2

delegation: ave = 55.4%, range = 18.2-100%; 100%;

  • No soft patterns retrieved were found as actual instances;

No soft patterns retrieved were found as actual instances;

  • Time: no delegation: 1s/KLOC, delegation: 1.16s/KLOC.

Time: no delegation: 1s/KLOC, delegation: 1.16s/KLOC.

slide-21
SLIDE 21

Experiment: Industrial Software

Assessment of design pattern use;

Assessment of design pattern use;

Telecommunication software: 8 complete C++

Telecommunication software: 8 complete C++ components with code and OMT design for a total of components with code and OMT design for a total of

EPM

about 200 KLOC; about 200 KLOC;

Pattern recovery on design:

Pattern recovery on design:

delegation constraints not applicable;

delegation constraints not applicable;

  • nly 4 of 8 systems contained pattern instances;
  • nly 4 of 8 systems contained pattern instances;

two kinds of patterns, 32

two kinds of patterns, 32 Adapter, Adapter, 6 6 Bridge. Bridge.

slide-22
SLIDE 22

Experiment: Industrial Sw: cont.

Pattern Recovery on Code:

Pattern Recovery on Code:

actual pattern instances present in 2 of 8 systems;

actual pattern instances present in 2 of 8 systems;

Precision: no delegation:16%, delegation:80%;

Precision: no delegation:16%, delegation:80%;

EPM

Precision: no delegation:16%, delegation:80%;

Precision: no delegation:16%, delegation:80%;

Design

Design-Code Pattern Compliance: Code Pattern Compliance:

no code

no code-design pattern instances intersection: design pattern instances intersection:

  • motivations: no delegation used in design, reused/COTS

motivations: no delegation used in design, reused/COTS classes in code not modeled in design, design not classes in code not modeled in design, design not maintained. maintained.

slide-23
SLIDE 23

Precision/Recall Trade Off

On large software a high number of false

On large software a high number of false positive may be generated; positive may be generated;

Non admissible approaches may be considered;

Non admissible approaches may be considered;

EPM

Metrics as: nesting level, cyclomatic

Metrics as: nesting level, cyclomatic complexity, coupling may help to reduce the complexity, coupling may help to reduce the candidate sets; candidate sets;

A compromise between precision and recall

A compromise between precision and recall may be necessary. may be necessary.

slide-24
SLIDE 24

Graph Similarity

EPM

slide-25
SLIDE 25

Graph

DP structure can be represented as a graph (or a multi-

graph)

A class diagram is just a graph

EPM

A class diagram is just a graph Graph isomorphism is actually not a point as a DP is

small and class diagrams are usually large

We need something different define a graph similarity

slide-26
SLIDE 26

Google

The WEB is a huge graph Certain nodes are more important than others

a good hub is pointed by many good authorities

EPM

  • a good compilation of resources – EPM web

a good authority is pointed by many good hub;

  • National Cancer Institute or NSERC or … – mono topic but

authoritative

Goal assign a hub score and authority score

it is kind of a graph weighting itself

but how …

slide-27
SLIDE 27

Graph to Graps

Suppose we have two graphs Ga and Gb each

can be described by its adjacency matrix A and B then:

+ A Z B A BZ

T T

EPM

B then:

The series Zodd and Zeven limit of 2k+1 and

2k have different properties

We are interested to the series a 1-norm

i.e., it is between 0 and 1

,... 2 , 1 ,

1

= + + =

+

k A Z B A BZ A Z B A BZ Z

F k T T k k T T k k

k

Z2

slide-28
SLIDE 28

Graph similarity

Set Z0 a matrix of all 1

Set Z0 a matrix of all 1

Compute:

Compute:

+ A Z B A BZ

T T

After a few iteration contains the

After a few iteration contains the similarity node by node of the two graphs similarity node by node of the two graphs

EPM

,... 2 , 1 ,

1

= + + =

+

k A Z B A BZ A Z B A BZ Z

F k T T k k T T k k

k

Z2

slide-29
SLIDE 29

DP recognition

A DP is just a graph

A DP is just a graph

A class diagram is a graph

A class diagram is a graph

Consider that it is actually a multi

Consider that it is actually a multi-graph graph

Consider that it is actually a multi

Consider that it is actually a multi-graph graph

association, inheritance, aggregation

association, inheritance, aggregation

A DP is usually pretty small a few node thus

A DP is usually pretty small a few node thus the computation is really fast the computation is really fast

EPM

slide-30
SLIDE 30

References

  • Giuliano

Giuliano Antoniol Antoniol et al Object et al Object-oriented design patterns recovery.

  • riented design patterns recovery. Journal of Systems

Journal of Systems and Software 59 and Software 59(2): 181 (2): 181-196 (2001) 196 (2001)

  • Vincent

Vincent Blondel Blondel et al A MEASURE OF SIMILARITY BETWEEN GRAPH et al A MEASURE OF SIMILARITY BETWEEN GRAPH VERTICES: APPLICATIONS TO SYNONYM EXTRACTION AND WEB VERTICES: APPLICATIONS TO SYNONYM EXTRACTION AND WEB SEARCHING SEARCHING

  • Nikolaos

Nikolaos Tsantalis Tsantalis et al, " et al, "Design Pattern Detection Using Similarity Scoring Design Pattern Detection Using Similarity Scoring," ," IEEE IEEE Transactions on Software Engineering Transactions on Software Engineering, vol. 32, no. 11, pp. 896 , vol. 32, no. 11, pp. 896-909, November 2006 909, November 2006

  • Laura A.

Laura A. Zager Zager, George C. , George C. Verghese Verghese: Graph similarity scoring and matching. Appl. : Graph similarity scoring and matching. Appl. Math.

  • Math. Lett
  • Lett. 21(1): 86

. 21(1): 86-94 (2008) 94 (2008)

  • Laura A.

Laura A. Zager Zager: : Graph similarity MIT dissertation June 2005 Graph similarity MIT dissertation June 2005