Combining Clustering with Pattern Combining Clustering with Pattern - - PowerPoint PPT Presentation

combining clustering with pattern combining clustering
SMART_READER_LITE
LIVE PREVIEW

Combining Clustering with Pattern Combining Clustering with Pattern - - PowerPoint PPT Presentation

Combining Clustering with Pattern Combining Clustering with Pattern Matching for Architecture Recovery Matching for Architecture Recovery of OO Systems of OO Systems Mircea Trifu Markus Bauer WSR 2004, Bad Honnef Overview Overview


slide-1
SLIDE 1

Combining Clustering with Pattern Matching for Architecture Recovery

  • f OO Systems

Combining Clustering with Pattern Matching for Architecture Recovery

  • f OO Systems

Mircea Trifu Markus Bauer

WSR 2004, Bad Honnef

slide-2
SLIDE 2

May 4, 2004

2

Overview Overview

► Context and Problem ► Related Work ► Our Approach

Gathering Architectural Clues Adapting Dependecy Measures Clustering

► Evaluation ► Future Work ► Summary

slide-3
SLIDE 3

May 4, 2004

3

Context and Problem Context and Problem

► Software systems age over time

Structures erode, knowledge about the system fades Evolution of systems becomes difficult and expensive

► Problem: Recover a system‘s architecture

to achieve a better understanding of the system to identify spots where the structure needs improvement

► Solution: Develop methods and tools that

automate the task of architecture extraction

slide-4
SLIDE 4

May 4, 2004

4

Related Work Related Work

► Pattern based approaches

identify structures by graph- or pattern matching techniques detect structural problems [Ciupke2001], design patterns [Prechelt1996,

Antoniol1998], user defined architectural structures [Sartipi2001]

mainly recognize „micro structures“ (method or class level) do not cover quality properties for the subsystems (coupling, cohesion,…)

► Approaches based on clustering

group system‘s entities based on their syntactic dependencies used mainly for reverse engineering systems written in procedural

[Mancoridis1999, Koschke2000] and OO languages [Rayside2000], [Trifu, Bauer2001], [eAbreu2000]

neglect the role the system‘s entities play in the architecture

  • ften produce system decompositions that are of not much meaning to

developers

slide-5
SLIDE 5

May 4, 2004

5

Our Approach Our Approach

► Combine pattern based approaches and clustering

Pattern based, adaptive clustering

► Pattern matching

collects hints about the role syntactic elements and their

relationships play in the system‘s architecture

► Cluster analysis

groups elements into subsystem candidates based on relationships makes use of these hints

Source code Source model

Fact Extraction Architectural Clue Gathering

Partition

Clustering

System graph

Compaction

Annotated model

Couplings Adaptation

Unclustered graph

slide-6
SLIDE 6

May 4, 2004

6

Pattern Matching Pattern Matching

► Exploit architectural

patterns

Architectures employ

patterns

Detection of architectural

patterns is difficult (structures erode!)

Architectural patterns use

fine grained patterns (fingerprints, clues), those are easier to detect

Fingerprints have predefined

roles

Roles provide a means to

rate dependencies

Method Type Layered Architecture Repository Architecture Client-Server Architecture Microkernel Broker MVC PAC Framework Library Adapter Pattern Façade Pattern Proxy Pattern Composite Pattern Strategy Pattern Abstract Factory Pattern Template Method Pattern

slide-7
SLIDE 7

May 4, 2004

7

Architectural clues Architectural clues

► Architectural clues can be detected automatically

Classification of methods

► What role does a method have? (delegation, accessor, ...) ► What statute does it have (wrt. inheritance)?

(new, (re-)implementation, extension, …)

► How is it used?

(initializer, interface, implementation, ...)

Detection of library code

► Usage count on the interface

Detection of design patterns (GoF)

► Adapter, Facade, Proxy, Composite, Strategy, Abstract Factory,

Template Method

► Result: annotated structural model

slide-8
SLIDE 8

May 4, 2004

8

Construction of the System Graph Construction of the System Graph

► Source code model

Weighted (multi-)graph

Classes = nodes Dependencies = edges

► Inheritance ► Aggregation ► Association ► Variable accesses ► Method calls ► Indirect coupling

► Weights are influenced by

the detected clues (according to their standard roles)

par:Parameter A:Class B:Class dataType ancestor :LibraryMarker met:Method params methods :MethodType

B C D E A

slide-9
SLIDE 9

May 4, 2004

9

Examples: Calls, Indirect Coupling Examples: Calls, Indirect Coupling

► Calls

Calls between classes A und B

A B

Context Weight Library 0.5 Standard 1 Composite 5

=

C

C methods noMethods B A g IndCouplin ) ( ) , (

Indirect Coupling

A B C1 C2

Adjust weights for the calls

(according to the clues detected)

Use metrics to aggregate the information

about calls between A and B

► Indirect coupling

Elements that are frequently used together

belong together A B

slide-10
SLIDE 10

May 4, 2004

10

Compaction and Clustering Compaction and Clustering

► Compaction:

Transform the multi-

graph into a standard graph

► Clustering:

Employ mature

standard algorithms

Goal: Group the nodes

  • f the graph

Right now: a modified

MST algorithm

B C D E A B C D E A )) , ( ), , ( max( ) , ( ) , (

7 1

A B dsim B A dsim B A sim c w B A dsim

i i i

= ⋅ =∑

=

B C D E A

slide-11
SLIDE 11

May 4, 2004

11

Evaluation Evaluation

50 100 150 200 250

MoJo value

Non-adaptive clustering 196 224 Adaptive clustering 164 84 MoJo(Pack,-) MoJo(CRP,-)

► ACT: Tool-Prototype in Java ► Comparing traditional vs. adaptive clustering using Java

AWT as case study

Package structure and CRP structure vs. clustering Cohesion and coupling properties

0,000 0,100 0,200 0,300 0,400 0,500 0,600 0,700 0,800 0,900

Metric value

Non-adaptive clustering 0,688 0,006 Adaptive clustering 0,804 0,007 AvgCohesion AvgCoupling

slide-12
SLIDE 12

May 4, 2004

12

Some Details... Some Details...

► Semantically related entities have been grouped together:

Menu, MenuItem, MenuContainer, MenuShortcut TextComponent, TextArea, TextField

► Successful separation of classes from different abstraction

levels and with different roles

20 40 60 80 100

Percentage

  • f classes

Non-adaptive clustering 4,54 59,25 Adaptive clustering 88,63 81,48 Events Peers

► Comparable results for 2nd casestudy: SSHTools

slide-13
SLIDE 13

May 4, 2004

13

Future Work Future Work

► Consider other types of syntactic interactions

Cast expressions

► Identify additional clues

Observer pattern; CORBA, COM calls, ...

► Experiment with different clustering algorithms ► Experiment with more case studies

Perform a more detailed comparison with other approaches Collect more evidence about clue usage Tune the thresholds and weight values

► Integrate the technique in our software assessment tool

suite

slide-14
SLIDE 14

May 4, 2004

14

Summary Summary

Our work contributes:

► A new approach for architecture extraction

combining the strengths of pattern based and clustering

approaches

evaluating fingerprints of architecture information

► Useful metrics to express dependencies

Call metrics, indirect coupling

► A powerful way to „correctly“ cluster:

framework-application settings layered architectures library code

slide-15
SLIDE 15

May 4, 2004

15

Questions and Comments Questions and Comments