 
              Combining Clustering with Pattern Combining Clustering with Pattern Matching for Architecture Recovery Matching for Architecture Recovery of OO Systems of OO Systems Mircea Trifu Markus Bauer WSR 2004, Bad Honnef
Overview Overview ► Context and Problem ► Related Work ► Our Approach � Gathering Architectural Clues � Adapting Dependecy Measures � Clustering ► Evaluation ► Future Work ► Summary 2 May 4, 2004
Context and Problem Context and Problem ► Software systems age over time � Structures erode, knowledge about the system fades � Evolution of systems becomes difficult and expensive ► Problem: Recover a system‘s architecture � to achieve a better understanding of the system � to identify spots where the structure needs improvement ► Solution: Develop methods and tools that automate the task of architecture extraction 3 May 4, 2004
Related Work Related Work ► Pattern based approaches � identify structures by graph- or pattern matching techniques � detect structural problems [Ciupke2001], design patterns [Prechelt1996, Antoniol1998], user defined architectural structures [Sartipi2001] � mainly recognize „micro structures“ (method or class level) � do not cover quality properties for the subsystems (coupling, cohesion,…) ► Approaches based on clustering � group system‘s entities based on their syntactic dependencies � used mainly for reverse engineering systems written in procedural [Mancoridis1999, Koschke2000] and OO languages [Rayside2000], [Trifu, Bauer2001], [eAbreu2000] � neglect the role the system‘s entities play in the architecture � often produce system decompositions that are of not much meaning to developers 4 May 4, 2004
Our Approach Our Approach ► Combine pattern based approaches and clustering � Pattern based, adaptive clustering ► Pattern matching � collects hints about the role syntactic elements and their relationships play in the system‘s architecture ► Cluster analysis � groups elements into subsystem candidates based on relationships � makes use of these hints Annotated Unclustered Source code model graph Fact Architectural Couplings Compaction Clustering Extraction Clue Gathering Adaptation System Source model Partition graph 5 May 4, 2004
Pattern Matching Pattern Matching ► Exploit architectural Layered patterns Library Architecture � Architectures employ Adapter Repository Pattern patterns Architecture � Detection of architectural Façade Client-Server Pattern Architecture patterns is difficult (structures erode!) Proxy Microkernel Pattern � Architectural patterns use Method Type Composite fine grained patterns Broker Pattern (fingerprints, clues), those Strategy are easier to detect MVC Pattern � Fingerprints have predefined Abstract Factory roles PAC Pattern � Roles provide a means to Template Method Framework rate dependencies Pattern 6 May 4, 2004
Architectural clues Architectural clues ► Architectural clues can be detected automatically � Classification of methods ► What role does a method have? (delegation, accessor, ...) ► What statute does it have (wrt. inheritance)? (new, (re-)implementation, extension, …) ► How is it used? (initializer, interface, implementation, ...) � Detection of library code ► Usage count on the interface � Detection of design patterns (GoF) ► Adapter, Facade, Proxy, Composite, Strategy, Abstract Factory, Template Method ► Result: annotated structural model 7 May 4, 2004
Construction of the System Graph Construction of the System Graph ► Source code model :LibraryMarker � Weighted (multi-)graph dataType B:Class par:Parameter � Classes = nodes ancestor params � Dependencies = edges ► Inheritance methods A:Class met:Method ► Aggregation :MethodType ► Association ► Variable accesses ► Method calls A ► Indirect coupling ► Weights are influenced by B E the detected clues (according to their D C standard roles) 8 May 4, 2004
Examples: Calls, Indirect Coupling Examples: Calls, Indirect Coupling ► Calls � Calls between classes A und B A B Context Weight � Adjust weights for the calls Library 0.5 (according to the clues detected) Standard 1 Composite 5 A B � Use metrics to aggregate the information about calls between A and B ► Indirect coupling � Elements that are frequently used together A belong together C 1 Indirect noMethods ∑ Coupling = IndCouplin g ( A , B ) C 2 methods C ( ) C B 9 May 4, 2004
Compaction and Clustering Compaction and Clustering A A ► Compaction: B B � Transform the multi- E E graph into a standard D C D C graph 7 = ∑ ⋅ dsim ( A , B ) w c i i = i 1 = sim ( A , B ) max( dsim ( A , B ), dsim ( B , A )) ► Clustering: � Employ mature A standard algorithms � Goal: Group the nodes B E of the graph � Right now: a modified D C MST algorithm 10 May 4, 2004
Evaluation Evaluation ► ACT: Tool-Prototype in Java ► Comparing traditional vs. adaptive clustering using Java AWT as case study � Package structure and CRP structure vs. clustering � Cohesion and coupling properties 250 0,900 0,800 200 0,700 0,600 150 MoJo Metric 0,500 value value 0,400 100 0,300 0,200 50 0,100 0 0,000 MoJo(Pack,-) MoJo(CRP,-) AvgCohesion AvgCoupling Non-adaptive 196 224 0,688 0,006 Non-adaptive clustering clustering 164 84 Adaptive Adaptive 0,804 0,007 clustering clustering 11 May 4, 2004
Some Details... Some Details... ► Semantically related entities have been grouped together: � Menu, MenuItem, MenuContainer, MenuShortcut � TextComponent, TextArea, TextField ► Successful separation of classes from different abstraction levels and with different roles 100 80 Percentage 60 of classes 40 20 0 Events Peers Non-adaptive clustering 4,54 59,25 88,63 81,48 Adaptive clustering ► Comparable results for 2nd casestudy: SSHTools 12 May 4, 2004
Future Work Future Work ► Consider other types of syntactic interactions � Cast expressions ► Identify additional clues � Observer pattern; CORBA, COM calls, ... ► Experiment with different clustering algorithms ► Experiment with more case studies � Perform a more detailed comparison with other approaches � Collect more evidence about clue usage � Tune the thresholds and weight values ► Integrate the technique in our software assessment tool suite 13 May 4, 2004
Summary Summary Our work contributes: ► A new approach for architecture extraction � combining the strengths of pattern based and clustering approaches � evaluating fingerprints of architecture information ► Useful metrics to express dependencies � Call metrics, indirect coupling ► A powerful way to „correctly“ cluster: � framework-application settings � layered architectures � library code 14 May 4, 2004
Questions and Questions and Comments Comments 15 May 4, 2004
Recommend
More recommend