What architects should know about reverse engineering and - - PDF document

what architects should know about reverse engineering and
SMART_READER_LITE
LIVE PREVIEW

What architects should know about reverse engineering and - - PDF document

What architects should know about reverse engineering and reengineering Rainer Koschke University of Bremen, Germany Arbeitsgruppe Softwaretechnik Fachbereich Mathematik und Informatik Universit at Bremen 8th of November 2005 Architecture


slide-1
SLIDE 1

What architects should know about reverse engineering and reengineering

Rainer Koschke University of Bremen, Germany

Arbeitsgruppe Softwaretechnik Fachbereich Mathematik und Informatik Universit¨ at Bremen

8th of November 2005

What architects should know about reverse engineering and reengineering

Rainer Koschke University of Bremen, Germany

Arbeitsgruppe Softwaretechnik Fachbereich Mathematik und Informatik Universit¨ at Bremen

8th of November 2005

2005-11-10

Architecture Reconstruction

I am a professor for software engineering at the University of Bremen in Germany (http://www.informatik.uni-bremen.de/∼koschke/). My research interests are primarily in the fields of software engineering and program analyses. My current research includes architecture recovery, feature location, program analyses, clone detection, and reverse engineering. I am one of the founders of the Bauhaus research project (http://www.bauhaus-stuttgart.de), founded in 1997 to develop methods and tools to support software maintainers in their daily job through reconstructed architectural and source code views. I am teaching reengineering and software engineering. I hold a doctoral degree in computer science from the University of Stuttgart, Germany. I am the current Chair of the IEEE TCSE committee on reverse engineering http://www.tcse.org/revengr/ and initiator and maintainer of the IEEE TCSE online bibliography on reengineering (http://www.iste.uni-stuttgart.de/ps/reengineering/index.html). I should also note that – beyond teaching software engineering and reengineering at the University of Bremen – I am about to create a spin-off with two of my former colleagues at Stuttgart that specializes in offering tools and services for architecture reconstruction. The company Bauhaus Software Technologies (http://www.bauhaus-tec.com) builds on about 8 years of research of the Bauhaus project.

slide-2
SLIDE 2

Computer Science Building in Stuttgart

The building at the time of delivery

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 2

Computer Science Building in Stuttgart

The building at the time of delivery

2005-11-10

Architecture Reconstruction Architecture The delivered product Computer Science Building in Stuttgart

Before I get to the topic, let me give you some background on where I come from. About one year ago, I was a postdoc at the University of Stuttgart. Only one year before I left Stuttgart to become a professor at the University of Bremen, we moved into a new computer science building. Here, you see a photograph of it. Doesn’t it have a beautiful archictecture? It is a really hot building – I remember the 40 degree celsius or 104 degree Fahrenheit we had during summertime. We have these very large windows to let all the light in. OK, the parts you can actually open look like a mixture of a safe and a loophole. Needless to say that we needed to cut the budget and had to cancel the air-conditioning. Well, you cannot have everything. OK, the building is not necessarily functional, but at least it is beautiful – that’s what the architect says. But anyway, it is a really beautiful building. The architect didn’t allow us to build a shield in front of the building because it would compromise the beauty of this building. He does not have an office in this building.

slide-3
SLIDE 3

Design models

The envisioned building

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 3

Design models

The envisioned building

2005-11-10

Architecture Reconstruction Architecture Architectural models Design models

Like every other complex engineering product, it underwent a thorough design process. Several models have been

  • created. Here, you see just two of them. They really helped us to imagine how life would be in this building. OK, I

admit, we didn’t foresee the problem with this summer heat. But anyway, it is a really beautiful building.

slide-4
SLIDE 4

Different architectural views

Plans for the implementation

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 4

Different architectural views

Plans for the implementation

2005-11-10

Architecture Reconstruction Architecture Architectural models Different architectural views

After more coarse-grained models, several other architectural views were created. Here, you see the mapping of functions onto levels and rooms. To your right, you see the network wires for the rooms and the power outlets. It is a negligible detail of the implementation that we lost our detailed plans about the power system. Such is life. But anyway, it is a really beautiful building.

slide-5
SLIDE 5

Computer Science Building in Stuttgart

After delivery: Use and Adjustments

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 5

Computer Science Building in Stuttgart

After delivery: Use and Adjustments

2005-11-10

Architecture Reconstruction Architecture Architectural models Computer Science Building in Stuttgart

When we finally moved into the building, we soon started to adapt it to our needs. I personally rode my bicycle to the university; but my room was at the first floor. So, I built a ramp to my office. My colleagues preferred a

  • balcony. One department demolished the walls between their offices to give way to an open-plan office.

Only recently – now in Bremen – I checked the web cam on the university campus again and found this: . . .

slide-6
SLIDE 6

Views in software maintenance

The building after several years of “maintenance”. . .

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 6

Views in software maintenance

The building after several years of “maintenance”. . .

2005-11-10

Architecture Reconstruction Architecture Architecture in the maintenance phase Views in software maintenance

After a pretty short time the building has undergone a series of changes. It used to be a really beautiful building. I heard rumors that the architects are now planning to build a third floor on top of the existing building to increase the capacity. Guess what they will do. They will get the original plans from the drawer and start planning based on that information. But unlike in software, the construction workers will immediately refuse to enter this building. Sometimes, I envy people in the construction business. In software, it is much more difficult to assess the actual state of the architecture.

slide-7
SLIDE 7

Software Architecture in Textbooks

Modern canonical compiler according to Shaw and Garlan (1993)

Analysis Lexical Analysis Syntactic Semantic Analysis Code Generation AST Symbol Table Optimization Text Code

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 7

Software Architecture in Textbooks

Modern canonical compiler according to Shaw and Garlan (1993)

Analysis Lexical Analysis Syntactic Semantic Analysis Code Generation AST Symbol Table Optimization Text Code

2005-11-10

Architecture Reconstruction Software architecture Idealized software architecture Software Architecture in Textbooks

  • Alright. I should talk about software—but, haven’t I?

In software, one of our most powerful weapon is abstraction. Software architecture is an abstraction. Here is one example presented by Mary Shaw in her book on software architecture. It is the reference architecture of modern compilers. It shows the individual steps in the transition from source text to object code. These steps are connected through two data structures: the abstract syntax tree and the associated symbol table. It is interesting to note that the picture does not show a connection between the symbol table and the abstract syntax tree, although there should be one. Moreover, in really modern compilers, you would have an additional intermediate representation that abstracts from the source language; and then the back end and optimization would not access the optimization and abstract syntax tree. But anyway, isn’t that a beautiful architecture?

slide-8
SLIDE 8

Software Architecture in Practice

Modern canonical compiler according to real life

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 8

Software Architecture in Practice

Modern canonical compiler according to real life

2005-11-10

Architecture Reconstruction Software architecture Software architecture in practice Software Architecture in Practice

Now if you look at the implementation of modern compilers, you find a somewhat different picture at first sight. Here, you see the files of a relatively new C compiler for micro processors. The edges represent various dependencies among these files. Various dependencies can be distinguished. I turned them pale to reduce the visual clutter. The interesting question now is “how does this picture relate to the architecture I just showed?” If you have do not have an immediate answer to this question, you must reconstruct this connection.

slide-9
SLIDE 9

Reverse Engineering

Definition

Reverse engineering is the process of analyzing a subject system to identify the system’s components and their interrelationships and create representations of the system in another form or a higher level

  • f abstraction.

– Chikofsky and Cross II. (1990)

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 9

Reverse Engineering

Definition Reverse engineering is the process of analyzing a subject system to identify the system’s components and their interrelationships and create representations of the system in another form or a higher level

  • f abstraction.

– Chikofsky and Cross II. (1990)

2005-11-10

Architecture Reconstruction Reverse engineering Software architecture reconstruction Reverse Engineering

So, what is architecture reconstruction after all? Architecture reconstruction is some form of reverse engineering that reconstructs architectural views from an existing system. What do I mean by “reverse engineering”? Most of you know the definition by Elliot Chikofsky and James Cross. I don’t want to bore you with definitions. I just want to point out the common elements in the definition of reverse engineering and software architecture. We have components and interrelationships. And we have abstraction. Now that we know what architecture reconstruction means, we are interested in how it is perceived. To this end, I went to the WICSA Software Architecture Body of Knowledge. There is a WICSA Wiki entry that notes the following about existing tools and techniques for architecture reconstruction.

slide-10
SLIDE 10

WICSA Software Architecture Body of Knowledge

WICSA Wiki: Tools and/or techniques to recover architecture

In the past, most tools and techniques focused on recovering architectural information that can be extracted by statically analyzing the source code. Typically they extract module dependencies (abstracting from dependencies such as method invocation/procedure call) and relationships such as inheritance. – http://wwwp.dnsalias.org/wiki/RecoveryToolsTechniques

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 10

WICSA Software Architecture Body of Knowledge

WICSA Wiki: Tools and/or techniques to recover architecture In the past, most tools and techniques focused on recovering architectural information that can be extracted by statically analyzing the source code. Typically they extract module dependencies (abstracting from dependencies such as method invocation/procedure call) and relationships such as inheritance. – http://wwwp.dnsalias.org/wiki/RecoveryToolsTechniques

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived A quote from the WICSA Wiki WICSA Software Architecture Body of Knowledge

According to this body of knowledge, architecture reconstruction tools are focusing on extracting static information. They typically, extract module dependencies. While extraction is certainly a part of reverse engineering, reverse engineering is not just extraction. As I said, reverse engineering is about creating abstraction. Yet, even the extraction is more difficult than some people tend to think. Apparently, these people are somewhat mislead by available simpler commercial tools which in fact often do little more than semantic analysis (I am using this term in the meaning of compiler construction).

slide-11
SLIDE 11

Interrelationships

Graph of Direct Calls

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 11

Interrelationships

Graph of Direct Calls

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived Interrelationships Interrelationships

Here, for instance, you see direct calls between modules. Extracting such a call graph is relatively easy. And, hence, most tools give you this kind of information. Yet, such call graphs are typically incomplete because the tools do not handle such nasty things as indirect calls through function pointers or dispatching calls in object-oriented programs.

slide-12
SLIDE 12

Interrelationships

Graph of Direct and Indirect Calls

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 12

Interrelationships

Graph of Direct and Indirect Calls

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived Interrelationships Interrelationships

If you want to get the complete picture, you need pointer analysis. This figure now has this additional information. You can see, pointer analysis does matter. Unfortunately, pointer analysis is a complex and costly analysis and, hence, many tools out there ignore such calls. Furthermore, as we heard yesterday during the WCRE workshop on software security there are many tricky ways to manipulate control flow through malicious attacks. These things are extremely difficult to detect. You may argue that these kinds of problems occur only in malicious contexts. Yet, stupidity is able to simulate malice. But even in a friendly environment and advanced pointer analysis, you will still miss dependencies. Code can be used by calling it, or code can be used by simply copying the source text. Such copying creates code clones. Code cloning is not covered by call graphs; yet, they do create subtle dependencies.

slide-13
SLIDE 13

Interrelationships

Graph of Direct and Indirect Calls and Cloning

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 13

Interrelationships

Graph of Direct and Indirect Calls and Cloning

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived Interrelationships Interrelationships

Here, this figure now shows the additional dependencies via duplicated code. Let me hide the other dependencies so that you can apprehend the frequency of cloning.

slide-14
SLIDE 14

Interrelationships

Clone Relation

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 14

Interrelationships

Clone Relation

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived Interrelationships Interrelationships

Note that two fragments are considered clones here if they are at least 100 tokens long. So, the copying is quite substantial.

slide-15
SLIDE 15

Architecture Anecdote foo bar layer n layer n-1 layer n-2 bar mybar foo copy

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 15

Architecture Anecdote foo bar layer n layer n-1 layer n-2 bar mybar foo copy

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived Interrelationships Architecture Anecdote

We once analyzed the code of a company to demonstrate the capabilities of our tools. We found a complete function copy—a literal copy, no change at all except for the function’s name. We presented that to the development group and asked what that is supposed to mean. The group leader himself, finally, admitted that he had copied this function. He explained that he had a layered architecture and the function was in a layer that should not be accessed. For this reason, he simply copied the

  • function. Oh, well.
slide-16
SLIDE 16

Statement

We go beyond obvious interrelationships. Reverse engineering is all about . . . making the invisible visible and raising the level of abstraction.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 16

Statement We go beyond obvious interrelationships. Reverse engineering is all about . . . making the invisible visible and raising the level of abstraction.

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived Interrelationships

To sum it up, yes, reverse engineering attempts to find dependencies. But we strive for the non-obvious interrelationships. Our task is to uncover the invisible and to raise the level of abstraction.

slide-17
SLIDE 17

WICSA Software Architecture Body of Knowledge

WICSA Wiki: Tools and/or techniques to recover architecture

In the past, most tools and techniques focused on recovering architectural information that can be extracted by statically analyzing the source code. Typically they extract module dependencies (abstracting from dependencies such as method invocation/procedure call) and relationships such as inheritance. – http://wwwp.dnsalias.org/wiki/RecoveryToolsTechniques

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 17

WICSA Software Architecture Body of Knowledge

WICSA Wiki: Tools and/or techniques to recover architecture In the past, most tools and techniques focused on recovering architectural information that can be extracted by statically analyzing the source code. Typically they extract module dependencies (abstracting from dependencies such as method invocation/procedure call) and relationships such as inheritance. – http://wwwp.dnsalias.org/wiki/RecoveryToolsTechniques

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived A second take on the WICSA Wiki WICSA Software Architecture Body of Knowledge

Let me give you another example to make my point clear. The Wiki entry states that we extract inheritance relations. “Extracting the inheritance relationship is easy”, many people believe. Every UML tool solves that problem. But is it really? Syntactically it is, semantically it is not. What these UML tools give you is just syntactic inheritance. Imagine someone would collapse all your classes into

  • ne. Syntactically, you have just one class, where in fact you have many logical classes.
slide-18
SLIDE 18

Reverse Engineering Inheritance Hierarchies

Technique by Snelting and Tip (1998) reconstructs a precise inheritance hierarchy from the code irrespectively of the syntactic inheritance:

1 for every variable (local, global, heap, pointer) determine which class

attributes (data and function members) it actually uses

→ yields binary relation

2 apply formal concept analysis to this binary relation

→ resulting concept lattice describes the actual inheritance hierarchy

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 18

Reverse Engineering Inheritance Hierarchies

Technique by Snelting and Tip (1998) reconstructs a precise inheritance hierarchy from the code irrespectively of the syntactic inheritance:

1 for every variable (local, global, heap, pointer) determine which class

attributes (data and function members) it actually uses → yields binary relation

2 apply formal concept analysis to this binary relation

→ resulting concept lattice describes the actual inheritance hierarchy

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived A second take on the WICSA Wiki Reverse Engineering Inheritance Hierarchies

There is a technique by Snelting and Tip (1998) that reconstructs a precise inheritance hierarchy from the code irrespectively of the syntactic inheritance. The technique determines the required class attributes (i.e., data and function members) for every local, global, or heap variable that has a class type and every pointer that points to a class type. So, on one hand, you have all variables and on the other hand the class attributes they require. That obviously constitutes a binary relation. This binary relation is analyzed by a precise mathematical technique called formal concept analysis. The result is a lattice that describes the actual inheritance relation. It combines pointer analysis and formal concept analysis, two relatively complex techniques. That is what I call reverse engineering.

slide-19
SLIDE 19

WICSA Software Architecture Body of Knowledge

WICSA Wiki: Tools and/or techniques to recover architecture

In the past, most tools and techniques focused on recovering architectural information that can be extracted by statically analyzing the source code. Typically they extract module dependencies (abstracting from dependencies such as method invocation/procedure call) and relationships such as inheritance. – http://wwwp.dnsalias.org/wiki/RecoveryToolsTechniques

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 19

WICSA Software Architecture Body of Knowledge

WICSA Wiki: Tools and/or techniques to recover architecture In the past, most tools and techniques focused on recovering architectural information that can be extracted by statically analyzing the source code. Typically they extract module dependencies (abstracting from dependencies such as method invocation/procedure call) and relationships such as inheritance. – http://wwwp.dnsalias.org/wiki/RecoveryToolsTechniques

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived A third take on the WICSA Wiki WICSA Software Architecture Body of Knowledge

Back again to the quote of the WICSA Wiki. Somewhat between the lines, I read some disregard of static analysis. I am not sure whether that is really meant, but at least I know that some people in my own community believe that. As a matter of fact, there are many dynamic techniques used in reverse engineering. Yet, whether static or dynamic is not really the question. What the WICSA Wiki entry probably wants to express is that there is more to architecture than just static structure. We also need behavior. For static structure, static analysis is appropriate and sufficient. Many people believe dynamic analysis is the only way to behavior where in fact, dynamic analysis is just one way to get to behavioral information. Optimizing compilers know quite a lot about the behavior of a program. And they do perform static analysis.

slide-20
SLIDE 20

Dynamic analysis

Dynamic analysis is easy only at first sight; has disadvantages:

conclusions only valid w.r.t. input → at the level of architecture, we are generally interested in every possible behavior huge amount of data to be analyzed

more difficult than testing:

white-box testing may assume knowledge on system black-box testing has a specification in reverse engineering, we have neither specification nor prior knowledge

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 20

Dynamic analysis

Dynamic analysis is easy only at first sight; has disadvantages: conclusions only valid w.r.t. input → at the level of architecture, we are generally interested in every possible behavior huge amount of data to be analyzed more difficult than testing: white-box testing may assume knowledge on system black-box testing has a specification in reverse engineering, we have neither specification nor prior knowledge

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived A third take on the WICSA Wiki Dynamic analysis

For many people, dynamic analysis appears as a quick win. But dynamic analysis is easy only at first sight. Dynamic analysis is related to testing and shares the same disadvantages. All the conclusions you draw are valid

  • nly with respect to the given input. When it comes to architecture, however, we are generally interested in all

possible behavior. And dynamic analysis can create a huge amount of data if you instrument a program blindly. In fact, in the reverse engineering context, gathering dynamic information is even more difficult than in testing. In white-box testing, you may assume prior knowledge on the system. In black-box testing, you can start from a

  • specification. In reverse engineering, we have neither specification nor prior knowledge.
slide-21
SLIDE 21

Static Analysis

compiler domain reverse engineering scope compilation unit whole program; system of programs interaction batch interactive handling uncertainty pessimistic

  • ptimistic/speculative??

additional problems: preprocessor directives multi-language systems incomplete code

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 21

Static Analysis

compiler domain reverse engineering scope compilation unit whole program; system of programs interaction batch interactive handling uncertainty pessimistic

  • ptimistic/speculative??

additional problems: preprocessor directives multi-language systems incomplete code

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived A third take on the WICSA Wiki Static Analysis

Yet, static analysis is neither simple. Fortunately, we can leverage many advanced techniques from the compiler

  • community. But unlike compilers, we generally need whole-program analyses. But such analyses are pretty

expensive and our programs tend to be huge, often written in multiple and awkward languages. In reverse engineering, the user is usually in the loop. So, our analyses are typically interactive. Another difference to compilers is the question, how conservative we must be in reverse engineering. We all agree that compiler should make conservative assumptions in case of uncertainty. Because of exaggerated pessimism a compiler may refuse to make a certain optimization. The cost, then, is a somewhat slower program. In our context, the costs are much higher. Pessimism may result in massive false positives to the extent of uselessness to an analyst. For me, it is still an open question how much optimism we can and should afford. Then, there are additional problems a compiler does not care about. Consider, for instance, Ira Baxter’s (Semantic Designs) ambition to analyze the non-preprocessed code. The syntax tree may be completely different depending

  • n how certain macros are expanded. Yet, if you do code transformation in the reengineering context, you must

analyze all possible configurations of the source text. A compiler doesn’t care about such nasty things.

slide-22
SLIDE 22

Analyzability

Statement

Analyzability is a key quality of software systems. Analyzability falls victim to flexibility these days.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 22

Analyzability

Statement Analyzability is a key quality of software systems. Analyzability falls victim to flexibility these days.

2005-11-10

Architecture Reconstruction How architecture reconstruction is perceived A third take on the WICSA Wiki Analyzability

This dilemma leads us to the question of software quality. There are many quality factors for an architecture. One

  • f them is analyzability. Current movements in Java, for instance, such as loading classes at runtime or

modifications and control flow through reflection, remind me of the days of self-modifying code. They make me

  • nervous. How will we maintain these dynamic systems?
slide-23
SLIDE 23

WICSA Software Architecture Body of Knowledge

WICSA Wiki: Tools and/or techniques to recover architecture

In the past, most tools and techniques focused on recovering architectural information that can be extracted by statically analyzing the source code. Typically they extract module dependencies (abstracting from dependencies such as method invocation/procedure call) and relationships such as inheritance. Many types of relationships cause dependencies among software, each of which has an impact on determining the architecture of the system; so much work in this area remains to be done. – http://wwwp.dnsalias.org/wiki/RecoveryToolsTechniques

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 23

WICSA Software Architecture Body of Knowledge

WICSA Wiki: Tools and/or techniques to recover architecture In the past, most tools and techniques focused on recovering architectural information that can be extracted by statically analyzing the source code. Typically they extract module dependencies (abstracting from dependencies such as method invocation/procedure call) and relationships such as inheritance. Many types of relationships cause dependencies among software, each of which has an impact on determining the architecture of the system; so much work in this area remains to be done. – http://wwwp.dnsalias.org/wiki/RecoveryToolsTechniques

2005-11-10

Architecture Reconstruction Viewpoints WICSA Software Architecture Body of Knowledge

The last statement in the WICSA Wiki entry states that there are many types of relationships that cause dependencies among software, each of which has an impact on determining the architecture of the system; so much work in this area remains to be done. Architecture is a summary of different views. These views address different concerns and, hence, consist of different entities and relationships.

slide-24
SLIDE 24

View and Viewpoints IEEE P1471 (2000)

Definition

A view is a representation of a whole system from the perspective of a related set of concerns.

Definition

A viewpoint specifies the kind of information that can be put in a view.

function calls

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 24

View and Viewpoints IEEE P1471 (2000)

Definition A view is a representation of a whole system from the perspective of a related set of concerns. Definition A viewpoint specifies the kind of information that can be put in a view.

function calls

2005-11-10

Architecture Reconstruction Viewpoints Views and Viewpoints View and Viewpoints IEEE P1471 (2000)

Architecture reconstruction creates architectural views for existing systems. But what is a view at all? One of the achievements of the IEEE P1471 is the definition of views and viewpoints. A view is a representation of a whole system from the perspective of a related set of concerns. Here, for instance, you see a part of the call graph of jikes, the IBM compiler for Java. Isn’t that a beautiful picture? The professor next to my door in Bremen is a pioneer in computer arts. He is creating arts through algorithms. In my group, we are not just using software to create arts; software itself becomes art. Such views are formalized through viewpoints. A viewpoint specifies the kind of information that can be put in a

  • view. A call graph viewpoint can be modeled by this UML diagram, for instance.

Reverse engineers: a viewpoint is nothing else than what we call the schema.

slide-25
SLIDE 25

Viewpoints in Forward Engineering

Zachman Perry and Wolfe 4+1 Siemens Clements et al. . . .

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 25

Viewpoints in Forward Engineering

Zachman Perry and Wolfe 4+1 Siemens Clements et al. . . .

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints in Forward Engineering Viewpoints in Forward Engineering

Viewpoints are very popular in forward engineering. You likely know these. Zachman was one of the first authors

  • n viewpoints. He proposed 6 × 6 different viewpoints. Perry and Wolfe proposed a simplified version of these

views, distinguishing only three viewpoints. Then you have the 4+1 viewpoints by Philippe Kruchten, you have the four Siemens viewpoints, et cetera. The number of viewpoints is confusing, in particular, because many of them are very similar. Recently, the book by Clements and colleagues brought some order to this sea of viewpoints.

slide-26
SLIDE 26

Viewpoints Categorization by Clements, Bachmann, Bass, Garlan, Ivers, Little, Nord, and Stafford (2002)

M: module

decomposition use generalization layers

CC: component & connectors

pipe and filter shared data publish and subscribe client server peer-to-peer communicating processes

A: allocation

deployment implementation work assignment

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 26

Viewpoints Categorization by Clements, Bachmann, Bass, Garlan, Ivers, Little, Nord, and Stafford (2002)

M: module decomposition use generalization layers CC: component & connectors pipe and filter shared data publish and subscribe client server peer-to-peer communicating processes A: allocation deployment implementation work assignment

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints in Forward Engineering Viewpoints Categorization by Clements et al. (2002)

Here, you see their categories of viewpoints. Module viewpoints show static structure and describe the decomposition, layering, and generalization of modules and their use dependencies. A module is a code unit that implements a set of responsibilities. Component-and-connector viewpoints express runtime behavior described in terms of components and connectors. A component is one of the principal processing units of the executing system; a connector is an interaction mechanism for the components. Allocation viewpoints describe mappings of software units to elements of the environment (the hardware, the file systems, or the development team). The interesting question now is which viewpoints do we as reverse engineers address?

slide-27
SLIDE 27

Literature Scope

Org. Title Years IEEE WCRE 1995–2003 IEEE ICSM 1995–2003 IEEE IWPC 1998–2003 IEEE CSMR 1997–2004 IEEE ToSE 1995–2004 IEEE WICSA 2001, 2004 IEEE VISOFT 2002 ACM ICSE 1987–2004 ACM PASTE 1998–1999, 2000–2001 ACM TOSEM 1992–2003 ACM SIGSOFT 1990–2003 Wiley JSME 2001–2004

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 27

Literature Scope

Org. Title Years IEEE WCRE 1995–2003 IEEE ICSM 1995–2003 IEEE IWPC 1998–2003 IEEE CSMR 1997–2004 IEEE ToSE 1995–2004 IEEE WICSA 2001, 2004 IEEE VISOFT 2002 ACM ICSE 1987–2004 ACM PASTE 1998–1999, 2000–2001 ACM TOSEM 1992–2003 ACM SIGSOFT 1990–2003 Wiley JSME 2001–2004

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints Addressed in Reverse Engineering Literature Scope

I conducted a comprehensive literature survey to answer that question. I browsed virtually all conference proceedings and journals related to reverse engineering systematically. WICSA is, of course, among these.

slide-28
SLIDE 28

Cat. Style Content #Publ. decomposition part-of 43 feature location implements 16 design patterns element participates-in pattern 12 class diagrams association, aggregation 10 M conformance conforms-to, deviates-from 7 interfaces requires, provides 3 use cases implemented-by 2 configuration varies-with 2 class hierarchies inherits, attribute-of, method-of 2

  • bject interaction

interacts-with 12 process interaction interacts-with 10 CC component interaction interacts-with 3 conceptual viewpoint implemented-by 3

  • bject traces

applied operations 2 responsibilty responsible-for 1 A build process generated-by 1 files described-in, stored-in 1 – view integration element corresponds-to Element 5

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 28

Cat. Style Content #Publ. decomposition part-of 43 feature location implements 16 design patterns element participates-in pattern 12 class diagrams association, aggregation 10 M conformance conforms-to, deviates-from 7 interfaces requires, provides 3 use cases implemented-by 2 configuration varies-with 2 class hierarchies inherits, attribute-of, method-of 2

  • bject interaction

interacts-with 12 process interaction interacts-with 10 CC component interaction interacts-with 3 conceptual viewpoint implemented-by 3

  • bject traces

applied operations 2 responsibilty responsible-for 1 A build process generated-by 1 files described-in, stored-in 1 – view integration element corresponds-to Element 5

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints Addressed in Reverse Engineering

I related reverse engineering papers that address architectural views to the category by Clements and colleagues. The vast majority of research papers is devoted to find modules or subsystems, generally through software

  • clustering. These techniques indeed typically extract dependencies and try to find cohesive elements.

There is also a bunch of papers that try to locate functionality in code. Such techniques establish traceability links between requirements, architecture, and code. ICSM with its many satellite workshops this year has seen quite a few new papers on that issue. Further momentum is gathered through techniques that try to identify aspects. Also, design patterns are being searched for quite frequently. It is interesting to note that we have seen a similar movement in the 90ies. It was called plan or cliche recognition. They looked for code idioms. Moreover, various heuristics are investigated to distinguish aggregation, composition, and general association in reverse engineering UML class diagrams from source code. And there are a few more papers on checking architecture conformance. Typically, however, these are structural checks such as the reflection model. Then, there are papers that try to find component-and-connector viewpoints. Interestingly enough, dynamic analyses are not dominating this area of research, although you indeed find much more than for module viewpoints. They differ in the types of runtime entities for which they reveal interaction patterns. Finally, you get only very few papers that try to embed software into its environment. One one them clusters modules according to their ownership, the so-called ownership architecture. Mike Godfrey describes ways to model and reconstruct the build architecture. Relating software to its environment is either not a popular topic or too difficult. In addition to that, there are a few papers that try to find links between different viewpoints. Such viewpoints could be considered meta viewpoints as they map views onto each other. That is why they do not fit into the category by Clements and colleagues.

slide-29
SLIDE 29

Cat. Style Content #Publ. decomposition part-of 43 feature location implements 16 design patterns element participates-in pattern 12 class diagrams association, aggregation 10 M conformance conforms-to, deviates-from 7 interfaces requires, provides 3 use cases implemented-by 2 configuration varies-with 2 class hierarchies inherits, attribute-of, method-of 2

  • bject interaction

interacts-with 12 process interaction interacts-with 10 CC component interaction interacts-with 3 conceptual viewpoint implemented-by 3

  • bject traces

applied operations 2 responsibilty responsible-for 1 A build process generated-by 1 files described-in, stored-in 1 – view integration element corresponds-to Element 5

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints Addressed in Reverse Engineering

module viewpoint decomposition Andritsos and Tzerpos (2003); Anquetil and Lethbridge (1998); Baniassad and Murphy (1998); Bauer and Trifu (2004); Bojic and Velasevic (2000); Chiricota, Jourdan, and Melan¸ con (2003); van Deursen and Kuipers (1999); Embley and Woodfield (1988); Girard and Koschke (1997); Girard, Koschke, and Schied (1997); Koschke and Eisenbarth (2000); Krikhaar (1997); Lakhotia and Gravley (1995); Lindig and Snelting (1997); Lung (1998); Tzerpos and Holt (1999); Mahdavi, Harman, and Hierons (2003); Mancoridis and Holt (1996); Mancoridis, Mitchell, Rorres, Chen, and Gansner (1998); Maqbool and Babri (2004); Mitchell and Mancoridis (2001); M¨ uller and Klashinsky (1985); M¨ uller, Tilley, Orgun, Corrie, and Madhavji (1992); de Oca and Carver (1998); Rayside, Reuss, Hedges, and Kontogiannis (2000); Saeed, Maqbool, Babri, Hassan, and Sarwar (2003); Sartipi, Kontogiannis, and Mavaddat (2000a); Sartipi and Kontogiannis (2001, 2003); Sartipi, Kontogiannis, and Mavaddat (2000b); Sartipi (2001); Schwanke (1992); Shokoufandeh, Mancoridis, and Maycock (2002); Siff and Reps (1997, 1999); Tonella (2001); Tzerpos and Holt (2000); Tzerpos (1997); Wen and Tzerpos (2003); Abreu, Pereira, and Sousa (2000); Gall and Kl¨

  • sch (1995); Han, Hofmeister, and Nord (2003); Mendonca and Kramer (1998)

feature location Chan, Liang, and Michail (2003); Chen and Rajlich (2000); Deprez and Lakhotia (2000); Egyed (2001); Eisenbarth, Koschke, and Simon (2001b,a, 2002a, 2001c, 2003); Lukoit, Wilde, Stowell, and Hennessey (2000); Marcus and Maletic (2003); Murphy, Lai, Walker, and Robillard (2001a); Pashov, Riebisch, and Philippow (2004); Wilde and Scully (1995); Wilde, Buckellew, Page, and Rajlich (2001); Zhao, Zhang, Liu, Sun, and Yang (2004) design patterns Antoniol, Fiutem, and Cristoforetti (1998); Asencio, Cardman, Harris, and Laderman (2002); Balanyi and Ferenc (2003); Heuzeroth, Holl, H¨

  • gstr¨
  • m, and L¨
  • we (2003); Keller, Schauer, Robitaille, and Page

(1999); Kramer and Prechelt (1996); Michail (2000); Niere, Sch¨ afer, Wadsack, Wendehals, and Welsh (2002); Niere, Wadsack, and Wendehals (2003); Seemann and von Gudenberg (1998); Tonella and Antoniol (1999, 2001)

Cat. Style Content #Publ. decomposition part-of 43 feature location implements 16 design patterns element participates-in pattern 12 class diagrams association, aggregation 10 M conformance conforms-to, deviates-from 7 interfaces requires, provides 3 use cases implemented-by 2 configuration varies-with 2 class hierarchies inherits, attribute-of, method-of 2

  • bject interaction

interacts-with 12 process interaction interacts-with 10 CC component interaction interacts-with 3 conceptual viewpoint implemented-by 3

  • bject traces

applied operations 2 responsibilty responsible-for 1 A build process generated-by 1 files described-in, stored-in 1 – view integration element corresponds-to Element 5

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints Addressed in Reverse Engineering

class diagrams Egyed (2002); Jackson and Waingold (1999, 2001); Milanova, Rountev, and Ryder (2002); Richner and Ducasse (2002, 1999); Riva and Rodriguez (2002); Subramaniam and Byrne (1996); Tonella and Potrich (2003); Yeh and Kuo (2002) conformance Aldrich, Chambers, and Notkin (2002); Gannod and Murthy (2003); Koschke and Simon (2003); Murphy, Notkin, and Sullivan (1995, 2001b); R¨

  • tschke and Krikhaar (2002); Tvedt, Costa, and Lindvall (2002)

interfaces Mancoridis (1996); Whaley, Martin, and Lam (2002); Viljamaa (2003) use cases Lucca, Fasolino, and Carlini (2000); El-Ramly, Stroulia, and Sorenson (2002) configuration Krone and Snelting (1994); Snelting (1996) class hierarchies Dekel and Gil (2003); Snelting and Tip (1998)

slide-30
SLIDE 30

Cat. Style Content #Publ. decomposition part-of 43 feature location implements 16 design patterns element participates-in pattern 12 class diagrams association, aggregation 10 M conformance conforms-to, deviates-from 7 interfaces requires, provides 3 use cases implemented-by 2 configuration varies-with 2 class hierarchies inherits, attribute-of, method-of 2

  • bject interaction

interacts-with 12 process interaction interacts-with 10 CC component interaction interacts-with 3 conceptual viewpoint implemented-by 3

  • bject traces

applied operations 2 responsibilty responsible-for 1 A build process generated-by 1 files described-in, stored-in 1 – view integration element corresponds-to Element 5

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints Addressed in Reverse Engineering

component&connector

  • bject interaction Pauw, Jensen, Mitchell, Sevitsky, Vlissides, and Yang (2001); Briand, Labiche, and Miao (2003);

Jerding, Stasko, and Ball (1997); Jerding and Rugaber (1997); Kollmann and Gogolla (2001); Krikhaar, Feijs, de Jong, and Medema (1999); Syst¨ a, Koskimies, and M¨ uller (2001); Syst¨ a (2000); Syst¨ a (1999); Souder, Mancoridis, and Salah (2001); Yan, Garlan, Schmerl, rich, and Kazman (2004); Wu, Hassan, and Holt (2002) process interaction Chase, Christey, Harris, and Yeh (1998b,a); Fiutem, Tonella, Antoniol, and Merlo (1996); Harris, Reubenstein, and Yeh (1995, 1996); Holtzblatt, Piazza, Reubenstein, Roberts, and Harris (1997); Pinzger and Gall (2002); Tonella, Fiutem, Antoniol, and Merlo (1996); Han et al. (2003); Mendonca and Kramer (1998) component interaction Ivkovic and Godfrey (2002); Marburger and Herzberg (2001); Moe and Carr (2001) conceptual viewpoint Biggerstaff, Mitbander, and Webster (1993); Gall, Jazayeri, Kl¨

  • sch, Lugmayr, and

Trausmuth (1996); Han et al. (2003)

  • bject traces Eisenbarth, Koschke, and Vogel (2005, 2002b)

allocation responsibilty Bowman and Holt (1999) build process Tu and Godfrey (2001) files Han et al. (2003) view integration Chase, Harris, Roberts, and Yeh (1996); Issarny, Saridakis, and Zarras (1998); Kazman and Carriere (1998); Waters and Abowd (1999); Yeh, Harris, and Chase (1997)

Cat. Style Content #Publ. decomposition part-of 43 feature location implements 16 design patterns element participates-in pattern 12 class diagrams association, aggregation 10 M conformance conforms-to, deviates-from 7 interfaces requires, provides 3 use cases implemented-by 2 configuration varies-with 2 class hierarchies inherits, attribute-of, method-of 2

  • bject interaction

interacts-with 12 process interaction interacts-with 10 CC component interaction interacts-with 3 conceptual viewpoint implemented-by 3

  • bject traces

applied operations 2 responsibilty responsible-for 1 A build process generated-by 1 files described-in, stored-in 1 – view integration element corresponds-to Element 5

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints Addressed in Reverse Engineering

slide-31
SLIDE 31

Statement

There is more to architecture reconstruction than just static structure.

Statement

Homework for reverse engineers: describe your reconstructed views through viewpoints for the reasons of formalization and comparison composability and interoperability

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 29

Statement There is more to architecture reconstruction than just static structure. Statement Homework for reverse engineers: describe your reconstructed views through viewpoints for the reasons of formalization and comparison composability and interoperability

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints Addressed in Reverse Engineering

There is more to architecture reconstruction than just static structure. This statement is ambiguous intentionally. I hope that I have demonstrated that there is quite some work that goes beyond just structure. On the other hand, it is true that most work is devoted to static structure. The forward engineers have done their homework and cleared the mess of views. They also have introduced the notion viewpoint. We have already started to model the data we are working on or which we produce. We simply use the term schema instead of viewpoint. But we have still a long way to go. Phillip Newcomb will report on an OMG initiative to give standard means to model reverse engineering data. The advantage of viewpoints or schemas is obvious. We could better compare our work and it would help us composing techniques and make our tools interoperable.

slide-32
SLIDE 32

Statement

We need a viewpoint catalog: concern → viewpoints → (re-)construction techniques

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 30

Statement We need a viewpoint catalog: concern → viewpoints → (re-)construction techniques

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints Addressed in Reverse Engineering

My vision is a catalog of viewpoints. Some kind of cookbook. Where you could look up concerns, the viewpoints that address these concerns, and associate construction or reconstruction techniques.

slide-33
SLIDE 33

Composing Architecture Reconstruction Through Views

  • utput view

input view 1 input view 2 input view 3

– van Deursen, Hofmeister, Koschke, Moonen, and Riva (2004)

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 31

Composing Architecture Reconstruction Through Views

  • utput view

input view 1 input view 2 input view 3

– van Deursen, Hofmeister, Koschke, Moonen, and Riva (2004)

2005-11-10

Architecture Reconstruction Viewpoints Viewpoints Addressed in Reverse Engineering Composing Architecture Reconstruction Through Views

My vision is to compose reconstruction techniques like a puzzle. You model the view that is reconstructed as a

  • viewpoint. Likewise, you model the data on which the reconstruction is based. And you have various different

techniques each specified alike. The modelling allows you to decide whether two techniques are compatible and whether you can compose them to a reconstruction pipeline.

slide-34
SLIDE 34

Example

Task: Extract and reuse an implementation for a given set of features. Which code implements these features specifically?

→ feature location

What is needed?

→ program slicing

How do you modularize the extracted code?

→ software clustering

How do you use the new modules?

→ protocol reconstruction

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 32

Example

Task: Extract and reuse an implementation for a given set of features. Which code implements these features specifically? → feature location What is needed? → program slicing How do you modularize the extracted code? → software clustering How do you use the new modules? → protocol reconstruction

2005-11-10

Architecture Reconstruction Pipeline of Reconstruction Steps Example

I’d like to demonstrate how such a combination of technique could look like. Assume that we want to extract and reuse an implementation for a given set of features. To this end, you need to know which code implements these features specifically, what is needed by that code, how you modularize the extracted code, and how you use the new modules? For each of this question, we have developed techniques. Please note in the following examples that they are small and simple enough to fit on a slide. These techniques have been applied to large systems successfully.

slide-35
SLIDE 35

Step 1 (Eisenbarth et al., 2003)

Which code implements these features specifically?

program instrumented program instrumentation program test case t1 profile t1 instrumented program instrumentation program u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program test case t3 profile t3 X X test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program features f1 f2 f3 X X X X X test case t3 profile t3 X X test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program formal concept analysis f3 u3 u4 t3 f2 u2 t2 f1 u1 t1 [Birkhoff, 1940] features f1 f2 f3 X X X X X test case t3 profile t3 X X test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program feature−specific units common units common features feature implication formal concept analysis f3 u3 u4 t3 f2 u2 t2 f1 u1 t1 [Birkhoff, 1940] features f1 f2 f3 X X X X X test case t3 profile t3 X X test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program static program slicing feature−specific units common units common features feature implication formal concept analysis f3 u3 u4 t3 f2 u2 t2 f1 u1 t1 [Birkhoff, 1940] features f1 f2 f3 X X X X X test case t3 profile t3 X X test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 33

Step 1 (Eisenbarth et al., 2003)

Which code implements these features specifically?

program instrumented program instrumentation program test case t1 profile t1 instrumented program instrumentation program u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program test case t3 profile t3 X X test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program features f1 f2 f3 X X X X X test case t3 profile t3 X X test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program formal concept analysis f3 u3 u4 t3 f2 u2 t2 f1 u1 t1 [Birkhoff, 1940] features f1 f2 f3 X X X X X test case t3 profile t3 X X test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program feature−specific units common units common features feature implication formal concept analysis f3 u3 u4 t3 f2 u2 t2 f1 u1 t1 [Birkhoff, 1940] features f1 f2 f3 X X X X X test case t3 profile t3 X X test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program static program slicing feature−specific units common units common features feature implication formal concept analysis f3 u3 u4 t3 f2 u2 t2 f1 u1 t1 [Birkhoff, 1940] features f1 f2 f3 X X X X X test case t3 profile t3 X X test case t2 profile t2 X X X u1 u2 u3 u4

  • exec. units

X X X test case t1 profile t1 instrumented program instrumentation program

2005-11-10

Architecture Reconstruction Pipeline of Reconstruction Steps Feature Location Step 1 (Eisenbarth et al., 2003)

Feature location attempts to identify the pieces of the code that specifically implement a given set of features. Our technique for feature location is a combination of dynamic and static analysis using formal concept analysis to factor out the specific code from the code executed for test cases that invoke the features. For details of the feature location technique, see our paper (Eisenbarth et al., 2003).

slide-36
SLIDE 36

Step 2 (Weiser, 1984)

What is needed?

procedure Foo (sum , prod :

  • ut

i n t ; n : i n i n t ) i s i : i n t ; begin i := 1; sum := 0; prod := 1; w h i l e i <= n loop sum := sum + i ; prod := prod ∗ i ; i := i + 1; end loop ; end Foo ;

relevant/specific functions formal concept analysis calls

  • instr. code

source profiler instrument. features system dependency graph (SDG) control flow analysis indirect references control flow graph analysis dependency control slicing data flow

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 34

Step 2 (Weiser, 1984)

What is needed?

procedure Foo (sum , prod :

  • ut

i n t ; n : i n i n t ) i s i : i n t ; begin i := 1; sum := 0; prod := 1; w hi l e i <= n loop sum := sum + i ; prod := prod ∗ i ; i := i + 1; end loop ; end Foo ; relevant/specific functions formal concept analysis calls

  • instr. code

source profiler instrument. features system dependency graph (SDG) control flow analysis indirect references control flow graph analysis dependency control slicing data flow

2005-11-10

Architecture Reconstruction Pipeline of Reconstruction Steps Program Slicing Step 2 (Weiser, 1984)

Once we have identified the specific code, we want to extract it. The specific code requires supporting code which we need to extract as well. But we want to extract only that code that is really necessary. Program slicing is a technique that gives you this code by following control and data dependencies. Program slicing works for large programs. There are even commercial tools available, as – for instance – CodeSurfer by GrammaTech. I illustrate program slicing by a simple example. Let us assume, we only need to compute parameter sum of procedure Foo.

slide-37
SLIDE 37

Step 2

What is needed?

procedure Foo (sum , prod :

  • ut

i n t ; n : i n i n t ) i s i : i n t ; begin i := 1; sum := 0; prod := 1; w h i l e i <= n loop sum := sum + i ; prod := prod * i; i := i + 1; end loop ; end Foo ;

relevant/specific functions formal concept analysis calls

  • instr. code

source profiler instrument. features system dependency graph (SDG) control flow analysis indirect references control flow graph analysis dependency control slicing data flow

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 35

Step 2

What is needed?

procedure Foo (sum , prod :

  • ut

i n t ; n : i n i n t ) i s i : i n t ; begin i := 1; sum := 0; prod := 1; w hi l e i <= n loop sum := sum + i ; prod := prod * i; i := i + 1; end loop ; end Foo ; relevant/specific functions formal concept analysis calls

  • instr. code

source profiler instrument. features system dependency graph (SDG) control flow analysis indirect references control flow graph analysis dependency control slicing data flow

2005-11-10

Architecture Reconstruction Pipeline of Reconstruction Steps Program Slicing Step 2

Program slicing determines all statements necessary to compute this values. These are the statements sum is control or data dependent on. The unnecessary code is grayed out in the example.

slide-38
SLIDE 38

Step 3

How do you modularize the extracted code?

0.9 0.8 0.7 0.4 0.6 0.5 0.3 0.2 0.1 0.0

a b c d e f g h i j k

relevant/specific functions formal concept analysis calls

  • instr. code

source profiler instrument. features system dependency graph (SDG) control flow analysis indirect references control flow graph analysis dependency control slicing data flow sliced SDG functions clustering modules

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 36

Step 3

How do you modularize the extracted code?

0.9 0.8 0.7 0.4 0.6 0.5 0.3 0.2 0.1 0.0 a b c d e f g h i j k

relevant/specific functions formal concept analysis calls

  • instr. code

source profiler instrument. features system dependency graph (SDG) control flow analysis indirect references control flow graph analysis dependency control slicing data flow sliced SDG functions clustering modules

2005-11-10

Architecture Reconstruction Pipeline of Reconstruction Steps Software Clustering Step 3

Now that we extracted the code, we might need to refactor it a bit. Software clustering is a technique that groups related elements together. You simply need to define a similarity function that determines how related two entities are and then a clustering algorithm clusters these elements for you. The available techniques differ in their underlying similarity function and in the algorithm used for clustering. There are various algorithms. Some of them give you just flat groups, some of them even decomposition trees – also known as dendrograms – as shown in this example.

slide-39
SLIDE 39

Step 4 (Eisenbarth, Koschke, and Vogel, 2005)

00 int main () { 01 int i = 0; 02 Stack *s1 = init (); 03 Stack *s2 = readFromFile (); 04 reverse (s2, s1); 05 do 06 { pop(s1); 07 i = i + 1; } 08 while (!is empty (s1)); 09 } 10 void reverse(Stack *from, Stack *to) 11 { 12 while (!is empty (from)) 13 push (to, pop (from)); 14 } typedef . . . Stack; Stack *create (); void init (Stack *stack); void push (Stack *stack, Item i); Item pop (Stack *stack); Item top (Stack *stack); int is empty (Stack *stack); void release(Stack *stack);

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 37

Step 4 (Eisenbarth, Koschke, and Vogel, 2005)

00 int main () { 01 int i = 0; 02 Stack *s1 = init (); 03 Stack *s2 = readFromFile (); 04 reverse (s2, s1); 05 do 06 { pop(s1); 07 i = i + 1; } 08 while (!is empty (s1)); 09 } 10 void reverse(Stack *from, Stack *to) 11 { 12 while (!is empty (from)) 13 push (to, pop (from)); 14 } typedef . . . Stack; Stack *create (); void init (Stack *stack); void push (Stack *stack, Item i); Item pop (Stack *stack); Item top (Stack *stack); int is empty (Stack *stack); void release(Stack *stack);

2005-11-10

Architecture Reconstruction Pipeline of Reconstruction Steps Protocol Reconstruction Step 4 (Eisenbarth, Koschke, and Vogel, 2005)

Finally, we want to know how the extracted component can actually be used. That is, we want to know the underlying protocol of the component. There is an active research on how to tackle that problem. One way is to look for examples of usage of a component that can be derived from the code. A component can be instantiated multiple times at runtime. So we need an extraction that is based on each instance. Let us assume, we want to know how this stack component is supposed to be used (again, this example is simple enough to fit on a slide). You can declare many variables of this abstract data type. One aspect of you are interested in is the question about the allowable sequence of operations on a particular instance, or object. There is a technique called object tracing that gives you the sequences of operations for individual objects. There are static and dynamic angles from which we can tackle this problem. Static object tracing retrieves this information from the code itself. It returns an object process graph that is a finite description of every possible sequence of operation for a particular object – in our example for variable s1. In essence, an object process graph is a projection of the control flow graph that contains only the operations applied to the object and the predicates on which these are dependent. Details can be found in our publications (Eisenbarth et al., 2002b, 2005). We have recently developed a dynamic analysis as well.

slide-40
SLIDE 40

Static Object Tracing

00 int main () { 01 int i = 0; 02 Stack *s1 = init (); 03 Stack *s2 = readFromFile (); 04 reverse (s2, s1); 05 do 06 { pop(s1); 07 i = i + 1; } 08 while (!is empty (s1)); 09 } 10 void reverse(Stack *from, Stack *to) 11 { 12 while (!is empty (from)) 13 push (to, pop (from)); 14 }

init

CALL reverse

pop is_empty push

RETURN ENTRY

reverse main

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 38

Static Object Tracing

00 int main () { 01 int i = 0; 02 Stack *s1 = init (); 03 Stack *s2 = readFromFile (); 04 reverse (s2, s1); 05 do 06 { pop(s1); 07 i = i + 1; } 08 while (!is empty (s1)); 09 } 10 void reverse(Stack *from, Stack *to) 11 { 12 while (!is empty (from)) 13 push (to, pop (from)); 14 }

init

CALL reverse

pop is_empty push

RETURN ENTRY reverse main

2005-11-10

Architecture Reconstruction Pipeline of Reconstruction Steps Protocol Reconstruction Static Object Tracing

This figure now shows you the object process graph for variable s1.

slide-41
SLIDE 41

Protocol Reconstruction

init push pop is_empty instance 1 init is_empty pop push instance n ... ... init push is_empty pop protocol t * black−box unification empty non−empty white−box analysis push init pop [!is_empty] is_empty pop [is_empty] is_empty

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 39

Protocol Reconstruction

init push pop is_empty instance 1 init is_empty pop push instance n ... ... init push is_empty pop protocol t * black−box unification empty non−empty white−box analysis push init pop [!is_empty] is_empty pop [is_empty] is_empty

2005-11-10

Architecture Reconstruction Pipeline of Reconstruction Steps Protocol Reconstruction Protocol Reconstruction

Once you have identified the object process graphs for all instances of a component, you can unify them. Assuming that each of them represents correct behavior, the unification represents at least a subset of the protocol. A user can complete the protocol through his or her domain knowledge. Additionally, you can look into the component’s implementation itself. There is a technique that collects the predicates on paths to location at which an exception is raised. These predicates constitute a precondition. Such information is typically available in defensive programming styles.

slide-42
SLIDE 42

All Steps

relevant/specific functions formal concept analysis calls

  • instr. code

source profiler instrument. features system dependency graph (SDG) control flow analysis indirect references control flow graph analysis dependency control slicing data flow sliced SDG functions clustering modules

  • bject tracing
  • bject process graph

protocol reconstruction pointer protocol

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 40

All Steps

relevant/specific functions formal concept analysis calls

  • instr. code

source profiler instrument. features system dependency graph (SDG) control flow analysis indirect references control flow graph analysis dependency control slicing data flow sliced SDG functions clustering modules

  • bject tracing
  • bject process graph

protocol reconstruction pointer protocol

2005-11-10

Architecture Reconstruction Pipeline of Reconstruction Steps All Steps All Steps

Finally, we put all this together. This overview shows the sequence of steps to solve our original problem. The solution is a successful interplay various existing techniques.

slide-43
SLIDE 43

Dagstuhl Software Architecture: Recovery and Modelling 2003

Brainstorming session on open research problems

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 41

Dagstuhl Software Architecture: Recovery and Modelling 2003

Brainstorming session on open research problems

2005-11-10

Architecture Reconstruction Open Research Problems Dagstuhl Software Architecture: Recovery and Modelling 2003

So, what remains to be done beyond what I have already said? In 2003, Arie van Deursen, Rick Kazman, and I

  • rganized the Dagstuhl seminar Software Architecture: Recovery and Modelling 2003.

In one of the session we brainstormed open research problems. Here is an excerpt of this session.

slide-44
SLIDE 44

Open Research Problems

clustering

“semantic” clustering mapping results of clustering to as-intended architectures

combination of techniques

reconciling different recovery techniques exchange format with semantics

identifying concerns and the code fragments affected by them adoption

“make money with reverse engineering in industry” “how to become a millionaire using reverse engineering?”

management issues

realistic process models for applying reverse engineering tools cost models for reverse engineering

architecture conformance checking analysis of dynamically reconfigurable systems architectural refactoring

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 42

Open Research Problems

clustering “semantic” clustering mapping results of clustering to as-intended architectures combination of techniques reconciling different recovery techniques exchange format with semantics identifying concerns and the code fragments affected by them adoption “make money with reverse engineering in industry” “how to become a millionaire using reverse engineering?” management issues realistic process models for applying reverse engineering tools cost models for reverse engineering architecture conformance checking analysis of dynamically reconfigurable systems architectural refactoring

2005-11-10

Architecture Reconstruction Open Research Problems Open Research Problems

We have proposed many clustering techniques in our research. They are all more or less based on structural

  • information. Some of them take additionally similarity of identifiers into account. Usually, they try to cluster

according to the notions of cohesion and coupling. But these are not necessarily the human criteria for groupings. Humans cluster things because they are semantically related. Yet, it is difficult to extract this kind of semantics from code. We cannot perform miracles. Moreover, it is an open issue on how we relate these clusters to a model of the intended architecture if we have

  • ne. At WCRE 2005, I am presenting a paper that combines clustering and the reflection method along these lines.

The combination of techniques, I have already mentioned. Viewpoints would be a steppingstone. Identification of aspects is another issue. As opposed to normal feature location, aspects – by definition – are cross-cutting concerns and cannot be mapped onto a single element. Then we have the adoption problem. It would certainly be nice to become a millionaire. But I would already be happy if just a few more people actually use our research tools There are also open management issues such as suitable process models and cost estimation. I think that is a completely underdeveloped area. The only one who seem to care is Harry Sneed who is presenting a paper on this issue at this conference. Architecture conformance checking I have already mentioned. And last but not least, how do we analyze these dynamically reconfigurable systems that will flood us in the near future? There is one more item that I am adding myself. It does not stem from the brain-storming session. We all know the book by Fowler on refactoring. Could something similar be written for architectural refactorings? What are the bad smells at that level? What are the transformations to cure them? In the end, aren’t architectural refactorings any different from code refactorings?

slide-45
SLIDE 45

Hitchhiker’s Guide to the Galaxy: Reverse Engineering

Random laughed. ”OK,” she said. ”Let’s try and go to Earth. Let’s go to Earth at some point on its, er...” ”Probability axis?” ”Yes. Where it hasn’t been blown up. OK. So you’re the Guide. How do we get a lift?” ”Reverse engineering.” ”What?” ”Reverse engineering. To me the flow of time is irrelevant. You decide what you want. I then merely make sure that it has already happened.” ”You’re joking.” ”Anything is possible.”

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 43

Hitchhiker’s Guide to the Galaxy: Reverse Engineering

Random laughed. ”OK,” she said. ”Let’s try and go to Earth. Let’s go to Earth at some point on its, er...” ”Probability axis?” ”Yes. Where it hasn’t been blown up. OK. So you’re the Guide. How do we get a lift?” ”Reverse engineering.” ”What?” ”Reverse engineering. To me the flow of time is irrelevant. You decide what you want. I then merely make sure that it has already happened.” ”You’re joking.” ”Anything is possible.”

2005-11-10

Architecture Reconstruction Definition of reverse engineering Hitchhiker’s Guide to the Galaxy: Reverse Engineering

Many open question. Who can answer these? Well, there is a book in this universe who has most answers. Well, at least it prevents you from panicking. In Chapter Seventeen of the Hitchhiker’s Guide to the Galaxy, long after the Earth was demolished, there is this conversation between Random and the new Guide. Arthur and Random want to get back to Earth. In this conversation, the bird, or guide if you will, suggests reverse engineering as a way to solve the problem. Reverse engineering goes back into past and changes some details so that the present is as wished. Anything is possible. Well, maybe in fairy tales. We are restricted by reality. And there are likely two things that remain real: the fact that reverse engineering will continue to be necessary and the fact that it will never be perfect in recovering all necessary information. In as much as forward engineering is an experience-based and creative undertaking, reverse engineering, too, requires – beyond the ability to analyze – creativity and experience because it creates abstractions, something computer aren’t really good for. Only basic steps will ever be automated in reverse engineering.

slide-46
SLIDE 46

IEEE International Conference on Software Maintenance

http://icsm2006.cs.drexel.edu/

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 44

Abreu, F., G. Pereira, and P. Sousa. 2000. A coupling-guided cluster analysis approach to reengineer the modularity of object-oriented

  • systems. In

European conference on software maintenance and reengineering. IEEE Computer Society Press. Aldrich, Jonathan, Craig Chambers, and David Notkin. 2002. Archjava: Connecting software architecture to implementation. In International conference on software engineering, 187–196. ACM Press. Andritsos, Periklis, and Vassilios Tzerpos. 2003. Software clustering based

  • n information loss minimization. In

Working conference on reverse engineering, 334–343. IEEE Computer Society Press. Anquetil, Nicolas, and Timothy Lethbridge. 1998. Extracting concepts from file names: a new file clustering criterion. In International conference on software engineering, 84–93. ACM Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 45

slide-47
SLIDE 47

Antoniol, G., R. Fiutem, and L. Cristoforetti. 1998. Design pattern recovery in object-oriented software. In International workshop on program comprehension. IEEE Computer Society Press. Asencio, A., S. Cardman, D. Harris, and E. Laderman. 2002. Relating expectations to automatically recovered design patterns. In Working conference on reverse engineering, 87–96. IEEE Computer Society Press. Balanyi, Zsolt, and Rudolf Ferenc. 2003. Mining design patterns from C++ source code. In International conference on software maintenance, 305–314. IEEE Computer Society Press. Baniassad, Elisa L. A., and Gail C. Murphy. 1998. Conceptual module querying for software reengineering. In International conference on software engineering, 64–73. ACM Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 46

Bauer, Markus, and Mircea Trifu. 2004. Architecture-aware adaptive clustering of oo systems. In European conference on software maintenance and reengineering, 3–12. IEEE Computer Society Press. Biggerstaff, Ted J., Bharat G. Mitbander, and Dallas Webster. 1993. The concept assignment problem in program understanding. In International conference on software engineering, 482–498. ACM Press. Bojic, Dragan, and Dusan Velasevic. 2000. A use-case driven method of architecture recovery for program understanding and reuse reengineering. In European conference on software maintenance and reengineering. IEEE Computer Society Press. Bowman, Ivan T., and Richard C. Holt. 1999. Reconstructing ownership architectures to help understand software systems. In International workshop on program comprehension. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 47

slide-48
SLIDE 48

Briand, L.C., Y. Labiche, and Y. Miao. 2003. Towards the reverse engineering of uml sequence diagrams. In Working conference on reverse engineering, 57–66. IEEE Computer Society Press. Chan, Keith, Zhi Cong Leo Liang, and Amir Michail. 2003. Design recovery of interactive graphical applications. In International conference on software engineering, 114–124. ACM Press. Chase, Melissa P., Steven M. Christey, David R. Harris, and Alexander S.

  • Yeh. 1998a. Managing recovered function and structure of legacy

software components. In Working conference on reverse engineering. IEEE Computer Society Press. ———. 1998b. Recovering software architecture from multiple source code analyses. In Program analysis for software technology, 43–50. ACM Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 48

Chase, Melissa P., David R. Harris, Susan N. Roberts, and Alexander S.

  • Yeh. 1996. Analysis and presentation of recovered software
  • architectures. In Working conference on reverse engineering. IEEE

Computer Society Press. Chen, Kunrong, and V´ aclav Rajlich. 2000. Case study of feature location using dependence graph. In International workshop on program comprehension. IEEE Computer Society Press. Chikofsky, Elliot J., and James H. Cross II. 1990. Reverse Engineering and Design Recovery: A Taxonomy. IEEE Software 7(1):13–17. Chiricota, Yves, Fabien Jourdan, and Guy Melan¸

  • con. 2003. Software

components capture using graph clustering. In International workshop on program comprehension, 217–226. IEEE Computer Society Press. Clements, Paul, Felix Bachmann, Len Bass, David Garlan, James Ivers, Reed Little, Robert Nord, and Judith Stafford. 2002. Documenting software architecture. Boston: Addison-Wesley.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 49

slide-49
SLIDE 49

Dekel, Uri, and Yossi Gil. 2003. Revealing class structure with concept

  • lattices. In Working conference on reverse engineering, 353–362. IEEE

Computer Society Press. Deprez, Jean-Christophe, and Arun Lakhotia. 2000. A formalism to automate mapping from program features to code. In International workshop on program comprehension. IEEE Computer Society Press. van Deursen, Arie, Christine Hofmeister, Rainer Koschke, Leon Moonen, and Claudio Riva. 2004. Symphony: View-driven software architecture

  • reconstruction. In

IEEE/IFIP working conference on software architecture, 122–132. IEEE Computer Society Press. van Deursen, Arie, and Tobias Kuipers. 1999. Identifying objects using cluster and concept analysis. In International conference on software engineering, 246–255. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 50

Egyed, Alexander. 2001. A scenario-driven approach to traceability. In International conference on software engineering, 123–132. ACM Press. ———. 2002. Automated abstraction of class diagrams. ACM Transactions on Software Engineering and Methodology 11(4): 449–491. Eisenbarth, Thomas, Rainer Koschke, and Daniel Simon. 2001a. Aiding program comprehension by static and dynamic feature analysis. In International conference on software maintenance, 602–611. IEEE Computer Society Press. ———. 2001b. Derivation of feature component maps by means of concept analysis. In European conference on software maintenance and reengineering, 176–180. IEEE Computer Society Press. ———. 2001c. Feature-driven program understanding using concept analysis of execution traces. In International workshop on program comprehension, 300–309. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 51

slide-50
SLIDE 50

———. 2002a. Incremental location of combined features for large-scale

  • programs. In International conference on software maintenance,

273–282. IEEE Computer Society Press. ———. 2003. Locating features in source code. IEEE Computer Society Transactions on Software Engineering 29(3). Eisenbarth, Thomas, Rainer Koschke, and Gunther Vogel. 2002b. Static trace extraction. In Working conference on reverse engineering. IEEE Computer Society Press. ———. 2005. Static object trace extraction for programs with pointers. Journals of Systems and Software . El-Ramly, Mohammad, Eleni Stroulia, and Paul Sorenson. 2002. Mining system-user interaction traces for use case models. In International workshop on program comprehension, 21–30. IEEE Computer Society Press. Embley, D. W., and S. N. Woodfield. 1988. Assessing the quality of abstract data types written in Ada. In International conference on software engineering, 144–153. ACM Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 52

Fiutem, Roberto, Paolo Tonella, Giulio Antoniol, and Ettore Merlo. 1996. A cliche-based environment to support architectural reverse engineering. In International conference on software maintenance. IEEE Computer Society Press. Gall, Harald, Mehdi Jazayeri, Ren´ e Kl¨

  • sch, Wolfgang Lugmayr, and Georg
  • Trausmuth. 1996. Architecture recovery in ares. In

Joint proceedings of the second international software architecture workshop 111–115. ACM Press. Gall, Harald, and Ren´ e Kl¨

  • sch. 1995. Finding objects in procedural

programs: an alternative approach. In Working conference on reverse engineering, 208–217. IEEE Computer Society Press. Gannod, Gerald C., and Shilpa Murthy. 2003. Verification of recovered software architectures. In International workshop on program comprehension, 258–267. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 53

slide-51
SLIDE 51

Girard, Jean-Francois, and Rainer Koschke. 1997. Finding components in a hierarchy of modules: a step towards architectural understanding. In International conference on software maintenance. IEEE Computer Society Press. Girard, Jean-Francois, Rainer Koschke, and Georg Schied. 1997. Comparison of abstract data type and abstract state encapsulation detection techniques for architectural understanding. In Working conference on reverse engineering. IEEE Computer Society Press. Han, Minmin, Christine Hofmeister, and Robert L. Nord. 2003. Reconstructing software architecture for J2EE web applications. In Working conference on reverse engineering, 67–76. IEEE Computer Society Press. Harris, D.R., H.B. Reubenstein, and A.S. Yeh. 1996. Recognizers for extracting architectural features from source code. In Working conference on reverse engineering, 252–261. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 54

Harris, R., Howard B. Reubenstein, and Alexander S. Yeh. 1995. Reverse engineering to the architectural level. In International conference on software engineering, 186–195. ACM Press. Heuzeroth, Dirk, Thomas Holl, Gustav H¨

  • gstr¨
  • m, and Welf L¨
  • we. 2003.

Automatic design pattern detection. In International workshop on program comprehension, 94–103. IEEE Computer Society Press. Holtzblatt, L.J., R.L. Piazza, H.B. Reubenstein, S.N. Roberts, and D.R.

  • Harris. 1997. Design recovery for distributed systems.

IEEE Computer Society Transactions on Software Engineering 23(7): 461–472. IEEE P1471. 2000. IEEE recommended practice for architectural description of software-intensive systems—std. 1471-2000. Issarny, Val´ erie, Titos Saridakis, and Apostolos Zarras. 1998. Multi-view description of software architectures. In Isaw ’98, proceedings of the third international workshop on software archite 81–84.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 55

slide-52
SLIDE 52

Ivkovic, Igor, and Michael W. Godfrey. 2002. Architecture recovery of dynamically linked applications: A case study. In International workshop on program comprehension, 178–184. IEEE Computer Society Press. Jackson, Daniel, and Allison Waingold. 1999. Lightweight extraction of

  • bject models from bytecode. In

International conference on software engineering, 194–202. ACM Press. ———. 2001. Lightweight extraction of object models from bytecode. IEEE Computer Society Transactions on Software Engineering 27(2): 159–169. Jerding, Dean, and Spencer Rugaber. 1997. Using visualization for architectural localization and extraction. In Working conference on reverse engineering. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 56

Jerding, Dean F., John T. Stasko, and Thomas Ball. 1997. Visualizing interactions in program executions. In International conference on software engineering, 360–370. IEEE Computer Society Press. Kazman, Rick, and S. Jeromy Carriere. 1998. View extraction and view fusion in architectural understanding. In Proceedings of the fifth internation conference on software reuse. IEEE Computer Society Press. Keller, Rudolf K., Reinhard Schauer, S´ ebastien Robitaille, and Patrick

  • Page. 1999. Pattern-based reverse-engineering of design components. In

International conference on software engineering, 226–235. ACM Press. Kollmann, Ralf, and Martin Gogolla. 2001. Capturing dynamic program behaviour with UML collaboration diagrams. In European conference on software maintenance and reengineering, 58–67. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 57

slide-53
SLIDE 53

Koschke, Rainer, and Thomas Eisenbarth. 2000. A framework for experimental evaluation of clustering techniques. In International workshop on program comprehension. IEEE Computer Society Press. Koschke, Rainer, and Daniel Simon. 2003. Hierarchical reflexion models. In Working conference on reverse engineering, 36–45. IEEE Computer Society Press. Kramer, Christian, and Lutz Prechelt. 1996. Design recovery by automated search for structural design patterns in object-oriented

  • software. In Working conference on reverse engineering. IEEE Computer

Society Press. Krikhaar, R., L. Feijs, R. de Jong, and J. Medema. 1999. Architecture comprehension tools for a PBX system. In European conference on software maintenance and reengineering. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 58

Krikhaar, Rene. 1997. Reverse architecting approach for complex systems. In International conference on software maintenance. IEEE Computer Society Press. Krone, Maren, and Gregor Snelting. 1994. On the inference of configuration structures from source code. In International conference on software engineering, 49–57. ACM Press. Lakhotia, A., and J.M. Gravley. 1995. Toward experimental evaluation of subsystem classification recovery techniques. In Working conference on reverse engineering, 262–271. IEEE Computer Society Press. Lindig, Christian, and Gregor Snelting. 1997. Assessing modular structure

  • f legacy code based on mathematical concept analysis. In

International conference on software engineering, 349–359. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 59

slide-54
SLIDE 54

Lucca, Giuseppe Antonio Di, Anna Rita Fasolino, and Ugo De Carlini.

  • 2000. Recovering use case models from object-oriented code: A

thread-based approach. In Working conference on reverse engineering. IEEE Computer Society Press. Lukoit, Kazimiras, Norman Wilde, Scott Stowell, and Tim Hennessey.

  • 2000. Tracegraph: Immediate visual location of software features. In

International conference on software maintenance. IEEE Computer Society Press. Lung, Chung-Horng. 1998. Software architecture recovery and restructuring through clustering techniques. In Proceedings of the third international workshop on software architecture, 101–104. ACM Press. Mahdavi, Kiarash, Mark Harman, and Robert Mark Hierons. 2003. A multiple hill climbing approach to software module clustering. In International conference on software maintenance, 315–324. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 60

Mancoridis, S., B.S. Mitchell, C. Rorres, Y. Chen, and E.R. Gansner. 1998. Using automatic clustering to produce high-level system organizations of source code. In International workshop on program comprehension. IEEE Computer Society Press. Mancoridis, Spiros.

  • 1996. Toward a generic framework for computing subsystem interfaces. In

Joint proceedings of the second international software architecture workshop 106–110. ACM Press. Mancoridis, Spiros, and Richard C. Holt. 1996. Recovering the structure of software systems using tube graph interconnection clustering. In International conference on software maintenance. IEEE Computer Society Press. Maqbool, O., and H. A. Babri. 2004. The weighted combined algorithm: A linkage algorithm for software clustering. In European conference on software maintenance and reengineering, 15–24. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 61

slide-55
SLIDE 55

Marburger, Andr´ e, and Dominikus Herzberg. 2001. E-cares research project: Understanding complex legacy telecommunication systems. In European conference on software maintenance and reengineering, 139–147. IEEE Computer Society Press. Marcus, Andrian, and Jonathan I. Maletic. 2003. Recovering documentation-to-source-code traceability links using latent semantic

  • indexing. In International conference on software engineering, 125–134.

IEEE Computer Society Press. Mendonca, Nabor C., and Jeff Kramer. 1998. Developing an approach for the recovery of distributed software architectures. In International workshop on program comprehension. IEEE Computer Society Press. Michail, Amir. 2000. Data mining library reuse patterns using generalized association rules. In International conference on software engineering, 167–176. ACM Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 62

Milanova, A., A. Rountev, and B. Ryder. 2002. Constructing precise object relation diagrams. In International conference on software maintenance, 586–595. IEEE Computer Society Press. Mitchell, Brian S., and Spiros Mancoridis. 2001. Craft: A framework for evaluating software clustering results in the absence of benchmark

  • decompositions. In Working conference on reverse engineering, 93–102.

IEEE Computer Society Press. Moe, Johan, and David A. Carr. 2001. Understanding distributed systems via execution trace data. In International workshop on program comprehension, 60–69. IEEE Computer Society Press. M¨ uller, H. A., and K. Klashinsky. 1985. Rigi—a system for programming-in-the-large. In International conference on software engineering, 80–86. ACM Press. Murphy, Gail C., Albert Lai, Robert J. Walker, and Martin P. Robillard.

  • 2001a. Separating features in source code: An exploratory study. In

International conference on software engineering, 275–284. ACM Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 63

slide-56
SLIDE 56

Murphy, Gail C., David Notkin, and Kevin Sullivan. 1995. Software re- flexion models: Bridging the gap between source and high-level models. In Proceedings of the third acm sigsoft symposium on the foundations of softw 18–28. New York, NY: ACM Press. Murphy, Gail C., David Notkin, and Kevin J. Sullivan. 2001b. Software reflexion models: Bridging the gap between design and implementation. IEEE Computer Society Transactions on Software Engineering 27(4). M¨ uller, H. A., S. R. Tilley, M. A. Orgun, B. D. Corrie, and N. H.

  • Madhavji. 1992. A reverse engineering en-

vironment based on spatial and visual software interconnection models. In Proceedings of the fifth acm sigsoft symposium on software development en 88–98. ACM Press. Niere, J¨

  • rg, Wilhelm Sch¨

afer, J¨

  • rg P. Wadsack, Lothar Wendehals, and

Jim Welsh. 2002. Towards pattern-based design recovery. In International conference on software engineering, 338–348. ACM Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 64

Niere, J¨

  • rg, J¨
  • rg P. Wadsack, and Lothar Wendehals. 2003. Handling

large search space in pattern-based reverse engineering. In International workshop on program comprehension, 274–283. IEEE Computer Society Press. de Oca, Carlos Montes, and Doris L. Carver. 1998. A visual representation model for software subsystem decomposition. In Working conference on reverse engineering. IEEE Computer Society Press. Pashov, Ilian, Matthias Riebisch, and Ilka Philippow. 2004. Supporting architectural restructuring by analyzing feature models. In European conference on software maintenance and reengineering, 25–34. IEEE Computer Society Press. Pauw, Wim De, E. Jensen, N. Mitchell, G. Sevitsky, J. Vlissides, and

  • J. Yang. 2001. Visualizing the execution of java programs. In

Proceedings of the international seminar on software visualization, lncs 2269, 151–162. Springer-Verlag Berlin.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 65

slide-57
SLIDE 57

Pinzger, Martin, and Harald Gall. 2002. Pattern-supported architecture

  • recovery. In International workshop on program comprehension, 53–62.

IEEE Computer Society Press. Rayside, Derek, Steve Reuss, Erik Hedges, and Kostas Kontogiannis. 2000. The effect of call graph construction algorithms for object-oriented programs on automatic clustering. In International workshop on program comprehension. IEEE Computer Society Press. Richner, T., and S. Ducasse. 2002. Using dynamic information for the iterative recovery of collaborations and roles. In International conference on software maintenance, 34–43. IEEE Computer Society Press. Richner, Tamar, and St´ ephane Ducasse. 1999. Recovering high-level views

  • f object-oriented applications from static and dynamic information. In

International conference on software maintenance. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 66

Riva, Claudio, and Jordi Vidal Rodriguez. 2002. Combining static and dynamic views for architecture reconstruction. In European conference on software maintenance and reengineering, 47–56. IEEE Computer Society Press. R¨

  • tschke, T., and R. Krikhaar. 2002. Architecture analysis tools to

support evolution of large industrial systems. In International conference on software maintenance, 182–191. IEEE Computer Society Press. Saeed, M., O. Maqbool, H.A. Babri, S.Z. Hassan, and S.M. Sarwar. 2003. Software clustering techniques and the use of combined algorithm. In European conference on software maintenance and reengineering, 301–310. IEEE Computer Society Press. Sartipi, Kamran. 2001. Alborz: A query-based tool for software architecture recovery. In International workshop on program comprehension, 115–117. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 67

slide-58
SLIDE 58

Sartipi, Kamran, and Kostas Kontogiannis. 2001. A graph pattern matching approach to software architecture recovery. In International conference on software maintenance, 408–417. IEEE Computer Society Press. ———. 2003. On modeling software architecture recovery as graph

  • matching. In International conference on software maintenance,

224–234. IEEE Computer Society Press. Sartipi, Kamran, Kostas Kontogiannis, and Farhad Mavaddat. 2000a. Architectural design recovery using data mining techniques. In European conference on software maintenance and reengineering. IEEE Computer Society Press. ———. 2000b. A pattern matching framework for software architecture recovery and restructuring. In International workshop on program comprehension. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 68

Schwanke, Robert W. 1992. An intelligent tool for re-engineering software

  • modularity. In International conference on software engineering. ACM

Press. Seemann, Jochen, and J¨ urgen Wolff von Gudenberg. 1998. Pattern-based design recovery of java software. In Proceedings of the acm sigsoft sixth international symposium on foundations 10–16. ACM Press. Shaw, Mary, and David Garlan. 1993. Advances in software engineering and knowledge engineering, chap. An Introduction to Software Architecture. River Edge, NJ: World Scientific Publishing Company. Shokoufandeh, A., S. Mancoridis, and M. Maycock. 2002. Applying spectral methods to software clustering. In Working conference on reverse engineering, 3–12. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 69

slide-59
SLIDE 59

Siff, M., and T. Reps. 1997. Identifying modules via concept analysis. In International conference on software maintenance, 170–179. IEEE Computer Society Press. Siff, Michael, and Thomas Reps. 1999. Identifying modules via concept

  • analysis. IEEE Computer Society Transactions on Software Engineering

25(6):749–768. Snelting, Gregor. 1996. Reengineering of configurations based on mathematical concept analysis. ACM Transactions on Software Engineering and Methodology 5(2): 146–189. Snelting, Gregor, and Frank Tip. 1998. Reengineering class hierarchies using concept analysis. In Proceedings of the acm sigsoft sixth international symposium on foundations 99–110. ACM Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 70

Souder, Tim, Spiros Mancoridis, and Maher Salah. 2001. Form: A framework for creating views of program executions. In International conference on software maintenance, 612–621. IEEE Computer Society Press. Subramaniam, Gokul V., and Eric J. Byrne. 1996. Deriving an object model from legacy fortran code. In International conference on software maintenance. IEEE Computer Society Press. Syst¨ a, Tarja. 1999. On the Relationships between Static and Dynamic Models in Reverse Engineering Java Software. In Proceedings of the 6th working conference on reverse engineering, 304–313. Atlanta, GA, USA: IEEE Computer Society Press. Syst¨ a, Tarja. 2000. Understanding the behavior of java programs. In Working conference on reverse engineering, 214–223. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 71

slide-60
SLIDE 60

Syst¨ a, Tarja, Kai Koskimies, and Hausi M¨

  • uller. 2001. Shimba—an

environment for reverse engineering java software systems. Software—Practice and Experience, Wiley 31(4):371–394. Tonella, P., R. Fiutem, G. Antoniol, and E. Merlo. 1996. Augmenting pattern-based architectural recovery with flow analysis: Mosaic—a case

  • study. In Working conference on reverse engineering. IEEE Computer

Society Press. Tonella, Paolo. 2001. Concept analysis for module restructuring. IEEE Computer Society Transactions on Software Engineering 27(4): 351–363. Tonella, Paolo, and Giulio Antoniol. 1999. Object oriented design pattern

  • inference. In International conference on software maintenance. IEEE

Computer Society Press. ———. 2001. Inference of object-oriented design patterns. Journal of Software Maintenance and Evolution: Research and Practice 13(5):309–330.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 72

Tonella, Paolo, and Alessandra Potrich. 2003. Reverse engineering of the interaction diagrams from C++ code. In International conference on software maintenance, 159–168. IEEE Computer Society Press. Tu, Qiang, and Michael W. Godfrey. 2001. The build-time software architecture view. In International conference on software maintenance, 398–407. IEEE Computer Society Press. Tvedt, R., P. Costa, and M. Lindvall. 2002. Does the code match the design? a process for architecture evaluation. In International conference on software maintenance, 393–403. IEEE Computer Society Press. Tzerpos, Vassilios. 1997. The orphan adoption problem in architecture

  • maintenance. In Working conference on reverse engineering. IEEE

Computer Society Press. Tzerpos, Vassilios, and Richard C. Holt. 1999. Mojo: A distance metric for software clustering. In Working conference on reverse engineering, 187–196. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 73

slide-61
SLIDE 61

———. 2000. On the stability of software clustering algorithms. In International workshop on program comprehension. IEEE Computer Society Press. Viljamaa, Jukka. 2003. Reverse engineering framework reuse interfaces. In Proceedings of the 9th european software engineering conference held jointly 217–226. ACM Press. Waters, Robert, and Gregory D. Abowd. 1999. Architectural synthesis: Integrating multiple architectural perspectives. In Working conference on reverse engineering. IEEE Computer Society Press. Weiser, Mark. 1984. Program slicing. IEEE Computer Society Transactions on Software Engineering 10(4). Wen, Zhihua, and Vassilios Tzerpos. 2003. An optimal algorithm for mojo

  • distance. In International workshop on program comprehension,

227–236. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 74

Whaley, John, Michael C. Martin, and Monica S. Lam. 2002. Automatic extraction of object-oriented component interfaces. In Proceedings of the international symposium on software testing and analysis. Wilde, Norman, Michelle Buckellew, Henry Page, and Vaclav Rajlich. 2001. A case study of feature location in unstructured legacy fortran code. In European conference on software maintenance and reengineering, 68–77. IEEE Computer Society Press. Wilde, Norman, and Michael Scully. 1995. Software reconnaissance: Mapping from features to code. Journal on Software Maintenance and Evolution 7:49–62. Wu, Jingwei, Ahmed E. Hassan, and Richard C. Holt. 2002. Using graph patterns to extract scenarios. In International workshop on program comprehension, 239–248. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 75

slide-62
SLIDE 62

Yan, Hong, David Garlan, Bradley Schmerl, Jonathan Ald rich, and Rick

  • Kazman. 2004. Discotect: A system for discovering architectures from

running systems. In International conference on software engineering, 470–479. ACM Press. Yeh, Alexander S., David R. Harris, and Melissa P. Chase. 1997. Manipulating recovered software architecture views. In International conference on software engineering, 184–194. ACM Press. Yeh, Dowming, and Wen-Yuan Kuo. 2002. Reverse engineering aggregation relationship based on propagation of operations. In European conference on software maintenance and reengineering, 223–231. IEEE Computer Society Press. Zhao, Wei, Lu Zhang, Yin Liu, Jiasu Sun, and Fuqing Yang. 2004. Sniafl: Towards a static non-interactive approach to feature location. In International conference on software engineering, 293–303. IEEE Computer Society Press.

Rainer Koschke What architects should know about reverse engineering and reengineering 8 Nov. 2005 76