program analysis program analysis extracting information
play

Program Analysis Program Analysis Extracting information, in order - PDF document

Program Analysis Program Analysis Extracting information, in order to present Extracting static and dynamic information from a software system abstractions of, or answer questions about, a software system Static Analysis: Examines the


  1. Program Analysis Program Analysis • Extracting information, in order to present Extracting static and dynamic information from a software system abstractions of, or answer questions about, a software system • Static Analysis: Examines the source code • Dynamic Analysis: Examines the system as it is executing What are we looking for? Entities • Depends on our goals and the system • Entities are individuals that live in the system, and attributes associated with them. • In almost any language, we can find out information about variable usage • Some examples: • In an OO environment, we can find out which classes use other • Classes, along with information about their superclass, their scope, classes, which are a base of an inheritance structure, etc. and where in the code they exist. • We can also find potential blocks of code that can never be • Methods/functions and what their return type or parameter list is, executed in running the program (dead code) etc. • Typically, the information extracted is in terms of entities and • Variables and what their types are, and whether or not they are relationships static, etc. Relationships Information format • Many different formats in use • Relationships are interactions between the entities • Simple but effective: RSF in the system. inherit TRIANGLE SHAPE • Relationships include: • TA is an extension of RSF that includes a schema • Classes inheriting from one another. • Methods in one class calling the methods of another class, and $INSTANCE SHAPE Class methods within the same class calling one another. • GXL is an XML-like extension of TA. A blow-up • A method referencing an attribute. factor of 10 or more makes it rather cumbersome

  2. Static Analysis CppETS • Involves parsing the source code • CppETS is a benchmark for C++ extractors • Usually creates an Abstract Syntax Tree • It consists of a collection of C++ programs that • Borrows heavily from compiler technology but pose various problems commonly found in parsing stops before code generation and reverse engineering • Requires a grammar for the programming • Static analysis research tools typically get about language 60% of the problems right • Can be very difficult to get right Example program Example Q&A #include <iostream.h> class Hello { • How many member methods are in the Hello public: Hello(); ~Hello(); class? }; Two, the constructor Hello::Hello() and Hello::Hello() destructor Hello::~Hello() { cout << "Hello, world.\n"; } • Where are these member methods used? Hello::~Hello() The constructor is called implicitly when an { cout << "Goodbye, cruel world.\n"; } instance of the class is created. The destructor is main() { called implicitly when the execution leaves the Hello h; scope of the instance. return 0; } Static analysis in IDEs Static analysis pipeline • Eclipse displays compilation warnings and errors on the fly, e.g. unused variables • EiffelStudio automatically creates BON diagrams of the static structure of Eiffel systems • Rational Rose, as well as some Eclipse plugins, do the same with UML and Java • Reverse engineers have many other uses for static facts

  3. Dynamic Analysis Instrumentation • Provides information about the run-time behaviour • Augments the subject program with code that of software systems, e.g. transmits events to a monitoring application, or • Component interactions writes relevant information to an output file • Event traces • A profiler can be used to examine the output file • Concurrent behaviour and extract relevant facts from it • Code coverage • Memory management • Instrumentation affects the execution speed and • Can be done with a profiler or a debugger storage space requirements of the system Instrumentation process Dynamic analysis pipeline Non-instrumented approach Dynamic analysis issues • One can also use debugger log files to obtain • Ensuring good code coverage is a key concern dynamic information • A comprehensive test suite is required to ensure • Disadvantage: Limited amount of information that all paths in the code will be exercised provided • Results may not generalize to future executions • Advantage: Less intrusive approach, more accurate performance measurements

  4. Static vs. Dynamic SWAGKit • SWAGKit is used to generate software landscapes • Reasons over all • Observes a small from source code possible number of • Based on a pipeline architecture with three behaviours behaviours phases (general results) (specific results) • Extract (cppx, bfx, javex) • Manipulate (prep, linkplus, layoutplus) • Conservative • Precise and fast • Present (lsedit) • Challenge: • Challenge: Select • Currently usable for programs written in C/C++ Choose good representative test and Java abstractions cases The SWAGKit Pipeline CPPX • C/C++ fact extractor based on gcc • Extracts facts from one source file at a time • Facts represent program information in TA format, e.g. $INSTANCE x integer • Can pass normal gcc parameters using the -g option • In the assignment, we will see two other fact extractors, bfx and javex. They extract facts from compiled code, C and Java respectively. Prep Grok • Prep is a series of scripts written in Grok • A simple scripting language • Function is to “clean up” facts from cppx so they • A relational algebraic calculator are in a form which can be usable by the rest of • Powerful in manipulating binary relations the pipeline.

  5. Grok Script (1) Grok Script (2) cat := {"Garfield", "Fluffy"} chase := cat X mouse chase mouse := {"Mickey", "Nancy"} eat := chase + mouse X cheese cheese := {"Roquefort", "Swiss"} eat animals := cat + mouse food := mouse + cheese animalsWhichAreFood := animals ^ food animalsWhichAreNotFood := animals - food animalsWhichAreFood animals - food #food mouse <= food Grok Scripts (3) A more real example Factbase rawFacts.rsf We need to compute call {"Mickey"} . eat relations between files contain a.c f1 eat . {"Mickey"} contain a.c f2 eater := dom eat contain b.c f3 contain b.c f4 food := rng eat call f1 f2 chasedBy := inv chase call f2 f3 topOfFoodChain := dom eat - rng eat call f3 f4 bottomOfFoodChain := rng eat - dom eat bothEatAndChase := eat ^ chase eatButNotChase := eat - chase chaseButNotEat := chase - eat secondOrderEat := eat o eat anyOrderEat := eat + A bigger real example linkplus containFacts := $1 Input: A nested getdb containFacts partition of a • Function is to link all facts into one large graph d := dom contain set of objects r := rng contain • Combines facts residing in separate files Output: A e := ent contain • Resolves inter-compilation unit relationships flattened roots := d - r version of the • Merges header files together leaves := r - d original partition • Does some cleanup to shrink final graph toKeep := roots + leaves toDelete := e - toKeep • Usage: linkplus list-of-files-to-link cc := contain+ delset toDelete • Produces out.ln.ta delrel contain contain := cc relToFile contain $2

  6. layoutplus lsedit • Adds • Clustering of facts based on contain.rsf (created manually or from a • View software landscape produced by previous clustering algorithm) parts of the pipeline • Layout information so that graph can be displayed • Schema information • Can make changes to landscape and save them • Usage: • Usage: lsedit out.ls.ta layoutplus contain.rsf out.ln.ta • Produces out.ls.ta

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend