 
              What is a High-Level Schema? Defining, Transforming, and Exchanging High-Level Schemas My answer: Any schema above the statement level A guided journey through the outback I see two distinct levels of abstraction: Presented by Michael W. Godfrey 1. Programming language entity level – Entities are (shared) fcns, vars, types, classes, … Software Architecture Group (SWAG) 2. Architectural level Dept of Comp Sci, Univ of Waterloo – Entities are modules, subsystems, classes, interfaces, … This presentation is available from http://plg.uwaterloo.ca/~migod/papers/ WCRE 00 -- High Level Schemas 2 Previous Work My (selfish) goals • Lots of – motivational work • I would like to be able to use other extractors … – ad hoc extractor snarfing – Want to perform architectural analyses of systems – experimental translation mechanisms written in languages other than C – Want to implement BEAGLE • Examples (many others exist) (a tool for exploring software evolution) – CORUM I and II • … but extractors differ in languages modelled, – GRAX level of detail, robustness, bugs, data format, … – TAXForm (TA eXchange FORMat) using Acacia, Rigiparse – I want to be able to convert data between tools. – Rigi using VisualAge C++ – Need agreement (awareness) from tool creators – Dali using Sniff+ WCRE 00 -- High Level Schemas 3 WCRE 00 -- High Level Schemas 4 TAXForm Utopia Transforming Between Schemas Rigi SHriMP Viewer Universal TAXForm to Rigi Converter Bunch Clustering Tool High-Level PBS Viewer Bunch / and Abstraction TAXForm TAXForm Tools Repository Converter Procedural Object-Oriented Dali to Rigi to cfx to TAXForm TAXForm TAXForm Converter Converter Converter PL/I C C++ Java Dali Extractor PBS Extractor Rigi Extractor (SNiFF+) (rigiparse) (cfx) Acacia C PBS C Rigi C System Artifacts WCRE 00 -- High Level Schemas 5 WCRE 00 -- High Level Schemas 6 1
TAXForm — Procedural schema TAXForm — High level schema uses Source file File defines defines contains defines depends-on defines defines Data Type Procedure Data contains Module Subsystem uses uses uses type data procedure uses type WCRE 00 -- High Level Schemas 7 WCRE 00 -- High Level Schemas 8 Back to my (selfish) goals My schema wish list [influenced by Acacia’s C and C++ data models] • Would like to concentrate on procedural and OO languages. Top-level programming language entities: – Others are interested in COBOL, JCL etc . – functions, variables, constants, type definitions (procedural languages) • I am interested in high-level info (f calls g) – methods, class member data, static methods and member data – but not in ASGs, code-level metrics (object-oriented languages) • Need to agree on – Syntax Entity containers : – Level of granularity and detail – files, modules, classes, packages – What to do in case of X e.g., X = “missing files” WCRE 00 -- High Level Schemas 9 WCRE 00 -- High Level Schemas 10 My schema wish list My schema wish list Entity attributes: Relationships : – Name, unique identifier (UID -- see next section) – Function calls, variable uses – UID of container, UID of containing file (if container is not a file) – Line number information (see below) – Signature/data type – Container use/inclusion (by other containers) – Line number information (see below) – Inheritance (various kinds) – Declared scope/visibility, static or not, final or not – “Friendship”, various template relationships – Definition or declaration (see below) Entity container attributes: Relationship attributes: – name, UID – Line number information (see below) – relative path (if a file) – Scope/permission of inheritance – version identifier (if provided) – UID of container (if not a file), UID of containing file (if not a file) WCRE 00 -- High Level Schemas 11 WCRE 00 -- High Level Schemas 12 2
Problems Problems Some technical problems: We’ve had these conversations before … – UID generation? (name-mangling?) – Line numbering (ranges)? “Getting academics to agree on anything is like – Incomplete information? herding cats.” • ill-formed code, gcc /K&R-isms • missing header files • resolving entity use to dfn/dcl (esp. with polymorphism, overloading) – Pre or post preprocessing? WCRE 00 -- High Level Schemas 13 WCRE 00 -- High Level Schemas 14 Example Extractors/Systems Dimensions of Variation • Intended use Others: Included here: – Level of schema (entity level, architectural level, or mixed) – Amount of detail • Languages modelled • Rigi [UVictoria] • PBS [UWloo] – Multi-lingual • SPOOL [UMontréal] • Acacia [AT&T] – Common super schemas • Datrix [Bell Canada] • cxref, ctags, – Explicit model “cross-overs” ( e.g., JCL, embedded SQL) cscope • MOOSE [UBern] • Hidden assumptions • TA++ [UOttawa] • SHORE [SD&M] – Known limitations • BAUHAUS [UStuttgart] • Neuhold [UVienna] • Notation/approach to store factbase • GUPRO [UKoblenz] • VisualAge C++ [IBM] – Support for translations and transformations • … [many others] • What’s particularly novel and noteworthy WCRE 00 -- High Level Schemas 15 WCRE 00 -- High Level Schemas 16 PBS C Language Entities PBS [Holt et al. @ UWaterloo] • Portable Bookshelf is a reverse engineering tool for creating software architecture models of large systems: – Guinea pigs: Mozilla, Linux, Apache, VIM, Mitel, TOBEY, … • Consists of fact extractor, fact manipulation engine (“ grok ”), and visualization tool (“landscape”) entity-level landscape source architectural cfx grok facts code facts viewer WCRE 00 -- High Level Schemas 17 WCRE 00 -- High Level Schemas 18 3
PBS C Language E/R View PBS Architectural Schema WCRE 00 -- High Level Schemas 19 WCRE 00 -- High Level Schemas 20 Acacia [Chen, Gansner et al. @ AT&T] Acacia C++/C Schemas • History: • Entity attributes: – CIA → CIAO → Acacia – Hex UID, name, kind (file, function, type, var, macro), filename, datatype (string), typeclass ( enum, • Consists of struct , etc. ), linenum info for def/dec, def/dec/undef, param list, template info, scope, storage spec ( static, – C and C++ extractors const, inline, inline virtual , etc. ), signature – SQL-like query engine • Relationship attributes: – visualization with auto-layout – Linenum info, rel. kind (refers, contains, inherits, instantiates, typedef, etc. ), relationship scope WCRE 00 -- High Level Schemas 21 WCRE 00 -- High Level Schemas 22 ctags, cxref, cscope Acacia Queries • These are “open source” Unix tools that perform • SQL-like queries for entities and relationships extractions: produces “;” delimited textual output: – ctags extracts only entity info • e.g., file, name, line num, kind, etc • works with C, C++, Eiffel, Fortran, and Java. % ksh cdef -u fu closeTagFile • Used for fast context switching while editing source code with vim/emacs 26f53ece;closeTagFile;function;entry.h;void;regular;83;0;83; dec;00000000;(const boolean);;extern;;;; – cxref generates cross-reference table for C systems. 76e7ae31;closeTagFile;function;entry.c;void;regular;551;553; • Often used for webifying source code ( e.g., Linux, Mozilla). 563;def;00000000;(const boolean);;extern;;;; – cscope used for program comprehension of C % ksh cref –u - - m - file2=‘osdeps.h’ systems ( e.g., who calls f , who uses v ) <all entity1 attrs> ; <all entity2 attrs > ; <rel attrs> • Older commercial Unix tool, recently open sourced. WCRE 00 -- High Level Schemas 23 WCRE 00 -- High Level Schemas 24 4
TA++ [Lethbridge et al. @ UOttawa] TA++ Entities • TKSee aids programming comprehension – i.e., what programmers do all day – TA++ is the data modelling language • Want “full story” from the source code: – Want pre-preprocessing view of code for all platforms and environments (text editor’s view) – … but most extractors use a compiler front end and preprocess toward a particular target and option set • Some extractors keep some macro info WCRE 00 -- High Level Schemas 25 WCRE 00 -- High Level Schemas 26 TA++ Relationships TA++ Combined E/R Model WCRE 00 -- High Level Schemas 27 WCRE 00 -- High Level Schemas 28 BAUHAUS [Koschke et al. @ UStuttgart] BAUHAUS Entities • Software architecture recovery system – Parse code, look for hidden/decayed abstractions, then redesign – Uses various heuristics to perform “clustering” – Works both at entity level and subsystem level • Built from many tools … – … including Rigi viewer and a customized C parser/extractor that (optionally) dumps RSF • Example WoSEF problem: – Cannot derive full includes hierarchy from Bauhaus extracted facts; this was a design decision, as the researchers were not interested in this information WCRE 00 -- High Level Schemas 29 WCRE 00 -- High Level Schemas 30 5
Recommend
More recommend