domain specific languages for program analysis
play

Domain-Specific Languages for Program Analysis Mark Hills OOPSLE - PowerPoint PPT Presentation

Domain-Specific Languages for Program Analysis Mark Hills OOPSLE 2015: Open and Original Problems in Software Language Engineering March 6, 2014 Montreal, Canada http://www.rascal-mpl.org 1 Overview A Starting Example: DCFlow Other


  1. Domain-Specific Languages for Program Analysis Mark Hills OOPSLE 2015: Open and Original Problems in Software Language Engineering March 6, 2014 Montreal, Canada http://www.rascal-mpl.org 1

  2. Overview • A Starting Example: DCFlow • Other Early-Stage Ideas • Summary extraction from documentation • Trace processing • Discussion 2

  3. Say you need a control flow graph… entry entry 3 x true false 3 x := 3 x := 3 10 15 x false true y := 10 y := 15 15 10 y := 15 y := 10 exit exit 3

  4. Building control flow graph extractors • First, define how to represent control flow graphs • Then, pick a language — hopefully we can reuse the first part for di ff erent languages, but maybe not… • Next, define the control flow rules, using your favorite language (such as Rascal, of course…) • Finally, define something that uses the graph — this makes sure the data structure is rich enough to be useful as well… 4

  5. What if we want to work with another language? • May be able to reuse base CFG definition (but maybe not) • Cannot reuse flow definition (unless CFG def is the same and features have identical semantics — the flow rules are specific to the features being defined) • Cannot easily reuse analysis (since CFG definition and semantics di ff er) 5

  6. 
 What if we want to work with another language? • May be able to reuse base CFG definition (but maybe not) • Cannot reuse flow definition (unless CFG def is the same and features have identical semantics — the flow rules are specific to the features being defined) • Cannot easily reuse analysis (since CFG definition and semantics di ff er) 
 So, we write the entire thing over again 
 (and again, and again…) 6

  7. DCFlow: Declarative Control Flow • Declarative DSL for defining control flow rules • Generates Rascal code to build intraprocedural control flow graphs with reusable library of CFG concepts • Provides basic visualization to allow graphs to be rendered in GraphViz dot • Provides ignore mechanism to indicate which language constructs we are not trying to define • IDE provides basic checking to aid user (with more coming) 7

  8. DCFlow Architecture DCFlow CFG Builder DCFlow Language-Specific Translator Modules Definition Functions (Rascal) (Rascal) (Rascal) Source Program DCFlow Libraries CFG Construction (Input Language) (Rascal) (Rascal) GraphViz CFG Visualization Control Flow Visualizations (Rascal) Graphs (Rascal) (GraphViz,dot) 8

  9. 
 
 Building up an example: plus • What should plus do? 
 binaryOperation(Expr left, Expr right, plus()) 9

  10. 
 
 
 
 
 Building up an example: plus • What should plus do? 
 binaryOperation(Expr left, Expr right, plus()) • Run left, then run right, then add them together 
 rule EXP::add = left --> right --> self; 10

  11. 
 
 
 
 Building up an example: plus • What should plus do? 
 binaryOperation(Expr left, Expr right, plus()) • Run left, then run right, then add them together 
 rule EXP::add = left --> right --> self; • That’s it! 
 11

  12. 
 
 
 
 
 
 Something more complex: while loops • What should while do? 
 \while(Expr cond, list[Stmt] body) 12

  13. 
 
 
 
 Something more complex: while loops • What should while do? 
 \while(Expr cond, list[Stmt] body) • The exp is the first and last thing we should do • A footer is useful as a target for break and continue • We need a back-edge, and it would be nice to label others 
 13

  14. 
 
 
 
 Something more complex: while loops • What should while do? 
 \while(Expr cond, list[Stmt] body) • The exp is the first and last thing we should do • A footer is useful as a target for break and continue • We need a back-edge, and it would be nice to label others 
 rule STATEMENT::whileStat = create(footer), ^exp -conditionTrue-> body -backedge-> exp, exp -conditionFalse-> $footer; 14

  15. Design Decisions • Focus on abstract syntax trees (should 
 almost work on Rascal concrete syntax, 
 but there are some di ff erences) • Leverage reified types for generation and checking • Try to ensure added features are general — don’t want to add something just because PHP or Java needs it • Make sure generated code is understandable — it should look close to what you would write yourself 15

  16. How about for other domains? • Idea 1: Program tracing • Internal DSL — goal is to build this as a library in Rascal • Allow filter functions to keep or discard events of interest • Use closures to support registration of handlers for specific events or event patterns • What we have now: rudimentary tracing for PHP programs using Rascal and xdebug (running over TCP sockets) 16

  17. How about for other domains? • Idea 2: Summary extraction • Libraries make it harder to analyze code, we may not know what these libraries actually do • Extract function/procedure/method summaries from existing documentation — basic info such as signatures, types, maybe ability to attach more advanced info • No work on this yet, still deciding what makes sense — currently works for PHP by extracting very generic HTML representation and using Rascal to match over it 17

  18. Related work • “Extensible intraprocedural flow analysis at the abstract syntax tree level”, Söderberg, Ekman, Hedin, Magnusson • Uses attribute grammars to represent control flow • Reference attributes represent edges • Collection attributes represent inverse relations (e.g., pred) • Higher-order attributes allow building new AST nodes (e.g., entry and exit)

  19. Related work • Spoofax: NaBL, language for incremental type checking • DHAL and variants for data flow analysis • Related conceptually — use domain-specific languages for specific analysis-related tasks • Direct language support: Rascal, TXL, Spoofax, ASF+SDF , etc

  20. Discussion 20

  21. Discussion: Some possible topics… • What opportunities are there for creating DSLs for program analysis? Which parts of the process would be best for this? • Which is best: internal or external? What circumstances drive this? • Is this even a good idea? Why not just use Rascal (or something else, if you must…) 21

  22. Which design decisions are important? • Focus on abstract syntax trees (should 
 almost work on Rascal concrete syntax, 
 but there are some di ff erences) • Leverage reified types for generation and checking • Try to ensure added features are general — don’t want to add something just because PHP or Java needs it • Make sure generated code is understandable — it should look close to what you would write yourself 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend