refinement based context sensitive points to analysis for
play

Refinement-Based Context-Sensitive Points-To Analysis for Java Manu - PowerPoint PPT Presentation

Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodk UC Berkeley PLDI 2006 1 What Does Refinement Buy You? Increased scalability: enable new clients Memory: orders of magnitude savings Time:


  1. Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006 1

  2. What Does Refinement Buy You? Increased scalability: enable new clients • Memory: orders of magnitude savings • Time: answer for a variable comes back in 1 second • ) Suitable for IDE Cast Safety Client Precision : 2

  3. Approach: Focus on the Client Demand-driven: only do requested work Client-driven refinement: stop when client satisfied Example: • client asks: “can x point to o?” • we refine until we answer NO (the good answer) or we time out 3

  4. Context-Sensitive Analysis Costly Context-sensitive analysis (def): • Compute result as if all calls inlined • But, collapse recursive methods Exponential blowup (code growth) 4

  5. Why Not Existing Technique? Most analyses approximate same way in all code • E.g., k-CFA • Precision lost, esp. for data structures Our analysis focuses precision where it matters • Fully precise in the limit • Only small amount of code analyzed precisely • First refinement algorithm for Java 5

  6. Points-To Analysis Overview Compute objects each variable can point to For each var x, points-to set pt(x) Model objects with abstract locations 1: x = new Foo() yields pt(x) = { o 1 } Flow-insensitive: statements in any order 6

  7. Points-To Analysis as CFL-Reachability 1) Assignments x = new Obj(); // o 1 y = new Obj(); // o 2 z = x; o 1 x z a 2) Method calls ( 1 [ f ) 1 id(p) { return p; } ] f a = id(x); d c p id ret id b = id(y); [ g ) 2 ( 2 3) Heap accesses o 2 y b c.f = x; c.g = y; d = c.f; pt(x) = { o | o flowsTo x } flowsTo : path exists flowsTo : balanced call and field parens flowsTo : balanced call parens 7

  8. Summary of Formulation Graph represents program Compute reachability with two filters • Language of balanced call parens • Language of balanced field parens 8

  9. Single path problem … … t 9 t 7 ] j [ p t 8 ) 8 t 6 … t 5 ) 5 ( 7 ( 1 ) 1 o t 0 t 1 t 2 x [ h [ f t 12 ] k [ f ( 1 ) 1 [ h ] g o 2 t 10 Problem: show path is unbalanced t 11 Goal: reduce number of visited edges Insight: enough to find one unbalanced paren 9

  10. Approximation via Match Edges o t 0 t 1 t 2 t 3 t 4 x [ g ] h [ f [ h ] j ] f [ f [ g [ h ] h ] j ] f Match edges connect matched field parens • From source of open to sink of close • Initially, all pairs connected Use match edges to skip subpaths 10

  11. Refining the Approximation o t 0 t 1 t 3 t 4 x [ g [ f ] j ] f [ f [ g [ h ] h ] j ] f Refine by removing some match edges • Exposes more of original path for checking Soundness: Traverse match edge ) assume field parens balanced on skipped path Remove where unbalanced parens expected • Explore deeper levels of pointer indirection 11

  12. Refinement With Both Languages ( 2 ) 3 ( 1 ) 1 o t 0 t 1 t 2 t 3 t 4 t 5 t 6 x [ g ] g ] f [ f Fields: [ f [ g ] g ] f Calls: ( 1 ) 1 ( 2 ) 3 Match edges enable approximation of calls • Only can check calls on match-free subpaths Match edge removal ) more call checking • Key point: refine heap and calls together 12

  13. Evaluation 13

  14. Experimental Configuration Implemented in Soot framework Tested on large benchmarks x 2 clients • SPECjvm98, Dacapo suite • Downcast checking, factory method props Refine context-insensitive result Timeout for long-running queries 14

  15. Precision: Cast Checking 15

  16. Scalability: Time and Memory Average query time less than 1 second • Interactive performance (for IDE) • At most 13 minutes for casts, 4 minutes for factory client Very low memory usage: at most 35MB • Of this, 30MB for context-insensitive result • Compare with >2GB for 1-ObjSens analysis 16

  17. Demand-Driven vs. Exhaustive Demand advantage: no caching required • Hence, low memory overhead • No engineering of efficient sets • Good for changing code; just re-compute Demand advantage: faster for many clients • Often only care about some variables Demand disadvantage: slower querying all vars • At most 90 minutes for all app. vars • But, still good precision, memory 17

  18. Conclusions Novel refinement-based analysis • More precise for tested clients • Interactive performance for queries • Low memory: could scale even more • Relatively easy to implement Insight: refine heap and calls together • Useful for other balanced-paren analyses? 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend