Refinement-Based Context-Sensitive Points-To Analysis for Java Manu - - PowerPoint PPT Presentation

refinement based context sensitive points to analysis for
SMART_READER_LITE
LIVE PREVIEW

Refinement-Based Context-Sensitive Points-To Analysis for Java Manu - - PowerPoint PPT Presentation

Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodk UC Berkeley PLDI 2006 1 What Does Refinement Buy You? Increased scalability: enable new clients Memory: orders of magnitude savings Time:


slide-1
SLIDE 1

1

Refinement-Based Context-Sensitive Points-To Analysis for Java

Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006

slide-2
SLIDE 2

2

What Does Refinement Buy You?

Increased scalability: enable new clients

  • Memory: orders of magnitude savings
  • Time: answer for a variable comes back in 1 second
  • ) Suitable for IDE

Precision:

Cast Safety Client

slide-3
SLIDE 3

3

Approach: Focus on the Client

Demand-driven: only do requested work Client-driven refinement: stop when client satisfied Example:

  • client asks: “can x point to o?”
  • we refine until we answer NO (the good

answer) or we time out

slide-4
SLIDE 4

4

Context-Sensitive Analysis Costly

Context-sensitive analysis (def):

  • Compute result as if all calls inlined
  • But, collapse recursive methods

Exponential blowup (code growth)

slide-5
SLIDE 5

5

Why Not Existing Technique?

Most analyses approximate same way in all code

  • E.g., k-CFA
  • Precision lost, esp. for data structures

Our analysis focuses precision where it matters

  • Fully precise in the limit
  • Only small amount of code analyzed precisely
  • First refinement algorithm for Java
slide-6
SLIDE 6

6

Points-To Analysis Overview

Compute objects each variable can point to

For each var x, points-to set pt(x)

Model objects with abstract locations

1: x = new Foo() yields pt(x) = { o1 }

Flow-insensitive: statements in any order

slide-7
SLIDE 7

7

Points-To Analysis as CFL-Reachability

1) Assignments x = new Obj(); // o1 y = new Obj(); // o2 z = x;

  • 1

x y z

  • 2

a b pid retid d c (1 )1 (2 )2 [f [g ]f 2) Method calls id(p) { return p; } a = id(x); b = id(y); 3) Heap accesses c.f = x; c.g = y; d = c.f;

pt(x) = { o | o flowsTo x } flowsTo: balanced call and field parens flowsTo: balanced call parens flowsTo: path exists

slide-8
SLIDE 8

8

Summary of Formulation

Graph represents program Compute reachability with two filters

  • Language of balanced call parens
  • Language of balanced field parens
slide-9
SLIDE 9

9

Single path problem

Problem: show path is unbalanced Goal: reduce number of visited edges Insight: enough to find one unbalanced paren

  • x

t0 t1 t2 [f (1 )1 [h [f (1 )1 [h t5 )5 t6 (7 t8 t9 t7 … … … ]j [p )8

  • 2

t10 t11 t12 ]g ]k

slide-10
SLIDE 10

10

Approximation via Match Edges

Match edges connect matched field parens

  • From source of open to sink of close
  • Initially, all pairs connected

Use match edges to skip subpaths

  • t3

t0 t1 t2 [f [g [h ]h t4 x ]j ]f [f [g [h ]h ]j ]f

slide-11
SLIDE 11

11

Refining the Approximation

Refine by removing some match edges

  • Exposes more of original path for checking

Soundness: Traverse match edge ) assume field parens balanced on skipped path Remove where unbalanced parens expected

  • Explore deeper levels of pointer indirection
  • t3

t0 t1 [f [g t4 x ]j ]f [f [g [h ]h ]j ]f

slide-12
SLIDE 12

12

Refinement With Both Languages

  • t5

t0 t1 t2 (1 )1 [g ]g t6 x ]f )3 t3 t4 [f (2

Match edges enable approximation of calls

  • Only can check calls on match-free subpaths

Match edge removal ) more call checking

  • Key point: refine heap and calls together

Calls: (1 )1 (2 )3 Fields: [f [g ]g ]f

slide-13
SLIDE 13

13

Evaluation

slide-14
SLIDE 14

14

Experimental Configuration

Implemented in Soot framework Tested on large benchmarks x 2 clients

  • SPECjvm98, Dacapo suite
  • Downcast checking, factory method props

Refine context-insensitive result Timeout for long-running queries

slide-15
SLIDE 15

15

Precision: Cast Checking

slide-16
SLIDE 16

16

Scalability: Time and Memory

Average query time less than 1 second

  • Interactive performance (for IDE)
  • At most 13 minutes for casts,

4 minutes for factory client

Very low memory usage: at most 35MB

  • Of this, 30MB for context-insensitive result
  • Compare with >2GB for 1-ObjSens analysis
slide-17
SLIDE 17

17

Demand-Driven vs. Exhaustive

Demand advantage: no caching required

  • Hence, low memory overhead
  • No engineering of efficient sets
  • Good for changing code; just re-compute

Demand advantage: faster for many clients

  • Often only care about some variables

Demand disadvantage: slower querying all vars

  • At most 90 minutes for all app. vars
  • But, still good precision, memory
slide-18
SLIDE 18

18

Conclusions

Novel refinement-based analysis

  • More precise for tested clients
  • Interactive performance for queries
  • Low memory: could scale even more
  • Relatively easy to implement

Insight: refine heap and calls together

  • Useful for other balanced-paren analyses?