What is points-to analysis? Informally: analysis determining what - PDF document

4/5/2010 What is points-to analysis? � Informally: analysis determining what locations (objects) pointers can point to Points-To Analysis Program main() { x = &a; Meeting 21, CSCI 5535, Spring 2010 y = &b; z = x; Guest Lecture: Manu Sridharan } Result pt(x) = {a}, pt(y) = {b}, pt(z) = {a} Importance of points-to analysis Lecture overview Verification � What we’ll cover Is { *x = 10 } *y = 3 { *x = 10 } valid? � Definition / complexity of several points-to analysis If x and y cannot point to same location, then yes variants Optimization � Andersen’s analysis via CFL-reachability Can � Some issues with handling method calls a = *x; *y = 4; b = *x � Refinement-based analysis be optimized to a = *x; *y = 4; b = a? � What we’ll skip (lack of time) If x and y cannot point to same location, then yes � Shape analysis (Evan will cover later in semester) Control Flow � Many other optimizations, control-flow analysis of Can l.add() invoke ArrayList.add()? functional languages, … If l can point to an ArrayList object, then yes Formal definition (for C) Soundness and precision � If p can point to q in some execution, a sound analysis Points-to analysis : Given a program and two will always report it variables p and q, points-to analysis checks if p � Analysis may be over-approximate , reporting p can point can point to q in some program execution to q even if it cannot � A precise analysis is sound but not over-approximate [Chakaravarthy03] � Yields exact answer given program semantics � malloc creates a fresh, unnamed variable � Precise analysis for C is undecidable Alias analysis : check if p1 and p2 can point to q � Or, for any Turing-complete language simultaneously in some execution � foo( TuringMachine M, TMInput i) { run M on i; p = &q; } � We’ll focus on points-to Bottom line: to obtain decidability and efficiency, must For now, assume no procedure calls approximate program semantics 1

4/5/2010 Approximation 1: Path Insensitivity Approximation 2: Flow Insensitivity � Treat all branches as non-deterministic � Assume statements can execute in any order � Given if (c) then p; else q; , always assume � With possible repetition either p or q can execute � Assume control-flow graph is complete � Must still respect execution order (flow sensitive) � Complexity � Complexity � With dynamic memory (malloc), undecidable � With dynamic memory (malloc), decidability � See [Ramalingam94,Chakaravarthy03] unknown (!) � Without dynamic memory, PSPACE-complete [MD00] � Without dynamic memory, NP-Hard [Horwitz97] � Even with just one procedure! � Bottom line: need even more approximation � Bottom line: need to approximate more Simultaneity Approximation 3: Andersen’s Semantics of pointer accesses � Assumes discovered points-to relations can all Pointer Write Pointer Read occur simultaneously x w � Hence, less precise handling of pointer accesses y z w x = *y *x = y � Challenge: express as approximate semantics? x y z � Breaks up multi-level derefs � Note: black arrows must occur simultaneously � **x = y becomes temp = *x, *temp = y Issue: Some relations cannot arise simultaneously � Again, imprecision due to simultaneity reasoning (**x does two derefs atomically) Statement set (flow insensitive): � Heap abstraction? Other? (I don’t know) {a=&c;b=a;c=&b;b=&a;*b=c} b points to c: a=&c;b=a, a points to b: c=&b;b=&a;*b=c � Complexity: O(N 3 ); much better! But not both! Andersen’s for Java: The Basics � Four statement types � new : x = new Obj() � assign : x = y � getfield : x = y.f � putfield : x.f = y � Single abstract location for each new ANDERSEN’S ANALYSIS IN CFL- � Represents objects allocated by all executions REACHABILITY � For more precise treatment, shape analysis 2

4/5/2010 CFL-Reachability More on CFL-Reachability � � Several variants � � → �� ε � � � � All-pairs : find all pairs of nodes connected by valid paths � � � Single-source : find all nodes to which source is connected by valid path �� General algorithm O(N 3 ) Points-to analysis graph: � N is number of nodes • Nodes represent variables / abstract locations � Faster algorithms for special cases (see [RHS95]) • Edges represent statements Points-to analysis paths: � Specialized algorithm needed to scale pointer analysis � ∈ �� • flowsTo - path from o to x: � For more details, see [Reps98] • alias - path from x to y: �� ∩ �� ≠ ∅ What about alias ? Andersen’s Analysis in CFL-Reachability x = new Obj(); // o 1 � Want: � �� ⇔ ∃ � � � �� ∧ � �� z = new Obj(); // o 2 w = x; �� Problem: need all edges in same direction y = x; � � � Solution: alias => flowsTo flowsTo y.f = z; Edge types statement v = w.f; �� flowsTo is inverse of flowsTo flowsTo alias �� Must add inverse edges to graph (e.g., assign) � � � � � See [SB06] for full grammar flowsTo => new (pf[f] alias gf[f] | assign)* flowsTo => new (assign)* balanced parens Importance of Handling Method Calls � Used pervasively, esp. in Java-like languages � Often deeply related to objects and pointers class ArrayList { Object[] elems; int i; public ArrayList() { allocation this.elems = new Object[10]; } METHOD CALLS public void add(Object o) { pointer write this.elems[i++] = o; } public Object get(int i) { pointer read return this.elems[i]; } } 3

4/5/2010 Precise Handling of Method Calls Decidability with Context Sensitivity � Precise path-insensitive + dynamic memory still � Idea: analyze as if all method calls inlined undecidable � Yields separate copies of local variables / new � Already undecidable with just one method expressions for each possible call � Flow-insensitive + dynamic memory + precise � Known as a context-sensitive analysis calls: undecidable � Problem: how to handle recursion � Recall that with one method, decidability unknown � Full inlining yields an infinite program � Via small modification of [Reps00] proof � But, analysis definitions still work fine! � Even Andersen-style analysis + precise calls is � Require variables p and q up front; forces choice of inlined undecidable (details coming up) copy � No dynamic memory: not well-studied � Flow-insensitive: find finite sequence from infinite statement � Note that stack frames are a form of dynamic memory set Andersen’s and Calls, Simplified Matching Calls and Returns: Example � Four statement types (ignore fields for now) � new : x = new Obj() �� assign : x = y � � � � id(p) { return p; } � call : x = m(p1, p2, …) x = new Obj(); // o1 � � return: return x y = new Obj(); // o2 a = id(x); � Idea: use balanced parentheses to match calls � � �� b = id(y); and returns � Parens labeled by call site � → �� ε � Grammar filters out unrealizable paths (method call returning to wrong site) � → �� Classic use of CFL-reachability [RHS95] Andersen’s and Calls: The Details Andersen’s and Calls: Decidability � Must allow for partially balanced call parens � Analysis requires solving reachability over � E.g., to handle intersection of two CFLs ( S and flowsTo ) makeObj() { return new Obj(); } � But, CFLs are not closed under intersection � Handle fields and calls simultaneously via � In our case, problem is undecidable intersected languages � Proof via reduction from PCP [Reps00] � Enhance N production (previous slide) to include all � Standard approach for decidability: approximate field accesses recursion � Points-to analysis must find paths that are both S � Collapse SCCs in call graph (change ( i into assign ) paths (for calls) and flowsTo paths (for fields) � Yields imprecise handling of recursive calls / returns � Also need barred edges, etc.; details in [SB06] 4

What is points-to analysis? Informally: analysis determining what - PDF document

4/5/2010 What is points-to analysis? Informally: analysis determining what locations (objects) pointers can point to Points-To Analysis Program main() { x = &a; Meeting 21, CSCI 5535, Spring 2010 y = &b; z = x; Guest Lecture:

Points-to Analysis y = &z; y z Points-to Analysis y = &z; x = &y; x y z

CMPS 112, Spring 2019 Midterm (Solutions) Section Points Score Reductions 10 points Lists

Points to ponder while we wait for everyone to log on Points to ponder while we wait for

September 27, 2013 New MAP-based Performance Policy Category Points Points Weighted Points

CMSC427 Rendering polylines Points, polylines and polygons Points Polyline Polygon Polyline can

The projective line minus three fractional 3 kinds of integral points points Darmons M

2016 IMPACT ANALYSIS Sponsored by: Presented by: KEY POINTS KEY POINTS You Are Not the Target

ELLIPTIC CURVES By Jessica and Sushi WHAT ARE ELLIPTIC CURVES?! ADDING POINTS! Adding points

Environmental Focal Points California Air Resources Board Environmental Focal Points

Points Points A , B , P and Q . MA202 Sections 5 & 401 Chapter 11-1 Slides Lines Lines

Closest Pair of Points Cormen et.al 33.4 Closest Pair of Points Closest pair. Given n points in

CS4495/6495 Introduction to Computer Vision 4B-L2 Matching feature points (a little) Feature

Accumulation points of real Schur roots Charles Paquette November 22 nd , 2014 CGMRT 2014,

Efficient and Precise Points-to Analysis: Modeling the Heap by Merging Equivalent Automata Tian

Cluster Analysis Objective: Group data points into classes of similar points based on a series of

WARMUP (10, -4) AND (6, 0) FIND THE DISTANCE BETWEEN THE 2 POINTS. FIND THE SLOPE

Lecture 18: Voronoi Graphs and Distinctive States CS 344R/393R: Robotics Benjamin Kuipers

Containment Problems and the Resurgence for Annika Denkert Points on Intersecting Lines in P 2

Families of curves with nontrivial endomorphisms in their Jacobians Jerome William Hoffman

A new study on the vanishing ideal of a set of points with multiplicity structures Na Lei,

On a Problem of Hajdu and Tengely Samir Siksek Michael Stoll University of Warwick Universit

Statistical Shape Models Eigenpatches model regions Assume shape is fixed What if it

Oblique projections and applications to weighted Procrustes type problems in Hilbert spaces

A Sparse Stress Model Mark Ortmann Mirza Klimenta Ulrik Brandes Department of Computer &

What is points-to analysis? Informally: analysis determining what - PDF document

4/5/2010 What is points-to analysis? Informally: analysis determining what locations (objects) pointers can point to Points-To Analysis Program main() { x = &a; Meeting 21, CSCI 5535, Spring 2010 y = &b; z = x; Guest Lecture:

Points-to Analysis y = &amp;z; y z Points-to Analysis y = &amp;z; x = &amp;y; x y z

CMPS 112, Spring 2019 Midterm (Solutions) Section Points Score Reductions 10 points Lists

Points to ponder while we wait for everyone to log on Points to ponder while we wait for

September 27, 2013 New MAP-based Performance Policy Category Points Points Weighted Points

CMSC427 Rendering polylines Points, polylines and polygons Points Polyline Polygon Polyline can

The projective line minus three fractional 3 kinds of integral points points Darmons M

2016 IMPACT ANALYSIS Sponsored by: Presented by: KEY POINTS KEY POINTS You Are Not the Target

ELLIPTIC CURVES By Jessica and Sushi WHAT ARE ELLIPTIC CURVES?! ADDING POINTS! Adding points

Environmental Focal Points California Air Resources Board Environmental Focal Points

Points Points A , B , P and Q . MA202 Sections 5 &amp; 401 Chapter 11-1 Slides Lines Lines

Closest Pair of Points Cormen et.al 33.4 Closest Pair of Points Closest pair. Given n points in

CS4495/6495 Introduction to Computer Vision 4B-L2 Matching feature points (a little) Feature

Accumulation points of real Schur roots Charles Paquette November 22 nd , 2014 CGMRT 2014,

Efficient and Precise Points-to Analysis: Modeling the Heap by Merging Equivalent Automata Tian

Cluster Analysis Objective: Group data points into classes of similar points based on a series of

WARMUP (10, -4) AND (6, 0) FIND THE DISTANCE BETWEEN THE 2 POINTS. FIND THE SLOPE

Lecture 18: Voronoi Graphs and Distinctive States CS 344R/393R: Robotics Benjamin Kuipers

Containment Problems and the Resurgence for Annika Denkert Points on Intersecting Lines in P 2

Families of curves with nontrivial endomorphisms in their Jacobians Jerome William Hoffman

A new study on the vanishing ideal of a set of points with multiplicity structures Na Lei,

On a Problem of Hajdu and Tengely Samir Siksek Michael Stoll University of Warwick Universit

Statistical Shape Models Eigenpatches model regions Assume shape is fixed What if it

Oblique projections and applications to weighted Procrustes type problems in Hilbert spaces

A Sparse Stress Model Mark Ortmann Mirza Klimenta Ulrik Brandes Department of Computer &amp;

Points-to Analysis y = &z; y z Points-to Analysis y = &z; x = &y; x y z

Points Points A , B , P and Q . MA202 Sections 5 & 401 Chapter 11-1 Slides Lines Lines

A Sparse Stress Model Mark Ortmann Mirza Klimenta Ulrik Brandes Department of Computer &