 
              Interprocedural Analysis with Data-Dependent Calls Circularity dilemma In languages with function pointers, first-class functions, or Problem: dynamically dispatched messages, callee(s) at call site 1. to compute possible callees, depend on data flow decide to do interprocedural analysis 2. to do interprocedural analysis, Could make worst-case assumptions: need a call graph call all possible functions/methods... 3. to construct a call graph, • ... with matching name (if name is given at call site) need to compute possible callee functions • ... with matching type (if type is given & trustable) 1. to compute possible callees, ... • ... that have had their address taken, & escape (if known) call graph Could do analysis to compute possible callees/receiver classes • intraprocedural analysis OK possible callees • interprocedural analysis better • context-sensitive interprocedural analysis even better interprocedural analysis How to break vicious cycle? Craig Chambers 176 CSE 501 Craig Chambers 177 CSE 501 A solution: optimistic iterative analysis Example Set up a standard optimistic interprocedural analysis, proc main() { use iteration to relax initial optimistic solution into proc p(pa) { return pa(d); } a sound fixed-point solution [e.g., for function ptrs/values] return b(p); } A simple flow-insensitive, context-insensitive analysis: proc b(ba) { • for each (formal, local, result, global, instance) variable, proc q(qa) { return d(d); } maintain set of possible functions that could be there c(q); • initially: empty set for all variables return ba(d); • for each call site, set of callees derived from set associated } with applied function expression • initially: no callees proc c(ca) { return ca(ca); worklist := { main } } while worklist not empty remove p from worklist proc d(da) { process p : proc r(ra) { return da; } perform intra analysis propagating fn sets from formals return c(r); foreach call site s in p : } add call edges for any new reachable callees add fns of actuals to callees’ formals if new callee(s) reached or callee(s)’ formals changed, put callee(s) back on worklist if result changed, put caller(s) back on worklist Craig Chambers 178 CSE 501 Craig Chambers 179 CSE 501
Context-sensitive analyses Static analysis of OO programs Can get more precision through Problem: dynamically dispatched message sends context-sensitive interprocedural analysis • direct cost: extra run-time checking to select target method • indirect cost: hard to inline, construct call graph, do interprocedural analysis k -CFA ( c ontrol f low a nalysis) [Shivers 88 etc.] • analyze Scheme programs Smaller problem: run-time class/subclass tests (e.g. instanceof, checked casts) • context key: sequence of k enclosing call sites • k =0 ⇒ context-insensitive • direct cost: extra tests • k =1 ⇒ reanalyze for each call site (but not transitively) − loses precision beyond k recursive calls − cost is exponential in k , even if no gain in precision An alternative: • context key: set of possible functions for arguments + avoid weaknesses of k -CFA: • only expend effort if possibly beneficial • never hits an arbitrary cut-off • worst-case cost proportional to (2 | Functions | ) MaxNumberOfArgs Craig Chambers 180 CSE 501 Craig Chambers 181 CSE 501 Class analysis Intraprocedural class analysis Solution to both problems: static class analysis Propagate sets of bindings of variables to sets of classes through CFG • compute set of possible classes e.g. {x → {Int}, y → {Vector,String}} of objects computed by each expression Knowing set of possible classes of message receivers enables Flow functions: message lookup at compile-time • CA x := new C (in) = in − { x →∗ } ∪ { x → { C }} ( static binding , devirtualization ) • CA x := y (in) = in − { x →∗ } ∪ { x → in( y )} Benefits of knowing set of possible target methods: • CA x := ... (in) = in − { x →∗ } ∪ { x →⊥ } • can construct call graph & do interprocedural analysis • CA if x instanceof C goto L1 else L2 (in) = • if single callee, then can inline, if profitable in − { x →∗ } ∪ { x → in( x ) ∩ Subclasses ( C )} (for L1 ) in − { x →∗ } ∪ { x → in( x ) − Subclasses ( C )} • if small number of callees, then can insert type-case (for L2 ) Knowing classes of arguments to run-time class/subclass tests Use info at sends, type tests enables constant-folding of tests, plus cast checking tools • x := y.foo(z) • if x instanceof C goto L1 else L2 Compose class analysis with inlining, etc. Many different algorithms for performing class analysis • different trade-offs between precision and cost Craig Chambers 182 CSE 501 Craig Chambers 183 CSE 501
Limitations of intraprocedural analysis Profile-guided class prediction Don’t know classes of Can exploit dynamic profile information if static info lacking • formals • results of non-inlined messages Monitor receiver class distributions for each send • contents of instance variables Recompile program, inserting run-time class tests for common receiver classes • on-line (e.g. in Self [Hölzle & Ungar 96]) Don’t know complete set of classes in program ⇒ can’t learn much from static type declarations or off-line (e.g. in Vortex) frequency Improve information by: 100% • looking at dynamic profiles Rectangle 75% • specializing methods for particular receiver/argument classes 50% • performing interprocedural class analysis 25% Circle Triangle Ellipse • flow-sensitive & -insensitive methods 0% • context-sensitive & -insensitive methods receiver classes for “ area ” Before: i := s.area(); After: i := ( if s.class == R then Rect::area(s) else s.area()); Craig Chambers 184 CSE 501 Craig Chambers 185 CSE 501 Specialization What to specialize? To get better static info, In Sather, Trellis: specialize for all inheriting receiver classes specialize source method w.r.t. inheriting receiver class • in Trellis, reuse superclass’s code if no change + compiler knows statically the class of the receiver formal In Self: same, but specialize at run-time class Rectangle { • Self compiles everything at run-time, ... incrementally as needed int area() { return length() * width(); } • will only specialize for (classes × messages) int length() { ... } actually used at run-time int width() { ... } }; In Vortex: use profile-derived weighted call graph to guide specialization class Square extends Rectangle { • only specialize if high frequency & provides benefit int size; • can specialize on args, too int length() { return size; } • can specialize for sets of classes w/ same behavior int width() { return size; } }; If specialize Rectangle::area as Square::area , can inline-expand length() & width() sends Craig Chambers 186 CSE 501 Craig Chambers 187 CSE 501
Flow-insensitive interprocedural static class analysis Improvements Simple idea: examine complete class hierarchy, Add optimistic pruning of unreachable classes put upper limit of possible callees of all messages • optimistically track which classes are instantiated during • can now benefit from type declarations, instanceof’s analysis Class Hierarchy Analysis (CHA) [Dean et al. 96, ...] • don’t make call arc to any method not inherited by an instantiated class • fill in skipped arcs as classes become reachable class Shape { • O( N ) abstract int area(); }; Rapid Type Analysis [Bacon & Sweeney 96]: in C++ class Rectangle extends Shape { ... Add intraprocedural analysis int area() { return length() * width(); } [Diwan et al. 96]: in Modula-3, w/o optimistic pruning, int length() { ... } w/ flow-sensitive interprocedural analysis int width() { ... } after flow-insensitive call graph construction }; class Square extends Rectangle { Type-inference-style analysis à la Steensgaard int size; • compute set of classes for each “type variable” int length() { return size; } • use unification to merge type variables int width() { return size; } • can blend with propagation, too }; [DeFouw et al. 98, Grove & Chambers 01]: in Vortex Rectangle r = ...; ... r.area() ... Craig Chambers 188 CSE 501 Craig Chambers 189 CSE 501 Flow-sensitive interprocedural static class analysis And the standard solution: optimistic iteration Extend static class analysis to examine entire program Compute call graph and class sets simultaneously, through optimistic iterative refinement • infer argument & result class sets for all methods • infer contents of instance variables and arrays Use worklist-based algorithm, with procedures on the worklist Initialize call graph & class sets to empty The standard problem: constructing the interprocedural call graph Initialize worklist to main call graph To process procedure off worklist: • analyze, given class sets for formals: receiver classes • perform method lookup at call sites • add call graph edges based on lookup • update callee(s) formals’ sets based on actuals’ class sets interprocedural analysis • if a callee method’s argument set changes, add it to worklist • if result set changes, add caller methods to worklist • if contents of an instance variable or array changes, add all accessing methods to worklist Craig Chambers 190 CSE 501 Craig Chambers 191 CSE 501
Recommend
More recommend