why data flow models
play

Why Data Flow Models? Models from Chapter 5 emphasized control - PowerPoint PPT Presentation

Why Data Flow Models? Models from Chapter 5 emphasized control Control flow graph, call graph, finite state machines We also need to reason about dependence Dependence and Data Flow Models Where does this value of x come from?


  1. Why Data Flow Models? • Models from Chapter 5 emphasized control • Control flow graph, call graph, finite state machines • We also need to reason about dependence Dependence and Data Flow Models • Where does this value of x come from? • What would be affected by changing this? • ... • Many program analyses and test design techniques use data flow information – Often in combination with control flow • Example: “Taint” analysis to prevent SQL injection attacks • Example: Dataflow test criteria (Ch.13) (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 1 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 2 Learning objectives Def-Use Pairs (1) • Understand basics of data-flow models and the • A def-use (du) pair associates a point in a program related concepts (def-use pairs, dominators…) where a value is produced with a point where it is used • Definition : where a variable gets a value • Understand some analyses that can be performed with the data-flow model of a – Variable declaration (often the special value “uninitialized”) program – Variable initialization – Assignment – The data flow analyses to build models – Values received by a parameter – Analyses that use the data flow models • Use : extraction of a value from a variable • Understand basic trade-offs in modeling data – Expressions flow – Conditional statements – variations and limitations of data-flow models and – Parameter passing analyses, differing in precision and cost – Returns (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 3 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 4

  2. Def-Use Pairs (3) Def-Use Pairs /** Euclid's algorithm */ public class GCD ... ... { if (...) { public int gcd(int x, int y) { if (...) { Definition: x = ... ; int tmp; // A: def x, y, tmp x gets a while (y != 0) { // B: use y value ... tmp = x % y; // C: def tmp; use x, y x = ... } x = y; // D: def x; use y y = ... + x + ... ; y = tmp; // E: def y; use tmp ... } Use: the value return x; // F: use x of x is } Def-Use extracted path y = ... + x + ... ... Figure 6.2, page 79 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 5 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 6 Def-Use Pairs (3) Definition-Clear or Killing • A definition-clear path is a path along the CFG x = ... // A: def x ... from a definition to a use of the same variable q = ... Definition: x x = y; // B: kill x, def x A without* another definition of the variable x = ... gets a value z = ... y = f(x); // C: use x between ... – If, instead, another definition is present on the path, Definition: x gets Path A..C is a new value, old then the latter definition kills the former B not definition-clear x = y value is killed • A def-use pair is formed if and only if there is a ... definition-clear path between the definition Path B..C is Use: the value definition-clear and the use C of x is y = f(x) extracted *There is an over-simplification here, which we will repair later. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 7 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 8

  3. (Direct) Data Dependence Graph Control dependence (1) • Data dependence: Where did these values come from? • A direct data dependence graph is: • Control dependence: Which statement controls whether – Nodes: as in the control flow graph (CFG) this statement executes? – Edges: def-use (du) pairs, labelled with the variable name – Nodes: as in the CFG – Edges: unlabelled, from entry/branching points to controlled blocks Dependence edges show this x value could be the unchanged parameter or could be set at line D (Figure 6.3, page 80) (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 9 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 10 Dominators Dominators (example) • Pre-dominators in a rooted, directed graph can be • A pre-dominates all used to make this intuitive notion of “controlling A nodes; G post-dominates decision” precise. all nodes • Node M dominates node N if every path from the root B to N passes through M. • F and G post-dominate E – A node will typically have many dominators, but except for the • G is the immediate post- C E root, there is a unique immediate dominator of node N which dominator of B is closest to N on any path from the root, and which is in turn dominated by all the other dominators of N. – C does not post-dominate B D F – Because each node (except the root) has a unique immediate • B is the immediate pre- dominator, the immediate dominator relation forms a tree. dominator of G G • Post-dominators : Calculated in the reverse of the – F does not pre-dominate G control flow graph, using a special “exit” node as the root. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 11 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 12

  4. Control dependence (2) Control Dependence • We can use post-dominators to give a more precise A definition of control dependence: Execution of F is not inevitable at B – Consider again a node N that is reached on some but not all B execution paths. – There must be some node C with the following property: Execution of F is C E inevitable at E • C has at least two successors in the control flow graph (i.e., it represents a control flow decision); • C is not post-dominated by N D F • there is a successor of C in the control flow graph that is post- dominated by N. G – When these conditions are true, we say node N is control- F is control-dependent on B, the last point at which its dependent on node C. execution was not inevitable • Intuitively: C was the last decision that controlled whether N executed (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 13 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 14 Calculating def-use pairs • Definition-use pairs can be defined in terms of paths in the program control flow graph: – There is an association (d,u) between a definition of variable v at d and a use of variable v at u iff Data Flow Analysis • there is at least one control flow path from d to u • with no intervening definition of v. – v d reaches u (v d is a reaching definition at u). – If a control flow path passes through another definition e of the same Computing data flow information variable v, v e kills v d at that point. • Even if we consider only loop-free paths, the number of paths in a graph can be exponentially larger than the number of nodes and edges. • Practical algorithms therefore do not search every individual path. Instead, they summarize the reaching definitions at a node over all the paths reaching that node. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 15 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 16

  5. Exponential paths DF Algorithm (even without loops) • An efficient algorithm for computing reaching definitions (and several other properties) is based on the way reaching definitions at one node are related to the reaching definitions at an adjacent node. A B C D E F G V • Suppose we are calculating the reaching definitions of node n, and there is an edge (p,n) from an immediate 2 paths from A to B predecessor node p. Tracing each path is not efficient, and we 4 from A to C – If the predecessor node p can assign a value to variable v, then can do much better. the definition v p reaches n. We say the definition v p is 8 from A to D generated at p. 16 from A to E – If a definition v p of variable v reaches a predecessor node p, and if v is not redefined at that node (in which case we say the ... v p is killed at that point), then the definition is propagated on 128 paths from A to V from p to n. (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 17 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 18 Equations of node E (y = tmp) Equations of node B (while (y != 0)) public class GCD { public int gcd(int x, int y) { public class GCD { int tmp; // A: def x, y, tmp public int gcd(int x, int y) { Calculate reaching while (y != 0) { // B: use y int tmp; // A: def x, y, tmp This line has two definitions at E in tmp = x % y; // C: def tmp; use x, y while (y != 0) { // B: use y predecessors: terms of its x = y; // D: def x; use y tmp = x % y; // C: def tmp; use x, y Before the loop, immediate y = tmp; // E: def y; use tmp x = y; // D: def x; use y end of the loop predecessor D } y = tmp; // E: def y; use tmp return x; // F: use x } } return x; // F: use x } • Reach(B) = ReachOut(A) � ReachOut(E) Reach(E) = ReachOut(D) • ReachOut(A) = gen(A) = {x A , y A , tmp A } ReachOut(E) = (Reach(E) \ {y A }) � {y E } • ReachOut(E) = (Reach(E) \ {y A }) � {y E } (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 19 (c) 2007 Mauro Pezzè & Michal Young Ch 6, slide 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend