Interprocedural Analysis: Sharir-Pnuelis Call-strings Approach - - PowerPoint PPT Presentation
Interprocedural Analysis: Sharir-Pnuelis Call-strings Approach - - PowerPoint PPT Presentation
Interprocedural Analysis: Sharir-Pnuelis Call-strings Approach Deepak DSouza Department of Computer Science and Automation Indian Institute of Science, Bangalore. 06 October 2010 Call strings approach For a given program P and analysis
Call strings approach For a given program P and analysis ((D, ≤), fMN, d0), the join over all interprocedurally valid paths (JVP) at point N is defined to be:
- ρ∈IVP(r1,N)
fρ(d0). Idea: collect data values that reach each point, tagged with call-string of associated path. This helps to say which values pass to a given return site. Now we can set up equations that capture JVP values.
a := a−1 7 F G t := a*b 1 A read a,b t := a*b print t D call p E 11 call p a == 0 5 B C O L M N 2 3 4 6 9 8 ret t := a*b10 I J P H Q c1 c2
Call-string along an interprocedurally valid path Call-string associated with an IVP path ρ, denoted CM(p), is the sequence of pending calls in ρ. A path ρ in IVP(r1, I) for example program:
1 2 3 A B C O F H L F K J M I G J N D
Associated call-string CM(ρ) is c1.
Call-string along an interprocedurally valid path Call-string associated with an IVP path ρ, denoted CM(p), is the sequence of pending calls in ρ. A path ρ in IVP(r1, I) for example program:
1 2 3 A B C O F H L F K J M I G J N D
Associated call-string CM(ρ) is c1. For ρ′ = ABCOFGHLF CM(ρ′) = c1c2. Denote set of all call-strings for given program by Γ.
Tagging with call-strings Classify paths reaching N according to call-strings. For each call-string γ maintain data value d =
- ρ∈CM−1(γ)
fρ(d0). Thus elements of L∗ are maps ξ : Γ → D, and ordering ξ1 ≤ ξ2 is pointwise extension of ≤ in D. Tagged JVP value: ξ∗
N : γ → ρ∈CM−1(γ) fρ(d0).
JVP value dN =
γ∈Γ ξ∗ N(γ).
Example: Tagging Eg: Path ABCOFGHLFKJ has associated callstring c1c2.
c1 c1c2 ǫ c1 c1c2 ⊥ 1 c1c2c2 ǫ γ : ξ(γ) : γ : ξ(γ) :
Tagged data values at J for for availability of a*b analysis
a := a−1 7 F G t := a*b 1 A read a,b t := a*b print t D call p E 11 call p a == 0 5 B C O L M N 2 3 4 6 9 8 ret t := a*b10 I J K P H Q c1 c2
Data-flow analysis with tagged data values Let D∗ = Γ → D. Pointwise ordering on D∗
ξ ≤′ ξ′ iff ξ(γ) ≤ ξ′(γ) for each call-string γ.
(D∗, ≤′) is also a complete lattice. Initial value ξ0 is given by ξ0(γ) = d0 if γ = ǫ ⊥
- therwise.
Transfer functions for non call/ret nodes: f ∗
MN = λξ.fMN ◦ ξ.
Transfer functions f ∗
MN’s are monotonic (distributive) if fMN’s
are monotonic (distributive).
Transfer functions f ∗
MN by example
(Non-call/ret node) ξC = fBC ◦ ξB. (Call node) ξF(γ) = ξC(γ′) if γ = γ′ · c1 ⊥
- therwise
(Return site) ξP(γ) = ξJ(γ · c1).
a := a−1 7 F G t := a*b 1 A read a,b t := a*b print t D call p E 11 call p a == 0 5 B C O L M N 2 3 4 6 9 8 ret t := a*b10 I J K P H Q c1 c2
Correctness claims Claim Let the LFP of the analysis ((D∗, ≤′), f ∗
MN, ξ0) be ξ∗. Then
x∗
N =
- γ∈Γ
ξ∗
N(γ)
is an over-approximation of the JVP at N. When fMN’s are distributive x∗
N coincides with JVN at N.
Exercise Use Kildall’s algo to compute the ξ table values for the example program, for |γ| ≤ 4. Start with initial value d0 = 0.
a := a−1 7 F G t := a*b 1 A read a,b t := a*b print t D call p E 11 call p a == 0 5 B C O L M N 2 3 4 6 9 8 ret t := a*b10 I J K P H Q ǫ c1 c2
Exercise Use Kildall’s algo to compute the ξ table values for the example program, for |γ| ≤ 4. Start with initial value d0 = 0.
a := a−1 7 F G t := a*b 1 A read a,b t := a*b print t D call p E 11 call p a == 0 5 B C O L M N 2 3 4 6 9 8 ret t := a*b10 I J K P H Q ǫ c1 c2 ǫ
Exercise Use Kildall’s algo to compute the ξ table values for the example program, for |γ| ≤ 4. Start with initial value d0 = 0.
a := a−1 7 F G t := a*b 1 A read a,b t := a*b print t D call p E 11 call p a == 0 5 B C O L M N 2 3 4 6 9 8 ret t := a*b10 I J K P H Q ǫ c1 c2 ǫ ǫ 1
Exercise Use Kildall’s algo to compute the ξ table values for the example program, for |γ| ≤ 4. Start with initial value d0 = 0.
a := a−1 7 F G t := a*b 1 A read a,b t := a*b print t D call p E 11 call p a == 0 5 B C O L M N 2 3 4 6 9 8 ret t := a*b10 I J K P H Q ǫ c1 c2 ǫ ǫ 1 c1 1
Exercise Use Kildall’s algo to compute the ξ table values for the example program, for |γ| ≤ 4. Start with initial value d0 = 0.
a := a−1 7 F G t := a*b 1 A read a,b t := a*b print t D call p E 11 call p a == 0 5 B C O L M N 2 3 4 6 9 8 ret t := a*b10 I J K P H Q ǫ c1 c2 ǫ ǫ 1 c1 1
Convergence of iteration Lattice (D∗, ≤′) is infinite for recursive programs. It is possible to bound the size of call strings Γ we need to consider. Let k be the number of call sites in P.
Convergence of iteration Claim For any path p with a prefix q such that |CM(q)| > k|D|2 = M there is a path p′ with |CM(q′)| ≤ M for each prefix q′ of p′, and fp(d0) = fp′(d0). Paths with bounded call-strings
M p p′
Proof follows shortly.
Ensuring convergence Go over to a finite lattice. Consider only call strings of length ≤ M (Call this ΓM). IVPΓM(r1, N) = paths from r1 to N such that for each prefix q, CM(q) ≤ M.
Data-flow analysis for JVP over IVPΓM
(Non-call/ret node) ξC = fBC ◦ ξB . (Call node) ξF (γ) = 8 < : ξC (γ′) if γ = γ′ · c1 and γ ∈ ΓM ⊥
- therwise
(Return site) ξP (γ) = ξJ (γ · c1).
a := a−1 7 F G t := a*b 1 A read a,b t := a*b print t D call p E 11 call p a == 0 5 B C O L M N 2 3 4 6 9 8 ret t := a*b10 I J K P H Q c1 c2
Bounding call-string size Claim For any path p in IVP(r1, N) such that |CM(q)| > M = k|D|2 for some prefix q of p, there is a path p′ in IVPΓM(r1, N) with fp′(d0) = fp(d0). Sufficient to prove: Subclaim For any path p in IVP(r1, N) with a prefix q such that |CM(q)| > M, we can produce a smaller path p′ in IVP(r1, N) with fp′(d0) = fp(d0). ...since if |p| ≤ M then p ∈ IVPΓM.
Proving subclaim: Path decomposition A path ρ in IVP(r1, n) can be decomposed as ρ1(c1, rp2)ρ2(c2, rp3)σ3 · · · (cj−1, rpj )ρj. where each ρi (i < j)is a valid and complete path from rpi to ci, and ρj is a valid and complete path from rpj to n. Thus c1, . . . , cj are the unfinished calls at the end of ρ.
1 2 4 3 c1 c2 c2 r2 c2 c2 r2 r2 r2 c2 c1 c2 ρ2 ρ3
Proving subclaim Let p0 be the first prefix of p where |CM| > M. Let decomposition of p0 be ρ1(c1, rp2)ρ2(c2, rp3)σ3 · · · (cj−1, rpj)ρj. Tag each unfinished-call ci in p0 by (ci, fq·ci(d0), fq·ciq′ei+1) where ei+1 is corresponding return of ci in p. If no return for ci in p tag with (c, fq·ci (d0), ⊥). Number of distinct such tags is k · |D|2. So there are two calls qc and qcq′c with same tag values.
Proving subclaim – tag values are ⊥
M p c c p′
Proving subclaim – tag values are not ⊥
M
c c e e
p′ p
Example
A read a,b B 6 return N return C call p2 D F G K I call p1 call p1 t:=a*b L a := 0 E K M O P Q R S T e2 m′ r1 c1 n1 e1 n′ c′ n2 m2 c2