 
              Type-Based Analysis and Applications Jens Palsberg Purdue University Department of Computer Science www.cs.purdue.edu/people/palsberg Supported by an NSF CAREER award. 1
� � � � � Terminology A type-based analysis assumes that the program type checks, and the analysis takes advantage of that. What is an example of a type-based analysis? What are the advantages of type-based analysis? Is type-based analysis competitive with other approaches to static analysis? Which tools use type-based analysis? What is the current spectrum of type-based analyses? 2
Static Analysis: Past Successes Optimizing Compilers Static Analysis program understanding debugging Software Engineering testing reverse engineering Static Analysis Symposium International Symposium on Software Testing and Analysis 3
� � � Static Analysis: Future Challenges Verification of key properties of software: real-time properties security-related behavior power consumption Highly efficient static analysis for run-time compilation Scalable static analysis 4
Program Model Extraction Model Model Checking Properties 5
� � � The Questions A type-based analysis assumes that the program type checks, and the analysis takes advantage of that. Can the types help with: defining more complicated analyses? reasoning about the correctness of an analysis? making the static analyses more efficient? 6
Example: Flow Analysis for the λ -Calculus Four well-known static analyses: 1. 0-CFA (does not rely on types) 2. type and effect system (type based) 3. sparse flow graph (type based) 4. types as discriminators (type based) 7
✂ ✂ ☎ ✄ ✂ ☎ ✆ ✂ ✄ ✂ ✄ ✄ ✂ ☎ ✂ ✆ ☎ ☎ ✄ ✂ ☎ ☎ ✆ ✂ ✄ ✄ ☎ ☎ ✄ ✄ ☎ � ✁ ✂ ✂ ✁ ✂ ✂ � ✄ Example: Flow Analysis for the λ -Calculus :: λ l x e x e e 1 e 2 What are the possible results of evaluating an expression? A flow set is a set of labels of λ -abstractions. Goal: compute a flow set that conservatively estimates the possible results. Conservative = if v is a possible value of e , then the label of v must be in the flow set of e . λ 1 f λ 2 x λ 3 a λ 4 b Running example: F fx a b λ 2 x λ 3 a λ 4 b F a x b β λ 3 a λ 4 b a b β λ 4 b b β So, a flow analysis must produce a flow set for F that contains the label 4. 8
✆ ✄ ✄ ☎ ✂ ✆ ✆ ✂ ✆ ☎ ✂ ✆ ✂ ☎ ✂ ✄ ✆ � ✂ � ✆ ✆ � ✆ 0-CFA Idea: flow graph. :: Nodes e occurs in the program E n e Edges: [Heintze & McAllester 1997] λ l x λ l x (1) e e λ l x e 1 e e 1 e 2 occurs in E (2) x e 2 λ l x e 1 e e 1 e 2 occurs in E (3) e 1 e 2 e e 1 e 2 e 2 e 3 (4) e 1 e 3 Idea: the flow set for e is the set of labels of abstractions λ l x such that e λ l x there is an edge e in the flow graph. e n 3 Complexity: O time. 9
✆ ✂ ✆ ☎ � ✂ ✂ ✄ ☎ ✆ ✁ ✂ ✂ ✆ ✄ ✂ ✂ ✆ ✆ ✂ ✆ Running Example For F , we can use Rules (1)–(4) to generate the edges: λ 3 a f a λ 1 f λ 2 x λ 3 a λ 2 x fx a fx λ 4 b F fx a x b λ 4 b so by transitivity (Rule (4)), we have F b , 4 so the flow set for F is . 10
✁ ✂ ✄ ☎ ✄ ✂ � � � ✆ ☎ ✆ � ✁ ✆ � � � � A Simple Type System :: α α is a type variable Types t t t The type rules: x : t (5) A A x t x : s e : t A (6) e : s λ l x A t e 1 : s e 2 : s A t A (7) e 1 e 2 : t A 11
✆ ☎ ☎ ✆ ✄ � ✆ ✄ ☎ ☎ ✆ ☎ ✁ ✄ ✆ ✆ ✆ ✆ ☎ ✆ ✄ ✂ ✆ ✄ ☎ ✂ ✆ ✆ ✂ � ✆ ✄ ✄ � ✆ ✂ ✂ ✂ ☎ ✄ ✄ ✆ ✄ � ☎ ✆ ✄ ✂ ✆ ☎ ☎ ☎ ✆ ✄ � Running Example For F , we can use Rules (5)–(7) to construct a type derivation which contains the judgments: λ 1 f λ 2 x / fx : 0 α α α α α α α α λ 2 x / f : fx : 0 α α α α α α α α λ 3 a a : 0 / α α α α λ 4 b b : α 0 / α F : α 0 / α 12
✁ ✆ ☎ ✂ ✄ � ✄ � � ✂ � ✆ ✄ � ☎ ✁ ✁ ☎ � � � ✆ � ✆ � � � � A Type and Effect System e : s ϕ t , then the flow set for e is ϕ . Idea: if we have the judgment A t ϕ :: α ϕ is a flow set Annotated types: t t Revised type rules: x : t (8) A A x t x : s e : t A ϕ (9) l e : s ϕ λ l x A t e 1 : s ϕ e 2 : s A t A (10) e 1 e 2 : t A 13
� ✁ ☎ ✄ ✆ � � ✁ � � ☎ ✁ ✆ � � � ✆ � � ✄ ✆ ✁ ✄ � ☎ ☎ ✁ ✆ � � � ✁ � � � � ✆ ✄ ✆ � � ✁ ☎ � ☎ � ✆ � � ✁ ✂ � ✄ ✄ � ✂ ✁ � ✂ ☎ � ✆ � ✁ � � ✁ � ✄ ✆ � � ✁ ✆ � ☎ � ✆ � � � � � ✁ ✄ ✄ � ✂ � ✂ ✆ � � ✁ ✂ � ✁ � � ✄ ✆ � � ✁ ✆ � ☎ ☎ ✆ � � ✁ � � ✂ ✄ ✄ ✆ � � ✁ � � ☎ ☎ ✁ ✆ � ✁ Running Example For F , we can use Rules (8)–(10) to construct a type derivation which contains the judgments: 4 3 4 1 λ 1 f λ 2 x fx : 0 / α α α α 4 2 4 α α α α 4 3 4 λ 2 x / f : fx : 0 α α α α 4 2 4 α α α α 4 3 4 λ 3 a / a : 0 α α α α 4 λ 4 b b : α 0 / α 4 / F : α 0 α 4 so the flow set for F is . 14
✄ ✂ ✄ ☎ ✂ ✄ ☎ ✆ ✆ ✄ ☎ ✄ ✄ ☎ ✆ ✆ ✄ ☎ ✂ ✄ ☎ ☎ ☎ ☎ ✂ ✄ ☎ ☎ ✄ ✆ � ☎ ✁ ✄ ✄ ☎ ✁ ☎ ✄ ✄ ✆ ☎ ✆ ☎ ✄ ✆ ✆ ✄ ✆ Sparse Flow Graphs Idea: sparse flow graph; no transitive closure [Heintze & McAllester 97]. :: dom ran Nodes e occurs in the program E n e n n Edges: dom λ l x λ l x e occurs in E (11) x e ran λ l x λ l x e occurs in E (12) e e ran e 1 e 2 occurs in E (13) e 1 e 2 e 1 dom e 1 e 2 occurs in E (14) e 1 e 2 ran n 1 n 2 n n 1 (15) ran ran n 1 n 2 dom n 1 n 2 n n 2 (16) dom dom n 2 n 1 15
☎ ✂ ☎ ✆ ✂ ✆ ✄ ✄ ✆ ☎ ☎ ✂ ☎ ☎ ✂ ✆ ✄ ✄ ✄ ✄ ✆ ☎ ☎ ✂ ✆ ✆ ✂ ✄ ✆ ☎ ✂ ✂ ☎ ✄ ✆ ✆ ☎ ✄ ✂ ✄ ✂ ✂ ✂ ✄ ☎ ✄ ☎ ✆ ☎ ✆ ✄ ✂ ✄ ✆ � ✄ ☎ ✂ ✂ � ✂ ✁ ✄ ✂ ✂ ✆ ✂ ☎ � ✂ � ✂ � ✄ ✄ ✆ ☎ ✆ ☎ ☎ ✂ ☎ ✂ ☎ ✂ ✆ ✄ ✄ ✄ ✆ ✄ ✂ ☎ ✂ ✆ ☎ ☎ ✂ ✄ ✂ ✂ ✄ ☎ ✆ ✄ Running Example For F , we can use Rules (11)–(16) to generate the edges: λ 1 f λ 2 x λ 3 a dom f fx a λ 1 f λ 2 x λ 3 a λ 1 f λ 2 x λ 2 x ran fx a fx fx λ 1 f λ 2 x λ 3 a λ 1 f λ 2 x ran ran ran F fx a fx λ 2 x ran ran fx fx f λ 1 f λ 2 x λ 3 a ran dom ran fx a a λ 3 a λ 1 f λ 2 x dom dom dom dom a fx f λ 2 x λ 1 f λ 2 x dom dom ran x fx fx λ 1 f λ 2 x λ 3 a dom fx a λ 4 b b 4 so the flow set for F is . Simply Typed finite and sparse graph, and same result as 0-CFA. n 2 Simple Small Types time. O 16
✆ ✆ ✂ ☎ � ✆ ✆ ✄ ☎ ✂ ✂ ✆ ✆ ✄ ✂ ✁ ✁ � ✆ ☎ ✆ ✂ ☎ � ✆ ✄ ☎ ✆ ✄ ☎ ✄ ✆ ☎ � ✆ ✂ ✂ ☎ ✄ ✆ ✆ ✄ ✄ ✄ ✆ ✂ ☎ ☎ ✄ ✆ ✄ ✂ ✆ � ☎ ✆ ✄ � ✆ ☎ Types as Discriminators Flow set for e which abstractions in E have the same type as e ? For the running example F , we have: λ 1 f λ 2 x / fx : 0 α α α α α α α α λ 2 x f : fx : 0 / α α α α α α α α λ 3 a a : 0 / α α α α λ 4 b b : α 0 / α / F : α 0 α α , namely λ 4 b There is exactly one abstraction in F with type α b , 4 so the flow set for F is . 17
� ✂ ☎ ✁ ✁ ✂ � ✄ ✆ � Advantages of type-based analysis Simplicity x : s e : t A ϕ l e : s ϕ λ l x A t Efficiency Types can make almost anything go faster! Correctness Type soundness: well-typed programs cannot go wrong [Milner 78]. Similar story for type and effect systems. 18
✂ ✂ ✂ Competitiveness Main approaches to static analysis: data flow analysis, constraint-based analysis, abstract interpretation, Important similarities! Correctness Algorithm type rules constraints type and effect system constraints Define abstract domains in terms of types. The types are a lingua franca when comparing analyses. 19
Tools that use type-based analysis Work on programs written in C++, Java, Modula 3, and Standard ML. 20
� ✂ ☎ ☎ ✄ ✄ ✂ ☎ ✂ ✄ ✂ Method Inlining Object-oriented virtual call site inline? e m Approach: types as discriminators, taking advantage of subtyping. StaticType ( e ) the static type of the expression e , SubTypes the set of declared subtypes of type t t StaticLookup definition (if any) of a method with name m C m that one finds when starting a static method lookup in the class C . 21
Recommend
More recommend