Review e ::= λx. e | x | e e | c τ ::= int | τ → τ v ::= λx. e | c Γ ::= · | Γ , x : τ CSE 505: Programming Languages e 1 → e ′ e 2 → e ′ 1 2 e 1 e 2 → e ′ v e 2 → v e ′ ( λx. e ) v → e [ v/x ] 1 e 2 2 Lecture 13 — Safely Extending STLC: Sums, e [ e ′ /x ] : capture-avoiding substitution of e ′ for free x in e Products, Bools Γ , x : τ 1 ⊢ e : τ 2 Γ ⊢ c : int Γ ⊢ x : Γ( x ) Γ ⊢ λx. e : τ 1 → τ 2 Zach Tatlock Γ ⊢ e 1 : τ 2 → τ 1 Γ ⊢ e 2 : τ 2 Fall 2013 Γ ⊢ e 1 e 2 : τ 1 Preservation: If · ⊢ e : τ and e → e ′ , then · ⊢ e ′ : τ . Progress: If · ⊢ e : τ , then e is a value or ∃ e ′ such that e → e ′ . Zach Tatlock CSE 505 Fall 2013, Lecture 13 2 Adding Stuff Pairs (CBV, left-right) Time to use STLC as a foundation for understanding other e ::= . . . | ( e, e ) | e. 1 | e. 2 common language constructs v ::= . . . | ( v, v ) τ ::= . . . | τ ∗ τ We will add things via a principled methodology thanks to a proper education e 1 → e ′ e 2 → e ′ 1 2 ( e 1 , e 2 ) → ( e ′ ( v 1 , e 2 ) → ( v 1 , e ′ 1 , e 2 ) 2 ) ◮ Extend the syntax e → e ′ e → e ′ ◮ Extend the operational semantics e. 1 → e ′ . 1 e. 2 → e ′ . 2 ◮ Derived forms (syntactic sugar), or ◮ Direct semantics ( v 1 , v 2 ) . 1 → v 1 ( v 1 , v 2 ) . 2 → v 2 ◮ Extend the type system Small-step can be a pain ◮ Extend soundness proof (new stuck states, proof cases) ◮ Large-step needs only 3 rules ◮ Will learn more concise notation later (evaluation contexts) In fact, extensions that add new types have even more structure Zach Tatlock CSE 505 Fall 2013, Lecture 13 3 Zach Tatlock CSE 505 Fall 2013, Lecture 13 4
Pairs continued Records Records are like n -ary tuples except with named fields ◮ Field names are not variables; they do not α -convert Γ ⊢ e 1 : τ 1 Γ ⊢ e 2 : τ 2 e ::= . . . | { l 1 = e 1 ; . . . ; l n = e n } | e.l Γ ⊢ ( e 1 , e 2 ) : τ 1 ∗ τ 2 v ::= . . . | { l 1 = v 1 ; . . . ; l n = v n } τ ::= . . . | { l 1 : τ 1 ; . . . ; l n : τ n } Γ ⊢ e : τ 1 ∗ τ 2 Γ ⊢ e : τ 1 ∗ τ 2 e i → e ′ e → e ′ Γ ⊢ e. 1 : τ 1 Γ ⊢ e. 2 : τ 2 i { l 1 = v 1 , . . . , l i − 1 = v i − 1 , l i = e i , . . . , l n = e n } e.l → e ′ .l Canonical Forms: If · ⊢ v : τ 1 ∗ τ 2 , then v has the form ( v 1 , v 2 ) → { l 1 = v 1 , . . . , l i − 1 = v i − 1 , l i = e ′ i , . . . , l n = e n } 1 ≤ i ≤ n Progress: New cases using Canonical Forms are v. 1 and v. 2 { l 1 = v 1 , . . . , l n = v n } .l i → v i Preservation: For primitive reductions, inversion gives the result Γ ⊢ e 1 : τ 1 . . . Γ ⊢ e n : τ n labels distinct directly Γ ⊢ { l 1 = e 1 , . . . , l n = e n } : { l 1 : τ 1 , . . . , l n : τ n } Γ ⊢ e : { l 1 : τ 1 , . . . , l n : τ n } 1 ≤ i ≤ n Γ ⊢ e.l i : τ i Zach Tatlock CSE 505 Fall 2013, Lecture 13 5 Zach Tatlock CSE 505 Fall 2013, Lecture 13 6 Records continued Sums What about ML-style datatypes: Should we be allowed to reorder fields? type t = A | B of int | C of int * t ◮ · ⊢ { l 1 = 42; l 2 = true } : { l 2 : bool ; l 1 : int } ?? ◮ Really a question about, “when are two types equal?” 1. Tagged variants (i.e., discriminated unions) Nothing wrong with this from a type-safety perspective , yet many 2. Recursive types languages disallow it 3. Type constructors (e.g., type ’a mylist = ... ) ◮ Reasons: Implementation efficiency, type inference 4. Named types Return to this topic when we study subtyping For now, just model (1) with (anonymous) sum types ◮ (2) is in a later lecture, (3) is straightforward, and (4) we’ll discuss informally Zach Tatlock CSE 505 Fall 2013, Lecture 13 7 Zach Tatlock CSE 505 Fall 2013, Lecture 13 8
Sums syntax and overview Sums operational semantics e ::= . . . | A ( e ) | B ( e ) | match e with A x. e | B x. e match A ( v ) with A x. e 1 | B y. e 2 → e 1 [ v/x ] v ::= . . . | A ( v ) | B ( v ) τ ::= . . . | τ 1 + τ 2 match B ( v ) with A x. e 1 | B y. e 2 → e 2 [ v/y ] ◮ Only two constructors: A and B e → e ′ e → e ′ A ( e ) → A ( e ′ ) B ( e ) → B ( e ′ ) ◮ All values of any sum type built from these constructors e → e ′ ◮ So A ( e ) can have any sum type allowed by e ’s type match e with A x. e 1 | B y. e 2 → match e ′ with A x. e 1 | B y. e 2 ◮ No need to declare sum types in advance match has binding occurrences, just like pattern-matching ◮ Like functions, will “guess the type” in our rules (Definition of substitution must avoid capture, just like functions) Zach Tatlock CSE 505 Fall 2013, Lecture 13 9 Zach Tatlock CSE 505 Fall 2013, Lecture 13 10 What is going on Sums Typing Rules Inference version (not trivial to infer; can require annotations) Feel free to think about tagged values in your head: Γ ⊢ e : τ 1 Γ ⊢ e : τ 2 ◮ A tagged value is a pair of: Γ ⊢ A ( e ) : τ 1 + τ 2 Γ ⊢ B ( e ) : τ 1 + τ 2 ◮ A tag A or B (or 0 or 1 if you prefer) ◮ The (underlying) value Γ ⊢ e : τ 1 + τ 2 Γ , x : τ 1 ⊢ e 1 : τ Γ , y : τ 2 ⊢ e 2 : τ Γ ⊢ match e with A x. e 1 | B y. e 2 : τ ◮ A match: ◮ Checks the tag Key ideas: ◮ Binds the variable to the (underlying) value ◮ For constructor-uses, “other side can be anything” ◮ For match , both sides need same type This much is just like OCaml and related to homework 2 ◮ Don’t know which branch will be taken, just like an if . ◮ In fact, can drop explicit booleans and encode with sums: E.g., bool = int + int , true = A (0) , false = B (0) Zach Tatlock CSE 505 Fall 2013, Lecture 13 11 Zach Tatlock CSE 505 Fall 2013, Lecture 13 12
Sums Type Safety What are sums for? Canonical Forms: If · ⊢ v : τ 1 + τ 2 , then there exists a v 1 such ◮ Pairs, structs, records, aggregates are fundamental that either v is A ( v 1 ) and · ⊢ v 1 : τ 1 or v is B ( v 1 ) and data-builders · ⊢ v 1 : τ 2 ◮ Sums are just as fundamental: “this or that not both” ◮ Progress for match v with A x. e 1 | B y. e 2 follows, as usual, from Canonical Forms ◮ You have seen how OCaml does sums (datatypes) ◮ Preservation for match v with A x. e 1 | B y. e 2 follows from ◮ Worth showing how C and Java do the same thing the type of the underlying value and the Substitution Lemma ◮ A primitive in one language is an idiom in another ◮ The Substitution Lemma has new “hard” cases because we have new binding occurrences ◮ But that’s all there is to it (plus lots of induction) Zach Tatlock CSE 505 Fall 2013, Lecture 13 13 Zach Tatlock CSE 505 Fall 2013, Lecture 13 14 Sums in C Sums in Java type t = A of t1 | B of t2 | C of t3 type t = A of t1 | B of t2 | C of t3 match e with A x -> ... match e with A x -> ... One way in C: One way in Java ( t4 is the match-expression’s type): struct t { abstract class t {abstract t4 m();} enum {A, B, C} tag; class A extends t { t1 x; t4 m(){...}} union {t1 a; t2 b; t3 c;} data; class B extends t { t2 x; t4 m(){...}} }; class C extends t { t3 x; t4 m(){...}} ... switch(e->tag){ case A: t1 x=e->data.a; ... ... e.m() ... ◮ A new method in t and subclasses for each match expression ◮ No static checking that tag is obeyed ◮ Supports extensibility via new variants (subclasses) instead of ◮ As fat as the fattest variant (avoidable with casts) extensibility via new operations ( match expressions) ◮ Mutation costs us again! Zach Tatlock CSE 505 Fall 2013, Lecture 13 15 Zach Tatlock CSE 505 Fall 2013, Lecture 13 16
Pairs vs. Sums You need both in your language ◮ With only pairs, you clumsily use dummy values, waste space, and rely on unchecked tagging conventions ◮ Example: replace int + ( int → int ) with int ∗ ( int ∗ ( int → int )) Pairs and sums are “logical duals” (more on that later) ◮ To make a τ 1 ∗ τ 2 you need a τ 1 and a τ 2 ◮ To make a τ 1 + τ 2 you need a τ 1 or a τ 2 ◮ Given a τ 1 ∗ τ 2 , you can get a τ 1 or a τ 2 (or both; your “choice”) ◮ Given a τ 1 + τ 2 , you must be prepared for either a τ 1 or τ 2 (the value’s “choice”) Zach Tatlock CSE 505 Fall 2013, Lecture 13 17
Recommend
More recommend