The Correctness of a Code Generator for a Functional Language - - PowerPoint PPT Presentation

the correctness of a code generator for a functional
SMART_READER_LITE
LIVE PREVIEW

The Correctness of a Code Generator for a Functional Language - - PowerPoint PPT Presentation

The Correctness of a Code Generator for a Functional Language Nathana el Courant September 8, 2017 Nathana el Courant Verified code generation September 8, 2017 1 / 21 Context PVS: interactive proof assistant Specification language


slide-1
SLIDE 1

The Correctness of a Code Generator for a Functional Language

Nathana¨ el Courant September 8, 2017

Nathana¨ el Courant Verified code generation September 8, 2017 1 / 21

slide-2
SLIDE 2

Context

PVS: interactive proof assistant Specification language almost completely executable Can be used to write and formally verify programs Slow → generate efficient executable code (PVS2C) Objective: formally verify PVS2C Idealized version Related work: CompCert, CakeML

Nathana¨ el Courant Verified code generation September 8, 2017 2 / 21

slide-3
SLIDE 3

A Small Functional Language Evaluation with Reference Counting A Small Imperative Language Formalization in PVS

Nathana¨ el Courant Verified code generation September 8, 2017 2 / 21

slide-4
SLIDE 4

Expressions and contexts

let a = t[i] in let b = t[j] in let t′ = t[i → b] in t′[j → a] e ::= | n | x | nil | f(x1, . . . , xn) | let (x : int*

n) = e1 in e2

| ifnz x then e1 else e2 | x[y] | x[y → z] | newint(n) | newref(n) | pop(e1) | ref(k) K ::= | | pop(K1) | let (x : int*

n) = K1 in e1

v ::= | n | nil | ref(k) r ::= | x | f(x1, . . . , xn) | x[y] | x[y → z] | newint(n) | newref(n) | ifnz x then e1 else e2 | let x = v in e | pop(v)

Nathana¨ el Courant Verified code generation September 8, 2017 3 / 21

slide-5
SLIDE 5

Decomposition theorem

Fill context with expression: replace hole by expression, denoted Ke K = let x = in f(x), e = 42 → Ke = let x = 42 in f(x)

Theorem (Decomposition theorem)

If e is not a value, there exists a unique decomposition e = Kr with K a context, r a redex.

Nathana¨ el Courant Verified code generation September 8, 2017 4 / 21

slide-6
SLIDE 6

Evaluation state

(e, S, M) e an expression, S the stack: maps variables → values, M the store: maps references → arrays of values.

Nathana¨ el Courant Verified code generation September 8, 2017 5 / 21

slide-7
SLIDE 7

Reduction of FL

Defined on redexes Context-preserving: if (e, S, M) → (e′, S′, M′), then (Ke, S, M) → (Ke′, S′, M′) Deterministic Exemples: (x, S, M) → (S(x), S, M) (x[y → z], S, M) → (new(M), S, M[new(M) → M(S(x))[S(y) → S(z)]]) (let x = v in e, S, M) → (pop(e), push(x, v, S), M)

Nathana¨ el Courant Verified code generation September 8, 2017 6 / 21

slide-8
SLIDE 8

A Small Functional Language Evaluation with Reference Counting A Small Imperative Language Formalization in PVS

Nathana¨ el Courant Verified code generation September 8, 2017 6 / 21

slide-9
SLIDE 9

Reference counts

Count number of times each reference appears #(S, x): number of times x appears in S Invariant: C(ref(k)) = 1ref(k)∈e + #(S, ref(k)) +

  • ref(s)∈M

#(M(ref(s)), ref(k)) Keep track of reference counts Free memory when count becomes 0 Destructive updates when possible

Nathana¨ el Courant Verified code generation September 8, 2017 7 / 21

slide-10
SLIDE 10

Marked variables

Variables can be marked + new constructor release Mark last occurrence of variables in each execution path Done statically: mark(X, e) marks each variable in e that is not in X mark(X, x) =

  • x

if x ∈ X x

  • therwise

mark(X, let x = e1 in e2) =            let x = mark(X ∪ vars(e2), e1) in mark(X, e2) if x ∈ vars(e2) let x = mark(X ∪ vars(e2), e1) in release(x, mark(X, e2))

  • therwise

Example: let x = f(y) in ifnz z then g(x, y) else release(y, f(x))

Nathana¨ el Courant Verified code generation September 8, 2017 8 / 21

slide-11
SLIDE 11

Correctness

Invariants preserved, with states (e, S, M) and (e′, S′, M′, C′): Reference count is accurate A variable no longer live in e′ is not bound to a reference in S′ The expression e′ is correctly marked All subterms release(x, e2) of e′ have x marked There is a function f that maps the references in M′ with count > 0 to references in M so that:

e is obtained by removing release and unmarking variables in f(e′), For each variable x live in e′, S(x) = f(S′(x)), For each reference s in M′ with C′(s) > 0, M(f(s)) = f(M′(s))

Nathana¨ el Courant Verified code generation September 8, 2017 9 / 21

slide-12
SLIDE 12

A Small Functional Language Evaluation with Reference Counting A Small Imperative Language Formalization in PVS

Nathana¨ el Courant Verified code generation September 8, 2017 9 / 21

slide-13
SLIDE 13

Syntax

{int z; {z := x[y]; result := x[z]}} e ::= | n | x | nil | f(x1, . . . , xn) | x[y] | x[y → z] | newint(n) | newref(n) s ::= | x := e | ifnz x then s1 else s2 | skip | {s1; s2} | release x | {int*

n x; s}

decl ::= int*

n x

function ::= (name, decl∗, s) program ::= function∗

Nathana¨ el Courant Verified code generation September 8, 2017 10 / 21

slide-14
SLIDE 14

Evaluation state

Fixed program (R, S, M, C) S is the stack, M is the store, C is the reference count, R is the call stack: stack of (function, program counter, local depth)

Nathana¨ el Courant Verified code generation September 8, 2017 11 / 21

slide-15
SLIDE 15

Reduction of IL

Extract statement at current program counter Reduce statement For assignments, except calls: like RL but store result

Nathana¨ el Courant Verified code generation September 8, 2017 12 / 21

slide-16
SLIDE 16

IL code generation

Return variable: translate(e, x) sets x to result of e translate(y[z → w], x) = x := y[z → w] translate(ifnz y then e1 else e2, x) = ifnz y then translate(e1, x) else translate(e2, x) translate(let (y : int*

n) = e1 in e2, x) =

{int*n y; {translate(e1, y); {skip; translate(e2, x)}}}

Nathana¨ el Courant Verified code generation September 8, 2017 13 / 21

slide-17
SLIDE 17

Correctness

Reconstruct redex and context from call stack, Reconstruct mapping for the stack variables, Store and counts the same

Nathana¨ el Courant Verified code generation September 8, 2017 14 / 21

slide-18
SLIDE 18

Program swap(t, i, j)

0{int a;1 a := t[i];2 skip; 3{int b;4 b := t[j]; 5skip; 6{int* t′;7 t′ := t[i → b];8 skip; 9result := t′[j → a]10}11}12}13

→ (pop4(), let b = 1 in let t′ = t[i → b] in t′[j → a])) main()

0{int z;1 z := +(y, 1);2 skip; 3result := swap(x, y, z)4}5

→ pop() Stack (b → 1, a → 0, j → 1, i → 0, t → r, result → undef, z → 1, y → 0, x → r, result → undef, result → undef) Store (r → 0, 1) Count (r → 2)

Nathana¨ el Courant Verified code generation September 8, 2017 15 / 21

slide-19
SLIDE 19

Expression pop5( let b = 1 in let t′ = t[i → b] in t′[j → a]) Stack (b → 1,a → 0, j → 1, i → 0, t → r, result → undef, z → 1, y → 0, x → r, result → undef, result → undef) Store (r → 0, 1) Count (r → 2)

Nathana¨ el Courant Verified code generation September 8, 2017 15 / 21

slide-20
SLIDE 20

Expression pop5( let b = 1 in let t′ = t[i → b] in t′[j → a]) Stack (a → 0, j → 1, i → 0, t → r, z → 1, y → 0, x → r) Store (r → 0, 1) Count (r → 2)

Nathana¨ el Courant Verified code generation September 8, 2017 15 / 21

slide-21
SLIDE 21

A Small Functional Language Evaluation with Reference Counting A Small Imperative Language Formalization in PVS

Nathana¨ el Courant Verified code generation September 8, 2017 15 / 21

slide-22
SLIDE 22

PVS formalization

Proof FL → RL translation both with and without typing Actually a 4th language: RL with several steps at once ≈ 7k lines definitions or theorems, proof files ≈ 310k lines long Code available at https://github.com/SRI-CSL/PVSCodegen Done before my internship: most definitions of FL, without proofs

Nathana¨ el Courant Verified code generation September 8, 2017 16 / 21

slide-23
SLIDE 23

Problems caused by PVS

PVS is slow: proofs take a long time to check PVS has bugs: could prove false Proofs of TCCs get misplaced after typechecking again → but allowed to detect those bugs

Nathana¨ el Courant Verified code generation September 8, 2017 17 / 21

slide-24
SLIDE 24

The proved theorems

bisimulation_lemma: THEOREM NOT tS‘state‘error AND NOT trS‘state‘error AND defs_well_typed(D, tS‘def_types) AND state_matches?(typed_to_topstate(tS), typed_to_topstate(trS)) IMPLIES state_matches?(typed_reduce(D)(tS), typed_reduce_n(D)( top_releases_ct(trS‘state‘redex) + 1, trS))

Nathana¨ el Courant Verified code generation September 8, 2017 18 / 21

slide-25
SLIDE 25

The proved theorems

bisimulation_lemma_i: THEOREM NOT trS1‘state‘error AND NOT trS2‘state‘error AND defs_well_typed(D, trS1‘def_types) AND iastate_matches(trS1, trS2) IMPLIES EXISTS (n: posnat): iastate_matches(typed_reduce_n(D)(n, trS1), typed_iareduce(D)(trS2))

Nathana¨ el Courant Verified code generation September 8, 2017 19 / 21

slide-26
SLIDE 26

The proved theorems

bisimulation_lemma: LEMMA FORALL (D, (trS | defs_well_typed(D, trS‘def_types)), iS): NOT iS‘error AND state_matches(D, trS, iS) IMPLIES (state_matches(D, trS, reduce(iS)) AND max_inst_steps(reduce(iS)) < max_inst_steps(iS)) OR state_matches(D, typed_iareduce(D)(trS), reduce(iS))

Nathana¨ el Courant Verified code generation September 8, 2017 20 / 21

slide-27
SLIDE 27

Conclusion

Verified code generation from a functional language to an imperative

  • ne

Includes garbage collection and destructive updates Article to be submitted to CPP 2018 Future work: Verify IL to C translation, add first-class functions and closures, try to verify completely PVS2C

Nathana¨ el Courant Verified code generation September 8, 2017 21 / 21