 
              The Correctness of a Code Generator for a Functional Language Nathana¨ el Courant September 8, 2017 Nathana¨ el Courant Verified code generation September 8, 2017 1 / 21
Context PVS: interactive proof assistant Specification language almost completely executable Can be used to write and formally verify programs Slow → generate efficient executable code (PVS2C) Objective: formally verify PVS2C Idealized version Related work: CompCert, CakeML Nathana¨ el Courant Verified code generation September 8, 2017 2 / 21
A Small Functional Language Evaluation with Reference Counting A Small Imperative Language Formalization in PVS Nathana¨ el Courant Verified code generation September 8, 2017 2 / 21
Expressions and contexts let a = t [ i ] in let b = t [ j ] in let t ′ = t [ i �→ b ] in t ′ [ j �→ a ] K ::= | � | pop ( K 1 ) n ) = K 1 in e 1 e ::= | n | let ( x : int* | x v ::= | n | nil | ref ( k ) | nil r ::= | x | f ( x 1 , . . . , x n ) | f ( x 1 , . . . , x n ) n ) = e 1 in e 2 | let ( x : int* | x [ y ] | ifnz x then e 1 else e 2 | x [ y �→ z ] | x [ y ] | newint ( n ) | x [ y �→ z ] | newref ( n ) | newint ( n ) | newref ( n ) | ifnz x then e 1 else e 2 | pop ( e 1 ) | ref ( k ) | let x = v in e | pop ( v ) Nathana¨ el Courant Verified code generation September 8, 2017 3 / 21
Decomposition theorem Fill context with expression: replace hole � by expression, denoted Ke K = let x = � in f ( x ) , e = 42 → Ke = let x = 42 in f ( x ) Theorem (Decomposition theorem) If e is not a value, there exists a unique decomposition e = Kr with K a context, r a redex. Nathana¨ el Courant Verified code generation September 8, 2017 4 / 21
Evaluation state ( e, S, M ) e an expression, S the stack: maps variables → values, M the store: maps references → arrays of values. Nathana¨ el Courant Verified code generation September 8, 2017 5 / 21
Reduction of FL Defined on redexes Context-preserving: if ( e, S, M ) → ( e ′ , S ′ , M ′ ) , then ( Ke, S, M ) → ( Ke ′ , S ′ , M ′ ) Deterministic Exemples: ( x, S, M ) → ( S ( x ) , S, M ) ( x [ y �→ z ] , S, M ) → ( new ( M ) , S, M [ new ( M ) �→ M ( S ( x ))[ S ( y ) �→ S ( z )]]) ( let x = v in e, S, M ) → ( pop ( e ) , push ( x, v, S ) , M ) Nathana¨ el Courant Verified code generation September 8, 2017 6 / 21
A Small Functional Language Evaluation with Reference Counting A Small Imperative Language Formalization in PVS Nathana¨ el Courant Verified code generation September 8, 2017 6 / 21
Reference counts Count number of times each reference appears #( S, x ) : number of times x appears in S Invariant: C ( ref ( k )) = 1 ref ( k ) ∈ e + #( S, ref ( k )) � + #( M ( ref ( s )) , ref ( k )) ref ( s ) ∈M Keep track of reference counts Free memory when count becomes 0 Destructive updates when possible Nathana¨ el Courant Verified code generation September 8, 2017 7 / 21
Marked variables Variables can be marked + new constructor release Mark last occurrence of variables in each execution path Done statically: mark ( X, e ) marks each variable in e that is not in X mark ( X, x ) = � x if x ∈ X otherwise x mark ( X, let x = e 1 in e 2 ) =  let x = mark ( X ∪ vars ( e 2 ) , e 1 )     in mark ( X, e 2 ) if x ∈ vars ( e 2 )  let x = mark ( X ∪ vars ( e 2 ) , e 1 )     in release ( x, mark ( X, e 2 )) otherwise  Example: let x = f ( y ) in ifnz z then g ( x, y ) else release ( y, f ( x )) Nathana¨ el Courant Verified code generation September 8, 2017 8 / 21
Correctness Invariants preserved, with states ( e, S, M ) and ( e ′ , S ′ , M ′ , C ′ ) : Reference count is accurate A variable no longer live in e ′ is not bound to a reference in S ′ The expression e ′ is correctly marked All subterms release ( x, e 2 ) of e ′ have x marked There is a function f that maps the references in M ′ with count > 0 to references in M so that: e is obtained by removing release and unmarking variables in f ( e ′ ) , For each variable x live in e ′ , S ( x ) = f ( S ′ ( x )) , For each reference s in M ′ with C ′ ( s ) > 0 , M ( f ( s )) = f ( M ′ ( s )) Nathana¨ el Courant Verified code generation September 8, 2017 9 / 21
A Small Functional Language Evaluation with Reference Counting A Small Imperative Language Formalization in PVS Nathana¨ el Courant Verified code generation September 8, 2017 9 / 21
Syntax { int z ; { z := x [ y ]; result := x [ z ] }} s ::= | x := e | ifnz x then s 1 else s 2 e ::= | n | skip | x | { s 1 ; s 2 } | nil | release x | f ( x 1 , . . . , x n ) n x ; s } | { int* | x [ y ] | x [ y �→ z ] n x | newint ( n ) decl ::= int* function ::= (name , decl ∗ , s ) | newref ( n ) program ::= function ∗ Nathana¨ el Courant Verified code generation September 8, 2017 10 / 21
Evaluation state Fixed program ( R , S, M , C ) S is the stack, M is the store, C is the reference count, R is the call stack: stack of ( function , program counter , local depth ) Nathana¨ el Courant Verified code generation September 8, 2017 11 / 21
Reduction of IL Extract statement at current program counter Reduce statement For assignments, except calls: like RL but store result Nathana¨ el Courant Verified code generation September 8, 2017 12 / 21
IL code generation Return variable: translate ( e, x ) sets x to result of e translate ( y [ z �→ w ] , x ) = x := y [ z �→ w ] translate ( ifnz y then e 1 else e 2 , x ) = ifnz y then translate ( e 1 , x ) else translate ( e 2 , x ) n ) = e 1 in e 2 , x ) = translate ( let ( y : int* { int* n y ; { translate ( e 1 , y ); { skip ; translate ( e 2 , x ) }}} Nathana¨ el Courant Verified code generation September 8, 2017 13 / 21
Correctness Reconstruct redex and context from call stack, Reconstruct mapping for the stack variables, Store and counts the same Nathana¨ el Courant Verified code generation September 8, 2017 14 / 21
Stack Program swap ( t, i, j ) ( b �→ 1 , a �→ 0 , 0 { int a ; 1 a := t [ i ]; 2 skip ; j �→ 1 , i �→ 0 , t �→ r, 3 { int b ; 4 b := t [ j ]; 5 skip ; result �→ undef , � 6 { int* t ′ ; 7 t ′ := t [ i �→ b ]; 8 skip ; z �→ 1 , y �→ 0 , x �→ r, 9 result := t ′ [ j �→ a ] 10 } 11 } 12 } 13 result �→ undef , � → ( pop 4 ( � ) , let b = 1 in result �→ undef ) let t ′ = t [ i �→ b ] in t ′ [ j �→ a ])) Store main () ( r �→ � 0 , 1 � ) 0 { int z ; 1 z := +( y, 1); 2 skip ; Count 3 result := swap ( x, y, z ) 4 } 5 → pop ( � ) ( r �→ 2) Nathana¨ el Courant Verified code generation September 8, 2017 15 / 21
Stack ( b �→ 1 ,a �→ 0 , j �→ 1 , i �→ 0 , t �→ r, result �→ undef , � Expression z �→ 1 , y �→ 0 , x �→ r, pop 5 ( result �→ undef , � let b = 1 in result �→ undef ) let t ′ = t [ i �→ b ] in t ′ [ j �→ a ]) Store ( r �→ � 0 , 1 � ) Count ( r �→ 2) Nathana¨ el Courant Verified code generation September 8, 2017 15 / 21
Stack ( a �→ 0 , j �→ 1 , i �→ 0 , t �→ r, z �→ 1 , y �→ 0 , x �→ r ) Expression pop 5 ( let b = 1 in let t ′ = t [ i �→ b ] in t ′ [ j �→ a ]) Store ( r �→ � 0 , 1 � ) Count ( r �→ 2) Nathana¨ el Courant Verified code generation September 8, 2017 15 / 21
A Small Functional Language Evaluation with Reference Counting A Small Imperative Language Formalization in PVS Nathana¨ el Courant Verified code generation September 8, 2017 15 / 21
PVS formalization Proof FL → RL translation both with and without typing Actually a 4 th language: RL with several steps at once ≈ 7k lines definitions or theorems, proof files ≈ 310k lines long Code available at https://github.com/SRI-CSL/PVSCodegen Done before my internship: most definitions of FL , without proofs Nathana¨ el Courant Verified code generation September 8, 2017 16 / 21
Problems caused by PVS PVS is slow: proofs take a long time to check PVS has bugs: could prove false Proofs of TCCs get misplaced after typechecking again → but allowed to detect those bugs Nathana¨ el Courant Verified code generation September 8, 2017 17 / 21
The proved theorems bisimulation_lemma: THEOREM NOT tS‘state‘error AND NOT trS‘state‘error AND defs_well_typed(D, tS‘def_types) AND state_matches?(typed_to_topstate(tS), typed_to_topstate(trS)) IMPLIES state_matches?(typed_reduce(D)(tS), typed_reduce_n(D)( top_releases_ct(trS‘state‘redex) + 1, trS)) Nathana¨ el Courant Verified code generation September 8, 2017 18 / 21
The proved theorems bisimulation_lemma_i: THEOREM NOT trS1‘state‘error AND NOT trS2‘state‘error AND defs_well_typed(D, trS1‘def_types) AND iastate_matches(trS1, trS2) IMPLIES EXISTS (n: posnat): iastate_matches(typed_reduce_n(D)(n, trS1), typed_iareduce(D)(trS2)) Nathana¨ el Courant Verified code generation September 8, 2017 19 / 21
The proved theorems bisimulation_lemma: LEMMA FORALL (D, (trS | defs_well_typed(D, trS‘def_types)), iS): NOT iS‘error AND state_matches(D, trS, iS) IMPLIES (state_matches(D, trS, reduce(iS)) AND max_inst_steps(reduce(iS)) < max_inst_steps(iS)) OR state_matches(D, typed_iareduce(D)(trS), reduce(iS)) Nathana¨ el Courant Verified code generation September 8, 2017 20 / 21
Recommend
More recommend