Deductive Program Verification with Why3 Jean-Christophe Filli atre - - PowerPoint PPT Presentation

deductive program verification with why3
SMART_READER_LITE
LIVE PREVIEW

Deductive Program Verification with Why3 Jean-Christophe Filli atre - - PowerPoint PPT Presentation

Deductive Program Verification with Why3 Jean-Christophe Filli atre CNRS Tallinn January 15, 2013 http://why3.lri.fr/tallinn-2013/ 1 / 101 definition program verification + proof conditions specification 2 / 101 this is not new


slide-1
SLIDE 1

Deductive Program Verification with Why3

Jean-Christophe Filliˆ atre CNRS Tallinn January 15, 2013 http://why3.lri.fr/tallinn-2013/

1 / 101

slide-2
SLIDE 2

definition

program + specification verification conditions proof

2 / 101

slide-3
SLIDE 3

this is not new

program + specification verification conditions proof

  • A. M. Turing. Checking a large routine. 1949.

STOP r′ = 1 u′ = 1 v′ = u TEST r − n s′ = 1 u′ = u + v s′ = s + 1 r′ = r + 1 TEST s − r 3 / 101

slide-4
SLIDE 4

this is not new

program + specification verification conditions proof Tony Hoare. Proof of a program: FIND.

  • Commun. ACM, 1971.

4 / 101

slide-5
SLIDE 5

proving

program + specification verification conditions proof a lot of theorem provers

  • SMT solvers: CVC3, Z3, Yices, Alt-Ergo, etc.

(the SMT revolution)

  • TPTP provers: Vampire, Eprover, SPASS, etc.
  • proof assistants: Coq, PVS, Isabelle, etc.
  • dedicated provers, e.g. Gappa

5 / 101

slide-6
SLIDE 6

which logic?

program + specification verification conditions proof

  • too rich: we can’t use automated theorem provers
  • too poor: we can’t model programming languages and we

can’t specify programs typically, a compromise

  • first-order logic
  • a bunch a theories: arithmetic, arrays, bit vectors, etc.

6 / 101

slide-7
SLIDE 7

programs

program + specification verification conditions proof extracting verification conditions for a realistic programming language is a lot of work as in a compiler, we rather translate to some intermediate language from which we extract VCs

7 / 101

slide-8
SLIDE 8

the Why tool

developed since 2001 at ProVal (LRI / INRIA) rewritten from scratch, started Feb 2010 ⇒ Why3 authors: F. Bobot, JCF, C. March´ e, G. Melquiond, A. Paskevich

  • pen source software (LGPL)

http://why3.lri.fr/ a similar tool: Boogie (Microsoft Research)

8 / 101

slide-9
SLIDE 9

applications

  • Java programs: Krakatoa (March´

e Paulin Urbain)

  • C programs: Caduceus (Filliˆ

atre March´ e) formerly,

Jessie plug-in of Frama-C (March´

e Moy) today

  • Ada programs: Hi-Lite (Adacore)
  • algorithms
  • probabilistic programs (Barthe et al.)
  • cryptographic programs (Vieira)

9 / 101

slide-10
SLIDE 10
  • verview

KML-annotated Java program ACSL-annotated C program ALFA-annotated ADA program

Krakatoa Frama-C Hi-Lite Jessie VC generator

Theories verification conditions

Transformations Encodings

Why3

Interactive provers (Coq, PVS, Isabelle/HOL, etc.) Automated provers (Alt-Ergo, CVC3, Z3, Simplify, Yices, etc.) More automated provers (Eprover, SPASS, Vampire, Gappa, etc.)

10 / 101

slide-11
SLIDE 11
  • verview of Why3

file.why file.mlw WhyML VCgen Why transform/translate print/run Coq Alt-Ergo CVC3 Z3 etc.

11 / 101

slide-12
SLIDE 12

Part I the logic of Why3

12 / 101

slide-13
SLIDE 13

in a nutshell

logic of Why3 = polymorphic first-order logic, with

  • (mutually) recursive algebraic data types
  • (mutually) recursive function/predicate symboles
  • (mutually) inductive predicates
  • let-in, match-with, if-then-else

formal definition in

Expressing Polymorphic Types in a Many-Sorted Language (FroCos 2011)

13 / 101

slide-14
SLIDE 14

Demo 1: the logic of Why3

14 / 101

slide-15
SLIDE 15

declarations

  • types
  • abstract: type t
  • alias: type t = list int
  • algebraic: type list α = Nil | Cons α (list α)
  • function / predicate
  • uninterpreted: function f int : int
  • defined: predicate non empty (l: list α) = l = Nil
  • inductive predicate
  • inductive trans t t = ...
  • axiom / lemma / goal
  • goal G: ∀ x: int. x ≥ 0 → x*x ≥ 0

15 / 101

slide-16
SLIDE 16

theories

logic declarations organized in theories a theory T1 can be

  • used (use) in a theory T2
  • cloned (clone) in another theory T2

theory end theory end theory end

16 / 101

slide-17
SLIDE 17

theories

logic declarations organized in theories a theory T1 can be

  • used (use) in a theory T2
  • symbols of T1 are shared
  • axioms of T1 remain axioms
  • lemmas of T1 become axioms
  • goals of T1 are ignored
  • cloned (clone) in another theory T2

theory end theory end theory end

17 / 101

slide-18
SLIDE 18

theories

logic declarations organized in theories a theory T1 can be

  • used (use) in a theory T2
  • cloned (clone) in another theory T2
  • declarations of T1 are copied or substituted
  • axioms of T1 remain axioms or become

lemmas/goals

  • lemmas of T1 become axioms
  • goals of T1 are ignored

theory end theory end theory end

18 / 101

slide-19
SLIDE 19

under the hood

a technology to talk to provers central concept: task

  • a context (a list of declarations)
  • a goal (a formula)

goal

19 / 101

slide-20
SLIDE 20

workflow

theory end theory end theory end

Alt-Ergo Z3 Vampire

20 / 101

slide-21
SLIDE 21

workflow

theory end theory end theory end goal

Alt-Ergo Z3 Vampire

21 / 101

slide-22
SLIDE 22

workflow

theory end theory end theory end goal goal

Alt-Ergo Z3 Vampire T1

22 / 101

slide-23
SLIDE 23

workflow

theory end theory end theory end goal goal goal

Alt-Ergo Z3 Vampire T1 T2

23 / 101

slide-24
SLIDE 24

workflow

theory end theory end theory end goal goal goal

Alt-Ergo Z3 Vampire T1 T2 P

24 / 101

slide-25
SLIDE 25

transformations

  • eliminate algebraic data types and match-with
  • eliminate inductive predicates
  • eliminate if-then-else, let-in
  • encode polymorphism, encode types
  • etc.

efficient: results of transformations are memoized

25 / 101

slide-26
SLIDE 26

driver

a task journey is driven by a file

  • transformations to apply
  • prover’s input format
  • syntax
  • predefined symbols / axioms
  • prover’s diagnostic messages

more details: Why3: Shepherd your herd of provers (Boogie 2011)

26 / 101

slide-27
SLIDE 27

example: Z3 driver (excerpt)

printer "smtv2" valid "^unsat" invalid "^sat" transformation "inline trivial" transformation "eliminate builtin" transformation "eliminate definition" transformation "eliminate inductive" transformation "eliminate algebraic" transformation "simplify formula" transformation "discriminate" transformation "encoding smt" prelude "(set-logic AUFNIRA)" theory BuiltIn syntax type int "Int" syntax type real "Real" syntax predicate (=) "(= %1 %2)" meta "encoding : kept" type int end

27 / 101

slide-28
SLIDE 28

API

Why3 has an OCaml API

  • to build terms, declarations, theories, tasks
  • to call provers

defensive API

  • well-typed terms
  • well-formed declarations, theories, and tasks

28 / 101

slide-29
SLIDE 29

plug-ins

Why3 can be extended via three kinds of plug-ins

  • parsers (new input formats)
  • transformations (to be used in drivers)
  • printers (to add support for new provers)

29 / 101

slide-30
SLIDE 30

API and plug-ins

Your code Why3 API WhyML TPTP etc. eliminate algebraic encode polymorphism etc. Simplify Alt-Ergo SMT-lib etc.

30 / 101

slide-31
SLIDE 31

Summary

  • numerous theorem provers are supported
  • Coq, SMT, TPTP, Gappa
  • user-extensible system
  • input languages
  • transformations
  • output syntax
  • efficient
  • e.g. transformations are memoized

more details:

  • Why3: Shepherd your herd of provers. (Boogie 2011)

31 / 101

slide-32
SLIDE 32

Part II program verification

32 / 101

slide-33
SLIDE 33

Demo 2: an historical example

  • A. M. Turing. Checking a Large Routine. 1949.

STOP r′ = 1 u′ = 1 v′ = u TEST r − n s′ = 1 u′ = u + v s′ = s + 1 r′ = r + 1 TEST s − r

33 / 101

slide-34
SLIDE 34

Demo 2: an historical example

  • A. M. Turing. Checking a Large Routine. 1949.

STOP r′ = 1 u′ = 1 v′ = u TEST r − n s′ = 1 u′ = u + v s′ = s + 1 r′ = r + 1 TEST s − r

u ← 1 for r = 0 to n − 1 do v ← u for s = 1 to r do u ← u + v

demo (access code)

34 / 101

slide-35
SLIDE 35

Demo 3: another historical example

f (n) = n − 10 si n > 100, f (f (n + 11)) sinon.

demo (access code)

35 / 101

slide-36
SLIDE 36

Demo 3: another historical example

f (n) = n − 10 si n > 100, f (f (n + 11)) sinon.

demo (access code)

e ← 1 while e > 0 do if n > 100 then n ← n − 10 e ← e − 1 else n ← n + 11 e ← e + 1 return n

demo (access code)

36 / 101

slide-37
SLIDE 37

Recapitulation

  • pre/postcondition

let foo x y z requires { P } ensures { Q } = ...

  • loop invariant

while ... do invariant { I } ... done for i = ... do invariant { I(i) } ... done

37 / 101

slide-38
SLIDE 38

Recapitulation

termination of a loop (resp. a recursive function) is ensured by a variant variant {t} with R

  • R is a well-founded order relation
  • t decreases for R at each step

(resp. each recursive call) by default, t is of type int and R is the relation y ≺ x def = y < x ∧ 0 ≤ x

38 / 101

slide-39
SLIDE 39

Remark

as show with function 91, proving termination may require to establish behavioral properties as well another example:

  • Floyd’s cycle detection (Hare and Tortoise algorithm)

39 / 101

slide-40
SLIDE 40

Data structures

up to now, we have only used integers let us consider more complex data structures

  • arrays
  • algebraic data types

40 / 101

slide-41
SLIDE 41

Arrays

Why3 standard library provides arrays use import array.Array that is

  • a polymorphic type

array α

  • an access operation, written

a[e]

  • an assignment operation, written

a[e1] ← e2

  • operations create, append, sub, copy, etc.

41 / 101

slide-42
SLIDE 42

Demo 4: two-way sort

sort an array of Boolean, using the following algorithm

let two way sort (a: array bool) = let i = ref 0 in let j = ref (length a - 1) in while !i < !j do if not a[!i] then incr i else if a[!j] then decr j else begin let tmp = a[!i] in a[!i] ← a[!j]; a[!j] ← tmp; incr i; decr j end done

False ? . . . ? True ↑ ↑ i j

demo (access code)

42 / 101

slide-43
SLIDE 43

Exercise 1: Dutch national flag

an array contains elements of the following enumerated type type color = Blue | White | Red sort it, in such a way we have the following final situation: . . . Blue . . . . . . White . . . . . . Red . . .

43 / 101

slide-44
SLIDE 44

Exercise: Dutch national flag

let dutch flag (a:array color) (n:int) = let b = ref 0 in let i = ref 0 in let r = ref n in while !i < !r do match a[!i] with | Blue → swap a !b !i; incr b; incr i | White → incr i | Red → decr r; swap a !r !i end done exercise: exo_flag.mlw

44 / 101

slide-45
SLIDE 45

Remark

as for termination, proving safety (such as absence of array access

  • ur of bounds) may be arbitrarily difficult

an example:

  • Knuth’s algorithm for N first primes (TAOCP vol. 1)

45 / 101

slide-46
SLIDE 46

Demo 5: Boyer-Moore’s majority

given a multiset of N votes A A A C C B B C C C B C C determine the majority, if any

46 / 101

slide-47
SLIDE 47

an elegant solution

due to Boyer & Moore (1980) linear time uses only three variables

47 / 101

slide-48
SLIDE 48

principle

A A A C C B B C C C B C C ↑ cand = A k = 1

48 / 101

slide-49
SLIDE 49

principle

A A A C C B B C C C B C C ↑ cand = A k = 2

49 / 101

slide-50
SLIDE 50

principle

A A A C C B B C C C B C C ↑ cand = A k = 3

50 / 101

slide-51
SLIDE 51

principle

A A A C C B B C C C B C C ↑ cand = A k = 2

51 / 101

slide-52
SLIDE 52

principle

A A A C C B B C C C B C C ↑ cand = A k = 1

52 / 101

slide-53
SLIDE 53

principle

A A A C C B B C C C B C C ↑ cand = A k = 0

53 / 101

slide-54
SLIDE 54

principle

A A A C C B B C C C B C C ↑ cand = B k = 1

54 / 101

slide-55
SLIDE 55

principle

A A A C C B B C C C B C C ↑ cand = B k = 0

55 / 101

slide-56
SLIDE 56

principle

A A A C C B B C C C B C C ↑ cand = C k = 1

56 / 101

slide-57
SLIDE 57

principle

A A A C C B B C C C B C C ↑ cand = C k = 2

57 / 101

slide-58
SLIDE 58

principle

A A A C C B B C C C B C C ↑ cand = C k = 1

58 / 101

slide-59
SLIDE 59

principle

A A A C C B B C C C B C C ↑ cand = C k = 2

59 / 101

slide-60
SLIDE 60

principle

A A A C C B B C C C B C C ↑ cand = C k = 3

60 / 101

slide-61
SLIDE 61

principle

A A A C C B B C C C B C C ↑ cand = C k = 3 then we check if C indeed has majority (in that case, it has)

61 / 101

slide-62
SLIDE 62

Fortran

62 / 101

slide-63
SLIDE 63

Why3

let mjrty (a: array candidate) = let n = length a in let cand = ref a[0] in let k = ref 0 in for i = 0 to n-1 do if !k = 0 then begin cand := a[i]; k := 1 end else if !cand = a[i] then incr k else decr k done; if !k = 0 then raise Not found; try if 2 * !k > n then raise Found; k := 0; for i = 0 to n-1 do if a[i] = !cand then begin incr k; if 2 * !k > n then raise Found end done; raise Not found with Found → !cand end

demo (access code) 63 / 101

slide-64
SLIDE 64

specification

  • precondition

let mjrty (a: array candidate) requires { 1 ≤ length a }

  • postcondition in case of success

ensures { 2 * numof a result 0 (length a) > length a }

  • postcondition in case of failure

raises { Not found → ∀ c: candidate. 2 * numof a c 0 (length a) ≤ length a }

64 / 101

slide-65
SLIDE 65

annotations

each loop is given a loop invariant for i = 0 to n-1 do invariant { 0 ≤ !k ≤ i ∧ numof a !cand 0 i ≥ !k ∧ 2 * (numof a !cand 0 i - !k) ≤ i - !k ∧ ∀ c: candidate. c = !cand → 2 * numof a c 0 i ≤ i - !k } ... for i = 0 to n-1 do invariant { !k = numof a !cand 0 i ∧ 2 * !k ≤ n } ...

65 / 101

slide-66
SLIDE 66

proof

the verification condition expresses

  • safety
  • array access within bounds
  • termination
  • validity of annotations
  • invariants are initialized and preserved
  • postconditions are established

automatically discharged by SMT solvers

66 / 101

slide-67
SLIDE 67

Ghost code

may be inserted for the purpose of specification and/or proof rules are:

  • regular code does not see ghost data
  • ghost code may read regular data (but can’t modify it)

in particular, ghost code may be removed without observable modification

67 / 101

slide-68
SLIDE 68

Demo 7: ring buffer

a circular buffer is implemented within an array type buffer α = { mutable first: int; mutable len : int; data : array α; } len elements are stored, starting at index first x1 x2 . . . xlen ↑

first

they may wrap around the array bounds . . . xlen x1 x2 ↑

first

68 / 101

slide-69
SLIDE 69

Demo 7: ring buffer

we add an extra ghost field to model the buffer contents type buffer α = { mutable first: int; mutable len : int; data : array α; ghost mutable sequence: list α; }

69 / 101

slide-70
SLIDE 70

Demo 7: ring buffer

ghost code is added to set this ghost field accordingly example: let push (b: buffer α) (x: α) : unit = ghost b.sequence ← b.sequence ++ Cons x Nil; let i = b.first + b.len in let n = Array.length b.data in b.data[if i ≥ n then i - n else i] ← x; b.len ← b.len + 1

70 / 101

slide-71
SLIDE 71

Demo 7: ring buffer

we link the array contents and the ghost field with a type invariant

type buffer α = ... invariant { let size = Array.length self.data in 0 ≤ self.first < size ∧ 0 ≤ self.len ≤ size ∧ self.len = L.length self.sequence ∧ ∀ i: int. 0 ≤ i < self.len → (self.first + i < size → nth i self.sequence = Some self.data[self.first + i]) ∧ (0 ≤ self.first + i - size → nth i self.sequence = Some self.data[self.first + i - size]) }

71 / 101

slide-72
SLIDE 72

Demo 7: ring buffer

such a type invariant

  • is assumed at function entry
  • must be ensured for values returned or modified

72 / 101

slide-73
SLIDE 73

Demo 7: ring buffer

alternatively, we could have introduced a logical function mapping the buffer to a list function buffer model (b: buffer α) : list α (* + suitable axioms *) but ghost code

  • is more compact
  • results in simpler proof (it provides explicit witnesses)

73 / 101

slide-74
SLIDE 74

Other data structures

a key idea of Hoare logic: any types and symbols from the logic can be used in programs note: we already used type int this way

74 / 101

slide-75
SLIDE 75

Algebraic data types

we can do so with algebraic data types in the library, we find type bool = True | False (in bool.Bool) type option α = None | Some α (in option.Option) type list α = Nil | Cons α (list α) (in list.List)

75 / 101

slide-76
SLIDE 76

Demo 7: same fringe

given two binary trees, do they contain the same elements when traversed in order? 8 3 1 5 4 4 1 3 8 5

76 / 101

slide-77
SLIDE 77

Demo 7: same fringe

type elt type tree = | Empty | Node tree elt tree function elements (t: tree) : list elt = match t with | Empty → Nil | Node l x r → elements l ++ Cons x (elements r) end let same fringe (t1 t2: tree) : bool ensures { result=True ↔ elements t1 = elements t2 } = ...

77 / 101

slide-78
SLIDE 78

Demo 7: same fringe

  • ne solution: look at the left branch as

a list, from bottom up x1 x2 ... xn t1 t2 tn

78 / 101

slide-79
SLIDE 79

Demo 7: same fringe

  • ne solution: look at the left branch as

a list, from bottom up x1 x2 ... xn t1 t2 tn 1 3 8 5 4 1 4 3 8 5

demo (access code)

79 / 101

slide-80
SLIDE 80

Exercise 2: inorder traversal

type elt type tree = Null | Node tree elt tree inorder traversal of t, storing its elements in array a

let rec fill (t: tree) (a: array elt) (start: int) : int = match t with | Null → start | Node l x r → let res = fill l a start in if res = length a then begin a[res] ← x; fill r a (res + 1) end else res end exercise: exo_fill.mlw

80 / 101

slide-81
SLIDE 81

Part III Modeling

81 / 101

slide-82
SLIDE 82

Back on arrays

in the library, we find

type array α model { length: int; mutable elts: map int α }

two meanings

  • in programs, an abstract data type:

type array α

  • in the logic, an immutable record type:

type array α = { length: int; elts: map int α }

82 / 101

slide-83
SLIDE 83

Back on arrays

  • ne cannot define operations over type array α

(it is abstract) but one may declare them examples:

val ([]) (a: array α) (i: int) : α reads {a} requires { 0 ≤ i < length a } ensures { result = a[i] } val ([]←) (a: array α) (i: int) (v: α) : unit writes {a} requires { 0 ≤ i < length a } ensures { a.elts = M.set (old a.elts) i v }

83 / 101

slide-84
SLIDE 84

Modeling

  • ne can model this way many data structures (be they

implemented or not) examples: stacks, queues, priority queues, graphs, etc.

84 / 101

slide-85
SLIDE 85

Example: hash tables

type key type t ’a val create: int -> t ’a val clear: t ’a -> unit val add: t ’a -> key -> ’a -> unit exception Not found val find: t ’a -> key -> ’a

85 / 101

slide-86
SLIDE 86

Example: hash tables

type key type t α model { mutable contents: map key (list α) } val add (h: t α) (k: key) (v: α) : unit writes {h} ensures { h[k] = Cons v (old h)[k] } ensures { ∀ k’: key. k’ = k → h[k’] = (old h)[k’] } ...

86 / 101

slide-87
SLIDE 87

Limitation

it is also possible to implement hash tables type t α = { mutable size: int; mutable data: array (list (key, α)); } invariant ... but it is (currently) not possible to prove that it implements the model from the previous slide

87 / 101

slide-88
SLIDE 88

Another example: 32-bit arithmetic

let us model signed 32-bit arithmetic two possibilities:

  • ensure absence of arithmetic overflow
  • model machine arithmetic faithfully (i.e. with overflows)

a constraint: we do not want to loose arithmetic capabilities of SMT solvers

88 / 101

slide-89
SLIDE 89

32-bit arithmetic

we introduce a new type for 32-bit integers type int32 the integer value is given by function toint int32 : int within annotations, we only use type int an expression x : int32 appears, in annotations, as toint x

89 / 101

slide-90
SLIDE 90

32-bit arithmetic

we define the range of 32-bit integers function min int: int = -2147483648 function max int: int = 2147483647 when we use them... axiom int32 domain: ∀ x: int32. min int ≤ toint x ≤ max int ... and when we build them val ofint (x:int) : int32 requires { min int ≤ x ≤ max int } ensures { toint result = x }

90 / 101

slide-91
SLIDE 91

32-bit arithmetic

then each program expression such as x + y is translated into

  • fint (toint x) (toint y)

this ensures the absence of arithmetic overflow (but we get a large number of additional verification conditions)

91 / 101

slide-92
SLIDE 92

Demo 8: Binary Search

let us consider searching for a value in a sorted array using binary search let us show the absence of arithmetic overflow

demo (access code)

92 / 101

slide-93
SLIDE 93

Binary Search

we found a bug the computation let m = (!l + !u) / 2 in may provoke an arithmetic overflow (for instance with a 2-billion elements array) a possible fix is let m = !l + (!u - !l) / 2 in

93 / 101

slide-94
SLIDE 94

modeling the heap

94 / 101

slide-95
SLIDE 95

Principle

the second key idea of Hoare logic is

  • ne can statically identify the various memory locations

(absence of aliasing) in particular, memory locations are not first-class values to handle programs with pointers,

  • ne has to model the memory heap

95 / 101

slide-96
SLIDE 96

Memory model

consider for instance C programs with pointers of type int* a possible model is type pointer val memory: ref (map pointer int) the C expression *p is translated into the Why3 expression !memory[p]

96 / 101

slide-97
SLIDE 97

Memory model

there are more subtle models such as the component-as-array model (Burstall / Bornat) each structure field is modeled as a separate map the C type struct List { int head; struct List *next; }; is modeled as type pointer val head: ref (map pointer int) val next: ref (map pointer pointer)

97 / 101

slide-98
SLIDE 98

Memory models

such models are used in aforementioned tools for C, Java, and Ada

KML-annotated Java program ACSL-annotated C program ALFA-annotated ADA program

Krakatoa Frama-C Hi-Lite Jessie VC generator

Theories verification conditions

Transformations Encodings

Why3

98 / 101

slide-99
SLIDE 99

conclusion

99 / 101

slide-100
SLIDE 100

Things not covered in this lecture

  • how aliases are excluded
  • how verification conditions are computed
  • how formulas are sent to provers
  • how floating-point arithmetic is modeled
  • etc.

100 / 101

slide-101
SLIDE 101

Conclusion

we saw three different ways of using Why3

  • as a logical language

(a convenient front-end to many theorem provers)

  • as a programming language to prove algorithms

(currently 78 examples in our gallery)

  • as an intermediate language

(for the verification of C, Java, Ada, etc.)

101 / 101