Decision Procedures for Algebraic Data Types with Abstractions - - PowerPoint PPT Presentation

decision procedures for algebraic data types with
SMART_READER_LITE
LIVE PREVIEW

Decision Procedures for Algebraic Data Types with Abstractions - - PowerPoint PPT Presentation

Decision Procedures for Algebraic Data Types with Abstractions Philippe Suter, Mirco Dotta and Viktor Kuncak Verification of functional programs proof counterexample (input, trace) sealed abstract class Tree case class Node(left: Tree, value:


slide-1
SLIDE 1

Decision Procedures for Algebraic Data Types with Abstractions

Philippe Suter, Mirco Dotta and Viktor Kuncak

slide-2
SLIDE 2

Verification of functional programs

proof counterexample

(input, trace)

slide-3
SLIDE 3

sealed abstract class Tree case class Node(left: Tree, value: Int, right: Tree) extends Tree case class Leaf() extends Tree

  • bject BST {

def add(tree: Tree, element: Int): Tree = tree match { case Leaf() ⇒ Node(Leaf(), element, Leaf()) case Node(l, v, r) if v > element ⇒ Node(add(l, element), v, r) case Node(l, v, r) if v < element ⇒ Node(l, v, add(r, element)) case Node(l, v, r) if v == element ⇒ tree } ensuring (result ≠ Leaf()) } (tree = Node(l, v, r) ∧ v > element ∧ result ≠ Leaf()) ⇒ Node(result, v, r) ≠ Leaf()

We know how to generate verification conditions for functional programs

slide-4
SLIDE 4

Proving verification conditions

(tree = Node(l, v, r) ∧ v > element ∧ result ≠ Leaf()) ⇒ Node(result, v, r) ≠ Leaf()

D.C. Oppen, Reasoning about Recursively Defined Data Structures, POPL ’78

  • G. Nelson, D.C. Oppen, Simplification by

Cooperating Decision Procedure, TOPLAS ’79 Previous work gives decision procedures that can handle certain verification conditions

slide-5
SLIDE 5

sealed abstract class Tree case class Node(left: Tree, value: Int, right: Tree) extends Tree case class Leaf() extends Tree

  • bject BST {

def add(tree: Tree, element: Int): Tree = tree match { case Leaf() ⇒ Node(Leaf(), element, Leaf()) case Node(l, v, r) if v > element ⇒ Node(add(l, element), v, r) case Node(l, v, r) if v < element ⇒ Node(l, v, add(r, element)) case Node(l, v, r) if v == element ⇒ tree } ensuring (content(result) == content(tree) ∪ { element }) def content(tree: Tree) : Set[Int] = tree match { case Leaf() ⇒ ∅ case Node(l, v, r) ⇒ content(l) ∪ { v } ∪ content(r) } }

slide-6
SLIDE 6

Complex verification condition

Set Expressions Recursive Function Algebraic Data Types t1 = Node(t2, e1, t3) ∧ content(t4) = content(t2) ∪ { e2 } ∧ content(Node(t4, e1, t3)) ≠ content(t1) ∪ { e2 } where def content(tree: Tree) : Set[Int] = tree match { case Leaf() ⇒ ∅ case Node(l, v, r) ⇒ content(l) ∪ { v } ∪ content(r) }

slide-7
SLIDE 7

Our contribution

Decision procedures for extensions of algebraic data types with certain recursive functions

slide-8
SLIDE 8

Formulas we aim to prove

Quantifier-free Formula Generalized Fold Function where def content(tree: Tree) : Set[Int] = tree match { case Leaf() ⇒ ∅ case Node(l, v, r) ⇒ content(l) ∪ { v } ∪ content(r) } t1 = Node(t2, e1, t3) ∧ content(t4) = content(t2) ∪ { e2 } ∧ content(Node(t4, e1, t3)) ≠ content(t1) ∪ { e2 } Domain with a Decidable Theory

slide-9
SLIDE 9

def α(tree: Tree) : C = tree match { case Leaf() ⇒ empty case Node(l, v, r) ⇒ combine(α(l), v, α(r)) }

General form of our recursive functions

def content(tree: Tree) : Set[Int] = tree match { case Leaf() ⇒ ∅ case Node(l, v, r) ⇒ content(l) ∪ { v } ∪ content(r) } empty : C combine : (C, E, C) → C

slide-10
SLIDE 10

Scope of our result - Examples

Tree content abstraction, as a: Set Multiset List Tree size, height, min Invariants (sortedness,…) *Kuncak,Rinard’07+ *Piskac,Kuncak’08+ *Plandowski’04+ *Papadimitriou’81+ *Nelson,Oppen’79+

slide-11
SLIDE 11

How do we prove such formulas?

Quantifier-free Formula Generalized Fold Function where def content(tree: Tree) : Set[Int] = tree match { case Leaf() ⇒ ∅ case Node(l, v, r) ⇒ content(l) ∪ { v } ∪ content(r) } t1 = Node(t2, e1, t3) ∧ content(t4) = content(t2) ∪ { e2 } ∧ content(Node(t4, e1, t3)) ≠ content(t1) ∪ { e2 } Domain with a Decidable Theory

slide-12
SLIDE 12

Separate the Conjuncts

c1 = content(t1) ∧ … ∧ c5 = content(t5) t1 = Node(t2, e1, t3) ∧ t5 = Node(t4, e1, t3) ∧ c4 = c2 ∪ { e2 } ∧ c5 ≠ c1 ∪ { e2 } ∧ t1 = Node(t2, e1, t3) ∧ content(t4) = content(t2) ∪ { e2 } ∧ content(Node(t4, e1, t3)) ≠ content(t1) ∪ { e2 }

slide-13
SLIDE 13

1 4 2

t2 t3 t1

1 7

t5 t4 =

4 2

t2 t3 t4 c2 c3

∪ ∪

4 2

c4 = c4 = { 4 } ∪ { 2 } ∪ ∅ ∪ c3 ∪ c2

content

= t1

7

t5 =

slide-14
SLIDE 14

Overview of the decision procedure

c4 = c2 ∪ { e2 } ∧ c5 ≠ c1 ∪ { e2 } t1 = Node(t2, e1, t3) t5 = Node(t4, e1, t3) ∧

The resulting formula is in the decidable theory of sets

c1 = c2 ∪ { e1 } ∪ c3 c5 = c4 ∪ { e1 } ∪ c3 ∧

additional derived constraints set constraints from the input formula

c4 = c2 ∪ { e2 } c5 ≠ c1 ∪ { e2 } c1 = c2 ∪ { e1 } ∪ c3 c5 = c4 ∪ { e1 } ∪ c3 ∧ ∧ ∧

resulting formula

ci = content(ti), i ∈ , 1, …, 5 -

tree constraints from the input formula mappings from the input formula

Decision Procedure for Sets

slide-15
SLIDE 15

What we have seen is a simple correct algorithm But is it complete?

slide-16
SLIDE 16

A verifier based on such procedure

val c1 = content(t1) val c2 = content(t2) if (t1 ≠ t2) , if (c1 == ∅) { assert(c2 ≠ ∅) x = c2.chooseElement } }

c1 = content(t1) ∧ c2 = content(t2) ∧ t1 ≠ t2 ∧ c1 = ∅ ∧ c2 = ∅

Warning: possible assertion violation

slide-17
SLIDE 17

Source of incompleteness

c1 = ∅ ∧ c2 = ∅

Models for the formula in the logic of sets must not contradict the disequalities over trees

c1 = content(t1) ∧ c2 = content(t2) ∧ t1 ≠ t2 ∧ c1 = ∅ ∧ c2 = ∅

t1 ≠ t2 ∅

slide-18
SLIDE 18

How to make the algorithm complete

  • Case analysis for each tree variable:

– is it Leaf ? – Is it not Leaf ?

c1 = content(t1) ∧ c2 = content(t2) ∧ t1 ≠ t2 ∧ c1 = ∅ ∧ c2 = ∅

This gives a complete decision procedure for the content function that maps to sets

∧ t1 = Leaf ∧ t2 = Node(t3, e, t4) ∧ t1 = Leaf ∧ t2 = Leaf ∧ t1 = Node(t3, e1, t4) ∧ t2 = Node(t5, e2, t6) ∧ t1 Node(t3, e, t4) ∧ t2 = Leaf

slide-19
SLIDE 19

What about other content functions?

Tree content abstraction, as a: Set Multiset List Tree size, height, min Invariants (sortedness,…)

slide-20
SLIDE 20

Sufficient Surjectivity How and when we can have a complete algorithm

slide-21
SLIDE 21

Decision Procedure for Sets

Choice of trees is constrained by sets

c4 = c2 ∪ { e2 } ∧ c5 ≠ c1 ∪ { e2 } t1 = Node(t2, e1, t3) t5 = Node(t4, e1, t3) ∧ c1 = c2 ∪ { e1 } ∪ c3 c5 = c4 ∪ { e1 } ∪ c3 ∧ c4 = c2 ∪ { e2 } c5 ≠ c1 ∪ { e2 } c1 = c2 ∪ { e1 } ∪ c3 c5 = c4 ∪ { e1 } ∪ c3 ∧ ∧ ∧

additional derived constraints set constraints from the input formula resulting formula

ci = content(ti), i ∈ , 1, …, 5 -

tree constraints from the input formula mappings from the input formula

slide-22
SLIDE 22

Inverse images

  • When we have a model for c1, c2, … how can

we pick distinct values for t1, t2,… ?

α α-1

The cardinality of α-1 (ci) is what matters.

ci = content(ti) ti ∈ content-1 (ci) ⇔

slide-23
SLIDE 23

‘Surjectivity’ of set abstraction

{ 1, 5 }

5 1 1 5 5 5 1 1

… ∅

content-1 content-1

|content-1(∅)| = 1 |content-1(,1, 5-)| = ∞

slide-24
SLIDE 24

In-order traversal

2 1 7 4

[ 1, 2, 4, 7 ]

inorder-

slide-25
SLIDE 25

‘Surjectivity’ of in-order traversal

[ 1, 5 ]

5 1 1 5

[ ]

inorder-1 inorder-1

|inorder-1(list)| =

(number of trees of size n = length(list))

slide-26
SLIDE 26

… …

|inorder-1(list)| length(list)

More trees map to longer lists

slide-27
SLIDE 27

An abstraction function α (e.g. content, inorder) is sufficiently surjective if and only if, for each number p > 0, there exist, computable as a function of p: such that, for every term t, Mp(α(t)) or š(t) in Sp.

a finite set of shapes Sp a closed formula Mp in the collection theory such that Mp(c) implies |α-1(c)| > p

  • Pick p sufficiently large.

Guess which trees have a problematic shape. Guess their shape and their elements. By construction values for all other trees can be found.

slide-28
SLIDE 28

For a conjunction of n disequalities over tree terms, if for each term we can pick a value from a set of trees of size at least n+1, then we can pick values that satisfy all disequalities. We can make sure there will be sufficiently many trees to choose from.

Generalization of the Independence of Disequations Lemma

slide-29
SLIDE 29

Sufficiently surjectivity holds in practice

Theorem: For every sufficiently surjective abstraction our procedure is complete. Theorem: The following abstractions are sufficiently surjective: set content, multiset content, list (any-order), tree height, tree size, minimum, sortedness A complete decision procedure for all these cases!

slide-30
SLIDE 30

Related Work

  • G. Nelson, D.C. Oppen, Simplification by

Cooperating Decision Procedure, TOPLAS ’79

  • V. Sofronie-Stokkermans, Locality Results for

Certain Extensions of Theories with Bridging Functions, CADE ’09 Some implemented systems: ACL2, Isabelle, Coq, Verifun, Liquid Types

slide-31
SLIDE 31
  • Reasoning

about functional programs reduces to proving formulas

  • Decision procedures always find a proof or a

counterexample

  • Previous

decision procedures handle recursion-free formulas

  • We

introduced decision procedures for formulas with recursive fold functions

Decision Procedures for Algebraic Data Types with Abstractions

slide-32
SLIDE 32

Thank you !

slide-33
SLIDE 33

Extra Slides

slide-34
SLIDE 34

Decision procedure for data structure hierarchy

bag (multiset) set setof mcontent msize

7

ssize

3

tree

Supports all natural operations

  • n trees, multisets, sets, and homomorphisms between them
slide-35
SLIDE 35

When we are not complete

  • When α-1 does not grow
  • The only natural example we found so far:

when there is no abstraction!

– Map trees into trees by mirroring them or – Reversing the list

slide-36
SLIDE 36

Sortedness

slide-37
SLIDE 37

End of extra slides

Stop clicking

slide-38
SLIDE 38

An abstraction function α is sufficiently surjective if and only if, for each number p > 0, there exist, computable as a function of p: such that, for every term t, Mp(α(t)) or š(t) in Sp.

a finite set of shapes Sp a closed formula Mp in the collection theory such that Mp(c) implies |α-1(c)| > p

  • 5

3 2 3

š

slide-39
SLIDE 39

lim inf |α-1(α(t))| = ∞

p→∞ š (t) ∉ Sp

An abstraction function α is sufficiently surjective if and only if, for each number p > 0, there exist, computable as a function of p: such that, for every term t, Mp(α(t)) or š(t) in Sp.

a finite set of shapes Sp a closed formula Mp in the collection theory such that Mp(c) implies |α-1(c)| > p

  • This definition implies:
slide-40
SLIDE 40

lim inf |α-1(α(t))| = ∞

p→∞ š (t) ∉ Sp

slide-41
SLIDE 41

To copy-paste

1

Wc1W ∧ ∨ ∪ ≠ ⊢ ∈ ∉ ⇒ → α Wα-1W š ⇔ ∅ α

slide-42
SLIDE 42

1 4 2

t2 t3 t1

1 7

t5 t4 = t1 =

7

t5

4 2

t2 t3 t4 = c1 = c5

∅ ∅

7

c2 c3

∪ ∪

4 2

c4 = = { 0, 7 } ∪ c5 = { 2, 4 } ∪ c2 ∪ c3

content content

slide-43
SLIDE 43

Trees Trees Trees

slide-44
SLIDE 44

Overview of the Decision Procedure

t1 = Node(t2, e1, t3) ∧ t5 = Node(t4, e1, t3) ∧ t1 ≠ t2 ∧ t1 ≠ t3 ∧ …∧ e1 = e2 ci = content(ti), i ∈ , 1, …, 5 - t1 = Node(t2, e1, t3) t5 = Node(t4, e1, t3) ∧

unification def content(tree: Tree) : Set[Int] = tree match { case Leaf() ⇒ ∅ case Node(l, v, r) ⇒ content(l) ∪ { v } ∪ content(r) }

= content(t1) c1 = content(t2) ∪ { e1 } ∪ content(t3) = c2 ∪ { e1 } ∪ c3 = content(Node(t2, e1, t3))

slide-45
SLIDE 45

Ghost Variables?

slide-46
SLIDE 46
  • bject BST {

def contains(tree: Tree, element: Int): Tree = tree match { case Leaf() => false case Node(l, v, r) if v > element => contains(l, element) case Node(l, v, r) if v < element => contains(r, element) case Node(l, v, r) if v == element => true } ensuring (result <=> element ∈ tree.content) }

Requires stating and proving an invariant such as:

∀ (l : Leaf) . l.content = ∅ ∀ (n : Node) . n.content = n.left.content ∪ { n.element } ∪ n.right.content

slide-47
SLIDE 47

sealed abstract class Tree { val content: Set[Int] } case class Node(content: Set[Int], left: Tree, value: Int, right: Tree) extends Tree case class Leaf() extends Tree { val content = ∅ }

  • bject BST {

def add(tree: Tree, element: Int): Tree = tree match { case Leaf() => Node({ element }, Leaf(), element, Leaf()) case Node(l, v, r) if v > element => Node(tree.content ∪ { element }, add(l, element), v, r) case Node(l, v, r) if v < element => Node(tree.content ∪ { element }, l, v, add(r, element)) case Node(l, v, r) if v == element => tree } ensuring (result.content == tree.content ∪ { element }) }

slide-48
SLIDE 48
  • Essentially duplicates the code
slide-49
SLIDE 49

Our Approach: No Ghosts!

slide-50
SLIDE 50
  • In a functional setting, specification variables

are just another view on the same data

  • Idea: provide the view explicitly, in the PL
slide-51
SLIDE 51
slide-52
SLIDE 52

Completeness

In general, we need a way to encode: in the domain theory.

ti ≠ tj ∧ tk ≠ tl ∧ … ∧ ci = α(ti) ∧ cj = α(tj) ∧ …

slide-53
SLIDE 53

Sufficient Surjectivity

  • For each tree t in the formula, guess its shape in Sp, or

write Mp(t)

  • Populate the shapes with fresh variables
  • Trees with different shapes are different by

construction.

  • For the other ones, create a disjunction of disequalities
  • ver their elements

f1 f2 f4 f3

slide-54
SLIDE 54

Sufficient Surjectivity

  • All the trees such that Mp(t) can be made

distinct and still map to the same collection Independence of Disequations Lemma: For a conjunction of n disequalities of tree terms, if for each term we can pick a value from a set of trees of size at least n, then we can pick values that satisfy all disequalities.

slide-55
SLIDE 55

Sufficient Surjectivity

5 5 1 1 shape š