FDCC: a Combined Approach for Solving Constraints over Finite - - PowerPoint PPT Presentation

fdcc a combined approach for solving constraints over
SMART_READER_LITE
LIVE PREVIEW

FDCC: a Combined Approach for Solving Constraints over Finite - - PowerPoint PPT Presentation

FDCC: a Combined Approach for Solving Constraints over Finite Domains and Arrays ebastien Bardin (1) , Arnaud Gotlieb (2) S (1) CEA LIST (Paris, France) (2) INRIA (Rennes, France) - Certus V&V Center, Simula (Oslo, Norway) CPAIOR 2012


slide-1
SLIDE 1

FDCC: a Combined Approach for Solving Constraints over Finite Domains and Arrays

S´ ebastien Bardin(1), Arnaud Gotlieb(2)

(1) CEA LIST (Paris, France) (2) INRIA (Rennes, France) - Certus V&V Center, Simula (Oslo, Norway)

CPAIOR 2012

Bardin, S., Gotlieb, A. 1/ 19

slide-2
SLIDE 2

Overview

Goal : an efficient CP(FD) approach for array+FD constraints go beyond standard filtering-based techniques (element) motivation = software verification Approach : combine global symbolic deduction mechanisms with local filtering in order to achieve better deductive power than both technique taken in isolation Results : an original “greybox” combination for array+FD constraints

◮ identify which information should be shared ◮ propose ways of taming communication cost

a prototype and encouraging experiments (random instances)

◮ greater solving power (beats perfect blackbox combination) ◮ low overhead

easy to adapt for any CP(FD) solver (small API)

Bardin, S., Gotlieb, A. 2/ 19

slide-3
SLIDE 3

Motivations

int foo (int a, int b, int c) // precondition(a,b,c) int tmp, result; tmp = a+b; if (tmp <= c) result = tmp; else result = c; return result; // postcondition(a,b,c,result)

Find input exercising each program paths “if”-path : (a, b, c) | = a + b ≤ c iff foo(a,b,c) goes through if-path “else”-path : (a, b, c) | = a + b > c iff foo(a,b,c) goes through else-path

Bardin, S., Gotlieb, A. 3/ 19

slide-4
SLIDE 4

Motivations

int foo (int a, int b, int c) // precondition(a,b,c) int tmp, result; tmp = a+b; if (tmp <= c) result = tmp; else result = c; return result; // postcondition(a,b,c,result)

Find input satisfying precondition Φpre, but not postcondition Ψpost “if”-path Φpre(a, b, c) ∧ a + b ≤ c ∧ ¬Ψpost(a, b, c, a + b) “else”-path Φpre(a, b, c) ∧ a + b > c ∧ ¬Ψpost(a, b, c, c)

Bardin, S., Gotlieb, A. 3/ 19

slide-5
SLIDE 5

Motivations

int foo (int a, int b, int c) // precondition(a,b,c) int tmp, result; tmp = a+b; if (tmp <= c) result = tmp; else result = c; return result; // postcondition(a,b,c,result)

Applications : test coverage bug finding Find input satisfying precondition Φpre, but not postcondition Ψpost “if”-path Φpre(a, b, c) ∧ a + b ≤ c ∧ ¬Ψpost(a, b, c, a + b) “else”-path Φpre(a, b, c) ∧ a + b > c ∧ ¬Ψpost(a, b, c, c)

Bardin, S., Gotlieb, A. 3/ 19

slide-6
SLIDE 6

Motivations (2)

Constraint resolution becomes prominent in formal verification especially software verification Underlies several approaches, either for test generation or invariant computation

[abstract model checking, bounded model checking] [symbolic execution, weakest precondition calculus]

Verification reduces to solving Verification Conditions (VCs)

Bardin, S., Gotlieb, A. 4/ 19

slide-7
SLIDE 7

Motivations (2)

Constraint resolution becomes prominent in formal verification especially software verification Underlies several approaches, either for test generation or invariant computation

[abstract model checking, bounded model checking] [symbolic execution, weakest precondition calculus]

Verification reduces to solving Verification Conditions (VCs) We consider quantifier-free conjunctive fragments interesting by themselves [symbolic execution, test data generation] basic block of solvers handling disjunctions and quantifications

Bardin, S., Gotlieb, A. 4/ 19

slide-8
SLIDE 8

CP(FD) and Verification

Most verification techniques are based on SMT Yet, CP(FD) is a natural and interesting alternative since basic data types naturally range over finite domains Potentially interesting for bounded (non-linear) integer arithmetic modular arithmetic [Gotlieb-Leconte-Marre 10] bitvectors [Bardin-Herrmann-Perroud 10] floating-point arithmetic [Botella-Gotlieb-Michel 06] A few CP-based verification tools exist [+ encouraging case-studies] Inka [Gotlieb-Botella-Rueher 00], GATeL [Marre-Blanc 05] Osmose [Bardin-Herrmann 08], Jaut [Charreteur-Botella-Gotlieb 09]

Bardin, S., Gotlieb, A. 5/ 19

slide-9
SLIDE 9

CP(FD) and Verification

Most verification techniques are based on SMT Yet, CP(FD) is a natural and interesting alternative since basic data types naturally range over finite domains Potentially interesting for bounded (non-linear) integer arithmetic modular arithmetic [Gotlieb-Leconte-Marre 10] bitvectors [Bardin-Herrmann-Perroud 10] floating-point arithmetic [Botella-Gotlieb-Michel 06] A few CP-based verification tools exist [+ encouraging case-studies] Inka [Gotlieb-Botella-Rueher 00], GATeL [Marre-Blanc 05] Osmose [Bardin-Herrmann 08], Jaut [Charreteur-Botella-Gotlieb 09] But CP(FD) lacks an efficient handling of array constraints

Bardin, S., Gotlieb, A. 5/ 19

slide-10
SLIDE 10

The theory of arrays

The standard theory of arrays is defined by three sorts : arrays A, elements of arrays E, indexes I function select(T, i) : A × I → E function store(T, i, e) : A × I × E → A = and = over E and I Semantics (read-over-write) (FC) i = j − → select(T, i) = select(T, j) (RoW-1) i = j − → select(store(T, i, e), j) = e (RoW-2) i = j − → select(store(T, i, e), j) = select(T, j)

Bardin, S., Gotlieb, A. 6/ 19

slide-11
SLIDE 11

The theory of arrays (2)

Why does array theory matter so much in verification ? for modelling arrays and vectors [of course !] basis for more advanced containers

◮ maps, hash tables ◮ memory heap

A few remarks about the theory no constraint on array size or domains of indexes / elements

[need to combine with constraints on E and I]

no equality / disequality between arrays yet, difficult to solve [NP-hard for the ∧-fragment]

Bardin, S., Gotlieb, A. 7/ 19

slide-12
SLIDE 12

CP and arrays : local filtering

arrays represented by pairs (index, element)

[explicit arrays of logical variables]

constraints on domains of indexes / elements (and size) select : well-known constraint element

[Van Hentenryck-Carillon 88, Brand 01]

store : more recent work [Charreteur-Botella-Gotlieb 09]

Element(ARRAY,I,E) :- ( integer(I)? ARRAY[I] == E, success ; D(E) ← D(E) ∩

i∈D(I) D(ARRAY(i)),

D(I) ← {i ∈ D(I)|D(E) ∩ D(ARRAY[i]) = ∅}, wait(...) )

Bardin, S., Gotlieb, A. 8/ 19

slide-13
SLIDE 13

CP and arrays : local filtering

arrays represented by pairs (index, element)

[explicit arrays of logical variables]

constraints on domains of indexes / elements (and size) select : well-known constraint element

[Van Hentenryck-Carillon 88, Brand 01]

store : more recent work [Charreteur-Botella-Gotlieb 09]

Update(A,I,E,A’) :- ( integer(I)? A’[I]==E, ∀ k = I do A’[k]==A[k], success ; D(E) ← D(E) ∩

i∈D(I) D(A’(i)),

D(I) ← {i ∈ D(I)|D(E) ∩ D(A’[i]) = ∅}, ∀ k ∈ D(I) do A’[k] == A[k] ∀ k ∈ D(I) do D(A’[k]) ← D(A’[k]) ∩ (D(A[k])∪ D(E)) ... )

Bardin, S., Gotlieb, A. 8/ 19

slide-14
SLIDE 14

CP and arrays : local filtering (2)

Fine for “simple” array constraints either small arrays or very few updates fixed-value indexes (or at least no wide-domain indexes) Insufficient for many array constraints from program verification large arrays, many updates, (wide-range) variable indexes

[see formulas from SMT-LIB]

Bardin, S., Gotlieb, A. 9/ 19

slide-15
SLIDE 15

CP and arrays : local filtering (2)

Fine for “simple” array constraints either small arrays or very few updates fixed-value indexes (or at least no wide-domain indexes) Insufficient for many array constraints from program verification large arrays, many updates, (wide-range) variable indexes

[see formulas from SMT-LIB]

e = select(T, i) ∧ f = select(T, j) ∧ e = f ∧ i = j T array of size 100 Domains : 0..100

× fd : needs labelling [no answer in 60 min in COMET]

Bardin, S., Gotlieb, A. 9/ 19

slide-16
SLIDE 16

CP and arrays : local filtering (2)

Fine for “simple” array constraints either small arrays or very few updates fixed-value indexes (or at least no wide-domain indexes) Insufficient for many array constraints from program verification large arrays, many updates, (wide-range) variable indexes

[see formulas from SMT-LIB]

e = select(T, i) ∧ f = select(T, j) ∧ e = f ∧ i = j T array of size 100 Domains : 0..100

× fd : needs labelling [no answer in 60 min in COMET]

Bardin, S., Gotlieb, A. 9/ 19

slide-17
SLIDE 17

CP and arrays : local filtering (2)

Fine for “simple” array constraints either small arrays or very few updates fixed-value indexes (or at least no wide-domain indexes) Insufficient for many array constraints from program verification large arrays, many updates, (wide-range) variable indexes

[see formulas from SMT-LIB]

i ∈ 1..5 ∧ j ∈ 6..10 ∧ a = select(store(store(T, j, a), i, b), j)

× fd : needs labelling, cannot established

select(store(T, j, a), j) = a

Bardin, S., Gotlieb, A. 9/ 19

slide-18
SLIDE 18

CP and arrays : local filtering (2)

Fine for “simple” array constraints either small arrays or very few updates fixed-value indexes (or at least no wide-domain indexes) Insufficient for many array constraints from program verification large arrays, many updates, (wide-range) variable indexes

[see formulas from SMT-LIB]

i ∈ 1..5 ∧ j ∈ 6..10 ∧ a = select(store(store(T, j, a), i, b), j)

× fd : needs labelling, cannot established

select(store(T, j, a), j) = a

Bardin, S., Gotlieb, A. 9/ 19

slide-19
SLIDE 19

Our approach

Bardin, S., Gotlieb, A. 10/ 19

slide-20
SLIDE 20

Our approach

Bardin, S., Gotlieb, A. 10/ 19

slide-21
SLIDE 21

Our approach

Bardin, S., Gotlieb, A. 10/ 19

slide-22
SLIDE 22

Our approach

Bardin, S., Gotlieb, A. 10/ 19

slide-23
SLIDE 23

Our approach

Bardin, S., Gotlieb, A. 10/ 19

slide-24
SLIDE 24

Our approach

Bardin, S., Gotlieb, A. 10/ 19

slide-25
SLIDE 25

Examples

e = select(T, i) ∧ f = select(T, j) ∧ e = f ∧ i = j T array of size 100 Domains : 0..100

cc : unsat quickly (axiom FC) × fd : needs labelling [no answer in 60 min in COMET] fdcc : unsat quickly through cc

Bardin, S., Gotlieb, A. 11/ 19

slide-26
SLIDE 26

Examples

e = select(T, i) ∧ f = select(T, j) ∧ e = f ∧ i = j T array of size 100 Domains : 0..100

cc : unsat quickly (axiom FC) × fd : needs labelling [no answer in 60 min in COMET] fdcc : unsat quickly through cc

Bardin, S., Gotlieb, A. 11/ 19

slide-27
SLIDE 27

Examples

e = select(T, i) ∧ f = select(T, j) ∧ e = f ∧ i = j T array of size 100 Domains : 0..100

cc : unsat quickly (axiom FC) × fd : needs labelling [no answer in 60 min in COMET] fdcc : unsat quickly through cc

Bardin, S., Gotlieb, A. 11/ 19

slide-28
SLIDE 28

Examples

e = select(T, i) ∧ f = select(T, j) ∧ e = f ∧ i = j T array of size 100 Domains : 0..100

cc : unsat quickly (axiom FC) × fd : needs labelling [no answer in 60 min in COMET] fdcc : unsat quickly through cc

Bardin, S., Gotlieb, A. 11/ 19

slide-29
SLIDE 29

Examples

i ∈ 1..5 ∧ j ∈ 6..10 ∧ a = select(store(store(T, j, a), i, b), j)

× cc : no deduction since i = j cannot be inferred × fd : needs labelling, cannot established

select(store(T, j, a), j) = a

fdcc : fd deduces i = j (domain-check), cc can then deduce

a = select(store(T, j, a), j) then a = a and unsat

Bardin, S., Gotlieb, A. 11/ 19

slide-30
SLIDE 30

Examples

i ∈ 1..5 ∧ j ∈ 6..10 ∧ a = select(store(store(T, j, a), i, b), j)

× cc : no deduction since i = j cannot be inferred × fd : needs labelling, cannot established

select(store(T, j, a), j) = a

fdcc : fd deduces i = j (domain-check), cc can then deduce

a = select(store(T, j, a), j) then a = a and unsat

Bardin, S., Gotlieb, A. 11/ 19

slide-31
SLIDE 31

Examples

i ∈ 1..5 ∧ j ∈ 6..10 ∧ a = select(store(store(T, j, a), i, b), j)

× cc : no deduction since i = j cannot be inferred × fd : needs labelling, cannot established

select(store(T, j, a), j) = a

fdcc : fd deduces i = j (domain-check), cc can then deduce

a = select(store(T, j, a), j) then a = a and unsat

Bardin, S., Gotlieb, A. 11/ 19

slide-32
SLIDE 32

Examples

i ∈ 1..5 ∧ j ∈ 6..10 ∧ a = select(store(store(T, j, a), i, b), j)

× cc : no deduction since i = j cannot be inferred × fd : needs labelling, cannot established

select(store(T, j, a), j) = a

fdcc : fd deduces i = j (domain-check), cc can then deduce

a = select(store(T, j, a), j) then a = a and unsat

Bardin, S., Gotlieb, A. 11/ 19

slide-33
SLIDE 33

Examples

e = select(T, i) ∧ f = select(T, j) ∧ g = select(T, k) ∧e = f ∧ e = g ∧ f = g T array of size 2, domain of indexes 1..2

× cc : deduces allDifferent(i,j,k), does not output unsat

(domains not taken into account)

× fd : needs labelling [labels indexes first ! !] fdcc : cc deduces allDifferent(i,j,k), then fd deduces

unsat

Bardin, S., Gotlieb, A. 11/ 19

slide-34
SLIDE 34

Examples

e = select(T, i) ∧ f = select(T, j) ∧ g = select(T, k) ∧e = f ∧ e = g ∧ f = g T array of size 2, domain of indexes 1..2

× cc : deduces allDifferent(i,j,k), does not output unsat

(domains not taken into account)

× fd : needs labelling [labels indexes first ! !] fdcc : cc deduces allDifferent(i,j,k), then fd deduces

unsat

Bardin, S., Gotlieb, A. 11/ 19

slide-35
SLIDE 35

Communication framework

Communication between fd and cc can be costly especially, checking (dis-)equalitites of variables through their domains, |V |2 pairs to be checked How to tame communication costs ? a communication policy allowing tight control over expensive communications a reduction of the number of pairs of variables to consider (critical pairs) Other labelling is only transmitted to fd

Bardin, S., Gotlieb, A. 12/ 19

slide-36
SLIDE 36

Communication framework (2)

Communication policy cheap communications (cc → fd) made asynchronously expensive ones (fd → cc) made on request (supervisor) Critical pairs focus on pairs whose (dis-)equality will surely lead to new deductions in cc [see axioms] the set of critical pairs is defined by

  • a. ∀ v ˆ

=select(T, i) and v ′ ˆ =select(T, j), pairs (v, v ′) and (i, j)

  • b. ∀ v ˆ

=select(store(T, i, e), j), pairs (i, j) and (e, v)

still quadratic (in #select ) another reduction : focus only on type (b.)

◮ linear in #select, capture the specificity of array axioms ◮ manageable in practise, still brings interesting deductions Bardin, S., Gotlieb, A. 13/ 19

slide-37
SLIDE 37

Communication framework (3)

Bardin, S., Gotlieb, A. 14/ 19

slide-38
SLIDE 38

Implementation

implemented in SICStus Prolog (≈ 1.7 KLOC) FD solver use the SICStus clpfd library add our own array select and store operations

[Charreteur-Botella-Gotlieb 09]

simple labelling heuristics such as first-fail cc and supervisor build our own [straightforward implementations] cliques : reduce search to 3-clique clique detection launched when new disequality added

Bardin, S., Gotlieb, A. 15/ 19

slide-39
SLIDE 39

Experiments on random instances

Shape of random formulas 40 variables, 3-6 arrays of size 20, domain = 0..50 four kind of formulas (easy / hard array, arith / no arith) length 10-60 Properties to be evaluated ability to solve as many formulas as possible comparison with fd and cc [including overhead] comparison with blackbox combinations (hybrid and best) Experiment 1 : evaluates on 369 formulas, balanced in the 4 classes and sat / unsat Experiment 2 : evaluates on 100 formulas for each length between 10 and 60 [performance w.r.t. complexity threshold]

Bardin, S., Gotlieb, A. 16/ 19

slide-40
SLIDE 40

Results

First experiment solving power : solves > than fd, cc, or best

◮ 22 formulas (out of 369) solved only by fdcc ◮ 5x less TO than fd and 3x less TO than best

affordable overhead over cc and fd [when they succeed]

◮ at worst 4x slower, on average 1.1x - 1.5x slower

robustness : results hold for all 4 classes and sat / unsat Second experiment fdcc again better than fd and cc maximal benefits on hard-to-solve formulas

[closed to complexity threshold]

Bardin, S., Gotlieb, A. 17/ 19

slide-41
SLIDE 41

Results (2)

Total (369) S U TO T cc 29 115 225 13545 fd 154 151 64 3995 fdcc 181 175 13 957 best 154 175 40 2492 hybrid 154 175 40 2609 S : # sat answer, U : # unsat answer, TO : # time-out (60 sec), T : time in sec.

Bardin, S., Gotlieb, A. 18/ 19

slide-42
SLIDE 42

Results (2)

AEUF-I AEUF-II (79) (90) S U TO T S U TO T cc 26 37 16 987 2 30 58 3485 fd 39 26 14 875 35 18 37 2299 fdcc 40 37 2 144 51 30 9 635 best 39 37 3 202 35 30 25 1529 hybrid 39 37 3 242 35 30 25 1561 AEUF+LIA-I AEUF+LIA-II (100) (100) S U TO T S U TO T cc 1 21 78 4689 27 73 4384 fd 50 47 3 199 30 60 10 622 fdcc 52 48 24 38 60 2 154 best 50 48 2 139 30 60 10 622 hybrid 50 48 2 159 30 60 10 647 S : # sat answer, U : # unsat answer, TO : # time-out (60 sec), T : time in sec.

Bardin, S., Gotlieb, A. 18/ 19

slide-43
SLIDE 43

Results (2)

1 7 8 11 12 3 35 69 70 56 18 4 5 25 34 48 40 16 10 20 30 40 50 60

#(unsolved formulas)

TO_CCFD TO_CC TO_FD 4 11 10 21 5 39 81 88 83 36 15 10 20 30 40 50 60

Gain with FDCC

Miracle Gain 99 93 92 89 88 97 65 31 30 44 82 96 95 75 66 52 60 84 10 20 30 40 50 60

#(solved formulas)

CCFD CC FD

Bardin, S., Gotlieb, A. 18/ 19

slide-44
SLIDE 44

Conclusion

Results an original decision procedure for arrays that combines ideas from symbolic reasoning and finite-domain constraint solving

◮ identify which information should be shared ◮ propose ways of taming communication cost

a prototype and encouraging experiments (random instances)

◮ greater solving power (beats even best) ◮ low overhead

easy to adapt for any CP(FD) solver Future work experiments on real-life problems extend the approach to handle memory heaps (new, delete)

Bardin, S., Gotlieb, A. 19/ 19

slide-45
SLIDE 45

About VCs

Logical connectors ∧ : to express paths ∨ : to embed several paths in one formula

[alternative : enumerate them in the verification tool]

∀, ∃ : advanced preconditions / postconditions / contracts First-order theories for data types basic data types : integers, bitvectors collections : arrays, maps We consider here quantifier-free conjunctive fragments interesting by themselves [symbolic execution, test data generation] basic block of solvers handling disjunctions and quantifications

Bardin, S., Gotlieb, A. 19/ 19

slide-46
SLIDE 46

Why a dedicated combination framework ?

Or : more direct approaches, and why we do not choose them Standard combination scheme between arrays and CP(FD)

[Nelson-Oppen (NO)]

solving arrays is already NP-hard NO is heavy on non-convex theories like arrays or integers FD constraints do not fit well into NO assumptions

[infinite model]

Remove all store functions by introducing ∨ CP(FD) not well-adapted for handling case-splits Simple concurrent black-box combination [first success wins] we want to outperform it in solving power while still allowing easy re-use of any CP(FD) engine

Bardin, S., Gotlieb, A. 19/ 19

slide-47
SLIDE 47

The two solvers cc and fd

A semi-decision procedure cc for (pure) arrays global symbolic reasoning polynomial-time (no case-split) correct but not complete [may output “maybe”] based on the standard congruence closure algorithm

◮ + rules for array axioms

A CP(FD) solvers for arrays and FD-constraints local domain filtering correct [complete with a labelling] based on the existing CP(FD) constraints re-use of existing CP(FD) solvers through a small API

◮ is fd eq(x,y) and is fd diff(x,y) ◮ if store and select not available, give access to internal

domains (set, get, ∪, ∩, ∈, ∅?)

Bardin, S., Gotlieb, A. 19/ 19

slide-48
SLIDE 48

Details : cc for arrays

Based on the standard congruence closure algorithm [union-find] each equivalence class has a witness each variable has a parent, the “higher parent” is the witness basic operations : witness(var) and union(var1,var2) clever implementations in O(n) [ranking, path compression] Extensions for arrays (FC-1) i = j − → select(T, i) = select(T, j) (FC-2) select(T, i) = select(T, j) − → i = j (RoW-1) i = j − → select(store(T, i, e), j) = e (RoW-2) i = j − → select(store(T, i, e), j) = select(T, j) (RoW-3) select(store(T, i, e), j) = e − → i = j

Bardin, S., Gotlieb, A. 19/ 19

slide-49
SLIDE 49

Details : cc for arrays

Based on the standard congruence closure algorithm [union-find] each equivalence class has a witness each variable has a parent, the “higher parent” is the witness basic operations : witness(var) and union(var1,var2) clever implementations in O(n) [ranking, path compression] Extensions for arrays (FC-1) i = j − → select(T, i) = select(T, j) (FC-2) select(T, i) = select(T, j) − → i = j FC handled by congruence closure [standard extension]

Bardin, S., Gotlieb, A. 19/ 19

slide-50
SLIDE 50

Details : cc for arrays

Based on the standard congruence closure algorithm [union-find] each equivalence class has a witness each variable has a parent, the “higher parent” is the witness basic operations : witness(var) and union(var1,var2) clever implementations in O(n) [ranking, path compression] Extensions for arrays (RoW-1) i = j − → select(store(T, i, e), j) = e (RoW-3) select(store(T, i, e), j) = e − → i = j RoW-1 and RoW-3 handled through reduction to FC add select(store(T, i, e), i) = e for each store(T, i, e) then rely on FC

Bardin, S., Gotlieb, A. 19/ 19

slide-51
SLIDE 51

Details : cc for arrays

Based on the standard congruence closure algorithm [union-find] each equivalence class has a witness each variable has a parent, the “higher parent” is the witness basic operations : witness(var) and union(var1,var2) clever implementations in O(n) [ranking, path compression] Extensions for arrays (RoW-2) i = j − → select(store(T, i, e), j) = select(T, j) RoW-2 : mechanism of delayed evaluation for each select(store(T, i, e), j), put (T, i, e, j) in a watch list when i = j is proved, deduce select(store(T, i, e), j) = select(T, j)

Bardin, S., Gotlieb, A. 19/ 19