Enhancing Symbolic Execution for Coverage-Oriented Testing S - - PowerPoint PPT Presentation

enhancing symbolic execution for coverage oriented testing
SMART_READER_LITE
LIVE PREVIEW

Enhancing Symbolic Execution for Coverage-Oriented Testing S - - PowerPoint PPT Presentation

Enhancing Symbolic Execution for Coverage-Oriented Testing S ebastien Bardin, Nikolai Kosmatov, Micka el Delahaye CEA LIST, Software Safety Lab (Paris-Saclay, France) Bardin et al. CFV 2015 1/ 40 Context : white-box software testing


slide-1
SLIDE 1

Enhancing Symbolic Execution for Coverage-Oriented Testing

S´ ebastien Bardin, Nikolai Kosmatov, Micka¨ el Delahaye

CEA LIST, Software Safety Lab (Paris-Saclay, France)

Bardin et al. CFV 2015 1/ 40

slide-2
SLIDE 2

Context : white-box software testing

Testing process Generate a test input Run it and check for errors Estimate coverage : if enough stop, else loop Coverage criteria [decision, mcdc, mutants, etc.] play a major role definition = systematic way of deriving test requirements generate tests, decide when to stop, assess quality of testing beware : infeasible test requirements

[waste generation effort, imprecise coverage ratios]

beware : lots of different coverage criteria

Bardin et al. CFV 2015 2/ 40

slide-3
SLIDE 3

Context : Dynamic Symbolic Execution

Dynamic Symbolic Execution [dart, cute, exe, sage, pex, klee, . . . ]

very powerful approach to (white box) test generation many tools and many successful case-studies since mid 2000’s

Bardin et al. CFV 2015 3/ 40

slide-4
SLIDE 4

Context : Dynamic Symbolic Execution

Dynamic Symbolic Execution [dart, cute, exe, sage, pex, klee, . . . ]

very powerful approach to (white box) test generation many tools and many successful case-studies since mid 2000’s

Symbolic Execution [King 70’s] consider a program P on input v, and a given path σ a path predicate ϕσ for σ is a formula s.t. v | = ϕσ ⇒ P(v) follows σ can be used for bounded-path testing !

  • ld idea, recent renew interest

[requires powerful solvers]

Bardin et al. CFV 2015 3/ 40

slide-5
SLIDE 5

Context : Dynamic Symbolic Execution

Dynamic Symbolic Execution [dart, cute, exe, sage, pex, klee, . . . ]

very powerful approach to (white box) test generation many tools and many successful case-studies since mid 2000’s

Symbolic Execution [King 70’s] consider a program P on input v, and a given path σ a path predicate ϕσ for σ is a formula s.t. v | = ϕσ ⇒ P(v) follows σ can be used for bounded-path testing !

  • ld idea, recent renew interest

[requires powerful solvers]

Dynamic Symbolic Execution [Korel+, Williams+, Godefroid+] interleave dynamic and symbolic executions drive the search towards feasible paths for free give hints for relevant under-approximations [robustness]

Bardin et al. CFV 2015 3/ 40

slide-6
SLIDE 6

Context : Dynamic Symbolic Execution (2)

input : a program P

  • utput : a test suite TS covering all feasible paths of Paths≤k(P)

pick a path σ ∈ Paths≤k(P) compute a path predicate ϕσ of σ

[wpre, spost]

solve ϕσ for satisfiability

[smt solver]

SAT(s) ? get a new pair < s, σ > loop until no more path to cover

Bardin et al. CFV 2015 4/ 40

slide-7
SLIDE 7

Context : Dynamic Symbolic Execution (2)

input : a program P

  • utput : a test suite TS covering all feasible paths of Paths≤k(P)

pick a path σ ∈ Paths≤k(P) compute a path predicate ϕσ of σ

[wpre, spost]

solve ϕσ for satisfiability

[smt solver]

SAT(s) ? get a new pair < s, σ > loop until no more path to cover

Bardin et al. CFV 2015 4/ 40

slide-8
SLIDE 8

Context : Dynamic Symbolic Execution (2)

input : a program P

  • utput : a test suite TS covering all feasible paths of Paths≤k(P)

pick a path σ ∈ Paths≤k(P) compute a path predicate ϕσ of σ

[wpre, spost]

solve ϕσ for satisfiability

[smt solver]

SAT(s) ? get a new pair < s, σ > loop until no more path to cover

Bardin et al. CFV 2015 4/ 40

slide-9
SLIDE 9

Context : Dynamic Symbolic Execution (2)

input : a program P

  • utput : a test suite TS covering all feasible paths of Paths≤k(P)

pick a path σ ∈ Paths≤k(P) compute a path predicate ϕσ of σ

[wpre, spost]

solve ϕσ for satisfiability

[smt solver]

SAT(s) ? get a new pair < s, σ > loop until no more path to cover

Bardin et al. CFV 2015 4/ 40

slide-10
SLIDE 10

Context : Dynamic Symbolic Execution (2)

input : a program P

  • utput : a test suite TS covering all feasible paths of Paths≤k(P)

pick a path σ ∈ Paths≤k(P) compute a path predicate ϕσ of σ

[wpre, spost]

solve ϕσ for satisfiability

[smt solver]

SAT(s) ? get a new pair < s, σ > loop until no more path to cover

Bardin et al. CFV 2015 4/ 40

slide-11
SLIDE 11

Context : Dynamic Symbolic Execution (2)

input : a program P

  • utput : a test suite TS covering all feasible paths of Paths≤k(P)

pick a path σ ∈ Paths≤k(P) compute a path predicate ϕσ of σ

[wpre, spost]

solve ϕσ for satisfiability

[smt solver]

SAT(s) ? get a new pair < s, σ > loop until no more path to cover

Bardin et al. CFV 2015 4/ 40

slide-12
SLIDE 12

Context : Dynamic Symbolic Execution (2)

input : a program P

  • utput : a test suite TS covering all feasible paths of Paths≤k(P)

pick a path σ ∈ Paths≤k(P) compute a path predicate ϕσ of σ

[wpre, spost]

solve ϕσ for satisfiability

[smt solver]

SAT(s) ? get a new pair < s, σ > loop until no more path to cover

Bardin et al. CFV 2015 4/ 40

slide-13
SLIDE 13

The problem

DSE is GREAT for automating structural testing

very powerful approach to (white box) test generation many tools and many successful case-studies since mid 2000’s

Bardin et al. CFV 2015 5/ 40

slide-14
SLIDE 14

The problem

DSE is GREAT for automating structural testing

very powerful approach to (white box) test generation many tools and many successful case-studies since mid 2000’s

Yet, no real support for structural coverage criteria

[except path coverage and branch coverage]

Would be useful : when required to produce tests achieving some criterion for producing “good” tests for an external oracle

[functional correctness, security, performance, etc.] Recent efforts [Active Testing, Augmented DSE, Mutation DSE] limited or unclear expressiveness explosion of the search space [APex : 272x avg, up to 2,000x]

Bardin et al. CFV 2015 5/ 40

slide-15
SLIDE 15

Our goals and results

Goals : extend DSE to a large set of structural coverage criteria support these criteria in a unified way support these criteria in an efficient way detect (some) infeasible test requirements

Bardin et al. CFV 2015 6/ 40

slide-16
SLIDE 16

Our goals and results

Goals : extend DSE to a large set of structural coverage criteria support these criteria in a unified way support these criteria in an efficient way detect (some) infeasible test requirements Results

generic low-level encoding of coverage criteria [ICST 14] efficient variant of DSE for coverage criteria [ICST 14] sound and quasi-complete detection of infeasibility [ICST 15]

Bardin et al. CFV 2015 6/ 40

slide-17
SLIDE 17

Outline

Introduction Labels Efficient DSE for Labels Infeasible label detection The GACC criterion Conclusion

Bardin et al. CFV 2015 7/ 40

slide-18
SLIDE 18

Focus : Labels

Annotate programs with labels

◮ predicate attached to a specific program instruction

Label (loc, ϕ) is covered if a test execution

◮ reaches the instruction at loc ◮ satisfies the predicate ϕ

Good for us

◮ can easily encode a large class of coverage criteria [see after] ◮ in the scope of standard program analysis techniques Bardin et al. CFV 2015 8/ 40

slide-19
SLIDE 19

Simulation of standard coverage criteria

statement_1 ; if (x==y && a<b) {...}; statement_3 ; − − − − − → statement_1 ; // l1: x==y && a<b // l2: !(x==y && a<b) if (x==y && a<b) {...}; statement_3 ; Decision Coverage (DC)

Bardin et al. CFV 2015 9/ 40

slide-20
SLIDE 20

Simulation of standard coverage criteria

statement_1 ; if (x==y && a<b) {...}; statement_3 ; − − − − − → statement_1 ; // l1: x==y // l2: !(x==y) // l3: a<b // l4: !(a<b) if (x==y && a<b) {...}; statement_3 ; Condition Coverage (CC)

Bardin et al. CFV 2015 9/ 40

slide-21
SLIDE 21

Simulation of standard coverage criteria

statement_1 ; if (x==y && a<b) {...}; statement_3 ; − − − − − → statement_1 ; // l1: x==y && a<b // l2: x==y && a>=b // l3: x!=y && a<b // l4: x!=y && a>=b if (x==y && a<b) {...}; statement_3 ; Multiple-Condition Coverage (MCC)

Bardin et al. CFV 2015 9/ 40

slide-22
SLIDE 22

Simulation of standard coverage criteria

Bardin et al. CFV 2015 9/ 40

OBJ : generic specification mechanism for coverage criteria IC, DC, FC, CC, MCC, GACC large part of Weak Mutations Input Domain Partition Run-Time Error

slide-23
SLIDE 23

Simulation of standard coverage criteria

Bardin et al. CFV 2015 9/ 40

OBJ : generic specification mechanism for coverage criteria IC, DC, FC, CC, MCC, GACC large part of Weak Mutations Input Domain Partition Run-Time Error

Out of scope : . strong mutations, MCDC . (side-effect weak mutations)

slide-24
SLIDE 24

Focus : Simulation of Weak Mutations

mutant M = syntactic modification of program P weakly covering M = finding t such that P(t) = M(t) just after the mutation

Bardin et al. CFV 2015 10/ 40

slide-25
SLIDE 25

From weak mutants to labels (1)

Bardin et al. CFV 2015 11/ 40

slide-26
SLIDE 26

From weak mutants to labels (2)

One label per mutant Mutation inside a statement lhs := e → lhs := e’

◮ add label : e = e′

lhs := e → lhs’ := e

◮ add label : &lhs = &lhs′ ∧ (lhs = e ∨ lhs′ = e)

Mutation inside a decision if (cond) → if (cond’)

◮ add label : cond ⊕ cond′

Beware : no side-effect inside labels

Bardin et al. CFV 2015 12/ 40

slide-27
SLIDE 27

From weak mutants to labels (2)

One label per mutant Mutation inside a statement lhs := e → lhs := e’

◮ add label : e = e′

lhs := e → lhs’ := e

◮ add label : &lhs = &lhs′ ∧ (lhs = e ∨ lhs′ = e)

Mutation inside a decision if (cond) → if (cond’)

◮ add label : cond ⊕ cond′

Beware : no side-effect inside labels

Bardin et al. CFV 2015 12/ 40

Theorem For any finite set O of side-effect free mutation

  • perators, WMO can be simulated by labels.
slide-28
SLIDE 28

From weak mutants to labels (2)

One label per mutant Mutation inside a statement lhs := e → lhs := e’

◮ add label : e = e′

lhs := e → lhs’ := e

◮ add label : &lhs = &lhs′ ∧ (lhs = e ∨ lhs′ = e)

Mutation inside a decision if (cond) → if (cond’)

◮ add label : cond ⊕ cond′

Beware : no side-effect inside labels

Bardin et al. CFV 2015 12/ 40

Theorem For any finite set O of side-effect free mutation

  • perators, WMO can be simulated by labels.
slide-29
SLIDE 29

LTest overview

Bardin et al. CFV 2015 13/ 40

plugin of the Frama-C analyser for C programs

◮ open-source ◮ sound, industrial strength ◮ among other : VA, WP, specification language

based on PathCrawler for test generation

slide-30
SLIDE 30

LTest overview

Bardin et al. CFV 2015 13/ 40

slide-31
SLIDE 31

LTest overview

Bardin et al. CFV 2015 13/ 40

slide-32
SLIDE 32

LTest overview

Bardin et al. CFV 2015 13/ 40

slide-33
SLIDE 33

Outline

Introduction Labels Efficient DSE for Labels Infeasible label detection The GACC criterion Conclusion

Bardin et al. CFV 2015 14/ 40

slide-34
SLIDE 34

Direct instrumentation

Covering label l ⇔ Covering branch True

sound & complete instrumentation × complexification of the search space [#paths, shape of paths] × dramatic overhead [theory & practice] [Apex : avg 272x, max 2000x]

Bardin et al. CFV 2015 15/ 40

slide-35
SLIDE 35

Direct instrumentation is not good enough

Bardin et al. CFV 2015 16/ 40

Non-tightness 1

× P’ has exponentially more paths

than P Non-tightness 2

× Paths in P’ too complex

◮ at each label, require to cover

p or to cover ¬p

◮ π′ covers up to N labels

slide-36
SLIDE 36

Our approach

The DSE⋆ algorithm [ICST 14] Tight instrumentation P⋆ : totally prevents “complexification” Iterative Label Deletion : discards some redundant paths Both techniques can be implemented in black-box

Bardin et al. CFV 2015 17/ 40

slide-37
SLIDE 37

DSE⋆ : Tight Instrumentation

Covering label l ⇔ Covering exit(0)

sound & complete instrumentation no complexification of the search space

Bardin et al. CFV 2015 18/ 40

slide-38
SLIDE 38

DSE⋆ : Tight Instrumentation (2)

Bardin et al. CFV 2015 19/ 40

slide-39
SLIDE 39

DSE⋆ : Tight Instrumentation (2)

Bardin et al. CFV 2015 19/ 40

slide-40
SLIDE 40

DSE⋆ : Tight Instrumentation (2)

Bardin et al. CFV 2015 19/ 40

slide-41
SLIDE 41

DSE⋆ : Iterative Label Deletion

Observations we need to cover each label only once yet, DSE explores paths of P⋆ ending in already-covered labels burden DSE with “useless” paths w.r.t. label coverage Solution : Iterative Label Deletion keep a cover status for each label symbolic execution ignores paths ending in covered labels dynamic execution updates cover status [truly requires DSE] Iterative Label Deletion is relatively complete w.r.t. label coverage

Bardin et al. CFV 2015 20/ 40

slide-42
SLIDE 42

DSE⋆ : Iterative Label Deletion (2)

Bardin et al. CFV 2015 21/ 40

slide-43
SLIDE 43

Summary

The DSE⋆ algorithm Tight instrumentation P⋆ : totally prevents “complexification” Iterative Label Deletion : discards some redundant paths relative completeness w.r.t. label coverage Both techniques can be implemented in black-box

Bardin et al. CFV 2015 22/ 40

slide-44
SLIDE 44

Experiments

Benchmark : Standard (test generation) benchmarks [Siemens,

Verisec, Mediabench]

12 programs (50-300 loc), 3 criteria (CC, MCC, WM) 26 pairs (program, coverage criterion) 1,270 test requirements Performance overhead

DSE DSE’ DSE⋆ Min ×1 ×1.02 ×0.49 Median ×1 ×1.79 ×1.37 Max ×1 ×122.50 ×7.15 Mean ×1 ×20.29 ×2.15 Timeouts 5 ∗

∗ : TO are discarded for overhead computation

cherry picking : 94s vs TO [1h30]

Bardin et al. CFV 2015 23/ 40

slide-45
SLIDE 45

Experiments

Benchmark : Standard (test generation) benchmarks [Siemens,

Verisec, Mediabench]

12 programs (50-300 loc), 3 criteria (CC, MCC, WM) 26 pairs (program, coverage criterion) 1,270 test requirements Coverage

Random DSE DSE⋆ Min 37% 61% 62% Median 63% 90% 95% Max 100% 100% 100% Mean 70% 87% 90% vs DSE : +39% coverage on some examples

Bardin et al. CFV 2015 23/ 40

slide-46
SLIDE 46

Experiments

Benchmark : Standard (test generation) benchmarks [Siemens,

Verisec, Mediabench]

12 programs (50-300 loc), 3 criteria (CC, MCC, WM) 26 pairs (program, coverage criterion) 1,270 test requirements Conclusion DSE⋆ performs significantly better than DSE’ The overhead of handling labels is kept reasonable high coverage, better than DSE

Bardin et al. CFV 2015 23/ 40

slide-47
SLIDE 47

Outline

Introduction Labels Efficient DSE for Labels Infeasible label detection The GACC criterion Conclusion

Bardin et al. CFV 2015 24/ 40

slide-48
SLIDE 48

Infeasibility detection

Basic ideas rely on existing sound verification methods . label (loc, ϕ) infeasible ⇔ assertion (loc, ¬ϕ) invariant grey-box combination of existing approaches [VA ⊕ WP]

Bardin et al. CFV 2015 25/ 40

slide-49
SLIDE 49

Overview of the approach

Bardin et al. CFV 2015 26/ 40

labels as a unifying criteria label infeasibility ⇔ assertion validity s-o-t-a verification for assertion checking

  • nly soundness is required (verif)

◮ label encoding not required to be perfect . mcdc and strong mutation ok

slide-50
SLIDE 50

Focus : checking assertion validity

Two broad categories of sound assertion checkers State-approximation computation [forward abstract interp., cegar]

◮ compute an invariant of the program ◮ then, analyze all assertions (labels) in one go

Goal-oriented checking [pre≤k, weakest precond., cegar]

◮ perform a dedicated check for each assertion ◮ a single check usually easier, but many of them Bardin et al. CFV 2015 27/ 40

slide-51
SLIDE 51

Focus : checking assertion validity

Two broad categories of sound assertion checkers State-approximation computation [forward abstract interp., cegar]

◮ compute an invariant of the program ◮ then, analyze all assertions (labels) in one go

Goal-oriented checking [pre≤k, weakest precond., cegar]

◮ perform a dedicated check for each assertion ◮ a single check usually easier, but many of them

Focus on Value-analysis (VA) and Weakest Precondition (WP) correspond to our implementation well-established approaches

[the paper is more generic]

Bardin et al. CFV 2015 27/ 40

slide-52
SLIDE 52

Focus : checking assertion validity (2)

VA WP sound for assert validity

  • blackbox reuse
  • local precision

×

  • calling context
  • ×

calls / loop effects

  • ×

global precision

× ×

scalability wrt. #labels

  • scalability wrt. code size

×

  • hypothesis : VA is interprocedural

Bardin et al. CFV 2015 28/ 40

slide-53
SLIDE 53

VA and WP may fail

int main() { int a = nondet (0 .. 20); int x = nondet (0 .. 1000); return g(x,a); } int g(int x, int a) { int res; if(x+a >= x) res = 1; else res = 0; //l1: res == 0 }

Bardin et al. CFV 2015 29/ 40

slide-54
SLIDE 54

VA and WP may fail

int main() { int a = nondet (0 .. 20); int x = nondet (0 .. 1000); return g(x,a); } int g(int x, int a) { int res; if(x+a >= x) res = 1; else res = 0; //@assert res != 0 }

Bardin et al. CFV 2015 29/ 40

slide-55
SLIDE 55

VA and WP may fail

int main() { int a = nondet (0 .. 20); int x = nondet (0 .. 1000); return g(x,a); } int g(int x, int a) { int res; if(x+a >= x) res = 1; else res = 0; //@assert res != 0 // both VA and WP fail }

Bardin et al. CFV 2015 29/ 40

slide-56
SLIDE 56

Proposal : VA ⊕ WP (1)

Goal = get the best of the two worlds idea : VA passes to WP the global info. it lacks Which information, and how to transfer it ? VA computes (internally) some form of invariants WP naturally takes into account assumptions //@ assume solution VA exports its invariants on the form of WP-assumptions

Bardin et al. CFV 2015 30/ 40

slide-57
SLIDE 57

Proposal : VA ⊕ WP (1)

Goal = get the best of the two worlds idea : VA passes to WP the global info. it lacks Which information, and how to transfer it ? VA computes (internally) some form of invariants WP naturally takes into account assumptions //@ assume solution VA exports its invariants on the form of WP-assumptions Should work for any VA and WP engine

Bardin et al. CFV 2015 30/ 40

slide-58
SLIDE 58

VA⊕WP succeeds !

int main() { int a = nondet (0 .. 20); int x = nondet (0 .. 1000); return g(x,a); } int g(int x, int a) { int res; if(x+a >= x) res = 1; else res = 0; //l1: res == 0 }

Bardin et al. CFV 2015 31/ 40

slide-59
SLIDE 59

VA⊕WP succeeds !

int main() { int a = nondet (0 .. 20); int x = nondet (0 .. 1000); return g(x,a); } int g(int x, int a) { //@assume 0 <= a <= 20 //@assume 0 <= x <= 1000 int res; if(x+a >= x) res = 1; else res = 0; //@assert res != 0 }

Bardin et al. CFV 2015 31/ 40

slide-60
SLIDE 60

VA⊕WP succeeds !

int main() { int a = nondet (0 .. 20); int x = nondet (0 .. 1000); return g(x,a); } int g(int x, int a) { //@assume 0 <= a <= 20 //@assume 0 <= x <= 1000 int res; if(x+a >= x) res = 1; else res = 0; //@assert res != 0 // VA ⊕ WP succeeds }

Bardin et al. CFV 2015 31/ 40

slide-61
SLIDE 61

Proposal : VA ⊕ WP (2)

Exported invariants numerical constraints (sets, intervals, congruence)

  • nly names appearing in the program (params, lhs, vars)

in practice : exhaustive export has very low overhead Soundness ok as long as VA is sound Exhaustivity of “export” only affect deductive power

Bardin et al. CFV 2015 32/ 40

slide-62
SLIDE 62

Summary

VA WP VA ⊕ WP sound for assert validity

  • blackbox reuse
  • local precision

×

  • calling context
  • ×
  • calls / loop effects
  • ×
  • global precision

× × ×

scalability wrt. #labels

  • scalability wrt. code size

×

  • ?

Bardin et al. CFV 2015 33/ 40

slide-63
SLIDE 63

Detection power

Reuse the same benchmarks [Siemens, Verisec, Mediabench] 12 programs (50-300 loc), 3 criteria (CC, MCC, WM) 26 pairs (program, coverage criterion) 1,270 test requirements, 121 infeasible ones #Lab #Inf VA WP VA ⊕ WP #d %d #d %d #d %d Total 1,270 121 84 69% 73 60% 118 98% Min 0% 0% 2 67% Max 29 29 100% 15 100% 29 100% Mean 4.7 3.2 63% 2.8 82% 4.5 95% #d : number of detected infeasible labels %d : ratio of detected infeasible labels

Bardin et al. CFV 2015 34/ 40

slide-64
SLIDE 64

Detection power

Reuse the same benchmarks [Siemens, Verisec, Mediabench] 12 programs (50-300 loc), 3 criteria (CC, MCC, WM) 26 pairs (program, coverage criterion) 1,270 test requirements, 121 infeasible ones #Lab #Inf VA WP VA ⊕ WP #d %d #d %d #d %d Total 1,270 121 84 69% 73 60% 118 98% Min 0% 0% 2 67% Max 29 29 100% 15 100% 29 100% Mean 4.7 3.2 63% 2.8 82% 4.5 95% #d : number of detected infeasible labels %d : ratio of detected infeasible labels

VA ⊕ WP achieves almost perfect detection detection speed is reasonable [≤ 1s/obj.]

Bardin et al. CFV 2015 34/ 40

slide-65
SLIDE 65

Impact on test generation

report more accurate coverage ratio

Coverage ratio reported by DSE⋆ Detection method None VA ⊕WP Perfect* Total 90.5% 99.2% 100.0% Min 61.54% 91.7% 100.0% Max 100.00% 100.0% 100.0% Mean 91.10% 99.2% 100.0% * preliminary, manual detection of infeasible labels

Bardin et al. CFV 2015 35/ 40

slide-66
SLIDE 66

Impact on test generation

  • ptimisation : speedup test generation : take care !

Ideal speedup

DSE⋆-opt vs DSE⋆ Min. 0.96× Max. 592.54× Mean 49.04×

in practice

DSE⋆-opt vs DSE⋆ RT(1s) +LUncov +DSE⋆ Min 0.1x Max 55.4x Mean 3.8x RT : random testing Speedup wrt. DSE⋆ alone

Bardin et al. CFV 2015 35/ 40

slide-67
SLIDE 67

Impact on test generation

  • ptimisation : speedup test generation : take care !

Ideal speedup

DSE⋆-opt vs DSE⋆ Min. 0.96× Max. 592.54× Mean 49.04×

in practice

DSE⋆-opt vs DSE⋆ RT(1s) +LUncov +DSE⋆ Min 0.1x Max 55.4x Mean 3.8x RT : random testing Speedup wrt. DSE⋆ alone

Bardin et al. CFV 2015 35/ 40

slide-68
SLIDE 68

Outline

Introduction Labels Efficient DSE for Labels Infeasible label detection The GACC criterion Conclusion

Bardin et al. CFV 2015 36/ 40

slide-69
SLIDE 69

The GACC criterion

MCDC : coverage criterion used in aeronautics demanding, industrially relevant requires to show that any atomic condition ci alone can influence its branching condition C . 2 tests t1, t2 for any such ci . t1(ci) = t2(ci), t1(C) = t2(C), and ∀j = i.t1(C) = t2(C) Example : show that a alone can influence a ∧ (b ∨ c) : t1 : (a = 1, b = 1, c = 0), t2 : (a = 0, b = 1, c = 0)

Bardin et al. CFV 2015 37/ 40

slide-70
SLIDE 70

The GACC criterion

GACC (global active clause coverage, a.k.a. shortcut MCDC) a weaker interpretation of MCDC, still demanding and industrially relevant the notion of ”influences alone” is different . 2 tests t1, t2 for any such ci . t1(ci) = t2(ci)

[ nothing on t1(C), t2(C)]

. for both t1, t2, modifying ci alones switches C

[nothing on the t1(cj), t2(cj)]

Example : show that a alone can influence a ∧ (b ∨ c) : t1 : (a = 1, b = 1, c = 0), t2 : (a = 0, b = 1, c = 1)

  • k for GACC, not for MCDC

Bardin et al. CFV 2015 37/ 40

slide-71
SLIDE 71

The GACC criterion

GACC can be encoded into labels [Tillmann et al. 2010] . [no exact encoding is known for MCDC] ϕi ci = f ∧ C(c1, . . . , ci−1, t, ci+1, . . . , cn) = C(c1, . . . , ci−1, f, ci+1, . . . , cn) ϕ′

i

ci = t ∧ C(c1, . . . , ci−1, t, ci+1, . . . , cn) = C(c1, . . . , ci−1, f, ci+1, . . . , cn)

Bardin et al. CFV 2015 37/ 40

slide-72
SLIDE 72

The GACC criterion (2)

Performance overhead

DSE’ DSE⋆ norm.

  • pt

Min 1.44× 1.41× 1.38× Med 3.76× 1.81× 1.44× Max 130.79× 59.40× 3.14× Mean 21.99× 10.55× 1.85× Timeouts 1 Coverage Random DSE DSE⋆ norm.

  • pt

Min 47% 62% 64% 72% Med 55% 76% 88% 96% Max 100% 100% 100% 100% Mean 60% 78% 85% 91%

Bardin et al. CFV 2015 38/ 40

slide-73
SLIDE 73

Outline

Introduction Labels Efficient DSE for Labels Infeasible label detection The GACC criterion Conclusion

Bardin et al. CFV 2015 39/ 40

slide-74
SLIDE 74

Conclusion

Goals : extend DSE to a large set of structural coverage criteria

generic low-level encoding of coverage criteria [ICST 14] efficient variant of DSE for coverage criteria [ICST 14] sound and quasi-complete detection of infeasibility [ICST 15]

Next : more criteria (MCDC, strong mutations) scale up infeasibility detection scale up DSE, better handling of functions [big challenge]

Bardin et al. CFV 2015 40/ 40