Random Testing and Model Checking: Building a Common Framework for - - PowerPoint PPT Presentation

random testing and model checking building a common
SMART_READER_LITE
LIVE PREVIEW

Random Testing and Model Checking: Building a Common Framework for - - PowerPoint PPT Presentation

Random Testing and Model Checking: Building a Common Framework for Nondeterministic Exploration Jet Propulsion Laboratory, California Institute of Technology Alex Groce and Rajeev Joshi WODA 2008 July 21, 2008 1 Background &


slide-1
SLIDE 1

1

WODA 2008 July 21, 2008

Random Testing and Model Checking: Building a Common Framework for Nondeterministic Exploration

Alex Groce and Rajeev Joshi

Jet Propulsion Laboratory, California Institute of Technology

slide-2
SLIDE 2

2

WODA 2008 July 21, 2008

Background & Motivation

LaRS (Laboratory for Reliable Software) at JPL has been building, verifying, and testing flash file systems for space mission use This work grows out of that experience

slide-3
SLIDE 3

3

WODA 2008 July 21, 2008

Background & Motivation

MSAP

  • Two flash file systems, one RAM file system,
  • ne critical parameter storage module
  • Approach: random testing [ICSE’07,ASE’08]

MSL (Mars Science Laboratory)

  • One flash file system, one RAM file system,
  • ne low-level flash interface (critical

parameter storage)

  • Approach: model checking/random testing
slide-4
SLIDE 4

4

WODA 2008 July 21, 2008

Random Testing

I think we all know what random testing is:

  • Operations and parameters generated at

random to test a program

  • Possibly with some bias or feedback to help

with the problem of irrelevant/redundant

  • perations
slide-5
SLIDE 5

5

WODA 2008 July 21, 2008

Model Checking and Dynamic Analysis

(Software) model checking

  • (In principle exhaustive) exploration of a

program’s state space

Dynamic analysis (what we’re here for today)

  • Analysis of a running program
  • Usually instrumentation or execution in virtual

environment – e.g. Valgrind, Daikon

  • Testing is a dynamic analysis: program is

executed in order to learn about its behaviors

  • We’re looking at the kind of model checking

that is essentially a dynamic analysis

slide-6
SLIDE 6

6

WODA 2008 July 21, 2008

Many Software Model Checkers

CBMC BLAST SLAM JPF2 SPIN CMC CRunner MAGIC VeriSoft Bogor

slide-7
SLIDE 7

7

WODA 2008 July 21, 2008

Two Approaches

CBMC BLAST SLAM JPF2 SPIN CMC CRunner MAGIC

Analysis of derived transition system Execution of actual code

VeriSoft CRunner

Our focus in this talk (dynamic: like testing) (“static”)

Bogor

slide-8
SLIDE 8

8

WODA 2008 July 21, 2008

Model Checking as State-Based Testing

Model-checking by executing the program

  • Backtracking search for all states

State already visited! Backtrack and try a different operation Done with test! Backtrack and try a different operation CFG State already visited! Backtrack and try a different operation Will explore, as a side-effect, many executions (like random testing) but the goal is to explore states

m kdi r / a m kdi r / a m kdi r / b m kdi r / b m kdi r / c m kdi r / c m kdi r / a m kdi r / a r m di r / a r m di r / a

slide-9
SLIDE 9

9

WODA 2008 July 21, 2008

SPIN and Model-Driven Verification

SPIN compiles a PROMELA model into a C program: it’s a model checker generator

  • Embed C code in transitions by executing the

compiled C code

  • Take advantage of all SPIN features –

hashing, multicore exploration, etc.

Requires the ability to restore a running program to an earlier execution state

  • Difficult engineering problem, handled by CIL-

based automatic code instrumentation [VMCAI’08]

slide-10
SLIDE 10

10

WODA 2008 July 21, 2008

SPIN and Model-Driven Verification

When SPIN backtracks, it uses information on how to restore the state

  • f the C program:
  • Tracked memory is

restored on backtrack

  • Matched memory is also

used to determine if a state has been visited before

Execute C code until control returns to SPIN Push tracked & matched state on stack Has state been visited before? Store matched state in state table Backtrack: pop stack & restore tracked & matched state Y N

slide-11
SLIDE 11

11

WODA 2008 July 21, 2008

SPIN and Model-Driven Verification

(Unsound) abstraction by matching on an abstraction of the tracked concrete state

  • E.g. track the

pointers/contents of a linked list

  • Match on a sorted array

copy only (if order doesn’t matter for property in question)

Execute C code until control returns to SPIN Push tracked & matched state on stack Has state been visited before? Store matched state in state table Backtrack: pop stack & restore tracked & matched state Y N

slide-12
SLIDE 12

12

WODA 2008 July 21, 2008

A Common Goal

Program state spaces are typically too large to explore fully even after (unsound) abstraction Random testing and model checking are both methods for nondeterministically exploring a program’s state space

  • A series of random walks
  • vs. systematic exploration with

backtracking

slide-13
SLIDE 13

13

WODA 2008 July 21, 2008

Which is Better?

Conventional wisdom (exaggerated):

  • Random testing is probably less effective

than model checking

  • BUT model checking is much more

difficult to apply than random testing, scales poorly, crashes a lot, makes your ears bleed, and may cause temporary paralysis

Test engineer using a model checker on a C program?

slide-14
SLIDE 14

14

WODA 2008 July 21, 2008

How True is the Conventional Wisdom?

Realistically, the state spaces for real programs are huge

  • Model checking will almost certainly use

unsound abstractions, and still be only partial exploration

  • Systematically missing some states that

could expose errors

  • Are we sure this is better than smart

random testing for fault detection / coverage?

slide-15
SLIDE 15

15

WODA 2008 July 21, 2008

How True is the Conventional Wisdom?

On the other hand, explicit-state model checking is not that difficult to apply

  • PROMELA is a nice language for expressing

nondeterministic choice & test structure

  • Provides test-case playback, minimization,

and other things often build by hand for testing

  • Scales quite well if memory usage is (a)

limited (no 5GB memory footprint) and (b) well-defined

  • Often true for embedded systems
slide-16
SLIDE 16

16

WODA 2008 July 21, 2008

Using SPIN for True Random Testing

Want to apply both methods

  • For research purposes (comparison)
  • Due diligence in testing! This stuff is

going to Mars… But why write two testers? – one for random testing, one for model checking

  • Basic harness looks the same,

property checks look the same, etc.

  • Annoying redundant work, better to

spend time improving the harness or running more tests

slide-17
SLIDE 17

17

WODA 2008 July 21, 2008

A Quick Primer: Using SPIN for Random Testing, in Five Slides OR Almost All the PROMELA You Ever Need to Know

slide-18
SLIDE 18

18

WODA 2008 July 21, 2008

Simple PROMELA Code

i nt i nt x; x; i nt i nt y; y; act i ve p t i ve pr oct ype m r oct ype m ai n ( ai n ( ) { { i f i f : : : : x = 1 x = 1 : : : : x = 2 x = 2 f i ; f i ; asser t ( ser t ( x = == y y) ; }

Start simple This model has 7 states What are they? State = (PC, x, y) 1 2 3 5 7 SPIN’s nondeterministic choice construct Picks any one of the choices that is enabled How do we guard a choice? i f i f : : : : ( x < 10) ( x < 10) - > y = 1

  • > y = 1

: : : : ( x < 5) ( x < 5) - >

  • > y = 3

y = 3 : : : : ( x > 1) ( x > 1) - >

  • > y = 4

y = 4 f i ; f i ; Not mutually exclusive!

slide-19
SLIDE 19

19

WODA 2008 July 21, 2008

Simple PROMELA Code

i nt x; i nt x; i nt y; i nt y; act i ve pr oct ype m ai n ( ) act i ve pr oct ype m ai n ( ) { i f i f : : x = : : x = 1 : : x = : : x = 2 f i ; f i ; i f i f : : y = : : y = 1 : : y = : : y = 2 f i ; f i ; i f i f : : x > y - > x = y : : x > y - > x = y : : y > x - > y = x : : y > x - > y = x : : el se - > ski p : : el se - > ski p f i ; f i ; asser t asser t ( x == y) ; ( x == y) ; }

This model has 17 states What are they? State = (PC, x, y) 1 2 3 5 7 9 13 14 15 17 Er… Don’t worry about state-counting too much – SPIN has various automatic reductions and atomicity choices that can make that difficult

slide-20
SLIDE 20

20

WODA 2008 July 21, 2008

Simple PROMELA Code

i nt i nt x; x; act i ve p t i ve pr oct ype m r oct ype m ai n ( ai n ( ) { { x = x = 0; 0; do do : : ( x < : : ( x < 10) - > x++ 10) - > x++ : : br eak : : br eak

  • d
  • d

/ * H / * Her e, x er e, x i s a anyt hi ng b nyt hi ng bet ween et ween 0 a and nd 9 i i ncl usi ve ncl usi ve * /

Only a couple more PROMELA constructs to learn for building test harnesses: the do do loop Like i f i f , except it introduces a loop to the top – br eak br eak choice can exit the loop This nondeterministically assigns x a value in the range 0…9

slide-21
SLIDE 21

21

WODA 2008 July 21, 2008

Simple PROMELA Code

i nl i ne p l i ne pi ck ( i ck ( var , M var , M AX) AX) var var = = 0; 0; do do : : ( var < : : ( var < M AX) - > var ++ M AX) - > var ++ : : br eak : : br eak

  • d
  • d

i nl i ne i nl i ne gives us a macro facility As you can imagine, this is a useful macro for building a test harness!

slide-22
SLIDE 22

22

WODA 2008 July 21, 2008

Less Simple PROMELA Code

: : choi ce == UNLI NK - > / * unl i nk * / pi ck( pat hi ndex, NUM _PATHS) ; / * Choose a pat h * / c_code { c_code {

  • now. r es =
  • now. r es = nvf s_unl i nk ( pat h[ now. pat hi ndex] ) ;

nvf s_unl i nk ( pat h[ now. pat hi ndex] ) ; } ; } ; nvf s_er r no = nvf s_er r no = c_expr { er r no} ; c_expr { er r no} ; check_r eset ( ) ; / * Check f or syst em r eset and r ei ni t i f needed * / i f : : ( r es < 0) && ( nvf s_er r no == ENO SPC) - > / * I f out - of - space er r or * / check_space( ) ; : : ( ( ! di d_r eset ) | | ( r es ! = - 1) ) && ! ( ( r es < 0) && ( nvf s_er r no == ENO SPC) ) - > c_code{ c_code{

  • now. r am

f s_r es =

  • now. r am

f s_r es = r am f s_unl i r am f s_unl i nk ( pat h[ now. pat hi ndex] ) ; nk ( pat h[ now. pat hi ndex] ) ; } ; } ; r am f s_er r no = r am f s_er r no = c_expr { er r no} ; c_expr { er r no} ; : : el se - > ski p f i ; . . . asser t ( r es == r am f s_r es) ; asser t ( nvf s_er r no == r am f s_er r no) ;

Finally, we want to be able to call the C program we are testing

slide-23
SLIDE 23

23

WODA 2008 July 21, 2008

Testing via Model Checking

Basic idea:

  • We’ll write a test harness in PROMELA
  • Use SPIN to backtrack and explore inputs
  • Use abstraction to limit the number of

states we consider

  • We can even “trick” SPIN into doing pure

random testing!

slide-24
SLIDE 24

24

WODA 2008 July 21, 2008

The pick Macro, Revisited

i nl i ne p l i ne pi ck ( i ck ( var , M var , M AX) AX) var var = = 0; 0; do do : : ( var < : : ( var < M AX) - > var ++ M AX) - > var ++ : : br eak : : br eak

  • d
  • d

What if we change pick?

slide-25
SLIDE 25

25

WODA 2008 July 21, 2008

The pick Macro, Revisited

i nl i ne p l i ne pi ck ( i ck ( var , M var , M AX) { AX) { i f i f : : ! ! i ni t i al i zed - ni t i al i zed - > nondet _pi ck( seed, S

  • ndet _pi ck( seed, SEED_RANG

E) ; EED_RANG E) ; c_code{ _code{ pr i nt f ( r i nt f ( “ Test w “ Test wi t h s i t h seed % eed % d\ n” , d\ n” ,

  • now. seed) ;
  • w. seed) ;

sr andom ( now. seed) ; r andom ( now. seed) ; } ; } ; i ni t i al i zed = ni t i al i zed = 1 : : el se : : el se - > ski p

  • > ski p

f i ; f i ; var r = c c_expr { r andom ( ) } _expr { r andom ( ) } % M M AX; AX; }

To this?

slide-26
SLIDE 26

26

WODA 2008 July 21, 2008

Some Results

From a flash file system for the Mars Science Laboratory mission – see the paper for details Basic idea – how does coverage (source code / configurations of the flash file system) change as we increase testing time?

slide-27
SLIDE 27

27

WODA 2008 July 21, 2008

Coverage of nvds_box.c

78 79 80 81 82 83 84 85 86 87 50 100 150 200

Minutes % Coverage

Model Checking Random Testing

slide-28
SLIDE 28

28

WODA 2008 July 21, 2008

Coverage of nvfs_pub.c

75.2 75.25 75.3 75.35 75.4 75.45 75.5 75.55 50 100 150 200

Minutes % Coverage

Model Checking Random Testing

slide-29
SLIDE 29

29

WODA 2008 July 21, 2008

Coverage of flash abstraction

10 20 30 40 50 60 70 50 100 150 200

Minutes Abstract states covered

Model Checking Random Testing

slide-30
SLIDE 30

30

WODA 2008 July 21, 2008

Coverage of page abstraction

5 10 15 20 25 30 35 40 50 100 150 200

Minutes Abstract states covered

Model Checking Random Testing

slide-31
SLIDE 31

31

WODA 2008 July 21, 2008

Conclusions (and an Invitation)

Is model checking better?

  • Maybe, maybe not
  • Preliminary results for one program
  • Visser et al. and others report varying

results for this question

  • These results don’t use as much feedback

as our latest test harness – which may change the results (improves both model checking and random testing results)

slide-32
SLIDE 32

32

WODA 2008 July 21, 2008

Conclusions (and an Invitation)

If you’re analyzing or testing C programs

  • Where function-call level atomicity is ok
  • With well-defined memory usage
  • It might be well worth your while to try explicit-

state model checking

  • Easy to work with abstractions and guide

testing/analysis towards certain goals

  • Can also provide random testing “for free”

JPF may work well for this purpose, also, though since it uses its own JVM, may be trickier/slower Download SPIN at http://www.spinroot.com