QuickCheck John Hughes Chalmers University/Quviq AB What is - - PowerPoint PPT Presentation

quickcheck
SMART_READER_LITE
LIVE PREVIEW

QuickCheck John Hughes Chalmers University/Quviq AB What is - - PowerPoint PPT Presentation

Specification Based Testing with QuickCheck John Hughes Chalmers University/Quviq AB What is QuickCheck? A library for writing and testing properties of program code Some code: A property: Properties as Code A test data A


slide-1
SLIDE 1

Specification Based Testing with QuickCheck

John Hughes Chalmers University/Quviq AB

slide-2
SLIDE 2
slide-3
SLIDE 3

What is QuickCheck?

  • A library for writing and testing properties of

program code

  • Some code:
  • A property:
slide-4
SLIDE 4

Properties as Code

A quantifier! A set! A predicate! A boolean- valued expression! A macro! An ordinary function definition! A test data generator!

slide-5
SLIDE 5

DEMO

slide-6
SLIDE 6

QuickCheck in a Nutshell

Properties

Test case Test case Test case Test case Test case Minimal Test case

slide-7
SLIDE 7

QuickCheck Properties: things with a counterexample

<bool-exp> ?FORALL(<var>,<generator>,<property>) ?IMPLIES(<bool-exp>,<property>) conjunction, disjunction ?EXISTS(<var>,<generator>,<property>)

slide-8
SLIDE 8

QuickCheck Generators

int(), bool(), real()… choose(<int>,<int>) {<generator>,<generator>…}

  • neof(<list-of-generators>)

?LET(<var>,<generator>,<generator>)

slide-9
SLIDE 9

Example: Sorted Lists

sorted_list_int() -> ?LET(L,list(int()), sort(L)).

slide-10
SLIDE 10

Benefits

  • Less time spent writing test code

– One property replaces many tests

  • Better testing

– Lots of combinations you’d never test by hand

  • Less time spent on diagnosis

– Failures minimized automagically

slide-11
SLIDE 11

An Experiment

Unit tests Properties

slide-12
SLIDE 12

How good were the tests at finding bugs—in other students’ code?

1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11 12 Hunit QuickCheck

Better

0 1 2 3 4 5 6 7 8 9 10 11 Unit tests

slide-13
SLIDE 13

Tests for Base 64 encoding

base64_encode(Config) when is_list(Config) -> %% Two pads <<"QWxhZGRpbjpvcGVuIHNlc2FtZQ==">> = base64:encode("Aladdin:open sesame"), %% One pad <<"SGVsbG8gV29ybGQ=">> = base64:encode(<<"Hello World">>), %% No pad "QWxhZGRpbjpvcGVuIHNlc2Ft" = base64:encode_to_string("Aladdin:open sesam"), "MDEyMzQ1Njc4OSFAIzBeJiooKTs6PD4sLiBbXXt9" = base64:encode_to_string( <<"0123456789!@#0^&*();:<>,. []{}">>),

  • k.

Test cases Expected results

slide-14
SLIDE 14

Writing a Property

prop_base64() -> ?FORALL(Data,list(choose(0,255)), base64:encode(Data) == ???).

slide-15
SLIDE 15

Round-trip Properties

prop_encode_decode() -> ?FORALL(L,list(choose(0,255)), base64:decode(base64:encode(L)) == list_to_binary(L)).

  • define(DECODE_MAP,

{bad,bad,bad,bad,bad,bad,bad,bad,ws,ws,bad,bad,ws,bad,bad, bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad, ws,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,62,bad,bad,bad,63, 52,53,54,55,56,57,58,59,60,61,bad,bad,bad,eq,bad,bad, bad,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14, 15,16,17,18,19,20,21,22,23,24,25,bad,bad,bad,bad,bad, bad,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40, 41,42,43,44,45,46,47,48,49,50,51,bad,bad,bad,bad,bad, bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad, bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad, bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad, bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,

  • define(DECODE_MAP,

{bad,bad,bad,bad,bad,bad,bad,bad,ws,ws,bad,bad,ws,bad,bad, bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad, ws,bad,bad,bad,bad,bad,bad,bad,bad,bad,62,bad,bad,bad,bad,63, 52,53,54,55,56,57,58,59,60,61,bad,bad,bad,eq,bad,bad, bad,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14, 15,16,17,18,19,20,21,22,23,24,25,bad,bad,bad,bad,bad, bad,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40, 41,42,43,44,45,46,47,48,49,50,51,bad,bad,bad,bad,bad, bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad, bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad, bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad, bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,bad,

NOT caught by the test suite

slide-16
SLIDE 16

Round-trip Properties

117> eqc:quickcheck(base64_eqc:prop_encode_decode()). ...................................Failed! Reason: {'EXIT',{badarg,43}} After 36 tests. [204,15,130] Shrinking...(3 times) Reason: {'EXIT',{badarg,43}} [0,0,62] prop_encode_decode() -> ?FORALL(L,list(choose(0,255)), base64:decode(base64:encode(L)) == list_to_binary(L)).

The table entry we changed

slide-17
SLIDE 17

Round-trip Properties

What does this test?

  • NOT a complete test—will not find a

consistent misunderstanding of base64

  • WILL find mistakes in encoder or decoder

Simple properties find a lot of bugs!

prop_encode_decode() -> ?FORALL(L,list(choose(0,255)), base64:decode(base64:encode(L)) == list_to_binary(L)).

slide-18
SLIDE 18

Back to the tests…

base64_encode(Config) when is_list(Config) -> %% Two pads <<"QWxhZGRpbjpvcGVuIHNlc2FtZQ==">> = base64:encode("Aladdin:open sesame"), %% One pad <<"SGVsbG8gV29ybGQ=">> = base64:encode(<<"Hello World">>), %% No pad "QWxhZGRpbjpvcGVuIHNlc2Ft" = base64:encode_to_string("Aladdin:open sesam"), "MDEyMzQ1Njc4OSFAIzBeJiooKTs6PD4sLiBbXXt9" = base64:encode_to_string( <<"0123456789!@#0^&*();:<>,. []{}">>),

  • k.

Where did these come from?

slide-19
SLIDE 19

Possibilities

  • Someone converted the data by hand
  • Another base64 encoder
  • The same base64 encoder!

– Only tests that changes don’t affect the result, not that the result is right Use the other encoder as an

  • racle

Use an old version (or a simpler version) as an oracle

slide-20
SLIDE 20

Commuting Diagram Properties

slide-21
SLIDE 21

Property Types in Class Examples

  • Rex Page: 71 properties in University of

Oklahoma courses in Software Engineering, Applied Logic (QuickCheck+ACL2)

Round trip Commuting diagram Other

slide-22
SLIDE 22

Time for some C code…

slide-23
SLIDE 23

Testing Stateful Code

API Calls API Calls API Calls API Calls

Model state Model state Model state Model state

postconditions

A list of numbers!

slide-24
SLIDE 24

A QuickCheck Property

prop_q() -> ?FORALL(Cmds,commands(?MODULE), begin {H,S,Res} = run_commands(?MODULE,Cmds), Res == ok) end).

slide-25
SLIDE 25

Let’s run some tests…

slide-26
SLIDE 26

Exercises  Practice

Small scale Property-driven development Trivial inputs Large scale Testing legacy code Complex inputs   

slide-27
SLIDE 27

Example: Ericsson Media Proxy

Megaco request Megaco response Megaco request Megaco response

Many, many parameters, can be 1—2 pages per message! Lots of work to write generators State machine models fit the problem well

slide-28
SLIDE 28

Ericsson Media Proxy Bug

  • Test adding and removing callers from a call

Add Add Sub Add Sub Add Sub

Call Full

slide-29
SLIDE 29
  • Relational databases don’t scale to ”Big Data”
  • ”noSQL” databases are a popular alternative

A highly scalable, reliable, available and low-latency distributed key- value store

slide-30
SLIDE 30

Put and Get

put get

1

put get

1

slide-31
SLIDE 31

Conflicts

put

1

put get

{0,1} 2

put

2

get QuickCheck model: record each client’s current view of the data; put replaces that view

slide-32
SLIDE 32

Example

put

1

put

{0,1}

get

2

put

3

put

???

get

{0,1,2,3}

get

A vector clock

  • ptimisation…

QuickCheck model: client’s view is fresh or stale: updating a stale view just adds to the conflicts…

slide-33
SLIDE 33

Example

put get

??

get get

{0,0}

slide-34
SLIDE 34

Duplicate value explained

put

12:43:27 12:43:27 12:43:27

get

12:43:28

get

{0,0}

slide-35
SLIDE 35

Eventual Consistency

  • ”For any sequence of operations, with any

node or network failures, Riak eventually reaches a consistent state”

– When is ”eventually”?

  • For any sequence of operations sent to any

subsets of server nodes (because of failures), completing all Riak’s repair operations results in a consistent state.

slide-36
SLIDE 36

AutoSAR

  • Joint project with Quviq, SP, Volvo Cars,

Mentor Graphics…

slide-37
SLIDE 37
slide-38
SLIDE 38

AutoSAR Basic Software

slide-39
SLIDE 39

The Story So Far…

  • QuickCheck state-machine models for 3

AutoSAR clusters (Com/PDUR, CAN, FlexRay)

  • Used to test software from 3 suppliers
  • Bugs revealed in all!

– Plus reinterpretations of the standard

slide-40
SLIDE 40
slide-41
SLIDE 41

uint32

slide-42
SLIDE 42

COM Component

  • 500 pages of standard
  • 250 pages of C
  • 25 pages of QuickCheck
slide-43
SLIDE 43

"We know there is a lurking bug somewhere in the dets code. We have got 'bad object' and 'premature eof' every other month the last year. We have not been able to track the bug down since the dets files is repaired automatically next time it is opened.“ Tobbe Törnqvist, Klarna, 2007

slide-44
SLIDE 44

What is it?

Application Mnesia Dets File system

Invoicing services for web shops Distributed database: transactions, distribution, replication Tuple storage >500 people in 5 years Race conditions?

slide-45
SLIDE 45

Imagine Testing This…

dispenser:take_ticket() dispenser:reset()

slide-46
SLIDE 46

A Unit Test in Erlang

test_dispenser() -> reset(), take_ticket(), take_ticket(), take_ticket(), reset(), take_ticket().

  • k =

1 = 2 = 3 =

  • k =

1 = Expected results

BUT…

slide-47
SLIDE 47

A Parallel Unit Test

  • Three possible correct
  • utcomes!

reset take_ticket take_ticket take_ticket 1 2 3 1 3 2 1 2 1

  • k
slide-48
SLIDE 48

Another Parallel Test

  • 42 possible correct outcomes!

reset take_ticket take_ticket take_ticket take_ticket reset

A killer app for properties!

slide-49
SLIDE 49

Modelling the dispenser

reset

take take take 1 2

  • k 1 2 3
slide-50
SLIDE 50

The Model

  • State transitions
  • Postconditions

next_state(S,_V,{call,_,reset,_}) -> 0; next_state(S,_V,{call,_,take_ticket,_}) -> S+1. postcondition(S,{call,_,take_ticket,_},Res) -> Res == S+1;

slide-51
SLIDE 51

Parallel Test Cases

reset ok

take 1 take 3 take 2 1 2

  • k 1 2 3
slide-52
SLIDE 52

prop_parallel() -> ?FORALL(Cmds,parallel_commands(?MODULE), begin start(), {H,Par,Res} = run_parallel_commands(?MODULE,Cmds), Res == ok) end)).

Generate parallel test cases Run tests, check for a matching serialization

slide-53
SLIDE 53

DEMO

slide-54
SLIDE 54

Prefix: Parallel:

  • 1. take_ticket() --> 1
  • 2. take_ticket() --> 1

Result: no_possible_interleaving take_ticket() -> N = read(), write(N+1), N+1.

slide-55
SLIDE 55

dets

  • Tuple store:

{Key, Value1, Value2…}

  • Operations:

– insert(Table,ListOfTuples) – delete(Table,Key) – insert_new(Table,ListOfTuples) – …

  • Model:

– List of tuples (almost)

slide-56
SLIDE 56

QuickCheck Specification

... … ... …

<100 LOC

> 6,000 LOC

slide-57
SLIDE 57

DEMO

  • Sequential tests to validate the model
  • Parallel tests to find race conditions
slide-58
SLIDE 58

Bug #1

Prefix:

  • pen_file(dets_table,[{type,bag}]) -->

dets_table Parallel:

  • 1. insert(dets_table,[]) --> ok
  • 2. insert_new(dets_table,[]) --> ok

Result: no_possible_interleaving

insert_new(Name, Objects) -> Bool Types: Name = name() Objects = object() | [object()] Bool = bool()

slide-59
SLIDE 59

Bug #2

Prefix:

  • pen_file(dets_table,[{type,set}]) --> dets_table

Parallel:

  • 1. insert(dets_table,{0,0}) --> ok
  • 2. insert_new(dets_table,{0,0}) --> …time out…

=ERROR REPORT==== 4-Oct-2010::17:08:21 === ** dets: Bug was found when accessing table dets_table

slide-60
SLIDE 60

Bug #3

Prefix:

  • pen_file(dets_table,[{type,set}]) --> dets_table

Parallel:

  • 1. open_file(dets_table,[{type,set}]) --> dets_table
  • 2. insert(dets_table,{0,0}) --> ok

get_contents(dets_table) --> [] Result: no_possible_interleaving

!

slide-61
SLIDE 61

What’s going on?

Dets server Reordering and concurrency!

slide-62
SLIDE 62

Is the file corrupt?

slide-63
SLIDE 63

Bug #4

Prefix:

  • pen_file(dets_table,[{type,bag}]) --> dets_table

close(dets_table) --> ok

  • pen_file(dets_table,[{type,bag}]) --> dets_table

Parallel:

  • 1. lookup(dets_table,0) --> []
  • 2. insert(dets_table,{0,0}) --> ok
  • 3. insert(dets_table,{0,0}) --> ok

Result: ok

premature eof

slide-64
SLIDE 64

Bug #5

Prefix:

  • pen_file(dets_table,[{type,set}]) --> dets_table

insert(dets_table,[{1,0}]) --> ok Parallel:

  • 1. lookup(dets_table,0) --> []

delete(dets_table,1) --> ok

  • 2. open_file(dets_table,[{type,set}]) --> dets_table

Result: ok false

bad object

slide-65
SLIDE 65

"We know there is a lurking bug somewhere in the dets code. We have got 'bad object' and 'premature eof' every other month the last year.” Tobbe Törnqvist, Klarna, 2007 Each bug fixed the day after reporting the failing case

slide-66
SLIDE 66

How come?

  • The bugs weren’t found earlier?

– despite > 6 weeks of work

  • Hypotheses

– …files of over 1GB? – …rehashing could be the problem? – Diagnosing races in production is hopeless

  • The bugs weren’t found in testing?

– Unit tests for races are hard to write…so people don’t! – Races=feature interaction  impractically many tests

slide-67
SLIDE 67

Race conditions should be found by unit testing with generated tests

slide-68
SLIDE 68

Reflections

slide-69
SLIDE 69

The Initial Phases

  • Lots of work to develop specification

– Understanding and generating test inputs

  • Many errors to fix in the specification, due to…

– New code is buggy – Misunderstandings of the informal spec – Undocumented features of the system – Undocumented limitations of the system

  • ”happy case” programming
slide-70
SLIDE 70

Making Progress

  • QuickCheck tends to find the same problem in

every run

– There is a ”most likely bug” – Other bugs usually shrink to the most likely one

  • To make progress, the most likely bug must be

excluded

– Bug preconditions document the limitations of the system

slide-71
SLIDE 71

The Payoff

  • Once the spec is corrected, and limitations

accounted for, real bugs start to appear

  • Each extension to the spec yields a non-linear

improvement in the variety of tests

  • The same spec can find many, many bugs
slide-72
SLIDE 72

QuickCheck…

  • …is very widely applicable
  • …almost always finds bugs in real systems!
  • …is particularly good at spotting interactions

that conventional test cases miss

  • …makes diagnosis simple by shrinking
  • …makes testing more intellectually challenging

and fun!!