Why Is Random Testing Effective for Partition Tolerance Bugs? Rupak - - PowerPoint PPT Presentation

why is random testing effective for partition tolerance
SMART_READER_LITE
LIVE PREVIEW

Why Is Random Testing Effective for Partition Tolerance Bugs? Rupak - - PowerPoint PPT Presentation

Why Is Random Testing Effective for Partition Tolerance Bugs? Rupak Majumdar, Filip Niksic Max Planck Institute for Software Systems (MPI-SWS) Despite Many Formal Approaches Despite Many Formal Approaches practitioners test their


slide-1
SLIDE 1

Why Is Random Testing Effective for Partition Tolerance Bugs?

Rupak Majumdar, Filip Niksic Max Planck Institute for Software Systems (MPI-SWS)

slide-2
SLIDE 2

Despite Many Formal Approaches…

slide-3
SLIDE 3

Despite Many Formal Approaches…

…practitioners test their code

slide-4
SLIDE 4

Despite Many Formal Approaches…

…practitioners test their code …by providing random inputs.

slide-5
SLIDE 5

Despite Many Formal Approaches…

…practitioners test their code …by providing random inputs. And despite our best judgement,

slide-6
SLIDE 6

Despite Many Formal Approaches…

…practitioners test their code …by providing random inputs. And despite our best judgement, …testing is surprisingly effective in finding bugs.

slide-7
SLIDE 7

Despite Many Formal Approaches…

…practitioners test their code …by providing random inputs. And despite our best judgement, …testing is surprisingly effective in finding bugs. We explore this unexpected effectiveness
 in testing distributed systems under partition faults.

slide-8
SLIDE 8

Jepsen: Call Me Maybe

A framework for black-box testing of distributed systems
 by randomly inserting network partition faults Analyses on http://jepsen.io/: etcd, Postgres, Redis, Riak, MongoDB, Cassandra, Kafka, RabbitMQ, Consul, Elasticsearch, Aerospike, Zookeeper, Chronos…

slide-9
SLIDE 9
  • 1. General Random Testing Framework
  • 2. Randomly Testing Distributed Systems
  • 3. Wider Context: Combinatorial Testing
slide-10
SLIDE 10

Tests and Goal Coverage

Tests T Goals G

slide-11
SLIDE 11

Tests and Goal Coverage

Tests T Goals G

A test covers
 some goals

slide-12
SLIDE 12

Tests and Goal Coverage

Tests T Goals G

A test covers
 some goals

Covering family = Set of tests that cover all goals

slide-13
SLIDE 13

Tests and Goal Coverage

Tests T Goals G

A test covers
 some goals

Covering family = Set of tests that cover all goals “Small” covering families = Efficient testing

slide-14
SLIDE 14

Random Testing

Pick a random test from T Fix a goal from G Suppose P[ covers ] ≥ p Characterize covering families with respect to p and |G|

slide-15
SLIDE 15

Probabilistic Method

Let G be the set of goals and P[random covers ] ≥ p

  • Theorem. There exists a covering family of size p-1 log|G|.
slide-16
SLIDE 16

Probabilistic Method

Let G be the set of goals and P[random covers ] ≥ p

  • Theorem. There exists a covering family of size p-1 log|G|.

Proof. P[ random does not cover ] ≤ 1 - p

slide-17
SLIDE 17

Probabilistic Method

Let G be the set of goals and P[random covers ] ≥ p

  • Theorem. There exists a covering family of size p-1 log|G|.

Proof. P[ random does not cover ] ≤ 1 - p P[ K independent do not cover ] ≤ (1 - p)K

slide-18
SLIDE 18

Probabilistic Method

Let G be the set of goals and P[random covers ] ≥ p

  • Theorem. There exists a covering family of size p-1 log|G|.

Proof. P[ random does not cover ] ≤ 1 - p P[ K independent do not cover ] ≤ (1 - p)K P[ K independent are not a covering family ] ≤ |G| (1 - p)K

slide-19
SLIDE 19

Probabilistic Method

Let G be the set of goals and P[random covers ] ≥ p

  • Theorem. There exists a covering family of size p-1 log|G|.

Proof. P[ random does not cover ] ≤ 1 - p P[ K independent do not cover ] ≤ (1 - p)K P[ K independent are not a covering family ] ≤ |G| (1 - p)K For K = p-1 log|G|, this probability is strictly less than 1. Therefore, there must exist K tests that are a covering family!

slide-20
SLIDE 20

Probabilistic Method

Let G be the set of goals and P[random covers ] ≥ p

  • Theorem. There exists a covering family of size p-1 log|G|.
  • Theorem. For ϵ > 0, a random family of p-1 log|G| + p-1 log ϵ-1

tests is a covering family with probability at least 1 - ϵ.

slide-21
SLIDE 21
  • 3. What is the notion of coverage?
  • 4. Can we bound P[random covers ]?

Random Testing Framework

Tests T Goals G

  • 1. What are tests?
  • 2. What are


testing goals?

slide-22
SLIDE 22
  • 1. General Random Testing Framework
  • 2. Randomly Testing Distributed Systems
  • 3. Wider Context: Combinatorial Testing
slide-23
SLIDE 23

Ninjas in Training

In a dojo in Kaiserslautern, n ninjas are in training. Training is complete if for every pair of ninjas, there is a round where they are in opposing teams. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-24
SLIDE 24

Ninjas in Training

In a dojo in Kaiserslautern, n ninjas are in training. Training is complete if for every pair of ninjas, there is a round where they are in opposing teams. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-25
SLIDE 25

Ninjas in Training

In a dojo in Kaiserslautern, n ninjas are in training. Training is complete if for every pair of ninjas, there is a round where they are in opposing teams. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-26
SLIDE 26

Ninjas in Training

In a dojo in Kaiserslautern, n ninjas are in training. Training is complete if for every pair of ninjas, there is a round where they are in opposing teams. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-27
SLIDE 27

Ninjas in Training

In a dojo in Kaiserslautern, n ninjas are in training. Training is complete if for every pair of ninjas, there is a round where they are in opposing teams. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-28
SLIDE 28

Ninjas in Training

In a dojo in Kaiserslautern, n ninjas are in training. Training is complete if for every pair of ninjas, there is a round where they are in opposing teams. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-29
SLIDE 29

Ninjas in Training

In a dojo in Kaiserslautern, n ninjas are in training. Training is complete if for every pair of ninjas, there is a round where they are in opposing teams. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-30
SLIDE 30

Ninjas in Training

In a dojo in Kaiserslautern, n ninjas are in training. Training is complete if for every pair of ninjas, there is a round where they are in opposing teams. How many rounds make the training complete?

  • Naïve: O(n2)
  • Can you do it in log n rounds?
slide-31
SLIDE 31

Ninjas in Training

More generally, n ninjas are training in k teams. Training is complete if for every choice of k ninjas, there is a round where they are each in different team. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-32
SLIDE 32

Ninjas in Training

More generally, n ninjas are training in k teams. Training is complete if for every choice of k ninjas, there is a round where they are each in different team. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-33
SLIDE 33

Ninjas in Training

More generally, n ninjas are training in k teams. Training is complete if for every choice of k ninjas, there is a round where they are each in different team. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-34
SLIDE 34

Ninjas in Training

More generally, n ninjas are training in k teams. Training is complete if for every choice of k ninjas, there is a round where they are each in different team. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-35
SLIDE 35

Ninjas in Training

More generally, n ninjas are training in k teams. Training is complete if for every choice of k ninjas, there is a round where they are each in different team. How many rounds make the training complete?

1 2 3 n

Round 1: Round 2: … … …

slide-36
SLIDE 36

Ninjas in Training

More generally, n ninjas are training in k teams. Training is complete if for every choice of k ninjas, there is a round where they are each in different team. How many rounds make the training complete?

  • Naïve: O(nk)
  • Can you do it in kk+1 (k!)-1 log n rounds?
slide-37
SLIDE 37

From Training Ninjas to
 Distributed Systems with Partition Faults

ninjas teams rounds complete training nodes in a network blocks in a partition partitions covering family

slide-38
SLIDE 38

Splitting Coverage

Given n nodes and k ≤ n:

  • Tests are partitions of nodes into k blocks: P = {B1, …, Bk}
  • Testing goals are sets of k nodes: S = {x1, …, xk}
  • P covers S if P splits S: x1 ∈ B1, …, xk ∈ Bk

Covering families are called k-splitting families here


slide-39
SLIDE 39

A Bug in Chronos

  • A distributed fault-tolerant job scheduler
  • Works in conjunction with Mesos and Zookeeper
  • Three special nodes: Chronos leader, Mesos leader,

Zookeeper leader

Chronos leader Mesos leader Zookeeper leader

slide-40
SLIDE 40

A Bug in Chronos

  • A distributed fault-tolerant job scheduler
  • Works in conjunction with Mesos and Zookeeper
  • Three special nodes: Chronos leader, Mesos leader,

Zookeeper leader

Chronos leader Mesos leader Zookeeper leader

slide-41
SLIDE 41
slide-42
SLIDE 42
slide-43
SLIDE 43

Splitting Coverage

Given n nodes and k ≤ n:

  • Number of partitions with k blocks:
  • Number of sets of k nodes:
  • Splitting a set with a random partition:

By the general theorem, there exists a k-splitting family


  • f size kk+1 (k!)-1 log n

n

k

≈ kn

k!

n

k

  • ≈ nk

k!

p = kn−k {

n k} ≈ k!

kk

slide-44
SLIDE 44

Effectiveness of Jepsen

  • Theorem. For ϵ > 0, a random family of partitions of size


kk+1 (k!)-1 log n + kk (k!)-1 log ϵ-1 is a k-splitting family with probability at least 1 - ϵ. For Chronos, with n = 5, k = 2, ϵ = 0.2: a family of 10 randomly chosen partitions is splitting with probability 80%

slide-45
SLIDE 45

Other Coverage Notions

k,l-Separation

  • Tests: Bipartitions
  • Goals: Two disjoint sets of k and l nodes
  • Coverage notion: The two sets included in different blocks
  • Size of covering families: O(f(k,l) log n)

Minority isolation

  • Tests: Bipartitions
  • Goals: Nodes
  • Coverage notion: The node is in the smaller block
  • Covering families: O(log n)
slide-46
SLIDE 46

Other Coverage Notions

k,l-Separation

  • Tests: Bipartitions
  • Goals: Two disjoint sets of k and l nodes
  • Coverage notion: The two sets included in different blocks
  • Size of covering families: O(f(k,l) log n)

Minority isolation

  • Tests: Bipartitions
  • Goals: Nodes
  • Coverage notion: The node is in the smaller block
  • Covering families: O(log n)
  • k-Splitting, k,l-separation, and minority isolation


explain most bugs found by Jepsen

slide-47
SLIDE 47

Other Coverage Notions

k,l-Separation

  • Tests: Bipartitions
  • Goals: Two disjoint sets of k and l nodes
  • Coverage notion: The two sets included in different blocks
  • Size of covering families: O(f(k,l) log n)

Minority isolation

  • Tests: Bipartitions
  • Goals: Nodes
  • Coverage notion: The node is in the smaller block
  • Covering families: O(log n)
  • k-Splitting, k,l-separation, and minority isolation


explain most bugs found by Jepsen

  • With high probability, O(log n) random partitions

simultaneously provide full coverage for all these notions

slide-48
SLIDE 48
  • 1. General Random Testing Framework
  • 2. Randomly Testing Distributed Systems
  • 3. Wider Context: Combinatorial Testing
slide-49
SLIDE 49
  • 3. What is the notion of coverage?
  • 4. How to construct covering families?

General Testing Framework

Tests T Goals G

  • 1. What are tests?
  • 2. What are


testing goals?

slide-50
SLIDE 50
  • 3. k-splitting coverage
  • 4. Random families of size O(log n) are k-splitting w.h.p.

Distributed Systems
 with Network Partitions

Tests T Goals G

  • 1. Partitions with


k blocks

  • 2. Sets of


k nodes

slide-51
SLIDE 51
  • 3. k-hitting coverage: Schedule “hits” events e1 < … < ek
  • 4. Hitting families of size O(log n), O(log n)k-1, O(nk-1)

Concurrent Programs

Tests T Goals G

  • 1. Schedules


(interleavings)

  • 2. Ordered sets

  • f k events

Program = Partially ordered set of events

Chistikov, Majumdar, Niksic. Hitting families of schedules for asynchronous programs. CAV 2016 Burckhardt et al. A randomized scheduler with probabilistic guarantees of finding bugs. ASPLOS 2010

slide-52
SLIDE 52
  • 3. Input coincides with the chosen values on the k features
  • 4. Various constructions of covering arrays

Combinatorial Testing

Tests T Goals G

  • 1. Inputs with


many features

  • 2. Values for


k features

Kuhn, Kacker, Lei. Combinatorial Testing. Encyclopedia of Software Engineering. 2010

slide-53
SLIDE 53
  • 3. What is the notion of coverage?
  • 4. How to construct covering families?

General Testing Framework

Tests T Goals G

  • 1. What are tests?
  • 2. What are


testing goals?

slide-54
SLIDE 54
  • 3. What is the notion of coverage?
  • 4. How to construct covering families?

General Testing Framework

Tests T Goals G

  • 1. What are tests?
  • 2. What are


testing goals?

Where else can we apply this approach?