SimGrid MC Verification Support for a Multi-API Simulation Platform - - PowerPoint PPT Presentation

simgrid mc
SMART_READER_LITE
LIVE PREVIEW

SimGrid MC Verification Support for a Multi-API Simulation Platform - - PowerPoint PPT Presentation

SimGrid MC Verification Support for a Multi-API Simulation Platform Stephan Merz 1 Martin Quinson 2 Cristian Rosa 2 1 INRIA Research Center Nancy, France 2 Universit e Henri Poincar e Nancy 1, Nancy, France FORTE 08/06/2011 Merz, Quinson,


slide-1
SLIDE 1

SimGrid MC

Verification Support for a Multi-API Simulation Platform Stephan Merz1 Martin Quinson2 Cristian Rosa2

1INRIA Research Center Nancy, France 2Universit´

e Henri Poincar´ e Nancy 1, Nancy, France

FORTE 08/06/2011

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 1 / 18

slide-2
SLIDE 2

Motivation

Distributed Systems pose a development challenge:

lack of a shared clock lack of a global view of the state non-determinism unadapted programming languages

Verfiy distributed message-passing systems We want to study actual implementations

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 2 / 18

slide-3
SLIDE 3

Studying Distributed Systems

Comparison of methodologies to study distributed systems:

Execution Simulation Proofs Model checking Performance Assessment

  • Experimental Bias
  • n/a

n/a Experimental Control

  • n/a

n/a Correctess Verification

  • Ease of use
  • Merz, Quinson, Rosa (INRIA,UHP Nancy 1)

SimGrid MC FORTE 08/06/2011 3 / 18

slide-4
SLIDE 4

Studying Distributed Systems

Comparison of methodologies to study distributed systems:

Execution Simulation Proofs Model checking Performance Assessment

  • Experimental Bias
  • n/a

n/a Experimental Control

  • n/a

n/a Correctess Verification

  • Ease of use
  • Simulation and Model Checking complement each other:

Simulation to assess the performance Model Checking to verify correctness Both run automatically No expert users

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 3 / 18

slide-5
SLIDE 5

Studying Distributed Systems

Comparison of methodologies to study distributed systems:

Execution Simulation Proofs Model checking Performance Assessment

  • Experimental Bias
  • n/a

n/a Experimental Control

  • n/a

n/a Correctess Verification

  • Ease of use
  • Simulation and Model Checking complement each other:

Simulation to assess the performance Model Checking to verify correctness Both run automatically No expert users Simulators and model checkers often use different models

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 3 / 18

slide-6
SLIDE 6

Model Checking Vs. Simulation

Distributed System State Space Model Checking explores all (relevant) behavior of the model Simulation explores one trace subject to the platform’s restrictions

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 4 / 18

slide-7
SLIDE 7

Model Checking Vs. Simulation

Distributed System State Space Model Checking explores all (relevant) behavior of the model Simulation explores one trace subject to the platform’s restrictions

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 4 / 18

slide-8
SLIDE 8

Model Checking Vs. Simulation

Distributed System State Space Model Checking explores all (relevant) behavior of the model Simulation explores one trace subject to the platform’s restrictions

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 4 / 18

slide-9
SLIDE 9

Contributions

The contributions of this article are: SimGrid MC: an extension of the SimGrid simulation framework for model checking distributed applications. SimGrid MC verify actual implementations We present the integration work:

Changes to the simulator main loop Simulation and model checking shares many abstractions

We explain how we implemented DPOR to support different communication APIs

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 5 / 18

slide-10
SLIDE 10

SimGrid

The SimGrid Framework is a collection of tools for the simulation of distributed computer systems. It uses a discrete-event simulation model (simulated time advances triggered by actions) Experimental Work-flow:

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 6 / 18

slide-11
SLIDE 11

The SimGrid Framework 2

T2 T1 M

a b c d

Simulation Round n tn-1

simulated time

Simulation Round n+1 tn tn+1 SURF (resource models) User code SIMIX (virtualization)

Main loop of SimGrid a, b, c, d are communication actions they are intercepted to introduce the delays of the platform they are the only way to affect the shared state

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 7 / 18

slide-12
SLIDE 12

The SimGrid Framework 2

T2 T1 M

a b c d

Simulation Round n tn-1

simulated time

Simulation Round n+1 tn tn+1 SURF (resource models) User code SIMIX (virtualization)

Main loop of SimGrid a, b, c, d are communication actions they are intercepted to introduce the delays of the platform they are the only way to affect the shared state The simulator already provides some needed abstractions: Controlled execution environment (processes, scheduling, etc). Transition detection (communication interception).

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 7 / 18

slide-13
SLIDE 13

1

Introduction Motivation Model Checking The SimGrid Framework 1 Shared Abstractions

2

SimGrid MC Architecture DPOR The Formal Model

3

Evaluation SMPI CHORD

4

Conclusion and Future Work

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 8 / 18

slide-14
SLIDE 14

SimGrid MC’s Architecture 1

Main characteristics of SimGrid MC: Explicit-state exploration (with a depth bound) It actually executes the code No visited state storing nor hashing (stateless MC) Replay based roll-backs Properties expressed as assertions (safety only...for now) Dynamic Partial-order Reductions to cope with state space explosion

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 8 / 18

slide-15
SLIDE 15

SimGrid MC’s Architecture 2

Architecture

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 9 / 18

slide-16
SLIDE 16

Exploration Loop

T2 T1 M

a b c d S0 a b a S1 c b b S2 c d

MC Initialisation MC Stack User code MC (exploration algorithm) Snapshot

Exploration Loop

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 10 / 18

slide-17
SLIDE 17

Dynamic Partial-order Reductions 1

A B C

a0 a2 a1 sa la b0 b1 b2 c0 c1 c2 sb r r lb

A B C

a0 a2 a1 sa la b0 b1 b2 c0 c1 c2 sb r r lb a0 b0 c0 a0 b1 c0 a0 b1 c1 a1 b1 c1 a1 b1 c2 a1 b2 c2 a2 b2 c2 sb r sa r lb la r sa sa lb r lb la sb r sa r lb la Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 11 / 18

slide-18
SLIDE 18

Dynamic Partial-order Reductions 1

A B C

a0 a2 a1 sa la b0 b1 b2 c0 c1 c2 sb r r lb

A B C

a0 a2 a1 sa la b0 b1 b2 c0 c1 c2 sb r r lb

Depth-first search state exploration algorithm:

a0 b0 c0 a0 b1 c0 a0 b1 c1 a1 b1 c1 a1 b1 c2 a1 b2 c2 a2 b2 c2 sb r sa r lb la r sa sa lb r lb la sb r sa r lb la Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 11 / 18

slide-19
SLIDE 19

Dynamic Partial-order Reductions 1

A B C

a0 a2 a1 sa la b0 b1 b2 c0 c1 c2 sb r r lb

A B C

a0 a2 a1 sa la b0 b1 b2 c0 c1 c2 sb r r lb

Depth-first search state exploration algorithm:

a0 b0 c0 a0 b1 c0 a0 b1 c1 a1 b1 c1 a1 b1 c2 a1 b2 c2 a2 b2 c2 sb r sa r lb la r sa sa lb r lb la sb r sa r lb la

It’s another serialization of the same partial-order!

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 11 / 18

slide-20
SLIDE 20

Dynamic Partial-order Reductions 1

A B C

a0 a2 a1 sa la b0 b1 b2 c0 c1 c2 sb r r lb

A B C

a0 a2 a1 sa la b0 b1 b2 c0 c1 c2 sb r r lb

Depth-first search state exploration algorithm:

a0 b0 c0 a0 b1 c0 a0 b1 c1 a1 b1 c1 a1 b1 c2 a1 b2 c2 a2 b2 c2 sb r sa r lb la r sa sa lb r lb la sb r sa r lb la

It’s another serialization of the same partial-order!

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 11 / 18

slide-21
SLIDE 21

Dynamic Partial-order Reductions 2

What are the transitions that we should interleave?

  • r equivalently ...

How do we generate a serialization of a different partial-order? ti ti tj tj

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 12 / 18

slide-22
SLIDE 22

Dynamic Partial-order Reductions 2

What are the transitions that we should interleave?

  • r equivalently ...

How do we generate a serialization of a different partial-order? Interleaving dependent transitions! D(ti, tj) = ¬I(ti, tj) ti ti tj tj

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 12 / 18

slide-23
SLIDE 23

Computing D Efficiently

How do we get the predicate D? Using the semantics of the transitions Proof of ”independence theorems” for each pair of transitions The I predicate is the disjunction of these cases I(ti, tj) = (ti = A ∧ tj = B) ∨ . . .

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 13 / 18

slide-24
SLIDE 24

Computing D Efficiently

How do we get the predicate D? Using the semantics of the transitions Proof of ”independence theorems” for each pair of transitions The I predicate is the disjunction of these cases I(ti, tj) = (ti = A ∧ tj = B) ∨ . . . It can be over-approximated by a D′ such that D(A, B) ⇒ D′(A, B) If we don’t know if I(ti, tj) we assume D′(ti, tj) (for soundness).

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 13 / 18

slide-25
SLIDE 25

Dynamic Partial-order Reductions 4

No communication API has a formal semantics. Manual specification is required based on:

informal API references experiments user experience

It is a tedious and time consuming job(should be done for each API)

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 14 / 18

slide-26
SLIDE 26

Dynamic Partial-order Reductions 4

No communication API has a formal semantics. Manual specification is required based on:

informal API references experiments user experience

It is a tedious and time consuming job(should be done for each API) Our solution:

A core set of four basic networking primitives User-level APIs on top of this Formal specification in of it’s semantics Theorems of independence between certain primitives State-space exploration at primitives’ level

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 14 / 18

slide-27
SLIDE 27

The Communication Model

The communication model is based on mailboxes: processes post send/receive request into mailboxes requests are queued and matched in FIFO order

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 15 / 18

slide-28
SLIDE 28

The Communication Model

The communication model is based on mailboxes: processes post send/receive request into mailboxes requests are queued and matched in FIFO order There are four primitives: Send – asynchronous send Recv – asynchronous receive WaitAny – block until completion of a communication TestAny – test for completion without blocking

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 15 / 18

slide-29
SLIDE 29

MPI Experiments

Example

1

if (rank % 3 == 0) {

2

MPI_Recv (&val1 , MPI_ANY_SOURCE );

3

MPI_Recv (&val2 , MPI_ANY_SOURCE );

4

MC_assert(val1 > rank );

5

MC_assert(val2 > rank );

6

} else {

7

MPI_Send (&rank , (rank / 3) * 3);

8

}

#P DFS DPOR States Time Peak Mem States Time Peak Mem 3 520 0.247 s 23472 kB 72 0.074 s 23472 kB 6 >10560579 >1 h

  • 1563

0.595 s 26128 kB 9

  • 32874

14.118 s 29824 kB

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 16 / 18

slide-30
SLIDE 30

Chord Experiments

Chord What is Chord? Chord is a P2P lookup protocol Designed for scalability Nodes might join/leave

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 17 / 18

slide-31
SLIDE 31

Chord Experiments

Chord What is Chord? Chord is a P2P lookup protocol Designed for scalability Nodes might join/leave Implementation of Chord in SimGrid 563 lines of C (MSG interface) 2 millions node in simulation spotted a bug in big instances

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 17 / 18

slide-32
SLIDE 32

Chord Experiments

Chord What is Chord? Chord is a P2P lookup protocol Designed for scalability Nodes might join/leave Implementation of Chord in SimGrid 563 lines of C (MSG interface) 2 millions node in simulation spotted a bug in big instances SimGrid MC with two nodes: DFS: 15600 states - 24s DPOR: 478 states - 1s Simple Counter-example!

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 17 / 18

slide-33
SLIDE 33

Conclusions and Future Work

Conclusions: Introduced SimGrid MC a model checker for distributed C programs Allows to both simulate and verify the programs without modifications The integration has been conceptually simple DPOR exploration with support for multiple APIs Capable of finding bugs in small MPI examples and in a more complex Chord implementation

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 18 / 18

slide-34
SLIDE 34

Conclusions and Future Work

Conclusions: Introduced SimGrid MC a model checker for distributed C programs Allows to both simulate and verify the programs without modifications The integration has been conceptually simple DPOR exploration with support for multiple APIs Capable of finding bugs in small MPI examples and in a more complex Chord implementation Future Work: Implement and evaluate a statefull exploration Experiment with an hybrid roll-back mechanism (checkpoint + replay) Add support for liveness properties verification Do simulation and model checking at the same time

Merz, Quinson, Rosa (INRIA,UHP Nancy 1) SimGrid MC FORTE 08/06/2011 18 / 18