Argosy: Verifying layered storage systems with recovery refinement - - PowerPoint PPT Presentation

argosy verifying layered storage systems with recovery
SMART_READER_LITE
LIVE PREVIEW

Argosy: Verifying layered storage systems with recovery refinement - - PowerPoint PPT Presentation

Argosy: Verifying layered storage systems with recovery refinement Tej Chajed , Joseph Tassarotti, Frans Kaashoek, Nickolai Zeldovich MIT logical disk disk 1 disk 2 Bob writes a replication system 2 logical disk write 1 write 2 disk 1 disk


slide-1
SLIDE 1

Argosy: Verifying layered storage systems with recovery refinement

Tej Chajed, Joseph Tassarotti, Frans Kaashoek, Nickolai Zeldovich MIT

slide-2
SLIDE 2

2

disk1 disk2 logical disk Bob writes a replication system

slide-3
SLIDE 3

2

disk1 disk2 write1 logical disk write2 Bob writes a replication system

slide-4
SLIDE 4

3

disk1 disk2 write1 logical disk Bob writes a replication system

slide-5
SLIDE 5

3

disk1 disk2 write1 logical disk

?

Bob writes a replication system

slide-6
SLIDE 6

3

disk1 disk2 write1

rep_recover

logical disk

?

and implements its recovery procedure Bob writes a replication system

slide-7
SLIDE 7

3

disk1 disk2 write1

rep_recover

logical disk

?

recovery restores invariants and implements its recovery procedure Bob writes a replication system

slide-8
SLIDE 8

4

replication

read write rep_recover

read and write are atomic if you run rep_recover after every crash

Disk interface Two-disk interface Bob is careful and writes a
 machine-checked proof of correctness

slide-9
SLIDE 9

Transactions

5

write-ahead logging

log_recover

Disk interface

slide-10
SLIDE 10

Transactions

5

write-ahead logging

log_recover

  • ps are atomic if you run

log_recover after every crash

Disk interface

slide-11
SLIDE 11

6

write-ahead log replication logging + replication

Transactions Disk interface Two-disk interface

?

slide-12
SLIDE 12

6

write-ahead log replication logging + replication

Transactions Disk interface Two-disk interface

rep_recover ; log_recover

?

slide-13
SLIDE 13

7

rep_recover ; log_recover

?

log_recover

under crashes

rep_recover

under crashes

Challenge: crashes during composed recovery

how do we prove correctness under crashes using the existing proofs?

slide-14
SLIDE 14

8

Prior work cannot handle multiple recovery procedures

CHL [SOSP ’15] not modular Yggdrasil [OSDI ’16] single recovery Flashix [SCP ’16] restricted recovery procedures

write-ahead log replication

slide-15
SLIDE 15

9

Argosy supports modular recovery proofs

developer proves developer proves

write-ahead log replication

Transactions Disk interface Two-disk interface

slide-16
SLIDE 16

9

Argosy supports modular recovery proofs

logging + replication

Argosy proves

write-ahead log replication

Transactions Disk interface Two-disk interface

slide-17
SLIDE 17

10

Contributions

Recovery refinement for modular proofs

slide-18
SLIDE 18

10

Contributions

see paper see paper

Recovery refinement for modular proofs CHL for proving recovery refinement Verified example: logging + replication

slide-19
SLIDE 19

10

Contributions

see paper see paper see code

Recovery refinement for modular proofs CHL for proving recovery refinement Verified example: logging + replication Machine-checked proofs in Coq

slide-20
SLIDE 20

11

Preview: recovery refinement

replication

Disk interface Two-disk interface

  • 1. Normal execution correctness


using refinement

  • 2. Crash and recovery correctness


using recovery refinement

slide-21
SLIDE 21

Refinement

12

Background

slide-22
SLIDE 22

Background

13

Disk interface Two-disk interface

replication

slide-23
SLIDE 23

Background

13

Disk interface Two-disk interface

replication write write1 write2 write_impl

slide-24
SLIDE 24

Background

13

Disk interface Two-disk interface

replication write write1 write2 write_impl

slide-25
SLIDE 25

Background

13

Disk interface

read

Two-disk interface

replication write write1 write2 write_impl read_impl read1 read2

slide-26
SLIDE 26

Background

13

Disk interface

read

Two-disk interface

replication write write1 write2 write_impl read_impl read1 read2 code code_impl write write_impl read read_impl

correctness is based on how we use replication : run code using Disk interface on top of two disks

slide-27
SLIDE 27

Background

14

Correctness: trace inclusion

replication

Disk interface Two-disk interface

code code_impl

spec’s
 behaviors running code’s behaviors

slide-28
SLIDE 28

Background

15

Proving correctness with an abstraction relation

disk1 disk2 logical disk

R

1. developer provides
 abstraction relation R spec state

slide-29
SLIDE 29

Background

15

Proving correctness with an abstraction relation

disk1 disk2 logical disk

R

write1 write2

1. developer provides
 abstraction relation R spec state

slide-30
SLIDE 30

Background

15

Proving correctness with an abstraction relation

disk1 disk2 logical disk

R

write1 write2 write

1. developer provides
 abstraction relation R

  • 2. prove spec execution exists

spec state

slide-31
SLIDE 31

Background

15

Proving correctness with an abstraction relation

disk1 disk2 logical disk

R

write1 write2 write

R

1. developer provides
 abstraction relation R

  • 2. prove spec execution exists
  • 3. and abstraction relation is preserved

spec state

slide-32
SLIDE 32

Recovery refinement

16

slide-33
SLIDE 33

17

Disk interface

read

Two-disk interface replication

write write1 write2 write_impl read_impl read1 read2

slide-34
SLIDE 34

17

Disk interface

read

Two-disk interface replication

write write1 write2 write_impl rep_recover read_impl read1 read2

slide-35
SLIDE 35

17

Disk interface

read

Two-disk interface replication

write write1 write2 write_impl rep_recover read_impl read1 read2

slide-36
SLIDE 36

18

replication

Disk interface Two-disk interface

Extending trace inclusion with recovery

code code_impl

⊇ ⊇

specification for crash behavior crash & recovery behavior

slide-37
SLIDE 37

18

replication

Disk interface Two-disk interface

Extending trace inclusion with recovery

code code_impl

⊇ ⊇

specification for crash behavior crash & recovery behavior ? crash semantics

recover

? recovery semantics

slide-38
SLIDE 38

19

replication

Disk interface Two-disk interface

code code_impl

|

  • p1

|

  • p1
  • p2

|

  • ne of these

:=

code

crash & recovery behavior

recover

? recovery semantics

slide-39
SLIDE 39

19

replication

Disk interface Two-disk interface

code code_impl

|

  • p1

|

  • p1
  • p2

|

:=

code

crash & recovery behavior

recover

? recovery semantics

slide-40
SLIDE 40

20

replication

Disk interface Two-disk interface

code code_impl

⊇ ⊇

code recover code_impl

slide-41
SLIDE 41

21

replication

Disk interface Two-disk interface

code code_impl

⊇ ⊇

code recover code_impl

zero-or-more iterations

recover ⋆

slide-42
SLIDE 42

21

replication

Disk interface Two-disk interface

code code_impl

⊇ ⊇

code recover code_impl recover ⋆

slide-43
SLIDE 43

22

replication

Disk interface Two-disk interface

code code_impl

⊇ ⊇

code recover code_impl recover ⋆

Trace inclusion, with recovery

slide-44
SLIDE 44

23

  • p1_impl

recover

  • p2_impl

recover

Proving trace inclusion, with recovery

slide-45
SLIDE 45

23

  • p1_impl

recover

  • p2_impl

recover

crash must occur during some operation

Proving trace inclusion, with recovery

slide-46
SLIDE 46

23

  • p1_impl

recover

  • p2_impl

recover

Proving trace inclusion, with recovery

slide-47
SLIDE 47

23

  • p1_impl

recover

  • p2_impl

recover

  • p1

R R

Proving trace inclusion, with recovery

slide-48
SLIDE 48

23

recover

  • p2_impl

recover

⋆ R

Proving trace inclusion, with recovery

slide-49
SLIDE 49

23

recover

  • p2_impl

recover

⋆ R

  • p2

| R

Proving trace inclusion, with recovery

slide-50
SLIDE 50

24

  • p

R

  • p_impl

R

non-crash execution crash and recovery execution

R

recover

  • p_impl

recover

  • p

|

R

Recovery refinement

slide-51
SLIDE 51

24

  • p

R

  • p_impl

R

non-crash execution crash and recovery execution

R

recover

  • p_impl

recover

  • p

|

R

Recovery refinement

implies

Trace inclusion

specification behavior

running code behavior

slide-52
SLIDE 52

Composition theorem

25

slide-53
SLIDE 53

26

  • p1
  • p2

|

  • p

r

Kleene algebra for transition relations

expression

slide-54
SLIDE 54

26

  • p1
  • p2
  • p1
  • p2
  • p

|

  • p

r

⋆ …

r r r

Kleene algebra for transition relations

expression matching transitions

slide-55
SLIDE 55

27

Theorem: recovery refinements compose

write-ahead log

log_recover …

replication

rep_recover …

If

Transactions Two-disk interface Disk interface

slide-56
SLIDE 56

27

Theorem: recovery refinements compose

write-ahead log

log_recover …

replication

rep_recover …

then If

logging + replication

… rep_recover; log_recover

Transactions Two-disk interface Transactions Two-disk interface Disk interface

slide-57
SLIDE 57

28

Goal: prove composed recovery correct

rep_recover ; log_recover

?

log_recover

under crashes

rep_recover

under crashes

slide-58
SLIDE 58

29

under crashes

log rep

under crashes

rep log

;

?

Goal: prove composed recovery correct

rep_recover log_recover

slide-59
SLIDE 59

30

log log

rep rep

slide-60
SLIDE 60

30

log log

rep rep

rep log rep

|

( )

rep log

slide-61
SLIDE 61

30

log log

rep rep

rep log rep

|

( )

rep log

⋆ how to re-use recovery proofs here?

slide-62
SLIDE 62

31

Using Kleene algebra for reasoning

rep log rep

|

( )

rep log

slide-63
SLIDE 63

31

Using Kleene algebra for reasoning

rep log rep

|

( )

rep log

after de-nesting (p ∣ q)⋆ = p⋆(qp⋆)⋆

slide-64
SLIDE 64

31

Using Kleene algebra for reasoning

rep log rep

|

( )

rep log

rep rep log rep

( )

rep log

=

⋆ ⋆ after de-nesting (p ∣ q)⋆ = p⋆(qp⋆)⋆

slide-65
SLIDE 65

31

Using Kleene algebra for reasoning

rep log rep

|

( )

rep log

rep rep log rep

( )

rep log

=

⋆ ⋆ after de-nesting (p ∣ q)⋆ = p⋆(qp⋆)⋆

rep rep log rep (

)

rep log

=

(pq)⋆p = p(qp)⋆

after sliding

slide-66
SLIDE 66

32

rep rep log rep (

)

rep log

After rewrite both proofs apply

slide-67
SLIDE 67

replication proof

32

rep rep log rep (

)

rep log

rep invariants restored

After rewrite both proofs apply

slide-68
SLIDE 68

replication proof

32

rep rep log rep (

)

rep log

rep invariants restored

log

behaves like

After rewrite both proofs apply

slide-69
SLIDE 69

write-ahead log proof replication proof

32

rep rep log rep (

)

rep log

rep invariants restored

log

behaves like

log log

⋆ log invariants restored

After rewrite both proofs apply

slide-70
SLIDE 70

33

Argosy is implemented and verified in Coq

github.com/mit-pdos/argosy

3,200 lines for framework 4,000 lines for verified example (logging + replication) Example extracts to Haskell and runs

slide-71
SLIDE 71

34

Argosy: modular proofs of layered storage systems

slide-72
SLIDE 72

34

Argosy: modular proofs of layered storage systems

rep rep

|

( )⋆

log

Kleene algebra

slide-73
SLIDE 73

34

Argosy: modular proofs of layered storage systems

r impl r

  • p

|

recovery refinement

rep rep

|

( )⋆

log

Kleene algebra

slide-74
SLIDE 74

34

Argosy: modular proofs of layered storage systems

r impl r

  • p

|

recovery refinement modular proofs

rep rep

|

( )⋆

log

Kleene algebra

slide-75
SLIDE 75

34

Argosy: modular proofs of layered storage systems

r impl r

  • p

|

recovery refinement modular proofs

rep rep

|

( )⋆

log

Kleene algebra

come find us after! Tej and Joe