Towards Automatically Checking Thousands of Failures with - - PowerPoint PPT Presentation

towards automatically checking thousands of failures with
SMART_READER_LITE
LIVE PREVIEW

Towards Automatically Checking Thousands of Failures with - - PowerPoint PPT Presentation

Towards Automatically Checking Thousands of Failures with Micro-Specifications Haryadi S. Gunawi, Thanh Do , Pallavi Joshi, Joseph M. Hellerstein, Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau , Koushik Sen University of


slide-1
SLIDE 1

Towards Automatically Checking Thousands of Failures with Micro-Specifications

Haryadi S. Gunawi, Thanh Do†, Pallavi Joshi, Joseph M. Hellerstein, Andrea C. Arpaci-Dusseau†, Remzi H. Arpaci-Dusseau†, Koushik Sen University of California, Berkeley

† University of Wisconsin, Madison

1

slide-2
SLIDE 2

Cloud Era

Solve bigger human problems Use cluster of thousands of machines

2 2

slide-3
SLIDE 3

Failures in The Cloud

3 3

slide-4
SLIDE 4

Failures in The Cloud

“The future is a world of failures everywhere” - Garth Gibson

3 3

slide-5
SLIDE 5

Failures in The Cloud

“The future is a world of failures everywhere” - Garth Gibson “Recovery must be a first-class operation” - Raghu Ramakrishnan

3 3

slide-6
SLIDE 6

Failures in The Cloud

“The future is a world of failures everywhere” - Garth Gibson “Recovery must be a first-class operation” - Raghu Ramakrishnan “Reliability has to come from the software” - Jefgrey Dean

3 3

slide-7
SLIDE 7

4 4

slide-8
SLIDE 8

5 5

slide-9
SLIDE 9

Why Failure Recovery Hard?

  • Testing is not advanced enough against complex

failures

– Diverse, frequent, and multiple failures – FaceBook photo loss

  • Recovery is under specified

– Need to specify failure recovery behaviors – Customized well-grounded protocols

  • Example: Paxos made live – An engineering

perspective [PODC’ 07]

6 6

slide-10
SLIDE 10

Our Solutions

7 7

slide-11
SLIDE 11

Our Solutions

  • FTS (“FATE”) – Failure Testing Service

– New abstraction for failure exploration – Systematically exercise 40,000 unique combinations of failures

7 7

slide-12
SLIDE 12

Our Solutions

  • FTS (“FATE”) – Failure Testing Service

– New abstraction for failure exploration – Systematically exercise 40,000 unique combinations of failures

  • DTS (“DESTINI”) – Declarative Testing

Specification

– Enable concise recovery specifications – We have written 74 checks (3 lines / check)

7 7

slide-13
SLIDE 13

Our Solutions

  • FTS (“FATE”) – Failure Testing Service

– New abstraction for failure exploration – Systematically exercise 40,000 unique combinations of failures

  • DTS (“DESTINI”) – Declarative Testing

Specification

– Enable concise recovery specifications – We have written 74 checks (3 lines / check)

  • Note: Names have changed since the paper

7 7

slide-14
SLIDE 14

Summary of Findings

  • Applied FATE and DESTINI to three cloud

systems: HDFS, ZooKeeper, Cassandra

  • Found 16 new bugs
  • Reproduced 74 bugs
  • Problems found

– Inconsistency – Data loss – Rack awareness broken – Unavailability

8 8

slide-15
SLIDE 15

Outline

Introduction

  • FATE
  • DESTINI
  • Evaluation
  • Summary

9 9

slide-16
SLIDE 16

10 10

slide-17
SLIDE 17

10

M 1 C 2 3 No failures

10

slide-18
SLIDE 18

10

M 1 C 2 3 No failures

Alloc. Req.

10

slide-19
SLIDE 19

10

M 1 C 2 3 No failures

Setup Stage Alloc. Req.

Data Transfer Stage

10

slide-20
SLIDE 20

10

M 1 C 2 3 No failures

10

slide-21
SLIDE 21

10

M 1 C 2 3 M 1 C 2 3 4 No failures Setup Stage Recovery: Recreate fresh pipeline

X1

10

slide-22
SLIDE 22

10

M 1 C 2 3 M 1 C 2 3 4 M 1 C 2 3 No failures Setup Stage Recovery: Recreate fresh pipeline Data transfer Stage Recovery: Continue on surviving nodes

X2 X1

10

slide-23
SLIDE 23

10

M 1 C 2 3 M 1 C 2 3 4 M 1 C 2 3 M 1 C 2 3 No failures Setup Stage Recovery: Recreate fresh pipeline Data transfer Stage Recovery: Continue on surviving nodes Bug in Data Transfer Stage Recovery

X3 X2 X1

10

slide-24
SLIDE 24

10

M 1 C 2 3 M 1 C 2 3 4 M 1 C 2 3 M 1 C 2 3 No failures Setup Stage Recovery: Recreate fresh pipeline Data transfer Stage Recovery: Continue on surviving nodes Bug in Data Transfer Stage Recovery

X3 X2 X1

Failures at DIFFERENT STAGES lead to DIFFERENT FAILURE BEHAVIORS Goal: Exercise difgerent failure recovery path

10

slide-25
SLIDE 25

FATE

  • A failure injection framework

– target IO points

– Systematically exploring failure – Multiple failures

  • New abstraction of failure

scenario

– Remember injected failures – Increase failure coverage

11 11

slide-26
SLIDE 26

FATE

  • A failure injection framework

– target IO points

– Systematically exploring failure – Multiple failures

  • New abstraction of failure

scenario

– Remember injected failures – Increase failure coverage

11

M 1 C 2 3

11

slide-27
SLIDE 27

FATE

  • A failure injection framework

– target IO points

– Systematically exploring failure – Multiple failures

  • New abstraction of failure

scenario

– Remember injected failures – Increase failure coverage

11

M 1 C 2 3

X X

11

slide-28
SLIDE 28

FATE

  • A failure injection framework

– target IO points

– Systematically exploring failure – Multiple failures

  • New abstraction of failure

scenario

– Remember injected failures – Increase failure coverage

11

M 1 C 2 3

X X X X

11

slide-29
SLIDE 29

FATE

  • A failure injection framework

– target IO points

– Systematically exploring failure – Multiple failures

  • New abstraction of failure

scenario

– Remember injected failures – Increase failure coverage

11

M 1 C 2 3

X X X X

11

slide-30
SLIDE 30

FATE

  • A failure injection framework

– target IO points

– Systematically exploring failure – Multiple failures

  • New abstraction of failure

scenario

– Remember injected failures – Increase failure coverage

11

M 1 C 2 3

X X X X X X

11

slide-31
SLIDE 31

Failure ID

12

2 3

12

slide-32
SLIDE 32

Failure ID

12

2 3

Field Fields Values Static

  • Func. Call

OutputStream.read() Static Source File BlockReceiver.java Dynamic Stack Track … Domain specific Source Node 2 Domain specific Destination Node 3

  • Net. Message

Data Packet Failure Type Crash After Hash 12348729 12348729

12

slide-33
SLIDE 33

How Developers Build Failure ID?

  • FATE intercepts all I/Os
  • Use aspectJ to collect information at

every I/O point

– I/O bufgers (e.g file bufger, network bufger) – Target I/O (e.g. file name, IP address)

  • Reverse engineer for domain specific

information

13 13

slide-34
SLIDE 34

Failure ID

12

2 3

14

slide-35
SLIDE 35

Failure ID

12

2 3

Field Fields Values Static

  • Func. Call

OutputStream.read() Static Source File BlockReceiver.java Dynamic Stack Track … Domain specific Source Node 2 Domain specific Destination Node 3

  • Net. Message

Data Packet Failure Type Crash After Hash 12348729 12348729

14

slide-36
SLIDE 36

Failure ID

12

2 3

Field Fields Values Static

  • Func. Call

OutputStream.read() Static Source File BlockReceiver.java Dynamic Stack Track … Domain specific Source Node 2 Domain specific Destination Node 3

  • Net. Message

Data Packet Failure Type Crash After Hash 12348729 12348729

14

slide-37
SLIDE 37

Exploring Failure Space

14 15

slide-38
SLIDE 38

Exploring Failure Space

14

M 1 C 2 3 A

Exp #1: A

15

slide-39
SLIDE 39

Exploring Failure Space

14

M 1 C 2 3 A A B

Exp #1: A Exp #2: B

15

slide-40
SLIDE 40

Exploring Failure Space

14

M 1 C 2 3 A A B A B C

Exp #1: A Exp #2: B Exp #3: C

15

slide-41
SLIDE 41

Exploring Failure Space

14

M 1 C 2 3 A A B A B C

Exp #1: A Exp #2: B Exp #3: C

M 1 C 2 3 B A

AB

15

slide-42
SLIDE 42

Exploring Failure Space

14

M 1 C 2 3 A A B A B C

Exp #1: A Exp #2: B Exp #3: C

M 1 C 2 3 B C B A A

AB AC

15

slide-43
SLIDE 43

Exploring Failure Space

14

M 1 C 2 3 A A B A B C

Exp #1: A Exp #2: B Exp #3: C

M 1 C 2 3 A B C B A A

AB AC

B C

BC

15

slide-44
SLIDE 44

Outline

Introduction FATE

  • DESTINI
  • Evaluation
  • Summary

15 16

slide-45
SLIDE 45

DESTINI

  • Enable concise recovery specifications
  • Check if expected behaviors match with

actual behaviors

  • Important elements:

– Expectations – Facts

– Failure Events – Check Timing

  • Interpose network and disk protocols

16 17

slide-46
SLIDE 46

Writing specifications

17 18

slide-47
SLIDE 47

Writing specifications

“Violation if expectation is difgerent from actual facts”

17 18

slide-48
SLIDE 48

Writing specifications

“Violation if expectation is difgerent from actual facts”

17 18

slide-49
SLIDE 49

Writing specifications

“Violation if expectation is difgerent from actual facts” violationTable():- expectationTable(), NOT-IN actualTable()

17 18

slide-50
SLIDE 50

Writing specifications

“Violation if expectation is difgerent from actual facts” violationTable():- expectationTable(), NOT-IN actualTable()

17 18

slide-51
SLIDE 51

Writing specifications

“Violation if expectation is difgerent from actual facts” violationTable():- expectationTable(), NOT-IN actualTable()

DataLog syntax:

17 18

slide-52
SLIDE 52

Writing specifications

“Violation if expectation is difgerent from actual facts” violationTable():- expectationTable(), NOT-IN actualTable()

DataLog syntax:

:- derivation

17 18

slide-53
SLIDE 53

Writing specifications

“Violation if expectation is difgerent from actual facts” violationTable():- expectationTable(), NOT-IN actualTable()

DataLog syntax:

:- derivation , AND

17 18

slide-54
SLIDE 54

18

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect Recovery

19

slide-55
SLIDE 55

18

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect Recovery

incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

19

slide-56
SLIDE 56

18

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect Recovery

Expected Nodes (Block, Node) Expected Nodes

  • ck, Node)

B Node 1 B Node 2

incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

19

slide-57
SLIDE 57

18

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect Recovery

Expected Nodes (Block, Node) Expected Nodes

  • ck, Node)

B Node 1 B Node 2

actualNodes(Block, Node) actualNodes(Block, Node) B Node 1 B Node 2

incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

19

slide-58
SLIDE 58

18

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect Recovery

Expected Nodes (Block, Node) Expected Nodes

  • ck, Node)

B Node 1 B Node 2

actualNodes(Block, Node) actualNodes(Block, Node) B Node 1 B Node 2

IncorrectNodes (Block, Node)

  • rrectNodes
  • ck, Node)

incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

19

slide-59
SLIDE 59

19

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect recovery incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

20

slide-60
SLIDE 60

19

M 1 C 2 3

Correct recovery X

Expected Nodes (Block, Node) Expected Nodes

  • ck, Node)

B Node 1 B Node 2

M 1 C 2 3

X Incorrect recovery incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

20

slide-61
SLIDE 61

19

M 1 C 2 3

Correct recovery X

Expected Nodes (Block, Node) Expected Nodes

  • ck, Node)

B Node 1 B Node 2 actualNodes(Block, Node) actualNodes(Block, Node) B Node 1

M 1 C 2 3

X Incorrect recovery incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

20

slide-62
SLIDE 62

19

M 1 C 2 3

Correct recovery X

Expected Nodes (Block, Node) Expected Nodes

  • ck, Node)

B Node 1 B Node 2 actualNodes(Block, Node) actualNodes(Block, Node) B Node 1 IncorrectNodes (Block, Node)

  • rrectNodes
  • ck, Node)

B Node 2

M 1 C 2 3

X Incorrect recovery incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

20

slide-63
SLIDE 63

19

M 1 C 2 3

Correct recovery X

Expected Nodes (Block, Node) Expected Nodes

  • ck, Node)

B Node 1 B Node 2 actualNodes(Block, Node) actualNodes(Block, Node) B Node 1 IncorrectNodes (Block, Node)

  • rrectNodes
  • ck, Node)

B Node 2

M 1 C 2 3

X Incorrect recovery incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

20

slide-64
SLIDE 64

19

M 1 C 2 3

Correct recovery X

Expected Nodes (Block, Node) Expected Nodes

  • ck, Node)

B Node 1 B Node 2 actualNodes(Block, Node) actualNodes(Block, Node) B Node 1 IncorrectNodes (Block, Node)

  • rrectNodes
  • ck, Node)

B Node 2

M 1 C 2 3

X Incorrect recovery BUILD EXPECTATIONS incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

20

slide-65
SLIDE 65

19

M 1 C 2 3

Correct recovery X

Expected Nodes (Block, Node) Expected Nodes

  • ck, Node)

B Node 1 B Node 2 actualNodes(Block, Node) actualNodes(Block, Node) B Node 1 IncorrectNodes (Block, Node)

  • rrectNodes
  • ck, Node)

B Node 2

M 1 C 2 3

X Incorrect recovery BUILD EXPECTATIONS CAPTURE FACTS incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N);

20

slide-66
SLIDE 66

Building Expectations

20

M 1 C 2 3

X

21

slide-67
SLIDE 67

Building Expectations

20

M 1 C 2 3

X

Master Client

Give me list of nodes for B [Node 1, Node 2, Node 3] 21

slide-68
SLIDE 68

Building Expectations

expectedNodes(B, N) :- getBlockPipe(B, N);

20

M 1 C 2 3

X

Master Client

Give me list of nodes for B [Node 1, Node 2, Node 3] 21

slide-69
SLIDE 69

Building Expectations

expectedNodes(B, N) :- getBlockPipe(B, N);

20

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

Master Client

Give me list of nodes for B [Node 1, Node 2, Node 3] 21

slide-70
SLIDE 70

Updating Expectation

21

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

22

slide-71
SLIDE 71

Updating Expectation

DEL expectedNodes(B, N) :- fateCrashNode(N), writeStage(B, Stage), Stage = “Data Transfer”, expectedNode(B, N)

21

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

22

slide-72
SLIDE 72

Updating Expectation

DEL expectedNodes(B, N) :- fateCrashNode(N), writeStage(B, Stage), Stage = “Data Transfer”, expectedNode(B, N)

21

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

22

slide-73
SLIDE 73

Updating Expectation

DEL expectedNodes(B, N) :- fateCrashNode(N), writeStage(B, Stage), Stage = “Data Transfer”, expectedNode(B, N)

21

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

22

slide-74
SLIDE 74

Updating Expectation

DEL expectedNodes(B, N) :- fateCrashNode(N), writeStage(B, Stage), Stage = “Data Transfer”, expectedNode(B, N)

21

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

  • “Client receives all acks from setup stage writeStage”  enter Data

Transfer stage

22

slide-75
SLIDE 75

Updating Expectation

DEL expectedNodes(B, N) :- fateCrashNode(N), writeStage(B, Stage), Stage = “Data Transfer”, expectedNode(B, N)

21

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

  • “Client receives all acks from setup stage writeStage”  enter Data

Transfer stage

setupAcks (B, Pos, Ack) :- cdpSetupAck (B, Pos, Ack); goodAcksCnt (B, COUNT<Ack>) :- setupAcks (B, Pos, Ack), Ack == ’OK’; nodesCnt (B, COUNT<Node>) :- pipeNodes (B, , N, ); writeStage (B, Stg) :- nodesCnt (NCnt), goodAcksCnt (ACnt), NCnt == Acnt, Stg := “Data Transfer”;

22

slide-76
SLIDE 76

Updating Expectation

DEL expectedNodes(B, N) :- fateCrashNode(N), writeStage(B, Stage), Stage = “Data Transfer”, expectedNode(B, N)

21

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

  • “Client receives all acks from setup stage writeStage”  enter Data

Transfer stage

22

slide-77
SLIDE 77

Updating Expectation

DEL expectedNodes(B, N) :- fateCrashNode(N), writeStage(B, Stage), Stage = “Data Transfer”, expectedNode(B, N)

21

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

  • “Client receives all acks from setup stage writeStage”  enter Data

Transfer stage

  • Precise failure events

22

slide-78
SLIDE 78

Updating Expectation

DEL expectedNodes(B, N) :- fateCrashNode(N), writeStage(B, Stage), Stage = “Data Transfer”, expectedNode(B, N)

21

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

  • “Client receives all acks from setup stage writeStage”  enter Data

Transfer stage

  • Precise failure events
  • Difgerent stages  difgerent recovery behaviors  difgerent

specifications

22

slide-79
SLIDE 79

Updating Expectation

DEL expectedNodes(B, N) :- fateCrashNode(N), writeStage(B, Stage), Stage = “Data Transfer”, expectedNode(B, N)

21

Expected Nodes(Bl ed Nodes(Block, Node) B Node 1 B Node 2 B Node 3

M 1 C 2 3

X

  • “Client receives all acks from setup stage writeStage”  enter Data

Transfer stage

  • Precise failure events
  • Difgerent stages  difgerent recovery behaviors  difgerent

specifications

  • FATE and DESTINI must work hand in hand

22

slide-80
SLIDE 80

Capture Facts

22

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect recovery

B_gs2 B_gs1 B_gs1

23

slide-81
SLIDE 81

Capture Facts

actualNodes(B, N) :- blocksLocation(B, N, Gs), latestGenStamp(B, Gs)

22

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect recovery

B_gs2 B_gs1 B_gs1

23

slide-82
SLIDE 82

Capture Facts

actualNodes(B, N) :- blocksLocation(B, N, Gs), latestGenStamp(B, Gs)

22

blocksLoc

  • cksLocations(B,
  • ns(B, N, Gs)

B Node 1 2 B Node 2 1 B Node 3 1

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect recovery

B_gs2 B_gs1 B_gs1

23

slide-83
SLIDE 83

Capture Facts

actualNodes(B, N) :- blocksLocation(B, N, Gs), latestGenStamp(B, Gs)

22

blocksLoc

  • cksLocations(B,
  • ns(B, N, Gs)

B Node 1 2 B Node 2 1 B Node 3 1

latestGenStam latestGenStamp(B, Gs) B 2

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect recovery

B_gs2 B_gs1 B_gs1

23

slide-84
SLIDE 84

Capture Facts

actualNodes(B, N) :- blocksLocation(B, N, Gs), latestGenStamp(B, Gs)

22

blocksLoc

  • cksLocations(B,
  • ns(B, N, Gs)

B Node 1 2 B Node 2 1 B Node 3 1

latestGenStam latestGenStamp(B, Gs) B 2

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect recovery

B_gs2 B_gs1 B_gs1

23

slide-85
SLIDE 85

Capture Facts

actualNodes(B, N) :- blocksLocation(B, N, Gs), latestGenStamp(B, Gs)

22

blocksLoc

  • cksLocations(B,
  • ns(B, N, Gs)

B Node 1 2 B Node 2 1 B Node 3 1

latestGenStam latestGenStamp(B, Gs) B 2

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect recovery

B_gs2 B_gs1 B_gs1

23

slide-86
SLIDE 86

Capture Facts

actualNodes(B, N) :- blocksLocation(B, N, Gs), latestGenStamp(B, Gs)

22

actualNodes(Block, Node) actualNodes(Block, Node) B Node 1 blocksLoc

  • cksLocations(B,
  • ns(B, N, Gs)

B Node 1 2 B Node 2 1 B Node 3 1

latestGenStam latestGenStamp(B, Gs) B 2

M 1 C 2 3

Correct recovery X

M 1 C 2 3

X Incorrect recovery

B_gs2 B_gs1 B_gs1

23

slide-87
SLIDE 87

Violation and Check-Timing

23

actualNodes(Block, Node) actualNodes(Block, Node) B Node 1

ExpectedNodes (Block, Node) ExpectedNodes

  • ck, Node)

B Node 1 B Node 2

IncorrectNodes (Block, Node)

  • rrectNodes
  • ck, Node)

B Node 2

incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N), cnpComplete(B) ;

24

slide-88
SLIDE 88

Violation and Check-Timing

23

actualNodes(Block, Node) actualNodes(Block, Node) B Node 1

ExpectedNodes (Block, Node) ExpectedNodes

  • ck, Node)

B Node 1 B Node 2

IncorrectNodes (Block, Node)

  • rrectNodes
  • ck, Node)

B Node 2

incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N), cnpComplete(B) ;

  • There is a point in time where recovery is ongoing, thus

specifications are violated

24

slide-89
SLIDE 89

Violation and Check-Timing

23

actualNodes(Block, Node) actualNodes(Block, Node) B Node 1

ExpectedNodes (Block, Node) ExpectedNodes

  • ck, Node)

B Node 1 B Node 2

IncorrectNodes (Block, Node)

  • rrectNodes
  • ck, Node)

B Node 2

incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N), cnpComplete(B) ;

  • There is a point in time where recovery is ongoing, thus

specifications are violated

  • Need precise events to decide when the check should be done

24

slide-90
SLIDE 90

Violation and Check-Timing

23

actualNodes(Block, Node) actualNodes(Block, Node) B Node 1

ExpectedNodes (Block, Node) ExpectedNodes

  • ck, Node)

B Node 1 B Node 2

IncorrectNodes (Block, Node)

  • rrectNodes
  • ck, Node)

B Node 2

incorrectNodes(B, N) :- expectedNodes(B, N), NOT-IN actualNodes(B, N), cnpComplete(B) ;

  • There is a point in time where recovery is ongoing, thus

specifications are violated

  • Need precise events to decide when the check should be done

– In this example, upon block completion

24

slide-91
SLIDE 91

Rules

24 r1 incorrectNodes (B, N)

:- cnpComplete (B), expectedNodes (B, N), NOT-IN actualNodes (B, N);

r2 pipeNodes (B, Pos, N)

:- getBlkPipe (UFile, B, Gs, Pos, N);

r3 expectedNodes (B, N)

:- getBlkPipe (UFile, B, Gs, Pos, N);

r4 DEL expectedNodes (B, N)

:- fateCrashNode (N), pipeStage (B, Stg), Stg == 2, expectedNodes (B, N);

r5 setupAcks (B, Pos, Ack)

:- cdpSetupAck (B, Pos, Ack);

r6 goodAcksCnt (B, CUUNT<Ack>) :- setupAcks (B, Pos, Ack), Ack == ’OK’; r7 nodesCnt (B, COUNT<Node>)

:- pipeNodes (B, , N, );

r8 pipeStage (B, Stg)

:- nodesCnt (NCnt), goodAcksCnt (ACnt), NCnt == Acnt, Stg := 2;

r9 blkGenStamp (B, Gs)

:- dnpNextGenStamp (B, Gs);

r10 blkGenStamp (B, Gs)

:- cnpGetBlkPipe (UFile, B, Gs, , );

r11 diskFiles (N, File)

:- fsCreate (N, File);

r12 diskFiles (N, Dst)

:- fsRename (N, Src, Dst), diskFiles (N, Src, Type);

r13 DEL diskFiles (N, Src)

:- fsRename (N, Src, Dst), diskFiles (N, Src, Type);

r14 fileTypes (N, File, Type)

:- diskFiles(N, File), Type := Util.getType(File);

r15 blkMetas (N, B, Gs)

:- fileTypes (N, File, Type), Type == metafile, Gs := Util.getGs(File);

r16 actualNodes (B, N)

:- blkMetas (N, B, Gs), blkGenStamp (B, Gs);

25

slide-92
SLIDE 92

Rules

24 r1 incorrectNodes (B, N)

:- cnpComplete (B), expectedNodes (B, N), NOT-IN actualNodes (B, N);

r2 pipeNodes (B, Pos, N)

:- getBlkPipe (UFile, B, Gs, Pos, N);

r3 expectedNodes (B, N)

:- getBlkPipe (UFile, B, Gs, Pos, N);

r4 DEL expectedNodes (B, N)

:- fateCrashNode (N), pipeStage (B, Stg), Stg == 2, expectedNodes (B, N);

r5 setupAcks (B, Pos, Ack)

:- cdpSetupAck (B, Pos, Ack);

r6 goodAcksCnt (B, CUUNT<Ack>) :- setupAcks (B, Pos, Ack), Ack == ’OK’; r7 nodesCnt (B, COUNT<Node>)

:- pipeNodes (B, , N, );

r8 pipeStage (B, Stg)

:- nodesCnt (NCnt), goodAcksCnt (ACnt), NCnt == Acnt, Stg := 2;

r9 blkGenStamp (B, Gs)

:- dnpNextGenStamp (B, Gs);

r10 blkGenStamp (B, Gs)

:- cnpGetBlkPipe (UFile, B, Gs, , );

r11 diskFiles (N, File)

:- fsCreate (N, File);

r12 diskFiles (N, Dst)

:- fsRename (N, Src, Dst), diskFiles (N, Src, Type);

r13 DEL diskFiles (N, Src)

:- fsRename (N, Src, Dst), diskFiles (N, Src, Type);

r14 fileTypes (N, File, Type)

:- diskFiles(N, File), Type := Util.getType(File);

r15 blkMetas (N, B, Gs)

:- fileTypes (N, File, Type), Type == metafile, Gs := Util.getGs(File);

r16 actualNodes (B, N)

:- blkMetas (N, B, Gs), blkGenStamp (B, Gs);

25

slide-93
SLIDE 93

Rules

24 r1 incorrectNodes (B, N)

:- cnpComplete (B), expectedNodes (B, N), NOT-IN actualNodes (B, N);

r2 pipeNodes (B, Pos, N)

:- getBlkPipe (UFile, B, Gs, Pos, N);

r3 expectedNodes (B, N)

:- getBlkPipe (UFile, B, Gs, Pos, N);

r4 DEL expectedNodes (B, N)

:- fateCrashNode (N), pipeStage (B, Stg), Stg == 2, expectedNodes (B, N);

r5 setupAcks (B, Pos, Ack)

:- cdpSetupAck (B, Pos, Ack);

r6 goodAcksCnt (B, CUUNT<Ack>) :- setupAcks (B, Pos, Ack), Ack == ’OK’; r7 nodesCnt (B, COUNT<Node>)

:- pipeNodes (B, , N, );

r8 pipeStage (B, Stg)

:- nodesCnt (NCnt), goodAcksCnt (ACnt), NCnt == Acnt, Stg := 2;

r9 blkGenStamp (B, Gs)

:- dnpNextGenStamp (B, Gs);

r10 blkGenStamp (B, Gs)

:- cnpGetBlkPipe (UFile, B, Gs, , );

r11 diskFiles (N, File)

:- fsCreate (N, File);

r12 diskFiles (N, Dst)

:- fsRename (N, Src, Dst), diskFiles (N, Src, Type);

r13 DEL diskFiles (N, Src)

:- fsRename (N, Src, Dst), diskFiles (N, Src, Type);

r14 fileTypes (N, File, Type)

:- diskFiles(N, File), Type := Util.getType(File);

r15 blkMetas (N, B, Gs)

:- fileTypes (N, File, Type), Type == metafile, Gs := Util.getGs(File);

r16 actualNodes (B, N)

:- blkMetas (N, B, Gs), blkGenStamp (B, Gs);

  • Capture Facts, Build Expectation from IO events
  • No need to interpose internal functions
  • Specification Reuse
  • For the first check, # rules : #check is 16:1
  • Overall, #rules: # check ratio is 3:1

25

slide-94
SLIDE 94

Outline

Introduction FATE DESTINI

  • Evaluation
  • Summary

25 26

slide-95
SLIDE 95

Evaluation

  • FATE: 3900 lines, DESTINI: 1200 lines
  • Applied FATE and DESTINI to three

cloud systems

– HDFS, ZooKeeper, Cassandra

  • 40,000 unique combination of failures
  • Found 16 new bugs, reproduced 74

bugs

  • 74 recovery specifications

– 3 lines / check

26 27

slide-96
SLIDE 96

Bugs found

  • Reduced availability and performance
  • Data loss due to multiple failures
  • Data loss in log recovery protocol
  • Data loss in append protocol
  • Rack awareness property is broken

27 28

slide-97
SLIDE 97

Conclusion

  • FATE explores multiple failure systematically
  • DESTINI enables concise recovery specifications
  • FATE and DESTINI: a unified framework

– Testing recovery specifications requires a failure service – Failure service needs recovery specifications to catch recovery bugs

28 29

slide-98
SLIDE 98

Thank you!

29

The Advanced Systems Laboratory http://www.cs.wisc.edu/adsl Berkeley Orders of Magnitude http://boom.cs.berkeley.edu

QUESTIONS?

Downloads our full TR paper from these websites

30