SLIDE 1 WHAT WE TALK ABOUT WHEN WE TALK ABOUT DISTRIBUTED SYSTEMS
ALVARO VIDELA
SLIDE 2 DISTRIBUTED SYSTEMS FOR THE IKEA FAMILY
ALVARO VIDELA
SLIDE 3
HTTP://BIT.LY/DIST-SYS101
SLIDE 4
@HINTJENS
SLIDE 5
DISTRIBUTED SYSTEMS
SLIDE 6 “A DISTRIBUTED SYSTEM IS ONE IN WHICH THE FAILURE OF A COMPUTER YOU DID NOT EVEN KNOW EXISTED CAN RENDER YOUR OWN COMPUTER UNUSABLE”
Leslie Lamport
SLIDE 7
SLIDE 8
SLIDE 9 Google: define jargon
SLIDE 10
DISTRIBUTED SYSTEMS
SLIDE 11 DISTRIBUTED SYSTEMS
- Many entities trying to solve a problem
(nodes, processes)
SLIDE 12 DISTRIBUTED SYSTEMS
- Many entities trying to solve a problem
(nodes, processes)
SLIDE 13 DISTRIBUTED SYSTEMS
- Many entities trying to solve a problem
(nodes, processes)
- Partial Knowledge
- Uncertainty
SLIDE 14
DEEP RABBIT HOLE
SLIDE 15
WHAT TO READ?
SLIDE 16
WHICH PAPERS?
SLIDE 17
SLIDE 18
SLIDE 19
SLIDE 20
SLIDE 21
SLIDE 22
WHICH BOOKS?
SLIDE 23
SLIDE 24
WHY?
SLIDE 25 http://tobielangel.com
SLIDE 26
THE PROBLEM
SLIDE 27
DIFFERENT MODELS
SLIDE 28 DIFFERENT MODELS
SLIDE 29 DIFFERENT MODELS
- Timing Model
- Inter Process Communication Used (IPC
method)
SLIDE 30 DIFFERENT MODELS
- Timing Model
- Inter Process Communication Used (IPC
method)
SLIDE 31
TIMING MODEL
SLIDE 33 TIMING MODEL
- Synchronous Model
- Asynchronous Model
SLIDE 34 TIMING MODEL
- Synchronous Model
- Asynchronous Model
- Semi-synchronous Model
SLIDE 35
INTERPROCESS COMMUNICATION
SLIDE 36 INTERPROCESS COMMUNICATION
SLIDE 37 INTERPROCESS COMMUNICATION
- Message Passing
- Shared Memory
SLIDE 38
FAILURE MODES
SLIDE 40 FAILURE MODES
- Crash-stop
- Crash-recovery
SLIDE 41 FAILURE MODES
- Crash-stop
- Crash-recovery
- Omission Faults
SLIDE 42 FAILURE MODES
- Crash-stop
- Crash-recovery
- Omission Faults
- Arbitrary Failures Mode (Byzantine)
SLIDE 43
LIVENESS AND SAFETY
SLIDE 44
LIVENESS AND SAFETY PROPERTIES OF ALGORITHMS
SLIDE 45 SAFETY
Some “bad” thing does not happens during execution
SLIDE 46 SAFETY
“Communication links should not invent messages out of thin air”
SLIDE 47 LIVENESS
A “good” thing happens during execution
SLIDE 48 LIVENESS
“A destination process eventually delivers the message”
SLIDE 49 LET’S TAKE A LOOK AT FLP
1
1 - Fischer, Lynch, Paterson
SLIDE 50
SLIDE 51 IMPOSSIBILITY OF DISTRIBUTED CONSENSUS WITH ONE FAULTY PROCESS
SLIDE 52 IMPOSSIBILITY OF DISTRIBUTED CONSENSUS WITH ONE FAULTY PROCESS
SLIDE 53 IMPOSSIBILITY OF DISTRIBUTED CONSENSUS WITH ONE FAULTY PROCESS
SLIDE 54 IMPOSSIBILITY OF DISTRIBUTED CONSENSUS WITH ONE FAULTY PROCESS
SLIDE 55 IMPOSSIBILITY OF DISTRIBUTED CONSENSUS WITH ONE FAULTY PROCESS
SLIDE 56
WHAT’S CONSENSUS ANYWAY?
SLIDE 57 “THE CONSENSUS PROBLEM IS A PARADIGM OF AGREEMENT PROBLEMS”
https://dl.acm.org/citation.cfm?id=1052796.1052806
SLIDE 58
PROPERTIES OF CONSENSUS
SLIDE 59 PROPERTIES OF CONSENSUS
- C-Termination: Every correct process eventually decides on some value
SLIDE 60 PROPERTIES OF CONSENSUS
- C-Termination: Every correct process eventually decides on some value
- C-Validity: If a process decides v, then v was proposed by some process
SLIDE 61 PROPERTIES OF CONSENSUS
- C-Termination: Every correct process eventually decides on some value
- C-Validity: If a process decides v, then v was proposed by some process
- C-Agreement: No two correct processes decide differently
SLIDE 62 PROPERTIES OF UNIFORM CONSENSUS
- C-Termination: Every correct process eventually decides on some value
- C-Validity: If a process decides v, then v was proposed by some process
- C-Agreement: No two correct processes decide differently
- C-Uniform Agreement: No two processes (correct or not) decide
differently.
SLIDE 63
WE NEED CONSENSUS WHEN:
A SET OF PROCESSES HAVE TO AGREE TO TAKE A COMMON ACTION
SLIDE 64
WE NEED CONSENSUS WHEN:
A SET OF PROCESSES HAVE TO AGREE TO TAKE A COMMON ACTION
Atomic Broadcast
SLIDE 65
WE NEED CONSENSUS WHEN:
A SET OF PROCESSES HAVE TO AGREE TO TAKE A COMMON ACTION
Atomic Broadcast Group Membership
SLIDE 66
ATOMIC BROADCAST
“CORRECT PROCESSES DELIVER THE SAME SET OF MESSAGES IN THE SAME ORDER”
SLIDE 67
FLP TELLS US THAT IF CONSENSUS CANNOT BE ACHIEVED, THEN ATOMIC BROADCAST OR GROUP MEMBERSHIP CANNOT BE ACHIEVED EITHER
SLIDE 68
SO, WE PACK OUR BAGS AND GO? NOTHING TO SEE HERE?
SLIDE 69 STUMBLING OVER CONSENSUS RESEARCH: MISUNDERSTANDING AND ISSUES
Marcos K. Aguilera
SLIDE 70
FAILURE DETECTORS
SLIDE 71
SLIDE 72
FAILURE DETECTORS
SLIDE 73 FAILURE DETECTORS
SLIDE 74 FAILURE DETECTORS
- External process
- Provides information about suspected processes
SLIDE 75 FAILURE DETECTORS
- External process
- Provides information about suspected processes
- Completeness property (crashed processes are
detected)
SLIDE 76 FAILURE DETECTORS
- External process
- Provides information about suspected processes
- Completeness property (crashed processes are
detected)
- Accuracy (correct process are never suspected)
SLIDE 77
“RUB SOME PERFECT FAILURE DETECTOR ON IT”
SLIDE 78 http://www.amazon.com/Introduction-Reliable-Secure- Distributed-Programming/dp/3642152597
PERFECT FAILURE DETECTOR
SLIDE 79
EVENTUALLY ACCURATE FAILURE DETECTOR
SLIDE 80 EVENTUALLY ACCURATE FAILURE DETECTOR
- Strong Completeness: Eventually, every
process that crashes is permanently suspected by every correct process.
SLIDE 81 EVENTUALLY ACCURATE FAILURE DETECTOR
- Strong Completeness: Eventually, every
process that crashes is permanently suspected by every correct process.
- Eventual Weak Accuracy: There is a time
after which some correct process is never suspected by the correct processes.
SLIDE 82 EVENTUALLY ACCURATE FAILURE DETECTOR
- Strong Completeness: Eventually, every
process that crashes is permanently suspected by every correct process.
- Eventual Weak Accuracy: There is a time
after which some correct process is never suspected by the correct processes. http://dl.acm.org/citation.cfm?id=1052806
SLIDE 83
SLIDE 84
QUORUMS
SLIDE 85
TL;DR:
INTERSECTING SETS
SLIDE 86 “A QUORUM IN A SYSTEM WITH N CRASH-FAULT PROCESS ABSTRACTIONS […] IS ANY MAJORITY OF PROCESSES, I.E., ANY SET OF MORE THAN N/2 PROCESSES”
QUORUMS
SLIDE 87 “IF F < N/2 PROCESSES FAIL BY CRASHING, THERE IS ALWAYS AT LEAST ONE QUORUM OF NONCRASHED PROCESSES IN SUCH SYSTEMS”
QUORUMS
SLIDE 88
QUORUMS
A - B - C - D - E
SLIDE 89
QUORUMS
A - B - C - D - E
SLIDE 90
QUORUMS
A - B’ - C - D’ - E
SLIDE 91
QUORUMS
A - B’ - C - D’ - E
SLIDE 92
QUORUMS
A - B’ - C’ - D’ - E’
SLIDE 93
QUORUMS
A - B’ - C’ - D’ - E’
SLIDE 94
CONSISTENCY
SLIDE 95
SLIDE 96
CONCURRENT FIFO QUEUE
SLIDE 97
CONSISTENCY CONDITIONS
SLIDE 98 CONSISTENCY CONDITIONS
- Atomic Consistency (Linearizabilty)
SLIDE 99 CONSISTENCY CONDITIONS
- Atomic Consistency (Linearizabilty)
- Sequential Consistency
SLIDE 100 CONSISTENCY CONDITIONS
- Atomic Consistency (Linearizabilty)
- Sequential Consistency
- Causal Consistency
SLIDE 101 CONSISTENCY CONDITIONS
- Atomic Consistency (Linearizabilty)
- Sequential Consistency
- Causal Consistency
https://aphyr.com/posts/313-strong-consistency- models
SLIDE 102 LINEARIZABILTY
http://www.amazon.com/Distributed-Algorithms-Message-Passing-Systems-Michel/dp/ 3642381227/
SLIDE 103 LINEARIZABILTY
http://www.amazon.com/Distributed-Algorithms-Message-Passing-Systems-Michel/dp/ 3642381227/
SLIDE 104
SOME BOOKS
SLIDE 105 http://www.amazon.com/Distributed-Algorithms- Message-Passing-Systems-Michel/dp/3642381227/
SLIDE 106 http://www.amazon.com/Fault-tolerant-Agreement- Synchronous-Message-passing-Distributed/dp/ 1608455254/
SLIDE 107 http://www.amazon.com/Communication-Abstractions- Fault-tolerant-Asynchronous-Distributed/dp/160845293X/
SLIDE 108 http://www.amazon.com/Distributed-Algorithms- Kaufmann-Management-Systems/dp/1558603484/
SLIDE 109 http://www.amazon.com/Introduction-Reliable-Secure- Distributed-Programming/dp/3642152597
SLIDE 110 http://www.amazon.com/Guide-Reliable-Distributed- Systems-High-Assurance/dp/1447124154/
SLIDE 111 http://www.amazon.com/Replication-Practice-Lecture- Computer-Theoretical/dp/3642112935/
SLIDE 112
FINDING NON PAYWALLED PAPERS
SLIDE 113
CONCLUSION
SLIDE 115 CONCLUSION
- Deep Rabbit Hole
- Computing Science where Science is Still a Thing™
SLIDE 116 CONCLUSION
- Deep Rabbit Hole
- Computing Science where Science is Still a Thing™
- History of the Field Matters
SLIDE 117 CONCLUSION
- Deep Rabbit Hole
- Computing Science where Science is Still a Thing™
- History of the Field Matters
- Read, read, read
SLIDE 118 THANKS!
@old_sound