When Aeron Met Raft Martin Thompson - @mjpt777 What does Consensus - - PowerPoint PPT Presentation

when aeron met raft
SMART_READER_LITE
LIVE PREVIEW

When Aeron Met Raft Martin Thompson - @mjpt777 What does Consensus - - PowerPoint PPT Presentation

Cluster Consensus When Aeron Met Raft Martin Thompson - @mjpt777 What does Consensus mean? consensus noun \ k n- sen(t)-s s \ : general agreement : unanimity Source: http://www.merriam-webster.com/ consensus noun \ k


slide-1
SLIDE 1

Cluster Consensus When Aeron Met Raft

Martin Thompson - @mjpt777

slide-2
SLIDE 2

What does “Consensus” mean?

slide-3
SLIDE 3

con•sen•sus

noun \ kən-ˈsen(t)-səs \ : general agreement : unanimity

Source: http://www.merriam-webster.com/

slide-4
SLIDE 4

con•sen•sus

noun \ kən-ˈsen(t)-səs \ : general agreement : unanimity : the judgment arrived at by most of those concerned

Source: http://www.merriam-webster.com/

slide-5
SLIDE 5

https://raft.github.io/raft.pdf

slide-6
SLIDE 6

https://www.cl.cam.ac.uk/~ms705/pub/papers/2015-osr-raft.pdf

slide-7
SLIDE 7

Raft in a Nutshell

slide-8
SLIDE 8

Roles

Candidate Follower Leader

slide-9
SLIDE 9

RPCs

  • 1. RequestVote RPC

Invoked by candidates to gather votes

  • 2. AppendEntries RPC

Invoked by leader to replicate and heartbeat

slide-10
SLIDE 10

Safety Guarantees

  • Election Safety
  • Leader Append-Only
  • Log Matching
  • Leader Completeness
  • State Machine Safety
slide-11
SLIDE 11

Monotonic Functions

slide-12
SLIDE 12

Version all the things!

slide-13
SLIDE 13

Clustering Aeron

slide-14
SLIDE 14

Is it Guaranteed Delivery™ ???

slide-15
SLIDE 15

What is the “Architect” really looking for?

slide-16
SLIDE 16

Replicated State Machines => Redundant Deterministic Services

slide-17
SLIDE 17

Client Client Client Client Client Service

slide-18
SLIDE 18

Client Client Client Client Client Service

slide-19
SLIDE 19

Client Client Client Client Client Consensus Module Service Consensus Module Service Consensus Module Service

slide-20
SLIDE 20

NIO Pain

slide-21
SLIDE 21

FileChannel channel = null; try { channel = FileChannel.open(directory.toPath()); } catch (final IOException ignore) { } if (null != channel) { channel.force(true); }

slide-22
SLIDE 22

Directory Sync

Files.force(directory.toPath(), true);

slide-23
SLIDE 23

Performance

slide-24
SLIDE 24

Let’s consider the application of an RPC design approach

slide-25
SLIDE 25

Client Client Client Client Client Consensus Module Service Consensus Module Service Consensus Module Service

slide-26
SLIDE 26

Should we consider concurrency and parallelism with Replicated State Machines?

slide-27
SLIDE 27

“Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once.” – Rob Pike

slide-28
SLIDE 28

1. Parallel is the opposite of Serial 2. Concurrent is the opposite of Sequential 3. Vector is the opposite of Scalar – John Gustafson

slide-29
SLIDE 29

Fetch Time

Instruction Pipelining

slide-30
SLIDE 30

Fetch Decode Time

Instruction Pipelining

slide-31
SLIDE 31

Fetch Decode Execute Time

Instruction Pipelining

slide-32
SLIDE 32

Fetch Decode Execute Retire Time

Instruction Pipelining

slide-33
SLIDE 33

Fetch Decode Execute Retire Time Fetch Decode Execute Retire

Instruction Pipelining

slide-34
SLIDE 34

Fetch Decode Execute Retire Time Fetch Decode Execute Retire Fetch Decode Execute Retire

Instruction Pipelining

slide-35
SLIDE 35

Fetch Decode Execute Retire Time Fetch Decode Execute Retire Fetch Decode Execute Retire Fetch Decode Execute Retire

Instruction Pipelining

slide-36
SLIDE 36

Order Time

Consensus Pipeline

slide-37
SLIDE 37

Order Log Time

Consensus Pipeline

slide-38
SLIDE 38

Order Log Transmit Time

Consensus Pipeline

slide-39
SLIDE 39

Order Log Transmit Commit Time

Consensus Pipeline

slide-40
SLIDE 40

Order Log Transmit Commit Time

Consensus Pipeline

Execute

slide-41
SLIDE 41

Order Log Transmit Commit Time

Consensus Pipeline

Execute Order Log Transmit Commit Execute

slide-42
SLIDE 42

Order Log Transmit Commit Time

Consensus Pipeline

Execute Order Log Transmit Commit Execute Order Log Transmit Commit Execute

slide-43
SLIDE 43

Client Client Client Client Client Consensus Module Service Consensus Module Service Consensus Module Service

slide-44
SLIDE 44

Client Client Client Client Client Consensus Module Service Consensus Module Service Consensus Module Service

slide-45
SLIDE 45

NIO Pain

slide-46
SLIDE 46

ByteBuffer byte[] copies

ByteBuffer byteBuffer = ByteBuffer.allocate(64 * 1024); byteBuffer.putInt(index, value);

slide-47
SLIDE 47

ByteBuffer byte[] copies

ByteBuffer byteBuffer = ByteBuffer.allocate(64 * 1024); byteBuffer.putBytes(index, bytes);

slide-48
SLIDE 48

ByteBuffer byte[] copies

ByteBuffer byteBuffer = ByteBuffer.allocate(64 * 1024); byteBuffer.putBytes(index, bytes);

slide-49
SLIDE 49

How can Aeron help?

slide-50
SLIDE 50

Message Index => Byte Index

slide-51
SLIDE 51

Multicast, MDC, and Spy based Messaging

slide-52
SLIDE 52

Counters and Bounded Consumption

slide-53
SLIDE 53

Binary Protocols & Zero intermediate copies

slide-54
SLIDE 54

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 5 10 15 20

Batching – Amortising Costs

Average overhead per item or operation in batch

slide-55
SLIDE 55

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 5 10 15 20

Batching – Amortising Costs

  • System calls
  • Network round trips
  • Disk writes
  • Expensive calculations
slide-56
SLIDE 56

Interesting Features

slide-57
SLIDE 57

Agents and Threads

slide-58
SLIDE 58

Timers

slide-59
SLIDE 59

Back Pressure and Stashed Work

slide-60
SLIDE 60

Replay and Snapshots

slide-61
SLIDE 61

Multiple Services on the same stream

slide-62
SLIDE 62

Client Client Client Client Client Consensus Module Service Consensus Module Service Consensus Module Service

slide-63
SLIDE 63

Client Client Client Client Client Consensus Module Service Consensus Module Service Consensus Module Service Service Service Service Service Service Service

slide-64
SLIDE 64

In Closing

slide-65
SLIDE 65

NIO Pain

slide-66
SLIDE 66

DirectByteBuffer MappedByteBuffer MappedByteBuffer DirectByteBuffer

slide-67
SLIDE 67
slide-68
SLIDE 68

https://github.com/real-logic/aeron Twitter: @mjpt777 “A distributed system is one in which the failure

  • f a computer you didn't even know existed

can render your own computer unusable.”

  • Leslie Lamport

Questions?