Implementing Raft protocol by coroutines and Ktor Framework Andrii - - PowerPoint PPT Presentation

implementing raft protocol by coroutines and ktor
SMART_READER_LITE
LIVE PREVIEW

Implementing Raft protocol by coroutines and Ktor Framework Andrii - - PowerPoint PPT Presentation

Implementing Raft protocol by coroutines and Ktor Framework Andrii Rodionov @AndriiRodionov About me Devoxx Ukraine organizer KNight Kyiv co-organizer JUG UA Leader Kyiv Kotlin User Group Co-leader Devoxx Ukraine


slide-1
SLIDE 1

Implementing Raft protocol by coroutines and Ktor Framework

Andrii Rodionov @AndriiRodionov

slide-2
SLIDE 2

About me

  • Devoxx Ukraine organizer
  • KNight Kyiv co-organizer
  • JUG UA Leader
  • Kyiv Kotlin User Group

Co-leader

slide-3
SLIDE 3

Devoxx Ukraine

30% discount Code: DevoxxUAKotlin

devoxx.org.ua

slide-4
SLIDE 4

Agenda

  • Consensus algorithms
  • Replicated state machine
  • Raft basics
  • Raft algorithm building blocks
  • Implementation details
  • Demo
slide-5
SLIDE 5

Consensus algorithms overview

  • allow a collection of machines to work as a coherent group
  • replicated state machines

○ servers compute identical copies of the same state

  • can survive the failures of some of its members
  • play a key role in building reliable distributed systems

(ZooKeeper, HDFS, …)

slide-6
SLIDE 6

Replicated State Machine

  • Replicated log ⇒ replicated state machine

○ All servers execute same commands in same order

  • Consensus module ensures proper log replication
  • System makes progress as long as any majority of servers are up
  • Failure model: fail-stop (not Byzantine), delayed/lost messages
slide-7
SLIDE 7

Replicated State Machine

  • Replicated log ⇒ replicated state machine

○ All servers execute same commands in same order

  • Consensus module ensures proper log replication
  • System makes progress as long as any majority of servers are up
  • Failure model: fail-stop (not Byzantine), delayed/lost messages
slide-8
SLIDE 8

Replicated State Machine

  • Replicated log ⇒ replicated state machine

○ All servers execute same commands in same order

  • Consensus module ensures proper log replication
  • System makes progress as long as any majority of servers are up
  • Failure model: fail-stop (not Byzantine), delayed/lost messages
slide-9
SLIDE 9

Replicated State Machine

  • Replicated log ⇒ replicated state machine

○ All servers execute same commands in same order

  • Consensus module ensures proper log replication
  • System makes progress as long as any majority of servers are up
  • Failure model: fail-stop (not Byzantine), delayed/lost messages
slide-10
SLIDE 10

Replicated State Machine

  • Replicated log ⇒ replicated state machine

○ All servers execute same commands in same order

  • Consensus module ensures proper log replication
  • System makes progress as long as any majority of servers are up
  • Failure model: fail-stop (not Byzantine), delayed/lost messages
slide-11
SLIDE 11

Why Raft?

  • Paxos has dominated discussion for 25 years

○ Hard to understand ○ Not complete enough for real implementations

  • New consensus algorithm: Raft

Raft is a consensus algorithm for managing a replicated log

Diego Ongaro and John Ousterhout - Stanford University

  • Raft adoption

Docker swarm, Consul, Kudu, RavenDB etc

slide-12
SLIDE 12

Raft consensus algorithm

  • Leader election

○ Select one of the servers to act as cluster leader ○ Detect crashes, choose new leader

  • Log replication

○ Leader takes commands from clients, appends them to its log ○ Leader replicates its log to other servers (overwriting inconsistencies)

slide-13
SLIDE 13

Server States and RPCs

slide-14
SLIDE 14

Log replication

  • Client sends command to leader
  • Leader appends command to its log
  • Leader sends AppendEntries RPCs to all followers
  • Once new entry committed:

○ Leader executes command in its state machine, returns result to client ○ Leader notifies followers of committed entries in subsequent AppendEntries RPCs ○ Followers execute committed commands in their state machines

  • Crashed/slow followers?

○ Leader retries AppendEntries RPCs until they succeed

  • Optimal performance in common case:

○ One successful RPC to any majority of servers

slide-15
SLIDE 15

Raft API to implement

service Raft { rpc Vote (RequestVoteRPC) returns (ResponseVoteRPC); rpc Append (RequestAppendEntriesRPC) returns (ResponseAppendEntriesRPC); }

slide-16
SLIDE 16

Raft architecture

[As a FOLLOWER]

  • waiting for heartbeats

[As a CANDIDATE]

  • ask for votes

[As a LEADER]

  • generate heartbeats
  • replicate log

[As a Raft-node]

  • process vote requests
  • process append requests
slide-17
SLIDE 17

Raft algorithm building blocks

  • RPC
  • State transition
  • Resettable countdown timer
  • Retry operation
  • Leader election
  • Heartbeat
slide-18
SLIDE 18

gRPC Kotlin - Coroutine based gRPC for Kotlin

class RaftServer(... ) : RaftGrpcKt.RaftImplBase(), CoroutineScope { fun vote(request: RequestVoteRPC): Deferred<ResponseVoteRPC> = async { … } fun append(request: RequestAppendEntriesRPC): Deferred<ResponseAppendEntriesRPC> = async { … } }

https://github.com/rouzwawi/grpc-kotlin

slide-19
SLIDE 19

Server States and RPCs

slide-20
SLIDE 20

Channel<State> for transition between states

val channel = Channel<State>() init { val waitingForHeartbeat = waitingForHeartbeatFromLeaderTimer() launch { channel.consumeEach { when (it) { FOLLOWER -> waitingForHeartbeat.reset() CANDIDATE -> leaderElection() LEADER -> appendRequestAndLeaderHeartbeat() } } } }

slide-21
SLIDE 21

Resettable Countdown Timer

class ResettableCountdownTimer(private val action: suspend () -> Unit) { private var timer = startTimer() fun reset() { timer.cancel() timer = startTimer() } private fun startTimer(): Timer { val newTimer = Timer() newTimer.schedule(randomDelay()) { runBlocking { action() } } return newTimer } }

slide-22
SLIDE 22

retry

suspend fun <T> retry(delay: Long = 5000, block: suspend () -> T): T { while (true) { try { return block() } catch (e: Exception) { } delay(delay) } }

slide-23
SLIDE 23

Leader election

1 2 3 4 5

slide-24
SLIDE 24

Leader election

val countDownLatch = CountDownLatch(majority)

1 2 3 4 5

slide-25
SLIDE 25

Leader election

val countDownLatch = CountDownLatch(majority) val job = Job() servers.forEach { srv -> launch(parent = job) { val responseVote = retry { srv.vote( … ) } countDownLatch.countDown() ... } } countDownLatch.await(electionTimeout, TimeUnit.SECONDS)

1 2 3 4 5

slide-26
SLIDE 26

Leader election (with old coroutines)

val countDownLatch = CountDownLatch(majority) val job = Job() servers.forEach { srv -> launch(parent = job) { val responseVote = retry { srv.vote( … ) } countDownLatch.countDown() ... } } countDownLatch.await(electionTimeout, TimeUnit.SECONDS) job.cancelAndJoin()

1 2 3 4 5

slide-27
SLIDE 27

Leader election (with structured concurrency)

val countDownLatch = CountDownLatch(majority) coroutineScope { servers.forEach { launch { val responseVote = retry { it.vote( … ) } countDownLatch.countDown() ... } } countDownLatch.await(electionTimeout, TimeUnit.SECONDS) coroutineContext.cancelChildren() }

1 2 3 4 5

slide-28
SLIDE 28

Heartbeat via fixedRateTimer

fixedRateTimer(period = 2000) { runBlocking { servers.forEach { launch { try { val response = it.append( … ) ... } catch (e: Exception) { } } } } }

slide-29
SLIDE 29

Ktor

private fun ktorServer() { val server = embeddedServer(Netty, port = 7000) { routing { get("/") { call.respondText("Server $id log ${entries()}", Text.Plain) } get("/cmd/{command}") { appendCommand(call.parameters["command"]) call.respondText("Server $id log ${entries()}", Text.Plain) } } } server.start(wait = false) }

slide-30
SLIDE 30

Demo

slide-31
SLIDE 31

Thank you!

Questions? @AndriiRodionov

slide-32
SLIDE 32

Literature

https://raft.github.io