A SOCIAL NETWORK ON STREAMS 2 DRIVETRIBE D RIVETRIBE The world - - PowerPoint PPT Presentation

a social network on streams
SMART_READER_LITE
LIVE PREVIEW

A SOCIAL NETWORK ON STREAMS 2 DRIVETRIBE D RIVETRIBE The world - - PowerPoint PPT Presentation

DRIVETRIBE ENGINEERING A SOCIAL NETWORK ON STREAMS 2 DRIVETRIBE D RIVETRIBE The world biggest motoring community. A social platform for petrolheads. By Clarkson, Hammond and May. 3 DRIVETRIBE D RIVETRIBE A content destination


slide-1
SLIDE 1

A SOCIAL NETWORK ON STREAMS

DRIVETRIBE ENGINEERING

slide-2
SLIDE 2

DRIVETRIBE

DRIVETRIBE

▸ The world biggest motoring

community.

▸ A social platform for petrolheads. ▸ By Clarkson, Hammond and May.

2

slide-3
SLIDE 3

DRIVETRIBE

DRIVETRIBE ▸ A content destination at the core. ▸ Users consume feeds of content:

images, videos, long-form articles.

▸ Content is organised in

homogenous categories called “tribes”.

▸ Different users have different

interests and the tribe model allows to mix and match at will.

3

slide-4
SLIDE 4

DRIVETRIBE

DRIVETRIBE ARTICLE

▸ Single article by James May. ▸ Contains a plethora of content and

engagement information.

▸ What do we need to compute an

aggregate like this?

4

slide-5
SLIDE 5

DRIVETRIBE

DRIVETRIBE ARTICLE

▸ getUser(id: Id[User])

5

slide-6
SLIDE 6

DRIVETRIBE

DRIVETRIBE ARTICLE

▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe])

6

slide-7
SLIDE 7

DRIVETRIBE

DRIVETRIBE ARTICLE

▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article])

7

slide-8
SLIDE 8

DRIVETRIBE

DRIVETRIBE ARTICLE

▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article]) ▸ countViews(id: Id[Article])

8

slide-9
SLIDE 9

DRIVETRIBE

DRIVETRIBE ARTICLE

▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article]) ▸ countViews(id: Id[Article]) ▸ countComments(id: Id[Article])

9

slide-10
SLIDE 10

DRIVETRIBE

DRIVETRIBE ARTICLE

▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article]) ▸ countViews(id: Id[Article]) ▸ countComments(id: Id[Article]) ▸ countBumps(id: Id[Article])

10

slide-11
SLIDE 11

DRIVETRIBE

DRIVETRIBE FEED OF ARTICLES

▸ rankArticles(forUserId).flatMap { a

=> … }

▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article]) ▸ countViews(id: Id[Article]) ▸ …

11

slide-12
SLIDE 12

DRIVETRIBE

QUINTESSENTIAL PREREQUISITES

▸ Scalable. Jeremy Clarkson has 7.2M Twitter followers. Cannot really hack

it and worry about it later.

▸ Performant. Low latency is key and mobile networks add quite a bit of it. ▸ Flexible. Almost nobody gets it right the first time around. The ability to

iterate is paramount.

▸ Maintainable. Spaghetti code works like interest on debt.

12

slide-13
SLIDE 13

DRIVETRIBE

THREE TIER APPROACH

▸ Clients interact with a fleet of stateless servers

(aka “API” servers or “Backend”) via HTTP (which is stateless).

▸ Global shared mutable state (aka the Database). ▸ Starting simple: Store data in a DB. ▸ Starting simple: Compute the aggregated views

  • n the fly.

13

slide-14
SLIDE 14

DRIVETRIBE

DRIVETRIBE ARTICLE

▸ getUser(id: Id[User]) ▸ getTribe(id: Id[Tribe]) ▸ getArticle(id: Id[Article]) ▸ countComments(id: Id[Article]) ▸ countBumps(id: Id[Article]) ▸ countViews(id: Id[Article])

14

slide-15
SLIDE 15

DRIVETRIBE

READ TIME AGGREGATION

▸ (6 queries per Item) x (Y items per

page)

▸ Cost of ranking and personalisation. ▸ Quite some work at read time. ▸ Slow. Not really Performant.

15

slide-16
SLIDE 16

DRIVETRIBE

WRITE TIME AGGREGATION

▸ Compute the aggregation at write

time.

▸ Then a single query can fetch all the

views at once. That scales.

16

slide-17
SLIDE 17

DRIVETRIBE

WRITE TIME AGGREGATION

▸ Compute the aggregation at write

time.

▸ Then a single query can fetch all the

views at once. That scales.

17

slide-18
SLIDE 18

DRIVETRIBE

WRITE TIME AGGREGATION

▸ Compute the aggregation at write

time.

▸ Then a single query can fetch all the

views at once. That scales.

18

slide-19
SLIDE 19

DRIVETRIBE

WRITE TIME AGGREGATION

▸ Compute the aggregation at write

time.

▸ Then a single query can fetch all the

views at once. That scales.

19

slide-20
SLIDE 20

DRIVETRIBE

WRITE TIME AGGREGATION - EVOLUTION

▸ sendNotification

20

slide-21
SLIDE 21

DRIVETRIBE

WRITE TIME AGGREGATION - EVOLUTION

▸ sendNotification ▸ updateUserStats

21

slide-22
SLIDE 22

DRIVETRIBE

WRITE TIME AGGREGATION - EVOLUTION

▸ sendNotification ▸ updateUserStats. ▸ What if we have a cache? ▸ Or a different database for search?

22

slide-23
SLIDE 23

DRIVETRIBE

WRITE TIME AGGREGATION

▸ A simple user action is triggering a

potentially endless sequence of side effects.

▸ Most of which need network IO. ▸ Many of which can fail.

23

slide-24
SLIDE 24

DRIVETRIBE

ATOMICITY

▸ What happens if one of them fails?

What happens if the server fails in the middle?

▸ We may have transaction support in

the DB, but what about external systems?

▸ Inconsistent.

24

Atomicity?

slide-25
SLIDE 25

DRIVETRIBE

CONCURRENCY

▸ Concurrent mutations on a global

shared state entail race conditions.

▸ State mutations are destructive and

can not be (easily) undone.

▸ A bug can corrupt the data

permanently.

25

Concurrency?

slide-26
SLIDE 26

DRIVETRIBE

ITALIAN PASTA

▸ Model evolution becomes difficult.

Reads and writes are tightly coupled.

▸ Migrations are scary. ▸ This is neither Extensible nor

Maintainable.

26

Extensibility?

slide-27
SLIDE 27

DRIVETRIBE

DIFFERENT APPROACH

▸ Let’s take a step back and try to decouple things. ▸ Clients send events to the API: “John liked Jeremy’s post”, “Maria updated

her profile”

▸ Events are immutable. They capture a user action at some point in time. ▸ Every application state instance can be modelled as a projection of those

events.

27

slide-28
SLIDE 28

DRIVETRIBE

▸ Persisting those yields an append-

  • nly log of events.

▸ An event reducer can then produce

application state instances.

▸ Even retroactively. The log is

immutable.

▸ This is event sourcing.

28

slide-29
SLIDE 29

DRIVETRIBE

▸ The write-time model (command

model) and the read time model (query model) can be separated.

▸ Decoupling the two models opens the

door to more efficient, custom implementations.

▸ This is known as Command Query

Responsibility Segregation aka

CQRS.

29

slide-30
SLIDE 30

DRIVETRIBE

EVENT SOURCING APPROACH

30

Like!!

slide-31
SLIDE 31

DRIVETRIBE

EVENT SOURCING APPROACH

31

Store Like Event Like!!

slide-32
SLIDE 32

DRIVETRIBE

EVENT SOURCING APPROACH

32

Store Like Event ArticleStatsReducer Like!!

slide-33
SLIDE 33

DRIVETRIBE

EVENT SOURCING APPROACH

33

Store Like Event ArticleStatsReducer NotificationReducer Like!!

slide-34
SLIDE 34

DRIVETRIBE

EVENT SOURCING APPROACH

34

Store Like Event ArticleStatsReducer NotificationReducer UserStatsReducer Like!!

slide-35
SLIDE 35

DRIVETRIBE

EVENT SOURCING APPROACH

35

Store Like Event ArticleStatsReducer NotificationReducer UserStatsReducer And so on.. Like!!

slide-36
SLIDE 36

DRIVETRIBE

EVENT SOURCING APPROACH

36

Store Like Event ArticleStatsReducer NotificationReducer UserStatsReducer Sky Is the limit Get Articles Like!!

slide-37
SLIDE 37

DRIVETRIBE

EVENT SOURCING APPROACH

37

Store Like Event ArticleStatsReducer NotificationReducer UserStatsReducer Sky Is the limit Get Articles Like!!

Extensibility?

Maintainability?

slide-38
SLIDE 38

DRIVETRIBE

EVENT SOURCING APPROACH

38

Store Like Event ArticleStatsReducer NotificationReducer UserStatsReducer Sky Is the limit Get Articles Like!!

Performance?

slide-39
SLIDE 39

DRIVETRIBE

EVENT SOURCING APPROACH

39

Store Like Event ArticleStatsReducer NotificationReducer UserStatsReducer Sky Is the limit Get Articles Like!!

Atomicity?

slide-40
SLIDE 40

DRIVETRIBE

EVENT SOURCING APPROACH

40

Store Like Event ArticleStatsReducer NotificationReducer UserStatsReducer Sky Is the limit Get Articles Like!!

Concurrency?

slide-41
SLIDE 41

IMPLEMENTATION?

slide-42
SLIDE 42

WE NEED A LOG

slide-43
SLIDE 43

DRIVETRIBE

APACHE KAFKA

▸ Distributed, fault-tolerant, durable and fast append-only log. ▸ Can scale the thousands of nodes, producers and consumers. ▸ Each business event type can be stored in its own topic.

43

slide-44
SLIDE 44

WE NEED A STREAM PROCESSOR

slide-45
SLIDE 45

DRIVETRIBE

APACHE FLINK

▸ Scalable, performant, mature. ▸ Elegant high level APIs in Scala. ▸ Powerful low level APIs for advanced tuning. ▸ Multiple battle-tested integrations. ▸ Very nice and active community.

45

slide-46
SLIDE 46

WE NEED DATASTORE

slide-47
SLIDE 47

DRIVETRIBE

ELASTICSEARCH

▸ Horizontally scalable document store. ▸ Rich and expressive query language. ▸ Dispensable. Can be replaced.

47

slide-48
SLIDE 48

WE NEED AN API

slide-49
SLIDE 49

DRIVETRIBE

AKKA HTTP

▸ Asynchronous web application framework. ▸ Written in Scala. ▸ Very expressive routing DSL. ▸ Any modern web application framework would do.

49

slide-50
SLIDE 50

DRIVETRIBE

EVENT SOURCING IN PRACTICE

50

Store Raw events Consume raw events Produce aggregated views Retrieve aggregated views

slide-51
SLIDE 51

DRIVETRIBE

EVENT SOURCING IN PRACTICE

51

Store Raw events Consume raw events Produce aggregated views Retrieve aggregated views

Stateful

slide-52
SLIDE 52

DRIVETRIBE

EVENT SOURCING IN PRACTICE

52

Store Raw events Consume raw events Produce aggregated views Retrieve aggregated views

slide-53
SLIDE 53

DRIVETRIBE

BLUE/GREEN APPROACH

53

MIRROR

slide-54
SLIDE 54

A REAL WORLD EXAMPLE

slide-55
SLIDE 55

DRIVETRIBE

COUNTING BUMPS

▸ Thousands of people like the fact that

Jeremy Clarkson is a really tall guy

▸ Users can “bump” a post if they like it

55

slide-56
SLIDE 56

DRIVETRIBE

EVENT SOURCING IN PRACTICE

56

Store Raw events Consume raw events Produce aggregated views Retrieve aggregated views

slide-57
SLIDE 57

DRIVETRIBE

COUNTING BUMPS

57

slide-58
SLIDE 58

DRIVETRIBE

COUNTING BUMPS - FIRST ATTEMPT

58

slide-59
SLIDE 59

DRIVETRIBE

COUNTING BUMPS - FIRST ATTEMPT

59

slide-60
SLIDE 60

DRIVETRIBE

COUNTING BUMPS - FIRST ATTEMPT

60

▸ Use Flink with at least once

semantics

slide-61
SLIDE 61

DRIVETRIBE

COUNTING BUMPS - FIRST ATTEMPT

61

▸ Use Flink with at least once

semantics

▸ Our system is eventually consistent

slide-62
SLIDE 62

DRIVETRIBE

WHAT DO WE KNOW ABOUT OUT COUNTER?

62

▸ we know we’re doing some kind of combine operation over States

slide-63
SLIDE 63

DRIVETRIBE 63

▸ we know we’re doing some kind of combine operation over States ▸ we want our counter to be idempotent: a |+| a === a

WHAT DO WE KNOW ABOUT OUT COUNTER?

slide-64
SLIDE 64

DRIVETRIBE 64

▸ we know we’re doing some kind of combine operation over States ▸ we want our counter to be idempotent: a |+| a === a ▸ we also want our counter to be associative:

a |+| (b |+| c) === (a |+| b) |+| c

WHAT DO WE KNOW ABOUT OUT COUNTER?

slide-65
SLIDE 65

DRIVETRIBE

BAND ALGEBRA

65

▸ Closed ▸ Idempotent ▸ Associative

slide-66
SLIDE 66

DRIVETRIBE

COUNTING BUMPS - SECOND ATTEMPT

66

slide-67
SLIDE 67

DRIVETRIBE

COUNTING BUMPS - SECOND ATTEMPT

67

▸ if all components of a case

class have a band then so does the case class

▸ would normally bring in Set

implementation from a library

▸ normally that library would have

a law testing module

slide-68
SLIDE 68

DRIVETRIBE

COUNTING BUMPS - PUTTING IT TOGETHER

68

slide-69
SLIDE 69

DRIVETRIBE

OTHER ALGEBRAS WE USE

69

▸ Adding events: Semigroup/Monoid ▸ Duplicate events: Band ▸ Out of order and duplicate events: Semilattice

slide-70
SLIDE 70

FIN