Disruptive Development using Reactive Event-Sourced Systems Jan - - PowerPoint PPT Presentation

disruptive development
SMART_READER_LITE
LIVE PREVIEW

Disruptive Development using Reactive Event-Sourced Systems Jan - - PowerPoint PPT Presentation

Disruptive Development using Reactive Event-Sourced Systems Jan Ypma jyp@tradeshift.com 1 / 42 Agenda A bit of Tradeshift history Typical sources of failures Event sourcing Actors, Akka, and Clustering Example use case: collaboration


slide-1
SLIDE 1

Disruptive Development

using Reactive Event-Sourced Systems Jan Ypma jyp@tradeshift.com 1 / 42

slide-2
SLIDE 2
slide-3
SLIDE 3

Agenda

A bit of Tradeshift history Typical sources of failures Event sourcing Actors, Akka, and Clustering Example use case: collaboration

slide-4
SLIDE 4

Tradeshift in 2011

Frontend Backend x10 Postgresql

Two major software components the frontend the backend <10 developers

slide-5
SLIDE 5

Tradeshift in 2016

Frontend Backend SFTP Public API App Backend (old) App Backend (new) Workflow Payments (old) Payments (new) Conversions Others... x150

30 deployed components (and growing) 150 developers 250.000 LoC in the backend

slide-6
SLIDE 6

Scaling of Tradeshift's systems

Moore's law applies to AWS Single point of not quite failing often enough 2016 directive: All new components must be clustered Yeah, what about the 30-ish existing ones? New architecture is needed

slide-7
SLIDE 7

Scaling of Tradehift's development process

2011: "We're a Java shop"

slide-8
SLIDE 8

Scaling of Tradehift's development process

2011: "We're a Java shop" 2016: Not really, at least not anymore Groovy and grails Python, go, ruby for infrastructure Crazy javascript people Scala But still, mostly Java Empower teams to pick their own language and frameworks

slide-9
SLIDE 9

Typical sources of failures

Frontend Backend Public API App Backend (old)

We're down! We overloaded the database which caused the backend to respond slowly which caused the frontend to respond slowly which caused our users' web browsers to respond slowly which caused our users to reload their page GOTO 10

slide-10
SLIDE 10

Typical sources of failures

Frontend Backend Public API App Backend (old)

We're down! We overloaded the database which caused the backend to respond slowly which caused the frontend to respond slowly which caused our users' web browsers to respond slowly which caused our users to reload their page GOTO 10 Enter the buzzwords Let it crash [2003, Amstrong] Micro-services [2005, Rodgers] Self-contained systems [2015, scs-architecture.org]

slide-11
SLIDE 11

Service 2 Service 1

Self-contained systems

No outgoing calls while handling an incoming request (except to our own databases) All inter-service communication must be asynchronous This implies data replication No single points of failure System must be clustered Design must trivially scale to 10x expected load

slide-12
SLIDE 12

Event sourcing

User #1

Created Name changed to Alice Pet added: Gary the Goldfish Pet removed: Gary the Goldfish

User #2

Created Name changed to Bob Pet added: Charlie the Cat

System considers an append-only Event journal the only source of truth Aggregate is one unit of information to which (and only which) an event atomically applies Events have a guaranteed order, but only within an aggregate

slide-13
SLIDE 13

Event sourcing

Nice scalability properties Each aggregate can process changes independently All information that spans >1 aggregate is materialized using event listeners Traditionally only applied inside a system Synchronous APIs only ("get customer history") Why not expose event stream itself? Eventual consistency Latency implications Security implications

slide-14
SLIDE 14

Implementation

slide-15
SLIDE 15

The actor model

Actor is an entity that responds only to messages by sending messages to other actors creating other (child) actors adjusting its behaviour Akka is a toolkit for writing actors in Java Actor is a normal Java class that extends UntypedActor or AbstractActor Message is an immutable, serializable, Java class Parent actor is the supervisor of its child actors. On child actor failure, parent decides what to do: Restart child Stop child Escalate

slide-16
SLIDE 16

Actor ping pong

public class PongActor extends UntypedActor { public void onReceive(Object message) { if (message instanceof String) { System.out.println("In PongActor - received message: " + message); getSender().tell("pong", getSelf()); } } }

slide-17
SLIDE 17

Actor ping pong

public class Initialize {} public class PingActor extends AbstractActor { private int counter = 0; private ActorRef pongActor = getContext().actorOf(Props.create(PongActor.class), "pongActor"); { receive(ReceiveBuilder .match(Initialize.class, msg -> { System.out.println("In PingActor - starting ping-pong"); pongActor.tell("ping", getSelf()); }) .match(String.class, msg -> { System.out.println("In PingActor - received message: " + message); counter += 1; if (counter == 3) { getContext().system().shutdown(); } else { getSender().tell("ping", getSelf()); } }) .build()); } }

slide-18
SLIDE 18

Actor ping ping

public static void main() { ActorSystem system = ActorSystem.create(); ActorRef pingActor = system.actorOf(Props.create(PingActor.class)); PingActor.tell(new Initialize()); }

Output:

In PingActor - starting ping-pong In PongActor - received message: ping In PingActor - received message: pong In PongActor - received message: ping In PingActor - received message: pong In PongActor - received message: ping In PingActor - received message: pong

slide-19
SLIDE 19

Akka persistence

Framework to do event sourcing using actors Persistence plugins for levelDB, cassandra, kafka, ... Each PersistentActor has a String identifier, under which events are stored

public class ChatActor extends AbstractPersistentActor { private final List<String> messages = new ArrayList<>(); @Override public String persistenceId() { return "chat-1"; } private void postMessage(String msg) { persist(msg, evt -> { messages.add(msg); sender().tell(Done.getInstance(), self()); }); } private void getMessageList() { sender().tell(new ArrayList<>(messages), self()); } // ... }

slide-20
SLIDE 20

Akka persistence

public class ChatActor extends AbstractPersistentActor { private final List<String> messages = new ArrayList<>(); private void postMessage(String msg) { /* ... */ } private void getMessageList() { /* ... */ } @Override public String persistenceId() { return "chat-1"; } @Override public PartialFunction<Object,BoxedUnit> receiveRecover() { return ReceiveBuilder .match(String.class, messages::add) .build(); } @Override public void receiveCommand() { return ReceiveBuilder .matchEquals("/list", msg -> getMessageList()) .match(String.class, this::postMessage) .build(); } }

slide-21
SLIDE 21

Akka remoting and clustering

Transparently lets actors communicate between systems ActorRef can point to a remote actor Messages must be serializable (using configurable mechanisms)

akka { actor { provider = "akka.remote.RemoteActorRefProvider" } remote { enabled-transports = ["akka.remote.netty.tcp"] netty.tcp { hostname = "127.0.0.1" port = 2552 } } cluster { seed-nodes = [ "akka.tcp://ClusterSystem@127.0.0.1:2551", "akka.tcp://ClusterSystem@127.0.0.1:2552"] } }

slide-22
SLIDE 22

Akka cluster sharding

Dynamically distributes a group of actors across an akka cluster MessageExtractor informs cluster sharding where a particular message should go

class ChatMessage { UUID conversation; String msg; } class MyMessageExtractor implements MessageExtractor { private final int numberOfShards = 256; @Override public String entityId(Object command) { return ChatMessage.cast(command).conversation.toString(); } @Override public String shardId(Object command) { return String.valueOf(entityId(command).hashCode() % numberOfShards); } @Override public Object entityMessage(Object command) { return ChatMessage.cast(command).msg; } }

slide-23
SLIDE 23

Akka cluster sharding

ShardRegion proxy sits between client and real (remote) persistent actor Persistent actor names will be their persistence id

public class ChatActor extends AbstractPersistentActor { // ... @Override public String persistenceId() { return getSelf().path().name(); } } ActorRef proxy = ClusterSharding.get(system).start( "conversations", Props.create(ChatActor.class), ClusterShardingSettings.create(system), new MyMessageExtractor()); proxy.tell(new ChatMessage( UUID.fromString("67c67d28-4719-4bf9-bfe6-3944ed961a60"), "hello!"));

slide-24
SLIDE 24

Putting it all together

2015: Let's try it out first 2016: Collaboration in production Real-time text message exchange between employees Text interface to automated travel agent In development: documents, FTP Stuff that works well for us: https://github.com/Tradeshift/ts-reaktive/ ts-reaktive-actors: Persistent actor base classes with reasonable defaults, and HTTP API for event journal ts-reaktive-marshal: Non-blocking streaming marshalling framework for akka streams ts-reaktive-replication: Master-slave replication across data centers for persistent actors

slide-25
SLIDE 25

A bit of refactoring

Introduce a single base class for all commands

public abstract class ChatCommand {} public class GetMessageList extends ChatCommand {} public class PostMessage extends ChatCommand { private final String message; public PostMessage(String message) { this.message = message; } public String getMessage() { return message; } }

slide-26
SLIDE 26

A bit of refactoring

Introduce a single base class for all events

public abstract class ChatEvent {} public class MessagePosted extends ChatEvent { private final String message; public MessagePosted(String message) { this.message = message; } public String getMessage() { return message; } }

slide-27
SLIDE 27

Independent state class

public class ChatState extends AbstractState<ChatEvent, ChatState> { public static final ChatState EMPTY = new ChatState(Vector.empty()); private final Seq<String> messages; @Override public ChatState apply(ChatEvent event) { if (event instanceof MessagePosted) { return new ChatState(messages.append( MessagePosted.class.cast(event).getMessage())); } else { return this; } } }

Plain old Java class, hence easily unit testable (compared to actor)

slide-28
SLIDE 28

Stateful persistent actor

public class ChatActor extends AbstractStatefulPersistentActor< ChatCommand, ChatEvent, ChatState> { public static abstract class Handler extends AbstractCommandHandler< ChatCommand, ChatEvent, ChatState> { } @Override protected ChatState initialState() { return ChatState.EMPTY; } @Override protected PartialFunction<ChatCommand,Handler> applyCommand() { return new PFBuilder<ChatCommand,Handler>() .match(ChatCommand.GetMessageList.class, cmd -> new GetMessageListHandler(getState(), cmd)) .match(ChatCommand.PostMessage.class, cmd -> new PostMessageHandler(getState(), cmd)) .build(); } }

Business logic pushed out to ChatState and *Handler

slide-29
SLIDE 29

Command handler example

public class PostMessageHandler extends ChatActor.Handler { @Override public Seq<ChatEvent> getEventsToEmit() { return Vector.of(new ChatEvent.MessagePosted( ChatCommand.PostMessage.class.cast(cmd).getMessage())); } @Override public Object getReply(Seq<ChatEvent> emittedEvents, long lastSequenceNr) { return Done.getInstance(); } }

Plain old Java class, hence easily unit testable (compared to actor) Behaviour of different commands separated into different classes

slide-30
SLIDE 30

Event journal over HTTP

We wanted easy consumption of the event log by other systems HTTP chunked encoding (1 chunk per event), without completion

GET /events?since=1473689920000 200 OK Content-Type: application/protobuf Transfer-Encoding: chunked 14 (14 bytes with the first protobuf event) 11 (11 bytes with the second protobuf event) (TCP stream stalls here)

Additional events can arrive in real time

slide-31
SLIDE 31

Use case: collaboration

Presentation

nodejs

Workflow

java

Content

java + akka Cassandra journal PostgreSQL materialization

Content backend has API to add messages to conversations Messages go into the event journal Journal is queryable over HTTP Presentation backend listens to event stream Materializes into views that are UI dependent Can combine other sources as well Browser talks to both Presentation and Content backends Web socket stream informs browser of incoming messages

slide-32
SLIDE 32

Wrap-up

Scalable systems: check Scalable development: check We're not quite there yet Akka, cassandra and the reaktive combo in active development Attitude of I've done Spring for 10+ years successfully, why would I learn this The proof is in more pudding Want to get involved? Get: http://akka.io/ Read: http://doc.akka.io/docs/akka/current/java.html Chat: https://gitter.im/akka/akka Hack: https://github.com/Tradeshift/ts-reaktive/ and https://github.com/jypma/ts-reaktive-examples/

slide-33
SLIDE 33
slide-34
SLIDE 34

Extra slides

slide-35
SLIDE 35

Introducing Akka Streams

Graph is a blueprint for a closed, finite network of stages, which communicate by streaming elements GraphStage<S extends Shape> is one processing stage within a graph, taking elements in through zero or more Inlets, and emitting through Outlets It's completely up to the stage when and how to respond to arriving elements All built-in graph stages embrace backpressure and bounded processing Mostly used graph stages Source<T, M> has one outlet of type T Sink<T, M> has one inlet of type T Flow<A, B, M> has one inlet of type A and one outlet of type B RunnableGraph<M> has no inlets or outlets Reactive streams Akka is a reactive streams implementation (just like RxJava and others) You typically don't interact in terms of publisher and subscriber directly

slide-36
SLIDE 36

Hello, streams

final ActorSystem system = ActorSystem.create("QuickStart"); final Materializer materializer = ActorMaterializer.create(system); final Source<Integer, NotUsed> numbers = Source.range(1, 100); final Sink<Integer, CompletionStage<Done>> print = Sink.foreach(i -> System.out.println(i)); final CompletionStage<Done> done = numbers.runWith(print, materializer); // Output: // 1 // 2 // ...

slide-37
SLIDE 37

Stream materialization

Graph is only a blueprint: nothing runs until it's given to a materializer, typically ActorMaterializer All graph stages are generic in their materialized type M Graph can be materialized (run, runWith) more than once

class Source<T, M> { // A graph which materializes into the M2 of the sink (ignoring source's M) public RunnableGraph<M2> to(Sink<T,M2> sink); // Materializes, and returns the M of the sink (ignoring this source's M) public <M2> M2 runWith(Sink<T, M2> sink, Materializer m) { ... } // A graph which materializes into the result of applying [combine] to // this source's M and the sink's M2 public <M2, MR> RunnableGraph<MR> toMat(Sink<T,M2> sink, Function2<M,M2,MR> combine); } class RunnableGraph<M> { public M run(Materializer m); }

slide-38
SLIDE 38

Reusable pieces

Source, Sink and Flow are all normal, immutable objects, so they're ideal to be constructed in reusable factory methods:

public Sink<String, CompletionStage<IOResult>> lineSink(String filename) { Sink<ByteString, CompletionStage<IOResult>> file = FileIO.toPath(Paths.get(filename); // Let's start with some strings return Flow.of(String.class) // Flow<String, String, NotUsed> // Convert them into bytes (UTF-8), adding a newline // We now have a Flow<String, ByteString, NotUsed> .map(s -> ByteString.fromString(s + "\n")) // Send them into a file, and we want the IOResult of the // FileIO sink as materialized value of our own sink .toMat(file), Keep.right()); } numbers.runWith(lineSink("numbers.txt"), materializer);

slide-39
SLIDE 39

Time-based processing

final Source<Integer, NotUsed> numbers = Source.range(1, 100000000); final Sink<Integer, CompletionStage<Done>> print = Sink.foreach(i -> System.out.println(i)); final CompletionStage<Done> done = numbers .throttle(1, Duration.create(1, TimeUnit.SECONDS), 1, ThrottleMode.shaping()) .runWith(print, materializer);

This does what you expect: print one message per second No OutOfMemoryError, akka buffers only as needed: backpressure

slide-40
SLIDE 40

Example Sources

Materialize as ActorRef

Source<T, ActorRef> s = Source.actorRef(10, OverflowStrategy.fail());

Materialize as reactive Subscriber<T>

Source<T, Subscriber<T>> s = Source.asSubscriber();

Read from a reactive Publisher<T> p

Source<T, NotUsed> s = Source.fromPublisher(p);

Emit the same element regularly

Source<T, Cancellable> s = Source.tick(duration, duration, element);

slide-41
SLIDE 41

Example Sinks

Send to ActorRef

Sink<T, NotUsed> s = Sink.actorRef(target, "done");

Materialize as reactive Publisher<T>

Sink<T, Publisher<T>> s = Sink.asPublisher(WITH_FANOUT);

Materialize into a java.util.List of all elements

Sink<T, List<T>> s = Sink.seq();

slide-42
SLIDE 42

Example source and flow operators

Send Source<String, M> src to an additional Sink<String> sink

Source<String, M> s = src.alsoTo(sink);

Process batches of concatenated strings, but only if coming in too fast

Source<String, M> s = src.batchWeighted(1000, s -> s.length(), s -> s, (s1,s2) -> s1 + s2);

Process 1 seconds' worth of elements at a time, but at most 100

Source<List<String>, M> s = src.groupedWithin(100, Duration.create(1, SECONDS));

Invoke a CompletionStage for each element, and resume with the results in order

CompletionStage<Integer> process(String s) { ... } Source<String, M> s = src.mapAsync(this::process);