Disruptive Development
using Reactive Event-Sourced Systems Jan Ypma jyp@tradeshift.com 1 / 42
Disruptive Development using Reactive Event-Sourced Systems Jan - - PowerPoint PPT Presentation
Disruptive Development using Reactive Event-Sourced Systems Jan Ypma jyp@tradeshift.com 1 / 42 Agenda A bit of Tradeshift history Typical sources of failures Event sourcing Actors, Akka, and Clustering Example use case: collaboration
using Reactive Event-Sourced Systems Jan Ypma jyp@tradeshift.com 1 / 42
A bit of Tradeshift history Typical sources of failures Event sourcing Actors, Akka, and Clustering Example use case: collaboration
Frontend Backend x10 Postgresql
Two major software components the frontend the backend <10 developers
Frontend Backend SFTP Public API App Backend (old) App Backend (new) Workflow Payments (old) Payments (new) Conversions Others... x150
30 deployed components (and growing) 150 developers 250.000 LoC in the backend
Moore's law applies to AWS Single point of not quite failing often enough 2016 directive: All new components must be clustered Yeah, what about the 30-ish existing ones? New architecture is needed
2011: "We're a Java shop"
2011: "We're a Java shop" 2016: Not really, at least not anymore Groovy and grails Python, go, ruby for infrastructure Crazy javascript people Scala But still, mostly Java Empower teams to pick their own language and frameworks
Frontend Backend Public API App Backend (old)
We're down! We overloaded the database which caused the backend to respond slowly which caused the frontend to respond slowly which caused our users' web browsers to respond slowly which caused our users to reload their page GOTO 10
Frontend Backend Public API App Backend (old)
We're down! We overloaded the database which caused the backend to respond slowly which caused the frontend to respond slowly which caused our users' web browsers to respond slowly which caused our users to reload their page GOTO 10 Enter the buzzwords Let it crash [2003, Amstrong] Micro-services [2005, Rodgers] Self-contained systems [2015, scs-architecture.org]
Service 2 Service 1
No outgoing calls while handling an incoming request (except to our own databases) All inter-service communication must be asynchronous This implies data replication No single points of failure System must be clustered Design must trivially scale to 10x expected load
User #1
Created Name changed to Alice Pet added: Gary the Goldfish Pet removed: Gary the Goldfish
User #2
Created Name changed to Bob Pet added: Charlie the Cat
System considers an append-only Event journal the only source of truth Aggregate is one unit of information to which (and only which) an event atomically applies Events have a guaranteed order, but only within an aggregate
Nice scalability properties Each aggregate can process changes independently All information that spans >1 aggregate is materialized using event listeners Traditionally only applied inside a system Synchronous APIs only ("get customer history") Why not expose event stream itself? Eventual consistency Latency implications Security implications
Actor is an entity that responds only to messages by sending messages to other actors creating other (child) actors adjusting its behaviour Akka is a toolkit for writing actors in Java Actor is a normal Java class that extends UntypedActor or AbstractActor Message is an immutable, serializable, Java class Parent actor is the supervisor of its child actors. On child actor failure, parent decides what to do: Restart child Stop child Escalate
public class PongActor extends UntypedActor { public void onReceive(Object message) { if (message instanceof String) { System.out.println("In PongActor - received message: " + message); getSender().tell("pong", getSelf()); } } }
public class Initialize {} public class PingActor extends AbstractActor { private int counter = 0; private ActorRef pongActor = getContext().actorOf(Props.create(PongActor.class), "pongActor"); { receive(ReceiveBuilder .match(Initialize.class, msg -> { System.out.println("In PingActor - starting ping-pong"); pongActor.tell("ping", getSelf()); }) .match(String.class, msg -> { System.out.println("In PingActor - received message: " + message); counter += 1; if (counter == 3) { getContext().system().shutdown(); } else { getSender().tell("ping", getSelf()); } }) .build()); } }
public static void main() { ActorSystem system = ActorSystem.create(); ActorRef pingActor = system.actorOf(Props.create(PingActor.class)); PingActor.tell(new Initialize()); }
Output:
In PingActor - starting ping-pong In PongActor - received message: ping In PingActor - received message: pong In PongActor - received message: ping In PingActor - received message: pong In PongActor - received message: ping In PingActor - received message: pong
Framework to do event sourcing using actors Persistence plugins for levelDB, cassandra, kafka, ... Each PersistentActor has a String identifier, under which events are stored
public class ChatActor extends AbstractPersistentActor { private final List<String> messages = new ArrayList<>(); @Override public String persistenceId() { return "chat-1"; } private void postMessage(String msg) { persist(msg, evt -> { messages.add(msg); sender().tell(Done.getInstance(), self()); }); } private void getMessageList() { sender().tell(new ArrayList<>(messages), self()); } // ... }
public class ChatActor extends AbstractPersistentActor { private final List<String> messages = new ArrayList<>(); private void postMessage(String msg) { /* ... */ } private void getMessageList() { /* ... */ } @Override public String persistenceId() { return "chat-1"; } @Override public PartialFunction<Object,BoxedUnit> receiveRecover() { return ReceiveBuilder .match(String.class, messages::add) .build(); } @Override public void receiveCommand() { return ReceiveBuilder .matchEquals("/list", msg -> getMessageList()) .match(String.class, this::postMessage) .build(); } }
Transparently lets actors communicate between systems ActorRef can point to a remote actor Messages must be serializable (using configurable mechanisms)
akka { actor { provider = "akka.remote.RemoteActorRefProvider" } remote { enabled-transports = ["akka.remote.netty.tcp"] netty.tcp { hostname = "127.0.0.1" port = 2552 } } cluster { seed-nodes = [ "akka.tcp://ClusterSystem@127.0.0.1:2551", "akka.tcp://ClusterSystem@127.0.0.1:2552"] } }
Dynamically distributes a group of actors across an akka cluster MessageExtractor informs cluster sharding where a particular message should go
class ChatMessage { UUID conversation; String msg; } class MyMessageExtractor implements MessageExtractor { private final int numberOfShards = 256; @Override public String entityId(Object command) { return ChatMessage.cast(command).conversation.toString(); } @Override public String shardId(Object command) { return String.valueOf(entityId(command).hashCode() % numberOfShards); } @Override public Object entityMessage(Object command) { return ChatMessage.cast(command).msg; } }
ShardRegion proxy sits between client and real (remote) persistent actor Persistent actor names will be their persistence id
public class ChatActor extends AbstractPersistentActor { // ... @Override public String persistenceId() { return getSelf().path().name(); } } ActorRef proxy = ClusterSharding.get(system).start( "conversations", Props.create(ChatActor.class), ClusterShardingSettings.create(system), new MyMessageExtractor()); proxy.tell(new ChatMessage( UUID.fromString("67c67d28-4719-4bf9-bfe6-3944ed961a60"), "hello!"));
2015: Let's try it out first 2016: Collaboration in production Real-time text message exchange between employees Text interface to automated travel agent In development: documents, FTP Stuff that works well for us: https://github.com/Tradeshift/ts-reaktive/ ts-reaktive-actors: Persistent actor base classes with reasonable defaults, and HTTP API for event journal ts-reaktive-marshal: Non-blocking streaming marshalling framework for akka streams ts-reaktive-replication: Master-slave replication across data centers for persistent actors
Introduce a single base class for all commands
public abstract class ChatCommand {} public class GetMessageList extends ChatCommand {} public class PostMessage extends ChatCommand { private final String message; public PostMessage(String message) { this.message = message; } public String getMessage() { return message; } }
Introduce a single base class for all events
public abstract class ChatEvent {} public class MessagePosted extends ChatEvent { private final String message; public MessagePosted(String message) { this.message = message; } public String getMessage() { return message; } }
public class ChatState extends AbstractState<ChatEvent, ChatState> { public static final ChatState EMPTY = new ChatState(Vector.empty()); private final Seq<String> messages; @Override public ChatState apply(ChatEvent event) { if (event instanceof MessagePosted) { return new ChatState(messages.append( MessagePosted.class.cast(event).getMessage())); } else { return this; } } }
Plain old Java class, hence easily unit testable (compared to actor)
public class ChatActor extends AbstractStatefulPersistentActor< ChatCommand, ChatEvent, ChatState> { public static abstract class Handler extends AbstractCommandHandler< ChatCommand, ChatEvent, ChatState> { } @Override protected ChatState initialState() { return ChatState.EMPTY; } @Override protected PartialFunction<ChatCommand,Handler> applyCommand() { return new PFBuilder<ChatCommand,Handler>() .match(ChatCommand.GetMessageList.class, cmd -> new GetMessageListHandler(getState(), cmd)) .match(ChatCommand.PostMessage.class, cmd -> new PostMessageHandler(getState(), cmd)) .build(); } }
Business logic pushed out to ChatState and *Handler
public class PostMessageHandler extends ChatActor.Handler { @Override public Seq<ChatEvent> getEventsToEmit() { return Vector.of(new ChatEvent.MessagePosted( ChatCommand.PostMessage.class.cast(cmd).getMessage())); } @Override public Object getReply(Seq<ChatEvent> emittedEvents, long lastSequenceNr) { return Done.getInstance(); } }
Plain old Java class, hence easily unit testable (compared to actor) Behaviour of different commands separated into different classes
We wanted easy consumption of the event log by other systems HTTP chunked encoding (1 chunk per event), without completion
GET /events?since=1473689920000 200 OK Content-Type: application/protobuf Transfer-Encoding: chunked 14 (14 bytes with the first protobuf event) 11 (11 bytes with the second protobuf event) (TCP stream stalls here)
Additional events can arrive in real time
Presentation
nodejs
Workflow
java
Content
java + akka Cassandra journal PostgreSQL materialization
Content backend has API to add messages to conversations Messages go into the event journal Journal is queryable over HTTP Presentation backend listens to event stream Materializes into views that are UI dependent Can combine other sources as well Browser talks to both Presentation and Content backends Web socket stream informs browser of incoming messages
Scalable systems: check Scalable development: check We're not quite there yet Akka, cassandra and the reaktive combo in active development Attitude of I've done Spring for 10+ years successfully, why would I learn this The proof is in more pudding Want to get involved? Get: http://akka.io/ Read: http://doc.akka.io/docs/akka/current/java.html Chat: https://gitter.im/akka/akka Hack: https://github.com/Tradeshift/ts-reaktive/ and https://github.com/jypma/ts-reaktive-examples/
Graph is a blueprint for a closed, finite network of stages, which communicate by streaming elements GraphStage<S extends Shape> is one processing stage within a graph, taking elements in through zero or more Inlets, and emitting through Outlets It's completely up to the stage when and how to respond to arriving elements All built-in graph stages embrace backpressure and bounded processing Mostly used graph stages Source<T, M> has one outlet of type T Sink<T, M> has one inlet of type T Flow<A, B, M> has one inlet of type A and one outlet of type B RunnableGraph<M> has no inlets or outlets Reactive streams Akka is a reactive streams implementation (just like RxJava and others) You typically don't interact in terms of publisher and subscriber directly
final ActorSystem system = ActorSystem.create("QuickStart"); final Materializer materializer = ActorMaterializer.create(system); final Source<Integer, NotUsed> numbers = Source.range(1, 100); final Sink<Integer, CompletionStage<Done>> print = Sink.foreach(i -> System.out.println(i)); final CompletionStage<Done> done = numbers.runWith(print, materializer); // Output: // 1 // 2 // ...
Graph is only a blueprint: nothing runs until it's given to a materializer, typically ActorMaterializer All graph stages are generic in their materialized type M Graph can be materialized (run, runWith) more than once
class Source<T, M> { // A graph which materializes into the M2 of the sink (ignoring source's M) public RunnableGraph<M2> to(Sink<T,M2> sink); // Materializes, and returns the M of the sink (ignoring this source's M) public <M2> M2 runWith(Sink<T, M2> sink, Materializer m) { ... } // A graph which materializes into the result of applying [combine] to // this source's M and the sink's M2 public <M2, MR> RunnableGraph<MR> toMat(Sink<T,M2> sink, Function2<M,M2,MR> combine); } class RunnableGraph<M> { public M run(Materializer m); }
Source, Sink and Flow are all normal, immutable objects, so they're ideal to be constructed in reusable factory methods:
public Sink<String, CompletionStage<IOResult>> lineSink(String filename) { Sink<ByteString, CompletionStage<IOResult>> file = FileIO.toPath(Paths.get(filename); // Let's start with some strings return Flow.of(String.class) // Flow<String, String, NotUsed> // Convert them into bytes (UTF-8), adding a newline // We now have a Flow<String, ByteString, NotUsed> .map(s -> ByteString.fromString(s + "\n")) // Send them into a file, and we want the IOResult of the // FileIO sink as materialized value of our own sink .toMat(file), Keep.right()); } numbers.runWith(lineSink("numbers.txt"), materializer);
final Source<Integer, NotUsed> numbers = Source.range(1, 100000000); final Sink<Integer, CompletionStage<Done>> print = Sink.foreach(i -> System.out.println(i)); final CompletionStage<Done> done = numbers .throttle(1, Duration.create(1, TimeUnit.SECONDS), 1, ThrottleMode.shaping()) .runWith(print, materializer);
This does what you expect: print one message per second No OutOfMemoryError, akka buffers only as needed: backpressure
Materialize as ActorRef
Source<T, ActorRef> s = Source.actorRef(10, OverflowStrategy.fail());
Materialize as reactive Subscriber<T>
Source<T, Subscriber<T>> s = Source.asSubscriber();
Read from a reactive Publisher<T> p
Source<T, NotUsed> s = Source.fromPublisher(p);
Emit the same element regularly
Source<T, Cancellable> s = Source.tick(duration, duration, element);
Send to ActorRef
Sink<T, NotUsed> s = Sink.actorRef(target, "done");
Materialize as reactive Publisher<T>
Sink<T, Publisher<T>> s = Sink.asPublisher(WITH_FANOUT);
Materialize into a java.util.List of all elements
Sink<T, List<T>> s = Sink.seq();
Send Source<String, M> src to an additional Sink<String> sink
Source<String, M> s = src.alsoTo(sink);
Process batches of concatenated strings, but only if coming in too fast
Source<String, M> s = src.batchWeighted(1000, s -> s.length(), s -> s, (s1,s2) -> s1 + s2);
Process 1 seconds' worth of elements at a time, but at most 100
Source<List<String>, M> s = src.groupedWithin(100, Duration.create(1, SECONDS));
Invoke a CompletionStage for each element, and resume with the results in order
CompletionStage<Integer> process(String s) { ... } Source<String, M> s = src.mapAsync(this::process);