Monolithic Batch Goes Microservice Streaming A story about one - - PowerPoint PPT Presentation
Monolithic Batch Goes Microservice Streaming A story about one - - PowerPoint PPT Presentation
Monolithic Batch Goes Microservice Streaming A story about one transformation Charles Tye & Anton Polyakov Who are We? What We Do Develop solutions for Market Risk Credit Risk Liquidity Risk Stress Testing Messaging Together with
Who are We?
3 •
Anton Polyakov Head of Application Development 2 years in Nordea Charles Tye Head of Core Services & Risk IT 17 years in Nordea
Develop solutions for Market Risk Credit Risk Liquidity Risk Stress Testing Messaging Together with around 70 other people from all over the world
What We Do
Market Risk
4 •
The high level view Quantify potential losses and exposures Do many small risks add up to a big risk? Can risks combine in unusual and unexpected ways?
Market Risk
5 •
Line of Defence Protect Nordea and
- ur customers
Daily internal reporting and external reporting to regulators Independent function Analysis and insight into the sources of risk Control of risk Management of capital
Examples of Risk Analysis
6 •
Value at Risk Look at last 2 years of market history Average of the worst 1%
- f outcomes
Simulate if the same thing happened again today. Highly non-linear but requirement to drill in and find the drivers
Examples of Risk Analysis
7 •
Stress Scenarios “Black Swan” worst case scenarios Unexpected outcomes from future events Example: Brexit Simulate if it happened
An Interesting Technology Problem
8 •
Consistent Non-linear Volume Speed Risk Analysis: Everything has to be included = know when you are complete Risk does not sum
- ver hierarchies
Drill-down is non trivial Traditional OLAP aggregate & increment doesn’t work 10,000,000 ,000,000 Reactive near real-time calculations Streaming data Fast corrections and “what-if” Interactive sub-second queries on huge data sets
Challenge No 1.
Find the seams Break it up Reusable components Replace a piece at a time
9 •
Spaghetti
Challenge No 2.
10 •
Develop a new service Integrate into the legacy system Reconcile the output Find and fix legacy bugs Fight complification
Challenge No 3.
Batch is synchronous state transfer. The
- nly way to achieve consistency?
11 •
Consistency is seriously hard to combine with streaming
Event sourced and streaming approach More robust, scalable and faster, especially for recovery Comes with a cost
Challenge No 4.
Legacy SQL was slow
12 •
Partitions and horizontally scales out across commodity hardware. Tougher challenges on terabyte-scale hardware due to NUMA limitations. Some cubes already > 200gb and larger ones planned.
Replace with in-memory aggregation Aggregate billions of scenarios in-memory and pre-compute total vectors
- ver hierarchies (linear)
Non-linear measures computed lazily Reactive and continuous queries
Solution: Microservices!
Well almost… Single responsibility – replace pieces of legacy from the inside out Self contained with business functional boundaries
- Independent and rapid development – team owns the whole stack
- Organisationally scalable – horizontally scale your teams
Flexible and maintainable – evolve the architecture Smart endpoints and dumb pipes Innovation and short lifecycles
13 •
The problem
- Business:
- Multi-model Market Risk calculator for Nordea portfolio
- VaR on different organization levels with 5-6 different models in parallel
- IT:
- 7000 CPU hours of grid calculation
- More than 4000 SQL jobs
- Graph with more than 10000 edges
- Nightly batch flow
14 •
How did it look like?
- Well, you know. 10 years of development
- In SQL
- No refactoring
(who needs it?)
15 •
Precisely, how did it look?
16 •
Logical architecture
Monolith staged app
17 •
Now a little of complication
Sloo-o-o-ow
- Fat. So it breaks
Can be parallel?
18 •
So what to do?
We all know the answer probably (since we are at this section ☺ )
- Find logically isolated blocks
- Keep an eye on non-functional aspect
- Think of how they communicate
- Think about what happens if something dies
19 •
Not quite a “classical” microservices…or?
produce enrich aggregate
- Request/response is not feasible
- Synchronous interaction is too long
- Some results are expensive to reproduce
20 •
So we need…
A middleware which
- “Glues” services together
- Caches important results
- Serves as a coordinator and work distributor
21 •
Scale out Fast pub/sub Queues and sets pull and dedup Distributed locks
22 •
Scale out Fast pub/sub Queues and sets pull and dedup Distributed locks Locks? Who needs locks?
23 •
store store store
Pub/sub messaging as notifier
Producer Enricher Aggregator consumer Redis pub/sub
24 •
But…
25 •
There are two main problems in distributed messaging: 2) Guarantee that each message is only delivered once 1) Guarantee messages order 2) Guarantee that each message is only delivered once
Enricher
Redis pub/sub Incoming queue Processing queue Enricher Producer store Queues with atomic operations BRPOPLPUSH
26 •
Sets and Hmaps – all good for dedup
In eventually consistent world dedup is your best friend store - HSET Enricher Multiple inserts due to recovery Consistent state due to dedup
27 •
So how to scale out?
logically concurrently Enricher <type A> Enricher <type B> Enricher <type X> Redis pub/sub Aggregator <day 1> Aggregator <day 2> Aggregator <day 3> Steal work Filter my events RedLock + TTL
28 •
Demo
store store store Producer Enricher Aggregator consumer Redis pub/sub Incoming queue Processing queue RedLock + TTL
29 •
The Result and What We Learned
Success!
- Aggregate and produce risk: 5 hours → 30 mins
- Corrections: 40 mins → 1 second
- Earlier deliveries – more time to manage the risks
- Faster recovery from problems
- Happy risk managers
Important (and painful) to integrate new services into the existing system Consistency is hard to combine with streaming (subject of another talk maybe) When distributing remember first law of distributed objects architecture
(do you remember it?)
30 •
The Result and What We Learned
First Law of Distributed Object Design: "don't distribute your objects"
31 •
And of course…
32 •
https://dk.linkedin.com/in/charles-tye-a8aa88b
https://github.com/parallelstream/