1
Cloud Native Data Pipelines with Apache Kafka
Gwen Shapira, Software Engineer @gwenshap
Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, - - PowerPoint PPT Presentation
1 Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, Software Engineer @gwenshap 2 What is a Cloud Native Application? 3 Resilience Elasticity Common ideas Agility DevOps 4 You will build Cloud Native Applications
1
Gwen Shapira, Software Engineer @gwenshap
2
3
Common ideas
4
5
6
7
Returns?
Fulfill Order
Validate Order
Orders
Inventory
8
9
10
11
12
Order Service Validation
V a l i d a t e O r d e r ( i d , u s e r , p r
u c t , p r i c e , a m
n t . . ) T r u e
13
14
Order Service Validation
V a l i d a t e O r d e r ( i d , u s e r , p r
u c t , p r i c e , a m
n t . . ) T r u e
Fraud Service Alert Service
WARNING: Antipattern
15
Order Service Validation
V a l i d a t e O r d e r ( i d , u s e r , p r
u c t , p r i c e , a m
n t . . ) T r u e
Fraud Service Order history Service
customer history Service credit alerts Service
WARNING: Antipattern
16
Order Service Validation
V a l i d a t e O r d e r ( i d , u s e r , p r
u c t , p r i c e , a m
n t . . ) T r u e
17
Order Service Validation
Proxy
new Validation
18
The challenges
19
20
21
Events are not:
Events are:
22
Buying an iPad (with REST)
Service to tell it to ship item.
to ship to (from Customer Service)
Submit Order shipOrder() getCustomer() Orders Service Shipping Service Customer Service Webserver
23
Using events for Notification
about the Shipping service (or any
forget.
Submit Order Order Created getCustomer()
REST Notification
Orders Service Shipping Service Customer Service Webserver Event Bus == Kafka
24
Using events to share facts
events, into the shipping service, where it is queried locally. .
Customer Updated Submit Order Order Created
Data is replicated
Orders Service Shipping Service Customer Service Webserver Event Bus == Kafka
Mainframe APACHE KAFKA Kafka Connect
Database APACHE KAFKA Kafka Connect
update table accounts set total=total+50 where id=600 {key=600,
vip=f total=300 …}, new_record={… vip=f, total=350, … } } Debezium
27
28
{order:1, product: iphone, status: created }
event 1 {order:1, product: iphone, status: valid } event 2 {order:2, product: ipad, status: created } event 3 {order:1, product: iphone, status: shipped } event 4
29
{order:1, product: iphone, status: created }
event 1 {order:1, product: iphone, status: valid } event 2 {order:2, product: ipad, status: created } event 3 {order:1, product: iphone, status: shipped } event 4 Order 1 -> iphone, shipped Order 2 -> ipad, created
30
Duplicate data?
Low risk due to shared event stream Just the data you need Sharded with the application
31
32
{order:1, product: iphone, status: created }
{order:1, product: iphone, status: valid } {order:2, product: ipad, status: created } {order:1, product: iphone, status: shipped } Order 1 -> iphone, shipped Order 2 -> ipad, created Odd orders: Even orders:
33
Better than shared DB
the way I need it
34
35
36
Requirements
services
37
Orders shipments customers Reporter 1 Reporter 2
My Browser
38
Instance 2 Trade Stats App Instance 1 Trade Stats App
Changelog Topic
restore
39
40
Who controls the data format?
Producer Integrator Consumer pre-processor Raw events Clean Standard events Enriched Aggregated events Consumer
43
44
Remember This
architectures
events
45
46
Step 1 Step 2 Step 3 Step 4 Orchestrator
Step 1 If success: Step 2 Else: Step 3 Finally: Step 4
47
Step 1 Step 2 Step 3 Step 4
Orders
Success Fail Shipped emailed