Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, - - PowerPoint PPT Presentation

cloud native data pipelines with apache kafka
SMART_READER_LITE
LIVE PREVIEW

Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, - - PowerPoint PPT Presentation

1 Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, Software Engineer @gwenshap 2 What is a Cloud Native Application? 3 Resilience Elasticity Common ideas Agility DevOps 4 You will build Cloud Native Applications


slide-1
SLIDE 1

1

Cloud Native Data Pipelines with Apache Kafka

Gwen Shapira, Software Engineer @gwenshap

slide-2
SLIDE 2

2

What is a Cloud Native Application?

slide-3
SLIDE 3

3

Common ideas

Resilience Elasticity Agility DevOps

slide-4
SLIDE 4

4

You will build Cloud Native Applications from
 Non Cloud Native components

slide-5
SLIDE 5

5

What do 
 Cloud Native architectures 
 look like?

slide-6
SLIDE 6

6

You Have Microservices

slide-7
SLIDE 7

7

They need to communicate

slide-8
SLIDE 8

Returns?

Fulfill Order

Validate Order

Orders

Inventory

8

I know! I’ll use REST APIs

slide-9
SLIDE 9

9

But, we forgot something…

slide-10
SLIDE 10

10

The Problem is DATA

slide-11
SLIDE 11

11

Cloud Native Architectures are Different.
 We need data architectures for cloud. And Data is about context and sharing

slide-12
SLIDE 12

12

Order Service Validation

V a l i d a t e O r d e r ( i d , u s e r , p r

  • d

u c t , p r i c e , a m

  • u

n t . . ) T r u e

Lets say I have this:

slide-13
SLIDE 13

13

We need Fraud Detection

slide-14
SLIDE 14

14

Order Service Validation

V a l i d a t e O r d e r ( i d , u s e r , p r

  • d

u c t , p r i c e , a m

  • u

n t . . ) T r u e

Fraud Service Alert Service

Option:

WARNING: Antipattern

slide-15
SLIDE 15

15

Order Service Validation

V a l i d a t e O r d e r ( i d , u s e r , p r

  • d

u c t , p r i c e , a m

  • u

n t . . ) T r u e

Fraud Service Order history Service

Option:

customer history Service credit alerts Service

WARNING: Antipattern

slide-16
SLIDE 16

16

Order Service Validation

V a l i d a t e O r d e r ( i d , u s e r , p r

  • d

u c t , p r i c e , a m

  • u

n t . . ) T r u e

What I want is really smart validator

slide-17
SLIDE 17

17

Order Service Validation

Maybe even more than one

Proxy

new Validation

slide-18
SLIDE 18

18

The challenges

  • Services are really Stateful
  • Data has history
  • Data is shared
slide-19
SLIDE 19

19

Lets Look at Patterns

slide-20
SLIDE 20

20

Publish Events

slide-21
SLIDE 21

21

Events are not:

  • Commands
  • Queries
  • Requests
  • Things that happened
  • Notification
  • Data

Events are:

slide-22
SLIDE 22

22

Buying an iPad 
 (with REST)

  • Orders Service calls Shipping

Service to tell it to ship item.

  • Shipping service looks up address

to ship to (from Customer Service)

Submit Order shipOrder() getCustomer() Orders Service Shipping Service Customer Service Webserver

slide-23
SLIDE 23

23

Using events for Notification

  • Orders Service no longer knows

about the Shipping service (or any

  • ther service). Events are fire and

forget.

Submit Order Order Created getCustomer()

REST Notification

Orders Service Shipping Service Customer Service Webserver Event Bus == Kafka

slide-24
SLIDE 24

24

Using events to 
 share facts

  • Call to Customer service is gone.
  • Instead data in replicated, as

events, into the shipping service, where it is queried locally. .

Customer Updated Submit Order Order Created

Data is replicated

Orders Service Shipping Service Customer Service Webserver Event Bus == Kafka

slide-25
SLIDE 25

Need someone else’s events? 


Change Data Capture

Mainframe APACHE KAFKA Kafka 
 Connect

slide-26
SLIDE 26

Need someone else’s events? 


Change Data Capture

Database APACHE KAFKA Kafka 
 Connect

update table accounts
 set total=total+50 where id=600 {key=600,

  • ld_record={…


vip=f 
 total=300
 …},
 new_record={…
 vip=f,
 total=350, …
 } } Debezium

slide-27
SLIDE 27

27

Local state
 for Microservices

slide-28
SLIDE 28

28

{order:1,
 product: iphone, status: created }

We have a stream of events:

event 1 {order:1,
 product: iphone, status: valid } event 2 {order:2,
 product: ipad, status: created } event 3 {order:1,
 product: iphone, status: shipped } event 4

slide-29
SLIDE 29

29

{order:1,
 product: iphone, status: created }

Store current state:

event 1 {order:1,
 product: iphone, status: valid } event 2 {order:2,
 product: ipad, status: created } event 3 {order:1,
 product: iphone, status: shipped } event 4 Order 1 -> iphone, shipped
 Order 2 -> ipad, created

slide-30
SLIDE 30

30

Duplicate data?

Low risk due to shared event stream Just the data you need Sharded with the application

slide-31
SLIDE 31

31

slide-32
SLIDE 32

32

{order:1,
 product: iphone, status: created }

Sharded View

{order:1,
 product: iphone, status: valid } {order:2,
 product: ipad, status: created } {order:1,
 product: iphone, status: shipped } Order 1 -> 
 iphone, shipped
 Order 2 -> 
 ipad, created
 Odd orders: Even orders:

slide-33
SLIDE 33

33

Better than shared DB

  • The data I need, 


the way I need it

  • Reduced dependencies
  • Low latency
  • Events are also triggers
slide-34
SLIDE 34

34

select order_id, customer_id, product where total_value>10000
 …
 And also, if you get one like that in the future, execute callback()

slide-35
SLIDE 35

35

Reporting Live 
 from Streams of Events

slide-36
SLIDE 36

36

Requirements

  • Aggregated reports
  • Combining data from many

services

  • Updated in real time
  • Scalable and resilient
slide-37
SLIDE 37

37

Orders shipments customers Reporter 1 Reporter 2

My Browser

slide-38
SLIDE 38

38

Instance 2 Trade Stats App Instance 1 Trade Stats App

Changelog Topic

restore

State Recovery

slide-39
SLIDE 39

39

3-layer data model

slide-40
SLIDE 40

40

Who controls the data format?

  • Publishers?
  • Consumers?
  • How do we share events?

Producer Integrator Consumer
 pre-processor Raw events Clean
 Standard
 events Enriched
 Aggregated events Consumer

slide-41
SLIDE 41

In Event Streaming World Event Schemas ARE the API

slide-42
SLIDE 42
slide-43
SLIDE 43

43

Take Away Points!

slide-44
SLIDE 44

44

Remember This

  • As you design cloud-native

architectures 


  • don’t forget the data
  • Publish events
  • Build views and reports from

events

  • Be nice to each other
slide-45
SLIDE 45

45

Orchestration vs Choreography

slide-46
SLIDE 46

46

Orchestration: One Service to Rule them all

Step 1 Step 2 Step 3 Step 4 Orchestrator

Step 1
 If success: Step 2 Else: Step 3 Finally: Step 4

slide-47
SLIDE 47

47

Choreography: We react to each other

Step 1 Step 2 Step 3 Step 4

Orders

Success Fail Shipped emailed