Streaming Microservices: Contracts & Compatibility Gwen Shapira - - PowerPoint PPT Presentation

streaming microservices contracts compatibility gwen
SMART_READER_LITE
LIVE PREVIEW

Streaming Microservices: Contracts & Compatibility Gwen Shapira - - PowerPoint PPT Presentation

Streaming Microservices: Contracts & Compatibility Gwen Shapira Confluent Inc . 1 APIs are contracts between services {user_id: 53, address: 2 Elm st} Quote Profile service service {user_id: 53, quote: 580} 2 But not all


slide-1
SLIDE 1

1

Streaming Microservices: Contracts & Compatibility Gwen Shapira

Confluent Inc.

slide-2
SLIDE 2

2

APIs are contracts between services

Profile service Quote service

{user_id: 53, address: “2 Elm st”} {user_id: 53, quote: 580}

slide-3
SLIDE 3

3

But not all services Talk to each other directly

Profile service Quote service

{user_id: 53, address: “2 Elm st.”} {user_id: 53, quote: 580}

slide-4
SLIDE 4

4

And naturally…

Profile service Quote service Stream processing Profile database

{user_id: 53, address: “2 Elm st.”}

slide-5
SLIDE 5

5

… and then you have a streaming platform

Producer Consumer

Streaming Applications Connectors Connectors

Apache Kafka

slide-6
SLIDE 6

6

Schema are APIs.

slide-7
SLIDE 7

7

It isn’t just about the services

Software engineering Teams & Culture Data & Metadata

slide-8
SLIDE 8

8

Lack of Schema can tightly couple teams and services

2001 2001 Citrus Heights-Sunrise Blvd Citrus_Hghts 60670001 3400293 34 SAC Sacramento SV Sacramento Valley SAC Sacramento County APCD SMA8 Sacramento Metropolitan Area CA 6920 Sacramento 28 6920 13588 7400 Sunrise Blvd 95610 38 41 56 38.6988889 121 16 15.98999977

  • 121.271111

10 4284781 650345 52

slide-9
SLIDE 9

9

Schemas are about how teams work together

Booking service

{user_id: 53, timestamp: 1497842472 1497842472} create table ( use_id number, timestamp number) new Date(timestamp)

Attribution service Booking DB

slide-10
SLIDE 10

10

Booking service

{user_id: 53, timestamp: “June 28, 2017 4:00pm” “June 28, 2017 4:00pm”}

Attribution service Booking DB

slide-11
SLIDE 11

11

Moving fast and breaking things

Booking service

{user_id: 53, timestamp: “June 28, 2017 4:00pm June 28, 2017 4:00pm”} create table ( use_id number, timestamp number) new Date(timestamp)

Attribution service Booking DB

slide-12
SLIDE 12

12

Back in my day… It was never a problem.

slide-13
SLIDE 13

13

And then it was.

slide-14
SLIDE 14

14

Moving data around since 1997 Missing my Schema since 2012. Apache Kafka PMC

Tweeting a lot @gwenshap

slide-15
SLIDE 15

15

Existing solutions

slide-16
SLIDE 16

16

Existing solutions

“It is a communication problem” “We need to improve our process” “We need to document everything and get stakeholder approval”

slide-17
SLIDE 17

17

Schema are APIs.

We need specifications We need to make changes to them We need to detect breaking changes We need versions We need tools

slide-18
SLIDE 18

18

Imagine a world where engineers can find the data they need and use it safely. Its easy if you try

slide-19
SLIDE 19

19

There are benefits to doing this well

Booking service Bookings Profile updates

Room Gift service loyalty service

Room gift requests

slide-20
SLIDE 20

20

Sometimes, magic happens

Booking service

Profile updates

Room Gift service loyalty service

Room gift requests Bookings Ne New! w! Be Beach ch pro promo

slide-21
SLIDE 21

21

… but most days I’m happy if the data pipelines are humming and nothing breaks.

slide-22
SLIDE 22

22

slide-23
SLIDE 23

23

Forward compatibility:

slide-24
SLIDE 24

24

Forward & Backward compatibility:

slide-25
SLIDE 25

25

Compatibility Rules

Av Avro JSO JSON Forward Compatibility

Can add fields Can delete optional fields (nullable / default) Can add fields

Backward Compatibility

Can delete fields Can add optional fields Can delete fields

Full Compatibility

Can only modify optional fields Nothing is safe

slide-26
SLIDE 26

26

It is confusing. So it is tempting to simplify

“Never change anything” “Adding fields is ok. Deleting is not” ”Everything is always optional except for the primary key”

slide-27
SLIDE 27

27

Enter Schema Registry

slide-28
SLIDE 28

28

Schema Registries Everywhere

slide-29
SLIDE 29

29

What do Schema Registries do?

1. Store schemas – put/get

  • 2. Link one or more schema to each event
  • 3. Java client that fetches & caches schemas
  • 4. Enforcement of compatibility rules
  • 5. Graphical browser
slide-30
SLIDE 30

30

Make those contracts binding

SerializationException

slide-31
SLIDE 31

31

Responsibility is slightly distributed

Producer Serializer Schema Registry

slide-32
SLIDE 32

32

Producers contain Serializers

1. Define the serializers:

props.put("key.serializer key.serializer", ”org.apache.kafka.serializers.StringSerializer

  • rg.apache.kafka.serializers.StringSerializer");

props.put("value.serializer value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer io.confluent.kafka.serializers.KafkaAvroSerializer"); props.put("schema.registry.url schema.registry.url", schemaUrl); … producer<String, LogLine> producer = new new KafkaProducer<String, LogLine>(props);

  • 2. Create a record:

ProducerRecord<String, LogLine> record = new new ProducerRecord<String, LogLine>(topic, event.getIp().toString(), event event);

3. Send the record:

producer.send(record);

slide-33
SLIDE 33

33

Serializers cache schemas, register new schema … and serialize

serialize(topic, isKey, object): subject = getSubjectName(topic, isKey) schema = getSchema(record) schemaIdMap = schemaCache.get(subject) if if (schemaIdMap.containsKey(schema): id = schemaIdMap.get(schema) else else id = registerAndGetId registerAndGetId(subject, schema) schemaIdMap.put(schema, id)

  • utput = MAGIC_BYTE + id + avroWriter(schema, object)
slide-34
SLIDE 34

34

Schema Registry caches schemas and validates compatibility

register(schema, subject): if if (schemaIsNewToSubject): prevSchema = getPrevSchema(subject) level = getCompatibilityLevel(subject) if if (level == FULL): validator = = new new SchemaValidatorBuilder().mutualReadStrategy().validateLatest() if if (validator.isCompatible(schema, prevSchema)) register else else throw …

slide-35
SLIDE 35

35

slide-36
SLIDE 36

36

Maven Plugin – because we prefer to catch problems in CI/CD

http://docs.confluent.io/current/schema-registry/docs/maven-plugin.html

  • schema-registry:download
  • sc

schema ma-regis istry:test-compatib ibilit ility

  • schema-registry:register
slide-37
SLIDE 37

37

So the flow is…

Dev Nightly build / merge Prod Test Registry Prod Registry Test Dev or Mock Registry

slide-38
SLIDE 38

38

What if…. I NEED to break compatibility?

Customer_v1 Customer_v2 Translator

slide-39
SLIDE 39

39

I have this stream processing job…

Nodes can will modify the schema

slide-40
SLIDE 40

40

Tracking services for fun and profit

slide-41
SLIDE 41

41

Schema discovery for fun and profit

slide-42
SLIDE 42

42

Can we enforce compliance better?

slide-43
SLIDE 43

43

Speaking of headers…

slide-44
SLIDE 44

44

And really, as an old school DBA I miss my constraints

slide-45
SLIDE 45

45

Why should Avro users have all the fun?

slide-46
SLIDE 46

46

Summary!

  • 1. Schema are APIs for event-driven services
  • 2. Which means compatibility is critical
  • 3. Use Schema Registry from Dev to Prod

4.Schema Registry is in Confluent Open Source

slide-47
SLIDE 47

47

Thank You!