Keep your Data Close and your Caches Hotter using Apache Kafka, - - PowerPoint PPT Presentation

keep your data close and your caches hotter
SMART_READER_LITE
LIVE PREVIEW

Keep your Data Close and your Caches Hotter using Apache Kafka, - - PowerPoint PPT Presentation

Keep your Data Close and your Caches Hotter using Apache Kafka, Connect and KSQL @gamussa | @riferrei | #IMCSummit @gamussa | @riferrei | #IMCSummit 2 @gamussa | @riferrei | #IMCSummit Raffle, yeah Raffle, yeah


slide-1
SLIDE 1 @gamussa | @riferrei | #IMCSummit

Keep your Data Close and your Caches Hotter

using Apache Kafka, Connect and KSQL @gamussa | @riferrei | #IMCSummit
slide-2
SLIDE 2 @gamussa | @riferrei | #IMCSummit 2
slide-3
SLIDE 3

Raffle, yeah 🚁

slide-4
SLIDE 4

Raffle, yeah 🚁

Follow @gamussa @riferrei 📹 🖽 👭 Tag @gamussa @riferrei With #IMCSummit

slide-5
SLIDE 5 @gamussa | @riferrei | #IMCSummit 4

Data is only useful if it is Fresh and Contextual

slide-6
SLIDE 6 @gamussa | @riferrei | #IMCSummit
slide-7
SLIDE 7 @gamussa | @riferrei | #IMCSummit

What if the airbag deploys 30 seconds after the collision?

slide-8
SLIDE 8 @gamussa | @riferrei | #IMCSummit
slide-9
SLIDE 9 @gamussa | @riferrei | #IMCSummit

December 6th, 2010: Commuter rail train hits elderly driver

slide-10
SLIDE 10 @gamussa | @riferrei | #IMCSummit 7

What if the information about the commuter rail train is outdated?

slide-11
SLIDE 11 @gamussa | @riferrei | #IMCSummit 8

Caches can be a Solution for Data that is Fresh

slide-12
SLIDE 12 @gamussa | @riferrei | #IMCSummit 9

APIs need to access data freely and easily

Cache API Read Write Read Write
slide-13
SLIDE 13 @gamussa | @riferrei | #IMCSummit 9

APIs need to access data freely and easily

  • Data should never be treated as a
scarce resource in applications Cache API Read Write Read Write
slide-14
SLIDE 14 @gamussa | @riferrei | #IMCSummit 9

APIs need to access data freely and easily

  • Data should never be treated as a
scarce resource in applications
  • Latency should be kept as minimal to
ensure a better user experience Cache API Read Write Read Write
slide-15
SLIDE 15 @gamussa | @riferrei | #IMCSummit 9

APIs need to access data freely and easily

  • Data should never be treated as a
scarce resource in applications
  • Latency should be kept as minimal to
ensure a better user experience
  • Data should be not be static: keep the
data fresh continuously Cache API Read Write Read Write
slide-16
SLIDE 16 @gamussa | @riferrei | #IMCSummit 9

APIs need to access data freely and easily

  • Data should never be treated as a
scarce resource in applications
  • Latency should be kept as minimal to
ensure a better user experience
  • Data should be not be static: keep the
data fresh continuously
  • Find ways to handle large amounts of
data without breaking the APIs Cache API Read Write Read Write
slide-17
SLIDE 17 @gamussa | @riferrei | #IMCSummit 10

Caches can be either built-in or distributed

Cache API Read Write Built-in Caches Cache API Distributed Caches Cache Cache Read Write
slide-18
SLIDE 18 @gamussa | @riferrei | #IMCSummit 10

Caches can be either built-in or distributed

  • If data can fit into the API memory, then
you should use built-in caches Cache API Read Write Built-in Caches Cache API Distributed Caches Cache Cache Read Write
slide-19
SLIDE 19 @gamussa | @riferrei | #IMCSummit 10

Caches can be either built-in or distributed

  • If data can fit into the API memory, then
you should use built-in caches
  • Otherwise, you may need to use
distributed caches for large sizes Cache API Read Write Built-in Caches Cache API Distributed Caches Cache Cache Read Write
slide-20
SLIDE 20 @gamussa | @riferrei | #IMCSummit 10

Caches can be either built-in or distributed

  • If data can fit into the API memory, then
you should use built-in caches
  • Otherwise, you may need to use
distributed caches for large sizes
  • Some cache implementations provides
the best of both cases Cache API Read Write Built-in Caches Cache API Distributed Caches Cache Cache Read Write
slide-21
SLIDE 21 @gamussa | @riferrei | #IMCSummit 10

Caches can be either built-in or distributed

  • If data can fit into the API memory, then
you should use built-in caches
  • Otherwise, you may need to use
distributed caches for large sizes
  • Some cache implementations provides
the best of both cases
  • For distributed caches, make sure to
always find a good way to O(1) Cache API Read Write Built-in Caches Cache API Distributed Caches Cache Cache Read Write
slide-22
SLIDE 22 @gamussa | @riferrei | #IMCSummit 11

DEMO

slide-23
SLIDE 23 @gamussa | @riferrei | #IMCSummit 12
slide-24
SLIDE 24 @gamussa | @riferrei | #IMCSummit 12
slide-25
SLIDE 25 @gamussa | @riferrei | #IMCSummit 13

Join the fun!

slide-26
SLIDE 26 @gamussa | @riferrei | #IMCSummit 14
slide-27
SLIDE 27 @gamussa | @riferrei | #IMCSummit 15

Caching Patterns

slide-28
SLIDE 28 @gamussa | @riferrei | #IMCSummit

Caching Pattern:
 Refresh Ahead

  • Proactively updates the cache
  • Keep the entries always in-sync
  • Ideal for latency sensitive cases
  • Ideal when data read is costly
  • It may need initial data loading
Kafka Connect Cache Kafka Connect API
slide-29
SLIDE 29 @gamussa | @riferrei | #IMCSummit

Caching Pattern:
 Refresh Ahead / Adapt

  • Proactively updates the cache
  • Keep the entries always in-sync
  • Ideal for latency sensitive cases
  • Ideal when data read is costly
  • It may need initial data loading
Kafka Connect Application Cache Kafka Connect Transform and adapt records before delivery Schema Registry for canonical models API
slide-30
SLIDE 30 @gamussa | @riferrei | #IMCSummit

Caching Pattern:
 Write Behind

  • Removes I/O pressure from app
  • Allows true horizontal scalability
  • Ensures ordering and
persistence
  • Minimizes DB code complexity
  • Totally handles DB unavailability
Kafka Connect Application Cache Kafka Connect API
slide-31
SLIDE 31 @gamussa | @riferrei | #IMCSummit

Caching Pattern:
 Write Behind / Adapt

  • Removes I/O pressure from app
  • Allows true horizontal scalability
  • Ensures ordering and
persistence
  • Minimizes DB code complexity
  • Totally handles DB unavailability
Kafka Connect Application Cache Kafka Connect Transform and adapt records before delivery Schema Registry for canonical models API
slide-32
SLIDE 32 @gamussa | @riferrei | #IMCSummit

Caching Pattern:
 Event Federation

  • Replicates data across regions
  • Keep multiple regions in-sync
  • Great to improve RPO and RTO
  • Handles lazy/slow networks well
  • Works well if its used along with
Read-Through and Write-Through patterns. Confluent Replicator <<MirrorMaker>>
slide-33
SLIDE 33 @gamussa | @riferrei | #IMCSummit 21

Kafka Connect Implementation Strategies

slide-34
SLIDE 34 @gamussa | @riferrei | #IMCSummit

Kafka Connect support for In-Memory Caches

  • Connector for Redis is open and it
is available in Confluent Hub
  • Connector for Memcached is open
and it is available in Confluent Hub
  • Connectors for both GridGain and
Apache Ignite implementations.
  • Connector for InfiniSpan is open
and is maintained by Red Hat Kafka Connect Kafka Connect Kafka Connect Kafka Connect
slide-35
SLIDE 35 @gamussa | @riferrei | #IMCSummit

Frameworks for other In-Memory Caches

  • Oracle provides HotCache from
GoldenGate for Oracle Coherence
  • Hazelcast has the Jet framework,
which provides support for Kafka
  • Pivotal GemFire (Apache Geode)
has good support from Spring
  • Good news: you can always write
your own sink using Connect API Oracle GoldenGate Hazelcast Jet Spring Data Spring Kafka Connect Framework Any Cache
slide-36
SLIDE 36 @gamussa | @riferrei | #IMCSummit

Interested on DB CDC? Then meet Debezium!

  • Amazing CDC technology to pull
data out from databases to Kafka
  • Works in a log level, which means
true CDC implementation for your projects instead of record polling
  • Open-source maintained by Red
  • Hat. Have broad support for many
popular databases.
  • It is built on top of Kafka Connect
slide-37
SLIDE 37 @gamussa | @riferrei | #IMCSummit

Support for Running Kafka Connect Servers

  • Run by yourself on BareMetal:
https://kafka.apache.org/downloads https:// www.confluent.io/download
  • IaaS on AWS or Google Cloud:
https://github.com/confluentinc/ccloud-tools
  • Running using Docker Containers:
https://hub.docker.com/r/confluentinc/cp-kafka- connect/
  • Running using Kubernetes: https://
github.com/confluentinc/cp-helm-chart https:// www.confluent.io/confluent-operator/ Kafka Connect
slide-38
SLIDE 38 26

Stay in touch

cnfl.io/meetups cnfl.io/slack cnfl.io/blog
slide-39
SLIDE 39 @ @gamussa | @riferrei | #IMCSummit

Thanks!

@riferrei ricardo@confluent.io @gamussa viktor@confluent.io

https://slackpass.io/confluentcommunity #connect #ksql

slide-40
SLIDE 40 28