etl is dead long live streams
play

ETL is dead; long-live streams Neha Narkhede, Co-founder & - PowerPoint PPT Presentation

ETL is dead; long-live streams Neha Narkhede, Co-founder & CTO, Confluent Data and data systems have really changed in the past decade Old world: Two popular locations for data DB DB DWH DB DB Operational databases Relational


  1. “ #1: Powerful and lightweight Java library; need just Kafka and your app app

  2. “ #2: Convenient DSL with all sorts of operators: join(), map(), filter(), windowed aggregates etc

  3. Word count program using Kafka’s streams API

  4. “ #3: True event-at-a-time stream processing; no microbatching

  5. “ #4: Dataflow-style windowing based on event-time; handles late-arriving data

  6. “ #5: Out-of-the-box support for local state ; supports fast stateful processing

  7. External state

  8. local state

  9. Fault-tolerant local state

  10. “ #6: Kafka’s Streams API allows reprocessing; useful to upgrade apps or do A/B testing

  11. reprocessing

  12. Real-time dashboard for security monitoring

  13. Kafka’s streams api: simple is beautiful Vision 1 Vision 2

  14. Logs unify batch and stream processing

  15. New shiny future of ETL: Kafka Connect API Connect API sink source Extract Load Streams API app Transforms

  16. App App App App cache A giant mess! cache MQ MQ search monitoring security DWH Hadoop

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend