Streaming SQL to Unify Batch and Stream Processing: Theory and - - PowerPoint PPT Presentation

streaming sql to unify batch and stream processing theory
SMART_READER_LITE
LIVE PREVIEW

Streaming SQL to Unify Batch and Stream Processing: Theory and - - PowerPoint PPT Presentation

Streaming SQL to Unify Batch and Stream Processing: Theory and Practice with Apache Flink at Uber Fabian Hueske Shuyi Chen Strata Data Conference, San Jose March, 7th 2018 1 What is Apache Flink? Data Stream Processing Event-driven


slide-1
SLIDE 1

1

Streaming SQL to Unify Batch and Stream Processing: Theory and Practice with Apache Flink at Uber

Strata Data Conference, San Jose March, 7th 2018

Fabian Hueske Shuyi Chen

slide-2
SLIDE 2

What is Apache Flink?

2

Batch Processing

process static and historic data

Data Stream Processing

realtime results from data streams

Event-driven Applications

data-driven actions and services

Stateful Computations Over Data Streams

slide-3
SLIDE 3

What is Apache Flink?

3 Queries Applications Devices etc. Database Stream File / Object Storage

Stateful computations over streams real-time and historic fast, scalable, fault tolerant, in-memory, event time, large state, exactly-once

Historic Data Streams Application

slide-4
SLIDE 4

Hardened at scale

4

Streaming Platform Service billions messages per day A lot of Stream SQL Streaming Platform as a Service 3700+ container running Flink, 1400+ nodes, 22k+ cores, 100s of jobs Fraud detection Streaming Analytics Platform 100s jobs, 1000s nodes, TBs state, metrics, analytics, real time ML, Streaming SQL as a platform

slide-5
SLIDE 5

Powerful Abstractions

5

Process Function (events, state, time) DataStream API (streams, windows) SQL / Table API (dynamic tables) Stream- & Batch Data Processing High-level Analytics API Stateful Event- Driven Applications

val stats = stream .keyBy("sensor") .timeWindow(Time.seconds(5)) .sum((a, b) -> a.add(b))

def processElement(event: MyEvent, ctx: Context, out: Collector[Result]) = { // work with event and state (event, state.value) match { … }

  • ut.collect(…) // emit events

state.update(…) // modify state // schedule a timer callback ctx.timerService.registerEventTimeTimer(event.timestamp + 500) }

Layered abstractions to navigate simple to complex use cases

slide-6
SLIDE 6

Apache Flink’s Relational APIs

Unified APIs for batch & streaming data

A query specifies exactly the same result regardless whether its input is static batch data or streaming data.

6

tableEnvironment .scan("clicks") .groupBy('user) .select('user, 'url.count as 'cnt) SELECT user, COUNT(url) AS cnt FROM clicks GROUP BY user

LINQ-style Table API ANSI SQL

slide-7
SLIDE 7

Query Translation

7

tableEnvironment .scan("clicks") .groupBy('user) .select('user, 'url.count as 'cnt) SELECT user, COUNT(url) AS cnt FROM clicks GROUP BY user

Input data is bounded (batch) Input data is unbounded (streaming)

slide-8
SLIDE 8

What if “clicks” is a file?

8

Clicks

user cTime url Mary 12:00:00 https://… Bob 12:00:00 https://… Mary 12:00:02 https://… Liz 12:00:03 https://… user cnt Mary 2 Bob 1 Liz 1 SELECT user, COUNT(url) as cnt FROM clicks GROUP BY user

Input data is read at once Result is produced at once

slide-9
SLIDE 9

What if “clicks” is a stream?

9

user cTime url user cnt SELECT user, COUNT(url) as cnt FROM clicks GROUP BY user

Clicks

Mary 12:00:00 https://… Bob 12:00:00 https://… Mary 12:00:02 https://… Liz 12:00:03 https://… Bob 1 Liz 1 Mary 1 Mary 2

Input data is continuously read Result is continuously produced

The result is identical!

slide-10
SLIDE 10

Why is stream-batch unification important?

§ Usability

  • ANSI SQL syntax: No custom “StreamSQL” syntax.
  • ANSI SQL semantics: No stream-specific results.

§ Portability

  • Run the same query on bounded and unbounded data
  • Run the same query on recorded and real-time data

§ Do we need to soften SQL semantics for streaming?

10 now bounded query unbounded query past future bounded query start of the stream unbounded query

slide-11
SLIDE 11

DBMSs Run Queries on Streams

§ Materialized views (MV) are similar to regular views, but persisted to disk or memory

  • Used to speed-up analytical queries
  • MVs need to be updated when the base tables change

§ MV maintenance is very similar to SQL on streams

  • Base table updates are a stream of DML statements
  • MV definition query is evaluated on that stream
  • MV is query result and continuously updated

11

slide-12
SLIDE 12

Continuous Queries in Flink

§ Core concept is a “Dynamic Table”

  • Dynamic tables are changing over time

§ Queries on dynamic tables

  • produce new dynamic tables (which are updated based on input)
  • do not terminate

§ Stream ↔ Dynamic table conversions

12

slide-13
SLIDE 13

Stream ↔ Dynamic Table Conversions

§ Append Conversions

  • Records are only inserted/appended

§ Upsert Conversions

  • Records are inserted/updated/deleted and have a

(composite) unique key

§ Changelog Conversions

  • Records are inserted/updated/deleted

13

slide-14
SLIDE 14

SQL Feature Set in Flink 1.5.0

§ SELECT FROM WHERE § GROUP BY / HAVING

  • Non-windowed, TUMBLE, HOP, SESSION windows

§ JOIN

  • Windowed INNER, LEFT / RIGHT / FULL OUTER JOIN
  • Non-windowed INNER JOIN

§ Scalar, aggregation, table-valued UDFs § SQL CLI Client (beta) § [streaming only] OVER / WINDOW

  • UNBOUNDED / BOUNDED PRECEDING

§ [batch only] UNION / INTERSECT / EXCEPT / IN / ORDER BY

14

slide-15
SLIDE 15

What can I build with this?

§ Data Pipelines

  • Transform, aggregate, and move events in real-time

§ Low-latency ETL

  • Convert and write streams to file systems, DBMS, K-V stores, indexes, …
  • Convert appearing files into streams

§ Stream & Batch Analytics

  • Run analytical queries over bounded and unbounded data
  • Query and compare historic and real-time data

§ Data Preparation for Live Dashboards

  • Compute and update data to visualize in real-time

15

slide-16
SLIDE 16

The New York Taxi Rides Data Set

§ The New York City Taxi & Limousine Commission provides a public data set about taxi rides in New York City § We can derive a streaming table from the data § Table: TaxiRides

rideId: BIGINT // ID of the taxi ride isStart: BOOLEAN // flag for pick-up (true) or drop-off (false) event lon: DOUBLE // longitude of pick-up or drop-off location lat: DOUBLE // latitude of pick-up or drop-off location rowtime: TIMESTAMP // time of pick-up or drop-off event

16

slide-17
SLIDE 17

Identify popular pick-up / drop-off locations

SELECT cell, isStart, HOP_END(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) AS hopEnd, COUNT(*) AS cnt FROM (SELECT rowtime, isStart, toCellId(lon, lat) AS cell FROM TaxiRides WHERE isInNYC(lon, lat)) GROUP BY cell, isStart, HOP(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE)

17

§ Compute every 5 minutes for each location the number of departing and arriving taxis

  • f the last 15 minutes.
slide-18
SLIDE 18

Average ride duration per pick-up location

SELECT pickUpCell, AVG(TIMESTAMPDIFF(MINUTE, e.rowtime, s.rowtime) AS avgDuration FROM (SELECT rideId, rowtime, toCellId(lon, lat) AS pickUpCell FROM TaxiRides WHERE isStart) s JOIN (SELECT rideId, rowtime FROM TaxiRides WHERE NOT isStart) e ON s.rideId = e.rideId AND e.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL '1' HOUR GROUP BY pickUpCell

18

§ Join start ride and end ride events on rideId and compute average ride duration per pick-up location.

slide-19
SLIDE 19

Building a Dashboard

19

Elastic Search Kafka

SELECT cell, isStart, HOP_END(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) AS hopEnd, COUNT(*) AS cnt FROM (SELECT rowtime, isStart, toCellId(lon, lat) AS cell FROM TaxiRides WHERE isInNYC(lon, lat)) GROUP BY cell, isStart, HOP(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE)

slide-20
SLIDE 20

Flink SQL in Production @ UBER

20

slide-21
SLIDE 21

Uber's business is Real-Time

21

Uber

slide-22
SLIDE 22

Challenges

Infrastructure

q 100s of Billions of messages / day q At-least-once processing q Exactly-once state processing q 99.99% SLA on availability q 99.99% SLA on latency Productivity q Target audience

q Operation people q Data scientists q Engineers

q Integrations

q Logging q Backend services q Storage systems q Data management q Monitoring

22

Operation

q ~1000 streaming jobs q Multiple DCs

slide-23
SLIDE 23

Stream processing @ Uber

§ Apache Samza (Since Jul. 2015)

  • Scalable
  • At-least-once message processing
  • Managed state
  • Fault tolerance

§ Apache Flink (Since May, 2017)

  • All of above
  • Exactly-once stateful computation
  • Accurate
  • Unified stream & batch processing with SQL

23

slide-24
SLIDE 24

24

Lifecycle of building a streaming job

slide-25
SLIDE 25

Writing the job

25

Business Logics Input Output Testing Debugging

  • Java/Scala
  • Streaming/batch
  • Duplicate code
slide-26
SLIDE 26

Running the job

26

Resource estimation Deployment Monitoring & Alerts Logging Maintenance

  • Manual process
  • Hard to scale beyond > 10 jobs
slide-27
SLIDE 27

27

Job from idea to production takes days

slide-28
SLIDE 28

28

How can we improve efficiency as a platform?

slide-29
SLIDE 29

Flink SQL to be savior

29

SELECT AVG(…) FROM eats_order WHERE …

slide-30
SLIDE 30

Connectors

30

HTTP

SELECT AVG(…) FROM eats_order WHERE …

Pinot

slide-31
SLIDE 31

UI & Backend services

§ To make it self-service

  • SQL composition & validation
  • Connectors management

31

slide-32
SLIDE 32

UI & Backend services

§ To make it self-service

  • Job compilation and generation
  • Resource estimation

32

Analyze input Analyze query Test deployment

Kafka input rate Hive metastore data SELECT * FROM ... YARN containers CPU Heap memory

slide-33
SLIDE 33

UI & Backend services

§ To make it self-service

  • Job deployment

33 Sandbox

  • Functional correctness
  • Play around with SQL

Staging

  • System generated estimate
  • Production like load

Production

  • Managed

Promote

slide-34
SLIDE 34

UI & Backend services

34

§ To make it self-service

  • Job management
slide-35
SLIDE 35

UI & Backend services

§ To support Uber scale

  • Monitoring and alert automation
  • Auto-scaling
  • Job recovery
  • DC failover

35

Watchdog

slide-36
SLIDE 36

AthenaX

§ Uber's Self-service stream and batch processing platform

  • SQL
  • Connectors
  • UI & backend services

36

slide-37
SLIDE 37

37

Business use cases

slide-38
SLIDE 38

Real-time Machine Learning - UberEats ETD

38 SELECT restaurant_id, AVG(etd) AS avg_etd FROM restaurant_stream GROUP BY TUMBLE(proctime, INTERVAL '5' MINUTE), restaurant_id

slide-39
SLIDE 39

Powering Restaurant Manager

39

Better Data -> Better Food -> Better Business = A Winning Recipe

Eats restaurant manager blog

slide-40
SLIDE 40

AthenaX wins

§ SQL abstraction with Flink

  • Non-engineers to use stream processing

§ E2E Self service § Job from idea to production take minutes/hours § Centralized place to track streaming dataflows § Minimal human intervention, and scale

  • perationally to ~ 1000 jobs in production

40

slide-41
SLIDE 41

AthenaX Open Source

§ Uber engineering blog § Open source repository

41

slide-42
SLIDE 42

15% discount code: StrataFlink

slide-43
SLIDE 43

Flink Forward SF 2018 Presenters

43

slide-44
SLIDE 44

Thank you!

@fhueske @ApacheFlink

Available on O’Reilly Early Release!