The Flink Big Data Analytics Platform Marton Balassi, Gyula Fora - - PowerPoint PPT Presentation

the flink big data analytics platform
SMART_READER_LITE
LIVE PREVIEW

The Flink Big Data Analytics Platform Marton Balassi, Gyula Fora - - PowerPoint PPT Presentation

The Flink Big Data Analytics Platform Marton Balassi, Gyula Fora {mbalassi, gyfora}@apache.org What is Apache Flink? Open Started in 2009 by the Berlin-based database research groups Source In the Apache Incubator since mid 2014 61


slide-1
SLIDE 1

The Flink Big Data Analytics Platform

Marton Balassi, Gyula Fora {mbalassi, gyfora}@apache.org

slide-2
SLIDE 2

What is Apache Flink?

  • Started in 2009 by the Berlin-based database

research groups

  • In the Apache Incubator since mid 2014
  • 61 contributors as of the 0.7.0 release

Open Source

  • Fast, general purpose distributed data processing

system

  • Combining batch and stream processing
  • Up to 100x faster than Hadoop

Fast + Reliable

  • Programming APIs for Java and Scala
  • Tested on large clusters

Ready to use

11/18/2014 2 Flink - M. Balassi & Gy. Fora

slide-3
SLIDE 3

What is Apache Flink?

Master Worker Worker

Flink Cluster

Analytical Program Flink Client & Op0mizer

11/18/2014 3 Flink - M. Balassi & Gy. Fora

slide-4
SLIDE 4

This Talk

  • Introduction to Flink
  • API overview
  • Distinguishing Flink
  • Flink from a user perspective
  • Performance
  • Flink roadmap and closing

11/18/2014 4 Flink - M. Balassi & Gy. Fora

slide-5
SLIDE 5

Open source Big Data Landscape

MapReduce Hive Flink Spark Storm Yarn Mesos HDFS Mahout Cascading Tez Pig Data processing engines App and resource management Applica3ons Storage, streams KaCa HBase Crunch …

11/18/2014 5 Flink - M. Balassi & Gy. Fora

slide-6
SLIDE 6

Flink stack

Common API Storage Streams Hybrid Batch/Streaming Run0me

HDFS

Files S3

Cluster Manager YARN EC2 Na0ve Flink Op0mizer Scala API

(batch)

Graph API

(„Spargel“)

JDBC

Redis Rabbit MQ KaCa Azure

Java Collec0ons Streams Builder Apache Tez Python API Java API

(streaming)

Apache MRQL Batch Streaming Java API

(batch)

Local Execu0on

11/18/2014 6 Flink - M. Balassi & Gy. Fora

slide-7
SLIDE 7

Flink APIs

slide-8
SLIDE 8

Programming model

DataSet/Stream A A (1) A (1) A (2) A (2) B (1) B (1) B (2) B (2) C (1) C (1) C (2) C (2) X X Y Y

Program Parallel Execution X Y Operator X Operator Y Data abstractions: Data Set, Data Stream

DataSet/Stream B DataSet/Stream C

11/18/2014 8 Flink - M. Balassi & Gy. Fora

slide-9
SLIDE 9

Flexible pipelines

Reduce Join Map Reduce Map Iterate Source Sink Source

Map, FlatMap, MapPartition, Filter, Project, Reduce, ReduceGroup, Aggregate, Distinct, Join, CoGoup, Cross, Iterate, Iterate Delta, Iterate-Vertex-Centric, Windowing

11/18/2014 9 Flink - M. Balassi & Gy. Fora

slide-10
SLIDE 10

WordCount, Java API

DataSet<String> text = env.readTextFile(input); DataSet<Tuple2<String, Integer>> result = text .flatMap((str, out) -> { for (String token : value.split("\\W")) {

  • ut.collect(new Tuple2<>(token, 1));

}) .groupBy(0) .sum(1);

11/18/2014 10 Flink - M. Balassi & Gy. Fora

slide-11
SLIDE 11

WordCount, Scala API

val input = env.readTextFile(input); val words = input flatMap { line => line.split("\\W+") } val counts = words groupBy { word => word } count()

11/18/2014 11 Flink - M. Balassi & Gy. Fora

slide-12
SLIDE 12

WordCount, Streaming API

DataStream<String> text = env.readTextFile(input); DataStream<Tuple2<String, Integer>> result = text .flatMap((str, out) -> { for (String token : value.split("\\W")) {

  • ut.collect(new Tuple2<>(token, 1));

}) .groupBy(0) .sum(1);

11/18/2014 12 Flink - M. Balassi & Gy. Fora

slide-13
SLIDE 13

Is there anything beyond WordCount?

11/18/2014 13 Flink - M. Balassi & Gy. Fora

slide-14
SLIDE 14

Beyond Key/Value Pairs

DataSet<Page> pages = ...; DataSet<Impression> impressions = ...; DataSet<Impression> aggregated = impressions .groupBy("url") .sum("count"); pages.join(impressions).where("url").equalTo("url") .print() // outputs pairs of matching pages and impressions class Impression { public String url; public long count; } class Page { public String url; public String topic; } // outputs pairs of pages and impressions

11/18/2014 14 Flink - M. Balassi & Gy. Fora

slide-15
SLIDE 15

Preview: Logical Types

DataSet<Row> dates = env.readCsv(...).as("order_id", "date"); DataSet<Row> sessions = env.readCsv(...).as("id", "session"); DataSet<Row> joined = dates .join(session).where("order_id").equals("id"); joined.groupBy("date").reduceGroup(new SessionFilter()) class SessionFilter implements GroupReduceFunction<SessionType> { public void reduce(Iterable<SessionType> value, Collector out){ ... } } public class SessionType { public String order_id; public Date date; public String session; }

11/18/2014 15 Flink - M. Balassi & Gy. Fora

slide-16
SLIDE 16

Distinguishing Flink

slide-17
SLIDE 17

Hybrid batch/streaming runtime

  • Batch and stream processing in the same system
  • No micro-batches, unified runtime
  • Competitive performance
  • Code reusable from batch processing to streaming,

making development and testing a piece-of-cake

11/18/2014 17 Flink - M. Balassi & Gy. Fora

slide-18
SLIDE 18

Flink Streaming

  • Most Data Set operators are also available for

Data Streams

  • Temporal and streaming specific operators

– Window/mini-batch operators – Window join, cross etc.

  • Support for iterative stream processing
  • Connectors for different data sources

– Kafka, Flume, RabbitMQ, Twitter etc.

11/18/2014 18 Flink - M. Balassi & Gy. Fora

slide-19
SLIDE 19

Flink Streaming

//Build new model on every second of new data DataStream<Double[]> model= env .addSource(new TrainingDataSource()) .window(1000) .reduceGroup(new ModelBuilder()); //Predict new data using the most up-to-date model DataStream<Integer> prediction = env .addSource(new NewDataSource()) .connect(model) .map(new Predictor());

11/18/2014 19 Flink - M. Balassi & Gy. Fora

slide-20
SLIDE 20

Lambda architecture

Source: https://www.mapr.com/developercentral/lambda-architecture

11/18/2014 20 Flink - M. Balassi & Gy. Fora

slide-21
SLIDE 21

Lambda architecture in Flink

11/18/2014 21 Flink - M. Balassi & Gy. Fora

slide-22
SLIDE 22

Dependability

JVM Heap

Flink Managed Heap

Network Buffers

Unmanaged Heap

(next version unifies network buffers and managed heap)

User Code Hashing/Sor0ng/Caching

  • Flink manages its own memory
  • Caching and data processing happens in a dedicated

memory fraction

  • System never breaks the


JVM heap, gracefully spills Shuffles/Broadcasts

11/18/2014 22 Flink - M. Balassi & Gy. Fora

slide-23
SLIDE 23
  • serializes data every time



  Highly robust, never gives up on you

  • works on objects, RDDs may be stored serialized



 Serialization considered slow, only when needed

  • makes serialization really cheap:



  partial deserialization, operates on serialized form
  Efficient and robust!

Operating on Serialized Data

11/18/2014 23 Flink - M. Balassi & Gy. Fora

slide-24
SLIDE 24

Operating on Serialized Data

Microbenchmark

  • Sorting 1GB worth of (long, double) tuples
  • 67,108,864 elements
  • Simple quicksort

11/18/2014 24 Flink - M. Balassi & Gy. Fora

slide-25
SLIDE 25

Memory Management

public class WC { public String word; public int count; }

empty page

Pool of Memory Pages

  • Works on pages of bytes, maps objects transparently
  • Full control over memory, out-of-core enabled
  • Algorithms work on binary representation
  • Address individual fields (not deserialize whole object)
  • Move memory between operations

11/18/2014 25 Flink - M. Balassi & Gy. Fora

slide-26
SLIDE 26

Flink from a user perspective

slide-27
SLIDE 27

Flink programs run everywhere

Cluster (Batch) Cluster (Streaming) Local Debugging

Fink Run3me or Apache Tez As Java Collec0on Programs

Embedded (e.g., Web Container)

11/18/2014 27 Flink - M. Balassi & Gy. Fora

slide-28
SLIDE 28

Migrate Easily

Flink out-of-the-box supports

  • Hadoop data types (writables)
  • Hadoop Input/Output Formats
  • Hadoop functions and object model

Input Map Reduce Output

DataSet DataSet DataSet

Red Join

DataSet

Map

DataSet Output

S

Input

11/18/2014 28 Flink - M. Balassi & Gy. Fora

slide-29
SLIDE 29
  • Requires no memory thresholds to configure

– Flink manages its own memory

  • Requires no complicated network configs

– Pipelining engine requires much less memory for data exchange

  • Requires no serializers to be configured

– Flink handles its own type extraction and data representation

  • Programs can be adjusted to data automatically

– Flink’s optimizer can choose execution strategies automatically

Little tuning or configuration required

11/18/2014 29 Flink - M. Balassi & Gy. Fora

slide-30
SLIDE 30

Understanding Programs

Visualizes the operations and the data movement of programs

Analyze after execution

Screenshot from Flink’s plan visualizer

11/18/2014 30 Flink - M. Balassi & Gy. Fora

slide-31
SLIDE 31

Understanding Programs

Analyze after execution (times, stragglers, …)

11/18/2014 31 Flink - M. Balassi & Gy. Fora

slide-32
SLIDE 32

Iterations in other systems

Step Step Step Step Step

Client

Loop outside the system

Step Step Step Step Step

Client

Loop outside the system

11/18/2014 32 Flink - M. Balassi & Gy. Fora

slide-33
SLIDE 33

Iterations in Flink

Streaming dataflow
 with feedback

map join red. join

System is iteration-aware, performs automatic optimization

11/18/2014 33 Flink - M. Balassi & Gy. Fora

slide-34
SLIDE 34

Automatic Optimization for Iterative Programs

Caching Loop-invariant Data Pushing work
 „out of the loop“ Maintain state as index

11/18/2014 34 Flink - M. Balassi & Gy. Fora

slide-35
SLIDE 35

Performance

slide-36
SLIDE 36

Distributed Grep

Filter Term 1

HDFS

Filter Term 2 Filter Term 3

Matches Matches Matches

  • 1 TB of data (log files)
  • 24 machines with
  • 32 GB Memory
  • Regular HDDs
  • HDFS 2.4.0
  • Flink 0.7-incubating-

SNAPSHOT

  • Spark 1.2.0-SNAPSHOT

Flink up to 2.5x faster

11/18/2014 36 Flink - M. Balassi & Gy. Fora

slide-37
SLIDE 37

Distributed Grep: Flink Execution

11/18/2014 37 Flink - M. Balassi & Gy. Fora

slide-38
SLIDE 38

Distributed Grep: Spark Execution

Filter Term 1

HDFS

Matches

Spark executes the job in 3 stages:

Stage 1

Filter Term 2

HDFS

Matches

Stage 2

Filter Term 3

HDFS

Matches

Stage 3

11/18/2014 38 Flink - M. Balassi & Gy. Fora

slide-39
SLIDE 39

Spark in-memory pinning

Filter Term 1

HDFS

Filter Term 2 Filter Term 3

Matches Matches Matches

In- Memory RDD Cache in-memory to avoid reading the data for each filter

JavaSparkContext sc = new JavaSparkContext(conf); JavaRDD<String> file = sc.textFile(inFile).persist(StorageLevel.MEMORY_AND_DISK()); for(int p = 0; p < patterns.length; p++) { final String pattern = patterns[p]; JavaRDD<String> res = file.filter(new Function<String, Boolean>() { … }); }

11/18/2014 39 Flink - M. Balassi & Gy. Fora

slide-40
SLIDE 40

Spark in-memory pinning

37 minutes 9 minutes

RDD is 100% in- memory Spark starts spilling RDD to disk

11/18/2014 40 Flink - M. Balassi & Gy. Fora

slide-41
SLIDE 41

PageRank

  • Dataset:

– Twitter Follower Graph – 41,652,230 vertices (users) – 1,468,365,182 edges (followings) – 12 GB input data

11/18/2014 41 Flink - M. Balassi & Gy. Fora

slide-42
SLIDE 42

PageRank results

11/18/2014 42 Flink - M. Balassi & Gy. Fora

slide-43
SLIDE 43

Why is there a difference?

  • Lets have a look at the iteration times:

Flink average: 48 sec., Spark average: 99 sec.

11/18/2014 43 Flink - M. Balassi & Gy. Fora

slide-44
SLIDE 44

PageRank on Flink with Delta Iterations

  • The algorithm runs 60 iterations until convergence (runtime

includes convergence check)

100 minutes 8.5 minutes

11/18/2014 44 Flink - M. Balassi & Gy. Fora

slide-45
SLIDE 45

Again, explaining the difference

On average, a (delta) interation runs for 6.7 seconds

Flink (Bulk): 48 sec. Spark (Bulk): 99 sec.

11/18/2014 45 Flink - M. Balassi & Gy. Fora

slide-46
SLIDE 46

Streaming performance

11/18/2014 46 Flink - M. Balassi & Gy. Fora

slide-47
SLIDE 47

Flink Roadmap

  • Flink has a major release every 3 months,

with >=1 big-fixing releases in-between

  • Finer-grained fault tolerance
  • Logical (SQL-like) field addressing
  • Python API
  • Flink Streaming, Lambda architecture support
  • Flink on Tez
  • ML on Flink (e.g., Mahout DSL)
  • Graph DSL on Flink
  • … and more

11/18/2014 47 Flink - M. Balassi & Gy. Fora

slide-48
SLIDE 48

11/18/2014 48 Flink - M. Balassi & Gy. Fora

slide-49
SLIDE 49

Marton Balassi, Gyula Fora {mbalassi, gyfora}@apache.org

http://flink.incubator.apache.org github.com/apache/incubator-flink @ApacheFlink