Fully Fault Tolerant Real Time Data Pipeline with Docker and Mesos - - PowerPoint PPT Presentation

fully fault tolerant real time data pipeline with docker
SMART_READER_LITE
LIVE PREVIEW

Fully Fault Tolerant Real Time Data Pipeline with Docker and Mesos - - PowerPoint PPT Presentation

Fully Fault Tolerant Real Time Data Pipeline with Docker and Mesos Rahul Kumar Technical Lead LinuxCon / ContainerCon - Berlin, Germany Data Pipeline Agenda Mesos + Docker Reactive Data Pipeline Goal Analyzing data always


slide-1
SLIDE 1

Fully Fault Tolerant Real Time Data Pipeline with Docker and Mesos

Rahul Kumar

Technical Lead

LinuxCon / ContainerCon - Berlin, Germany

slide-2
SLIDE 2

Agenda

  • Data Pipeline
  • Mesos + Docker
  • Reactive Data Pipeline
slide-3
SLIDE 3

Goal

Analyzing data always have great benefits and is one of the greatest challenge for an organization.

slide-4
SLIDE 4

Today’s business generates massive amount of digital data.

slide-5
SLIDE 5

which is cumbersome to store, transport and analyze

slide-6
SLIDE 6

Making distributed system and

  • ff-loading workload to

commodity clusters is one of the better approach to solve data problem

slide-7
SLIDE 7
slide-8
SLIDE 8

Characteristics Of a distributed system

Resource Sharing

Openness

Concurrency

Scalability

Fault Tolerance

Transparency

slide-9
SLIDE 9

Collect Store Process Analyze

slide-10
SLIDE 10

Data Center

slide-11
SLIDE 11

Manually Scale Frameworks & Install services

slide-12
SLIDE 12

Complex Very Limited Inefficient Low Utilization

slide-13
SLIDE 13

Static Partitioning Blocker for Fault Tolerant data pipeline

slide-14
SLIDE 14

Failure make it even more complex to manage

slide-15
SLIDE 15

Apache Mesos

“Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.”

slide-16
SLIDE 16
slide-17
SLIDE 17

Mesos Features

  • Scalability: scale up to 10,000s of nodes
  • Fault-tolerant: replicated master and slaves using ZooKeeper
  • Docker support: Support for Docker containers
  • Native Container: Linux Native isolation between tasks with Linux

Containers

  • Scheduling: Multi-resource scheduling (memory, CPU, disk, and

ports)

  • API supports: Java, Python and C++ APIs for developing new parallel

applications

  • Monitoring: Web UI for viewing cluster state
slide-18
SLIDE 18
slide-19
SLIDE 19

Resource Isolation

slide-20
SLIDE 20
slide-21
SLIDE 21
slide-22
SLIDE 22

Docker Containerizer

Mesos adds the support for launching tasks that contains Docker images Users can either launch a Docker image as a Task, or as an Executor. To run the mesos-agent to enable the Docker Containerizer, “docker” must be set as one of the containerizers option mesos-agent --containerizers=docker,mesos

slide-23
SLIDE 23
slide-24
SLIDE 24

Mesos Frameworks

  • Aurora: Aurora was developed at Twitter and the migrated to Apache

Project later. Aurora is a framework that keeps service running across a shared pool of machines, and responsible for keeping them running forever.

  • Marathon: It is a framework for container orchestration for Mesos.

Marathon helps to run other framework on Mesos. Marathon also runs

  • ther application container such as Jetty, JBoss Server, Play Server.
  • Chronos: Fault tolerance job scheduler for Mesos, It was developed at

Airbnb as replacement of cron.

slide-25
SLIDE 25

Resilient Distributed Datasets (RDDs)

  • Big collection of data

which is:

  • Immutable
  • Distributed
  • Lazily evaluated
  • Type Inferred
  • Cacheable

Spark Stack

slide-26
SLIDE 26

Many big-data applications need to process large data streams in near-real time

Monitoring Systems Alert Systems Computing Systems

Why Spark Streaming?

slide-27
SLIDE 27

Taken from Apache Spark.

What is Spark Streaming?

slide-28
SLIDE 28

Framework for large scale stream processing ➔ Created at UC Berkeley ➔ Scales to 100s of nodes ➔ Can achieve second scale latencies ➔ Provides a simple batch-like API for implementing complex algorithm ➔ Can absorb live data streams from Kafka, Flume, ZeroMQ, Kinesis etc.

What is Spark Streaming?

slide-29
SLIDE 29

Run a streaming computation as a series of very small, deterministic batch jobs

  • Chop up the live stream into batches of X seconds
  • Spark treats each batch of data as RDDs

and processes them using RDD operations

  • Finally, the processed results of the RDD
  • perations are returned in batches

Spark Streaming

slide-30
SLIDE 30

Point of Failure

Simple Streaming Pipeline

slide-31
SLIDE 31
slide-32
SLIDE 32
  • To use Mesos from Spark, you need a Spark binary package available in a

place accessible (http/s3/hdfs) by Mesos, and a Spark driver program configured to connect to Mesos.

  • Configuring the driver program to connect to Mesos:

val sconf = new SparkConf() .setMaster("mesos://zk://10.121.93.241:2181,10.181.2.12:2181,10.107.48.112:2181/mesos") .setAppName("MyStreamingApp") .set("spark.executor.uri","hdfs://Sigmoid/executors/spark-1.3.0-bin-hadoop2.4.tgz") .set("spark.mesos.coarse", "true") .set("spark.cores.max", "30") .set("spark.executor.memory", "10g") val sc = new SparkContext(sconf) val ssc = new StreamingContext(sc, Seconds(1)) ...

Spark Streaming over a HA Mesos Cluster

slide-33
SLIDE 33

Real-time stream processing systems must be operational 24/7, which requires them to recover from all kinds of failures in the system.

  • Spark and its RDD abstraction is designed to seamlessly handle failures of any worker nodes in

the cluster.

  • In Streaming, driver failure can be recovered with checkpointing application state.
  • Write Ahead Logs (WAL) & Acknowledgements can ensure 0 data loss.

Spark Streaming Fault-tolerance

slide-34
SLIDE 34

Simple Fault-tolerant Streaming Infra

slide-35
SLIDE 35
slide-36
SLIDE 36
  • Figure out the bottleneck : CPU, Memory, IO, Network
  • If parsing is involved, use the one which gives

high performance.

  • Proper Data modeling
  • Compression, Serialization

Creating a scalable pipeline

slide-37
SLIDE 37

Thank You

@rahul_kumar_aws

LinuxCon / ContainerCon - Berlin, Germany