Data At Rest Data In Motion! A Lambda Architecture Overview When - - PowerPoint PPT Presentation

data at rest data in motion
SMART_READER_LITE
LIVE PREVIEW

Data At Rest Data In Motion! A Lambda Architecture Overview When - - PowerPoint PPT Presentation

Data At Rest Data In Motion! A Lambda Architecture Overview When Things Go Wrong http://xkcd.com/327/ Fault T olerance !!!! Fault T olerance Developer Software Hardware Data Collection Three T ypes Of Data Streams Structured


slide-1
SLIDE 1

Data At Rest … Data In Motion!

A Lambda Architecture Overview

slide-2
SLIDE 2

When Things Go Wrong

http://xkcd.com/327/

Fault T

  • lerance !!!!
slide-3
SLIDE 3

Fault T

  • lerance

Developer Software Hardware

slide-4
SLIDE 4

Data Collection

Three T ypes Of Data Streams Structured (Databases ...) Semi Structured (JSON, XML, XAML ...) UnStructured (Blogs, E-Mails, Log Files ...)

slide-5
SLIDE 5

Lambda Architecture T

  • The Rescue !!
slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8

Lambda Architecture — Requirements

Fault-tolerant against both hardware failures and human errors Support variety of use cases that include low latency querying as well as updates Linear scale-out capabilities Extensible, so that the system is manageable and can accommodate newer features easily

slide-9
SLIDE 9

Lambda Architecture

NEW DATA STREAM IMMUTABLE MASTER DATA PROCESS STREAM PRECOMPUTE VIEWS INCREMENT VIEWS QUERY

View 2 View N View 1 View 1 View 2 View N BATCH RECOMPUTE REAL-TIME INCREMENT MERGE

slide-10
SLIDE 10

Lambda Architecture

NEW DATA STREAM IMMUTABLE MASTER DATA PROCESS STREAM PRECOMPUTE VIEWS INCREMENT VIEWS QUERY

View 2 View N View 1 View 1 View 2 View N BATCH RECOMPUTE REAL-TIME INCREMENT MERGE BATCH LAYER SERVING LAYER SPEED LAYER

slide-11
SLIDE 11

Lambda Architecture - Layers

Batch Layer Managing the master data set, an immutable, append only set of raw data. Pre computing arbitrary query functions, called batch views.

slide-12
SLIDE 12

Lambda Architecture - Layers

Serving Layer Indexes batch views so that they can be queried in ad hoc with low latency. Merges and reconciles batch and real time views.

slide-13
SLIDE 13

Lambda Architecture - Layers

Speed Layer Accommodates all requests that are subject to low latency requirements. Using fast and incremental algorithms, deals with recent data only.

slide-14
SLIDE 14

Lambda Architecture - Reconciliation

Data absorbed into Batch Views Not yet absorbed

Time

slide-15
SLIDE 15

Lambda Architecture - Reconciliation

Data absorbed into Batch Views Not yet absorbed

Time

Just a few hours of data Now

slide-16
SLIDE 16

Lambda Architecture - Immutable Data + Views

Times tamp Airpor t Flight Action 2015- 01- 01T10: 00:0 DUB EL123 take-

  • ff

2015- 01- 01T10: 05:0 HEL SA45 take-

  • ff

2015- 01- 01T10: 07:0 AMS BA99 take-

  • ff

2015- 01- LHR LH17 landin g Immutable Master Dataset

slide-17
SLIDE 17

Lambda Architecture - Immutable Data + Views

Timestamp Airport Flight Action 2015-01- 01T10:00:0 DUB EL123 take-off 2015-01- 01T10:05:0 HEL SA45 take-off 2015-01- 01T10:07:0 AMS BA99 take-off 2015-01- 01T10:09:0 LHR LH17 landing 2015-01- 01T10:10:0 CDG AF03 landing 2015-01- 01T10:11:0 FCO AZ501 take-off

Immutable Master Dataset

Map Reduce

air borne: 2307

Map Reduce Map Reduce

airport load: air borne per airline:

Airport Planes AMS 44 LHR 69 Airline SAS BA

slide-18
SLIDE 18

Lambda Architecture - Implementation

NEW DATA STREAM IMMUTABLE MASTER DATA PROCESS STREAM PRECOMPUTE VIEWS INCREMENT VIEWS QUERY

View 2 View N View 1 View 1 View 2 View N BATCH RECOMPUTE REAL-TIME INCREMENT MERGE

slide-19
SLIDE 19

Lambda Architecture - Implementation

NEW DATA STREAM Hadoop HDFS Apache Kafka Apache Hive Apache Spark

  • Spark SQL
  • R
  • Presto

HBase HBase HBase Storm Bolt Storm Bolt Storm Bolt BATCH RECOMPUTE REAL-TIME INCREMENT MERGE