Flying Faster with Heron KARTHIK RAMASAMY @KARTHIKZ #TwitterHeron - - PowerPoint PPT Presentation

flying faster with heron
SMART_READER_LITE
LIVE PREVIEW

Flying Faster with Heron KARTHIK RAMASAMY @KARTHIKZ #TwitterHeron - - PowerPoint PPT Presentation

Flying Faster with Heron KARTHIK RAMASAMY @KARTHIKZ #TwitterHeron TALK OUTLINE BEGIN b I II ( III MOTIVATION HERON OVERVIEW K V Z IV END HERON OPERATIONAL PERFORMANCE EXPERIENCES [ OVERVIEW TWITTER IS REAL TIME


slide-1
SLIDE 1

Flying Faster with Heron

KARTHIK RAMASAMY

@KARTHIKZ #TwitterHeron

slide-2
SLIDE 2

BEGIN END

OVERVIEW

  • I
MOTIVATION

(

II

HERON PERFORMANCE

K

V

OPERATIONAL EXPERIENCES

Z

IV

TALK OUTLINE

HERON

b

III

slide-3
SLIDE 3

OVERVIEW

  • [
slide-4
SLIDE 4

TWITTER IS REAL TIME

G

Emerging break out trends in Twitter (in the form #hashtags)

Ü

Real time sports conversations related with a topic (recent goal
  • r touchdown)
  • Real time product
recommendations based
  • n your behavior &
profile real time search real time trends real time conversations real time recommendations Real time search of tweets

s

ANALYZING BILLIONS OF EVENTS IN REAL TIME IS A CHALLENGE!

slide-5
SLIDE 5

GUARANTEED MESSAGE PROCESSING HORIZONTAL SCALABILITY ROBUST FAULT TOLERANCE CONCISE CODE- FOCUS ON LOGIC

/

b

\

Ñ

TWITTER STORM

Streaming platform for analyzing realtime data as they arrive, so you can react to data as it happens.

slide-6
SLIDE 6

STORM TERMINOLOGY

TOPOLOGY Directed acyclic graph Vertices=computation, and edges=streams of data tuples SPOUTS Sources of data tuples for the topology Examples - Event Bus/Kafka/Kestrel/MySQL/Postgres BOLTS Process incoming tuples and emit outgoing tuples Examples - filtering/aggregation/join/arbitrary function

, %

slide-7
SLIDE 7

STORM TOPOLOGY

% % % % %

SPOUT 1 SPOUT 2 BOLT 1 BOLT 2 BOLT 3 BOLT 4 BOLT 5

slide-8
SLIDE 8

WORD COUNT TOPOLOGY

% %

TWEET SPOUT PARSE TWEET BOLT WORD COUNT BOLT Live stream of Tweets LOGICAL PLAN

slide-9
SLIDE 9

WORD COUNT TOPOLOGY

% %

TWEET SPOUT TASKS PARSE TWEET BOLT TASKS WORD COUNT BOLT TASKS

% % % % % % % %

When a parse tweet bolt task emits a tuple which word count bolt task should it send to?

slide-10
SLIDE 10

STREAM GROUPINGS

Random distribution

  • f tuples

Group tuples by a field or multiple fields Replicates tuples to all tasks SHUFFLE GROUPING FIELDS GROUPING ALL GROUPING Sends the entire stream to one task GLOBAL GROUPING

/

  • ,

.

slide-11
SLIDE 11

WORD COUNT TOPOLOGY

% %

TWEET SPOUT TASKS PARSE TWEET BOLT TASKS WORD COUNT BOLT TASKS

% % % % % % % %

SHUFFLE GROUPING FIELDS GROUPING

slide-12
SLIDE 12

MOTIVATION

(

slide-13
SLIDE 13

STORM ARCHITECTURE

Nimbus

ZK CLUSTER

SUPERVISOR

W1 W2 W3 W4

SUPERVISOR

W1 W2 W3 W4

TOPOLOGY SUBMISSION ASSIGNMENT MAPS SLAVE NODE SLAVE NODE MASTER NODE

Multiple Functionality Scheduling/Monitoring Single point of failure Storage Contention No resource reservation and isolation

slide-14
SLIDE 14

STORM WORKER

TASK4 TASK5

EXECUTOR2

TASK2 TASK3 TASK1

EXECUTOR1 JVM PROCESS

Complex hierarchy Difficult to tune Hard to debug

slide-15
SLIDE 15

DATA FLOW IN STORM WORKERS

In Queue In Queue In Queue In Queue In Queue

TCP Receive Buffer

In Queue In Queue In Queue In Queue Out Queue

Outgoing Message Buffer

User Logic Thread User Logic Thread User Logic Thread User Logic Thread User Logic Thread User Logic Thread User Logic Thread User Logic Thread User Logic Thread Send Thread

Global Send Thread TCP Send Buffer Global Receive Thread

Kernel

Queue Contention Multiple Languages

slide-16
SLIDE 16

OVERLOADED ZOOKEEPER

zk S1 S2 S3

Scaled up

W W W

STORM

zk

Handled unto to 1200 workers per cluster

slide-17
SLIDE 17

67% 33%

OVERLOADED ZOOKEEPER

KAFKA SPOUT Offset/partition is written every 2 secs STORM RUNTIME Workers write heart beats every 3 secs

Analyzing zookeeper traffic

slide-18
SLIDE 18

OVERLOADED ZOOKEEPER

zk S1 S2 S3

Heart beat daemons

W W W

STORM

zk

5000 workers per cluster

H HH KV KV KV

slide-19
SLIDE 19

shared pool storm cluster

STORM - DEPLOYMENT

slide-20
SLIDE 20

shared pool storm cluster joe’s topology

isolated pools

STORM - DEPLOYMENT

slide-21
SLIDE 21

STORM - DEPLOYMENT

shared pool storm cluster joe’s topology

isolated pools

jane’s topology

slide-22
SLIDE 22

STORM - DEPLOYMENT

shared pool storm cluster joe’s topology

isolated pools

jane’s topology dave’s topology

slide-23
SLIDE 23

g

G

STORM ISSUES

LACK OF BACK PRESSURE Drops tuples unpredictably EFFICIENCY Serialization program consumes 75 cores at 30% CPU Topology consumes 600 cores at 20-30% CPU NO BATCHING Tuple oriented system - implicit batching by 0MQ

slide-24
SLIDE 24

EVOLUTION OR REVOLUTION?

FUNDAMENTAL ISSUES- REQUIRE EXTENSIVE REWRITING

Several queues for moving data Inflexible and requires longer development cycle USE EXISTING OPEN SOURCE SOLUTIONS Issues working at scale/lacks required performance Incompatible API and long migration process

,

fix storm or develop a new system?

slide-25
SLIDE 25

HERON

b

slide-26
SLIDE 26

HERON DESIGN GOALS

FULLY API COMPATIBLE WITH STORM Directed acyclic graph Topologies, spouts and bolts USE OF MAIN STREAM LANGUAGES C++/JAVA/Python

  • d
  • TASK ISOLATION

Ease of debug ability/resource isolation/profiling

slide-27
SLIDE 27

HERON ARCHITECTURE

Topology 1

TOPOLOGY SUBMISSION

Scheduler Topology 2 Topology 3 Topology N

slide-28
SLIDE 28

TOPOLOGY ARCHITECTURE

Topology Master

ZK CLUSTER

Stream Manager

I1 I2 I3 I4

Stream Manager

I1 I2 I3 I4

Logical Plan, Physical Plan and Execution State Sync Physical Plan CONTAINER CONTAINER

Metrics Manager Metrics Manager

slide-29
SLIDE 29

TOPOLOGY MASTER

ASSIGNS ROLE MONITORING METRICS

b

\

Ñ

Solely responsible for the entire topology

slide-30
SLIDE 30

TOPOLOGY MASTER

Topology Master

ZK CLUSTER

Logical Plan, Physical Plan and Execution State

PREVENT MULTIPLE TM BECOMING MASTERS

  • ALLOWS OTHER PROCESS TO DISCOVER TM
slide-31
SLIDE 31

STREAM MANAGER

ROUTES TUPLES BACK PRESSURE ACK MGMT

Ñ

Routing Engine

/ ,

slide-32
SLIDE 32

STREAM MANAGER

% %

S1 B2 B3

%

B4

slide-33
SLIDE 33

S1 B2 B3

STREAM MANAGER

Stream Manager Stream Manager Stream Manager Stream Manager

S1 B2 B3 B4 S1 B2 B3 S1 B2 B3 B4

O(n2) O(k2)

B4

slide-34
SLIDE 34

S1 B2 B3

STREAM MANAGER

Stream Manager Stream Manager Stream Manager Stream Manager

S1 B2 B3 B4 S1 B2 B3 S1 B2 B3 B4

tcp back pressure

B4

SLOWS UPSTREAM AND DOWNSTREAM INSTANCES

slide-35
SLIDE 35

S1 B2 B3

STREAM MANAGER

Stream Manager Stream Manager Stream Manager Stream Manager

S1 B2 B3 B4 S1 B2 B3 S1 B2 B3 B4

spout back pressure

B4 S1 S1 S1 S1

slide-36
SLIDE 36

S1 B2 B3

STREAM MANAGER

Stream Manager Stream Manager Stream Manager Stream Manager

S1 B2 B3 B4 S1 B2 B3 S1 B2 B3 B4

stage by stage back pressure

B4 S1 S1 S1 S1 B2 B2 B2 B2

slide-37
SLIDE 37

STREAM MANAGER

PREDICTABILITY Tuple failures are more deterministic SELF ADJUSTS Topology goes as fast as the slowest component

  • back pressure advantages
slide-38
SLIDE 38

HERON INSTANCE

RUNS ONE TASK EXPOSES API COLLECTS METRICS

|

Does the real work!

p

> >

>

slide-39
SLIDE 39

HERON INSTANCE

Stream Manager Metrics Manager

Gateway Thread Task Execution Thread data-in queue data-out queue metrics-out queue

slide-40
SLIDE 40

OPERATIONAL EXPERIENCES

K

slide-41
SLIDE 41

HERON DEPLOYMENT

Topology 1 Topology 2 Topology 3 Topology N Heron Tracker Heron VIZ Heron Web

ZK CLUSTER

Aurora Services Aurora Scheduler Observability

slide-42
SLIDE 42

HERON SAMPLE TOPOLOGIES

slide-43
SLIDE 43

SAMPLE TOPOLOGY DASHBOARD

slide-44
SLIDE 44

Large amount of data produced every day Large cluster Several topologies deployed Several billion messages every day

HERON @TWITTER

1 stage 10 stages

3x reduction in cores and memory STORM is decommissioned

slide-45
SLIDE 45

HERON PERFORMANCE

x

9

slide-46
SLIDE 46

HERON PERFORMANCE

Settings

COMPONENTS EXPT #1 EXPT #2 EXPT #3 EXPT #4 Spout 25 100 200 300 Bolt 25 100 200 300 # Heron containers 25 100 200 300 # Storm workers 25 100 200 300

slide-47
SLIDE 47

HERON PERFORMANCE

million tuples/min 350 700 1050 1400 Spout Parallelism 25 100 200 500

Storm Heron

Word count topology - Acknowledgements enabled

latency (ms) 625 1250 1875 2500 Spout Parallelism 25 100 200 500

Storm Heron

10-14x Throughput Latency 5-15x

slide-48
SLIDE 48

HERON PERFORMANCE

# cores used 625 1250 1875 2500 Spout Parallelism 25 100 200 500

Storm Heron

Word count topology - CPU usage

2-3x

slide-49
SLIDE 49

HERON PERFORMANCE

Throughput and CPU usage with no acknowledgements - Word count topology

million tuples/min 1250 2500 3750 5000 Spout Parallelism 25 100 200 500

Storm Heron

# cores used 625 1250 1875 2500 Spout Parallelism 25 100 200 500

Storm Heron

slide-50
SLIDE 50

HERON EXPERIMENT

RTAC topology

% %

CLIENT EVENT SPOUT DISTRIBUTOR BOLT USER COUNT BOLT

%

AGGREGATOR BOLT

SHUFFLE GROUPING FIELDS GROUPING FIELDS GROUPING

slide-51
SLIDE 51

HERON PERFORMANCE

Acknowledgements enabled

# cores used 100 200 300 400

Storm Heron

CPU usage - RTAC Topology

No acknowledgements

# cores used 100 200 300 400

Storm Heron

slide-52
SLIDE 52

HERON PERFORMANCE

latency (ms) 17.5 35 52.5 70

Storm Heron

Latency with acknowledgements enabled - RTAC Topology

slide-53
SLIDE 53

CURIOUS TO LEARN MORE…

Twitter Heron: Stream Processing at Scale

Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg, Sailesh Mittal, Jignesh M. Patel*,1, Karthik Ramasamy, Siddarth Taneja

@sanjeevrk, @challenger_nik, @Louis_Fumaosong, @vikkyrk, @cckellogg, @saileshmittal, @pateljm, @karthikz, @staneja

Twitter, Inc., *University of Wisconsin – Madison

Storm @Twitter

Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel*, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat, Sailesh Mittal, Dmitriy Ryaboy

@ankitoshniwal, @staneja, @amits, @karthikz, @pateljm, @sanjeevrk, @jason_j, @krishnagade, @Louis_Fumaosong, @jakedonham, @challenger_nik, @saileshmittal, @squarecog

Twitter, Inc., *University of Wisconsin – Madison

slide-54
SLIDE 54

CONCLUSION

SIMPLIFIED ARCHITECTURE Easy to debug, profile and support HIGH PERFORMANCE 7-10x increase in throughput 5-10x improvement in latency

  • EFFICIENCY

3-5x decrease in resource usage

slide-55
SLIDE 55
  • #ThankYou

FOR LISTENING

slide-56
SLIDE 56

QUESTIONS and ANSWERS

R

  • Go ahead. Ask away.