Storm Distributed and fault-tolerant realtime computation Nathan - PowerPoint PPT Presentation

Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter

Basic info • Open sourced September 19th • Implementation is 15,000 lines of code • Used by over 25 companies • >2700 watchers on Github (most watched JVM project) • Very active mailing list • >2300 messages • >670 members

Before Storm Queues Workers

Example (simplified)

Example Workers schemify tweets and append to Hadoop

Example Workers update statistics on URLs by incrementing counters in Cassandra

Example Use mod/hashing to make sure same URL always goes to same worker

Scaling Deploy Reconfigure/redeploy

Problems • Scaling is painful • Poor fault-tolerance • Coding is tedious

What we want • Guaranteed data processing • Horizontal scalability • Fault-tolerance • No intermediate message brokers! • Higher level abstraction than message passing • “Just works”

Storm Guaranteed data processing Horizontal scalability Fault-tolerance No intermediate message brokers! Higher level abstraction than message passing “Just works”

Use cases Stream Distributed Continuous processing RPC computation

Storm Cluster

Storm Cluster Master node (similar to Hadoop JobTracker)

Storm Cluster Used for cluster coordination

Storm Cluster Run worker processes

Starting a topology

Killing a topology

Concepts • Streams • Spouts • Bolts • Topologies

Streams Tuple Tuple Tuple Tuple Tuple Tuple Tuple Unbounded sequence of tuples

Spouts Source of streams

Spout examples • Read from Kestrel queue • Read from Twitter streaming API

Bolts Processes input streams and produces new streams

Bolts • Functions • Filters • Aggregation • Joins • Talk to databases

Topology Network of spouts and bolts

Tasks Spouts and bolts execute as many tasks across the cluster

Task execution Tasks are spread across the cluster

Stream grouping When a tuple is emitted, which task does it go to?

Stream grouping • Shuffle grouping: pick a random task • Fields grouping: mod hashing on a subset of tuple fields • All grouping: send to all tasks • Global grouping: pick task with lowest id

Topology [“id1”, “id2”] shuffle shuffle [“url”] shuffle all

Streaming word count TopologyBuilder is used to construct topologies in Java

Streaming word count Define a spout in the topology with parallelism of 5 tasks

Streaming word count Split sentences into words with parallelism of 8 tasks

Streaming word count Consumer decides what data it receives and how it gets grouped Split sentences into words with parallelism of 8 tasks

Streaming word count Create a word count stream

Streaming word count splitsentence.py

Streaming word count

Streaming word count Submitting topology to a cluster

Streaming word count Running topology in local mode

Distributed RPC Data flow for Distributed RPC

DRPC Example Computing “reach” of a URL on the fly

Reach Reach is the number of unique people exposed to a URL on Twitter

Computing reach Follower Distinct follower Tweeter Follower Follower Distinct follower Reach Tweeter Count URL Follower Follower Distinct Tweeter follower Follower

Reach topology

Reach topology Keep set of followers for each request id in memory

Reach topology Update followers set when receive a new follower

Reach topology Emit partial count after receiving all followers for a request id

Guaranteeing message processing “Tuple tree”

Guaranteeing message processing • A spout tuple is not fully processed until all tuples in the tree have been completed

Guaranteeing message processing • If the tuple tree is not completed within a specified timeout, the spout tuple is replayed

Guaranteeing message processing Reliability API

Guaranteeing message processing “Anchoring” creates a new edge in the tuple tree

Guaranteeing message processing Marks a single node in the tree as complete

Guaranteeing message processing • Storm tracks tuple trees for you in an extremely efficient way

Transactional topologies How do you do idempotent counting with an at least once delivery guarantee?

Transactional topologies Won’t you overcount?

Transactional topologies Transactional topologies solve this problem

Transactional topologies Built completely on top of Storm’s primitives of streams, spouts, and bolts

Transactional topologies Enables fault-tolerant, exactly-once messaging semantics

Transactional topologies Batch 1 Batch 2 Batch 3 Process small batches of tuples

Transactional topologies Batch 1 Batch 2 Batch 3 If a batch fails, replay the whole batch

Transactional topologies Batch 1 Batch 2 Batch 3 Once a batch is completed, commit the batch

Transactional topologies Batch 1 Batch 2 Batch 3 Bolts can optionally be “committers”

Transactional topologies Commit 1 Commit 1 Commit 2 Commit 3 Commit 4 Commit 4 Commits are ordered. If there’s a failure during commit, the whole batch + commit is retried

Example

Example New instance of this object for every transaction attempt

Example Aggregate the count for this batch

Example Only update database if transaction ids differ

Example This enables idempotency since commits are ordered

Example (Credit goes to Kafka devs for this trick)

Transactional topologies Multiple batches can be processed in parallel, but commits are guaranteed to be ordered

Transactional topologies • Requires a more sophisticated source queue than Kestrel or RabbitMQ • storm-contrib has a transactional spout implementation for Kafka

Storm UI

Storm on EC2 https://github.com/nathanmarz/storm-deploy One-click deploy tool

Starter code https://github.com/nathanmarz/storm-starter Example topologies

Documentation

Ecosystem • Scala, JRuby, and Clojure DSL’s • Kestrel, Redis, AMQP , JMS, and other spout adapters • Multilang adapters • Cassandra, MongoDB integration

Questions? http://github.com/nathanmarz/storm

Future work • State spout • Storm on Mesos • “Swapping” • Auto-scaling • Higher level abstractions

Implementation KafkaTransactionalSpout

Implementation all all all

Implementation all all all TransactionalSpout is a subtopology consisting of a spout and a bolt

Implementation all all all The spout consists of one task that coordinates the transactions

Implementation all all all The bolt emits the batches of tuples

Implementation all all all The coordinator emits a “batch” stream and a “commit stream”

Implementation all all all Batch stream

Implementation all all all Commit stream

Implementation all all all Coordinator reuses tuple tree framework to detect success or failure of batches or commits and replays appropriately

Storm Distributed and fault-tolerant realtime computation Nathan - PowerPoint PPT Presentation

Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter Basic info Open sourced September 19th Implementation is 15,000 lines of code Used by over 25 companies >2700 watchers on Github (most watched JVM

Welcome to Storm ! The Storm botnet Reachability check Overnet (UDP) The Storm botnet

Communicating Storm Surge: Lessons Communicating Storm Surge: Lessons Learned during Isaac, Irene

ACTO TON STORM TANKS Community Liaison Working Group Tuesday 4 February 2020 Acton Storm Tanks

Tow n of Moraga Storm Drain O&M Program Developing a Storm Drain GIS Outline Part I

Apache Storm Christopher Little Apache Storm Alternatives Storm Hadoop Spark Streaming

Household Sewage Treatment Systems (HSTS) on Storm Water Pollution Storm Water Defined Water

Richard F. Dick Storm CEO Storm Technologies Inc CEO, Storm Technologies, Inc. Sammy

Construction Storm Water Construction Storm Water Construction Storm Water - - 10 Most

St St Storm Water Storm Water W t W t Management Management Management Management Program

Ice Storm December 2013 Presentation to General Committee January 8, 2014 Ice Storm

MS4 STORM WATER PROGRAM OLDHAM COUNTY STORM WATER MANAGEMENT DISTRICT Presented by: Tim Tyree

Workshop S Into the Storm Ohio Storm Water Compliance in Light of the 2020 Renewal of the

2013 Ice Storm Update Outage Management Improvements Presentation to Markham Special Committee

R-Storm: A Resource-Aware Scheduler for STORM Mohammad Hosseini Boyang Peng Zhihao

STORM WATER MANAGEMENT February 21, 2015 Prince William County Department of Public Works

Town of Oro Valley STORM WATER UTILITY COMMISSION Storm Water Utility Service Fee Proposal

How to implement a virtual How to implement a virtual network laboratory in six network

Multicore OSes: Looking Forward from 1991, er, 2011 David A. Holland and Margo I. Seltze (Harvard

OPEN PETASCALE LIBRARIES Advancing the development of numerical software for the new generation

Minimizing Completion Time for Loop Tiling with Computation and Communication Overlapping

NIA CFD Futures Conference Hampton, VA; August 2012 2 10 1 10 4 5 10 10 Supported by: NSF

A furtive fumble in Hard-Core Obscenity: the misuse of Template Meta-Programming to implement

2QBF workshop paper report: Graph Neural Network in the 2QBF Zhanfu Yang Content What is SAT

Safety and Security 1) Please silence your cell phones during the presentation. 2) Please make

Storm Distributed and fault-tolerant realtime computation Nathan - PowerPoint PPT Presentation

Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter Basic info Open sourced September 19th Implementation is 15,000 lines of code Used by over 25 companies >2700 watchers on Github (most watched JVM

Welcome to Storm ! The Storm botnet Reachability check Overnet (UDP) The Storm botnet

Communicating Storm Surge: Lessons Communicating Storm Surge: Lessons Learned during Isaac, Irene

ACTO TON STORM TANKS Community Liaison Working Group Tuesday 4 February 2020 Acton Storm Tanks

Tow n of Moraga Storm Drain O&amp;M Program Developing a Storm Drain GIS Outline Part I

Apache Storm Christopher Little Apache Storm Alternatives Storm Hadoop Spark Streaming

Household Sewage Treatment Systems (HSTS) on Storm Water Pollution Storm Water Defined Water

Richard F. Dick Storm CEO Storm Technologies Inc CEO, Storm Technologies, Inc. Sammy

Construction Storm Water Construction Storm Water Construction Storm Water - - 10 Most

St St Storm Water Storm Water W t W t Management Management Management Management Program

Ice Storm December 2013 Presentation to General Committee January 8, 2014 Ice Storm

MS4 STORM WATER PROGRAM OLDHAM COUNTY STORM WATER MANAGEMENT DISTRICT Presented by: Tim Tyree

Workshop S Into the Storm Ohio Storm Water Compliance in Light of the 2020 Renewal of the

2013 Ice Storm Update Outage Management Improvements Presentation to Markham Special Committee

R-Storm: A Resource-Aware Scheduler for STORM Mohammad Hosseini Boyang Peng Zhihao

STORM WATER MANAGEMENT February 21, 2015 Prince William County Department of Public Works

Town of Oro Valley STORM WATER UTILITY COMMISSION Storm Water Utility Service Fee Proposal

How to implement a virtual How to implement a virtual network laboratory in six network

Multicore OSes: Looking Forward from 1991, er, 2011 David A. Holland and Margo I. Seltze (Harvard

OPEN PETASCALE LIBRARIES Advancing the development of numerical software for the new generation

Minimizing Completion Time for Loop Tiling with Computation and Communication Overlapping

NIA CFD Futures Conference Hampton, VA; August 2012 2 10 1 10 4 5 10 10 Supported by: NSF

A furtive fumble in Hard-Core Obscenity: the misuse of Template Meta-Programming to implement

2QBF workshop paper report: Graph Neural Network in the 2QBF Zhanfu Yang Content What is SAT

Safety and Security 1) Please silence your cell phones during the presentation. 2) Please make

Tow n of Moraga Storm Drain O&M Program Developing a Storm Drain GIS Outline Part I