Designing Hybrid Data Processing Systems for Heterogeneous Servers - PowerPoint PPT Presentation

Designing Hybrid Data Processing Systems for Heterogeneous Servers Peter Pietzuch Large-Scale Distributed Systems (LSDS) Group Imperial College London http://lsds.doc.ic.ac.uk <prp@imperial.ac.uk> 1 University of Cambridge – Cambridge, United Kingdom – November 2017

Data is the New Oil • Many new sources of data become available – Most data is produced continuously Internet services, Social feeds IoT Cameras web sites devices RFID tags Mobile Data devices Scientific repositories instruments • Data powers plethora of new and personalised services … 2 Peter Pietzuch - Imperial College London

Data-Intensive Systems • Data analytics over web click streams Uniquely Identified Visitors Unique Visitors – How to maximise user experience with Visits relevant content? Page Views – How to analyse “click paths” to trace most Hits common user routes? Volume of Available Data • Machine learning models for • Solution: AdPredictor online prediction – Bayesian learning algorithm ranks adverts according to – E.g. serving adverts on search engines click probabilities f n update … y E {−1,1} f 1 predict 3 Peter Pietzuch - Imperial College London

Throughout and Result Freshness Matter … Data-intensive system High-throughput Low-latency processing results Facebook Insights: Aggregates 9 GB/s < 10 sec latency 40K credit card transactions/s < 25 ms latency Feedzai: Google Zeitgeist: 40K user queries/s < 1 ms latency NovaSparks: 150M trade options/s < 1 ms latency 4 Peter Pietzuch - Imperial College London

Design Space for Data-Intensive Systems • Tension between performance and algorithmic complexity TBs Data amount Hard for all algorithms GBs Hard for machine learning algorithms MBs Easy for most algorithms 10s 1s 100ms 10ms 1ms Result latency 5 Peter Pietzuch - Imperial College London

Algorithmic Complexity Increases T1 Pre-process … T1(a, b, c) Share state … highway highway highway highway highway Iterate segment segment segment segment segment T2 Parallelize T2(c, d, e) direction direction direction direction direction speed speed speed speed speed … Aggregate T3 T3(g, i, h) Complex Topic- Content- Online machine Stream pattern based based learning, data queries matching filtering filtering mining Complex Event Stream Publish/Subscribe Processing (CEP) processing 6 Peter Pietzuch - Imperial College London

Scale Out Model in Data Centres 7 Peter Pietzuch - Imperial College London

Task Parallelism vs. Data Parallelism ... Input data Results Servers in data centre select distinct W.cid select distinct W.cid select highway, segment, direction, AVG(speed) From Payments [ range 300 seconds] as W, select distinct W.cid From Payments [ range 300 seconds] as W, from Vehicles[ range 5 seconds slide 1 second] Payments [ partition-by 1 row] as L select highway, segment, direction, AVG(speed) From Payments [ range 300 seconds] as W, group by highway, segment, direction Payments [ partition-by 1 row] as L where W.cid = L.cid and W.region != L.region from Vehicles[ range 5 seconds slide 1 second] having Payments [ partition-by 1 row] as L avg < 40 where W.cid = L.cid and W.region != L.region group by highway, segment, direction where W.cid = L.cid and W.region != L.region having avg < 40 Task parallelism: Data parallelism: Multiple data processing jobs Single data processing job 8 Peter Pietzuch - Imperial College London

Distributed Dataflow Systems • Idea: Execute data-parallel tasks on cluster nodes parallelism degree 2 • Tasks organised as dataflow graph parallelism degree 3 • Almost all big data systems do this: • Apache Hadoop, Apache Spark, Apache Storm, Apache Flink, Google TensorFlow, ... • 9 Peter Pietzuch - Imperial College London

Nobody Ever Got Fired For Using Hadoop/Spark • 2012 study of MapReduce workloads (A. Rowstron, D. Narayanan, A. Donnely, G. O’Shea, A. Douglas, HotCDP’12) – Microsoft: median job size < 14 GB – Yahoo: median job size < 12.5 GB – Facebook: 90% of jobs < 100 GB • Many data-intensive jobs easily fit into memory • One server cheaper/more efficient than compute cluster 10 Peter Pietzuch - Imperial College London

Parallelism of Heterogeneous Servers Servers have many parallel CPU cores Heterogeneous servers with GPUs common PCIe Bus Command Queue SMX 1 ... SMX N Socket 1 Socket 2 C 1 C 5 C 1 C 5 10s of 1000s of C 2 C 6 C 2 C 6 CPU cores C 3 C 7 C 3 C 7 GPU cores C 4 C 8 C 4 C 8 L3 L3 L2 Cache DMA DRAM DRAM New types of compute accelerators: Xeon Phi, Google's TPUs, FPGAs, ... 11 Peter Pietzuch - Imperial College London

Servers Are Becoming Increasingly Heterogeneous Slide courtesy of Torsten Hoefler (Systems Group, ETH Zürich) E How can Data-Intensive Systems Exploit Heterogeneous Hardware? 12 Peter Pietzuch - Imperial College London

Roadmap • SABER: Hybrid stream processing engine for heterogeneous servers • [SIGMOD’16] • (1) How to parallelise computation on modern hardware? • (2) How to utilise heterogeneous servers ? • (3) Experimental performance results 13 Peter Pietzuch - Imperial College London

Analytics with Window-based Stream Queries • Real-time analytics over data streams • Windows define finite data amount for processing highway highway highway highway highway highway highway highway highway highway segment segment segment segment segment segment segment segment segment segment direction direction direction direction direction direction direction direction direction direction speed speed speed speed speed speed speed speed speed speed window now Time-based window with size τ at current time t [t - τ : t] Vehicles[Range τ seconds] Count-based window with size n: last n tuples Vehicles[Rows n] 14 Peter Pietzuch - Imperial College London

Defining Stream Query Semantics • Windows convert data streams to dynamic relations (database table) Window specification Any relational query Streams Relations (select, project, join, group by, etc) Stream operators: Istream, Dstream, Rstream 15 Peter Pietzuch - Imperial College London

SQL Stream Queries SQL provides well-defined declarative semantics for queries – Based on relational algebra (select, project, join, …) • Example: Identify slow moving traffic on highway – Input stream: Vehicles(highway, segment, direction, speed) – Find highway segments with average speed below 40 km/h Output select highway, segment, direction, AVG(speed) as avg Input data from Vehicles[ range 5 sec slide 1 sec] group by highway, segment, direction having avg < 40 Operators 16 Peter Pietzuch - Imperial College London

(1) How to Parallelise Computation? • Perform query evaluation across sliding windows in parallel – Exploit data parallelism across stream size: 4 sec 6 5 4 3 2 1 slide: 1 sec w 1 w 2 w 3 w 4 17 Peter Pietzuch - Imperial College London

How to use GPUs with Stream Queries? • Naive strategy parallelises computation along window boundaries size: 4 sec 6 5 4 3 2 1 slide: 1 sec Task T 1 Combine partial results Task T 2 E Window-based parallelism results in redundant computation 18 Peter Pietzuch - Imperial College London

How to use GPUs with Stream Queries? • Parallel processing of non-overlapping window data? size: 4 sec 6 5 4 3 2 1 slide: 1 sec w 1 T 1 T 2 Combine partial results w 2 T 3 w 3 T 4 w 4 T 5 E Slide-based parallelism limits degree of parallelism 19 Peter Pietzuch - Imperial College London

Apache Spark: Small Slides à Low Throughput select AVG(S.1) from S [ rows 1024 slide x ] 2 1.8 1.6 (10 6 tuples/s) Throughput 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 1 2 3 4 5 6 7 8 9 Window slide ( 10 6 tuples ) • Spark relates window slide to micro-batch size used for parallelisation E Avoid coupling system parameters with query definition 20 Peter Pietzuch - Imperial College London

SABER: Parallel Window Processing • Idea: Parallelise using task size that is best for hardware T 3 T 2 T 1 5 tuples/task 15 14 15 13 14 12 13 11 12 11 10 10 9 9 8 8 7 7 6 6 5 4 3 2 1 size: 7 tuples w 1 w 1 slide: 2 tuples w 2 w 2 w 3 w 3 w 4 w 4 w 5 w 5 • Task contains one or more window fragments 21 Peter Pietzuch - Imperial College London

SABER: Window Fragment Processing • Process window fragments in parallel • Reassemble partial results to obtain overall result Worker A: T 1 w 1 w 2 w 2 result w 3 w 1 Empty Empty result w 1 w 2 w 3 Slot 2 Slot 1 Output result w 4 Result stage w 5 circular buffer Worker B : T 2 Partial result reassembly must also be done in parallel 22 Peter Pietzuch - Imperial College London

API for Operator Implementation T 2 T 1 5 tuples/task • Fragment function f f 10 9 8 7 6 5 4 3 2 1 size: 7 tuples – Processes window fragments slide: 2 tuples w 1 w 2 f f f f f b f f f f • Assembly function f a w 2 results w 1 results – Merges partial window results f a f a output • Batch function f b – Composes fragment functions within task – Allows incremental processing 23 Peter Pietzuch - Imperial College London

Designing Hybrid Data Processing Systems for Heterogeneous Servers - PowerPoint PPT Presentation

Designing Hybrid Data Processing Systems for Heterogeneous Servers Peter Pietzuch Large-Scale Distributed Systems (LSDS) Group Imperial College London http://lsds.doc.ic.ac.uk <prp@imperial.ac.uk> 1 University of Cambridge

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems Abdullah Gharaibeh, Elizeu

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing

Pattern-guided Big Data Processing on Hybrid Parallel Architectures Fahad Khalid, Frank Feinbube,

Verification of Hybrid Controlled Processing Systems based on Decomposition and Deduction Goran

Computational Tools Data Simple Calculator Spreadsheet Processing Complex Hybrid Scripting

A Hybrid Approach to Linked Data Q er Pro essin Query Processing with ith Time Constraints e Co

Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems

Designing an adaptive VM that combines vectorized and JIT execution on heterogeneous hardware

on Astrophysical Data Processing Heterogeneous Many-Core Systems Theodore Kisner, LBNL

CIS 4930/6930: Principles of Cyber-Physical Systems Chapter 4: Hybrid Systems - Hybrid Automata

1/31/2007 Massachusetts Institute of Technology Context Hybrid Systems Hybrid

Comparing Hybrid Peer-to-Peer Hybrid peer-to-peer systems Systems Beverly Yang and Hector

MODELING OF MODELING OF HYBRID SYSTEMS HYBRID SYSTEMS C. G. Cassandras C. G. Cassandras Dept.

Modeling Analysis and Design of Hybrid Control Systems Part I Zeno/Chatter-free Systems

Designing Applications that See Designing Applications that See Lecture 6: Processing Dan

Fil illi ling th the gap ap a keystone for r th the Pan European Priv rivate Plac

SPFPFS Asynchronous Webinar Translate theories and categories in written paragraphs

Jumping loci and finiteness properties of groups Alexander I. Suciu (joint work with Alexandru

Probing the Higgs CP Nature in the H Decay Mohammad Hassan Hassanshahi Supervisors:

reform 22 nd July 2019 Meeting: SE London Joint Health Overview and Scrutiny Committee Location:

Noncommutative geometry of finite groups Javier Lpez Pea Department of Mathematics

Janet NOC Giovanni Sorenti Janet Service Desk Manager Janet Infrastructure 6,000km of

Can we afford heart failure management in the future? Martin R Cowie Professor of Cardiology