designing hybrid data processing systems for
play

Designing Hybrid Data Processing Systems for Heterogeneous Servers - PowerPoint PPT Presentation

Designing Hybrid Data Processing Systems for Heterogeneous Servers Peter Pietzuch Large-Scale Distributed Systems (LSDS) Group Imperial College London http://lsds.doc.ic.ac.uk <prp@imperial.ac.uk> 1 University of Cambridge


  1. Designing Hybrid Data Processing Systems for Heterogeneous Servers Peter Pietzuch Large-Scale Distributed Systems (LSDS) Group Imperial College London http://lsds.doc.ic.ac.uk <prp@imperial.ac.uk> 1 University of Cambridge – Cambridge, United Kingdom – November 2017

  2. Data is the New Oil • Many new sources of data become available – Most data is produced continuously Internet services, Social feeds IoT Cameras web sites devices RFID tags Mobile Data devices Scientific repositories instruments • Data powers plethora of new and personalised services … 2 Peter Pietzuch - Imperial College London

  3. Data-Intensive Systems • Data analytics over web click streams Uniquely Identified Visitors Unique Visitors – How to maximise user experience with Visits relevant content? Page Views – How to analyse “click paths” to trace most Hits common user routes? Volume of Available Data • Machine learning models for • Solution: AdPredictor online prediction – Bayesian learning algorithm ranks adverts according to – E.g. serving adverts on search engines click probabilities f n update … y E {−1,1} f 1 predict 3 Peter Pietzuch - Imperial College London

  4. Throughout and Result Freshness Matter … Data-intensive system High-throughput Low-latency processing results Facebook Insights: Aggregates 9 GB/s < 10 sec latency 40K credit card transactions/s < 25 ms latency Feedzai: Google Zeitgeist: 40K user queries/s < 1 ms latency NovaSparks: 150M trade options/s < 1 ms latency 4 Peter Pietzuch - Imperial College London

  5. Design Space for Data-Intensive Systems • Tension between performance and algorithmic complexity TBs Data amount Hard for all algorithms GBs Hard for machine learning algorithms MBs Easy for most algorithms 10s 1s 100ms 10ms 1ms Result latency 5 Peter Pietzuch - Imperial College London

  6. Algorithmic Complexity Increases T1 Pre-process … T1(a, b, c) Share state … highway highway highway highway highway Iterate segment segment segment segment segment T2 Parallelize T2(c, d, e) direction direction direction direction direction speed speed speed speed speed … Aggregate T3 T3(g, i, h) Complex Topic- Content- Online machine Stream pattern based based learning, data queries matching filtering filtering mining Complex Event Stream Publish/Subscribe Processing (CEP) processing 6 Peter Pietzuch - Imperial College London

  7. Scale Out Model in Data Centres 7 Peter Pietzuch - Imperial College London

  8. Task Parallelism vs. Data Parallelism ... Input data Results Servers in data centre select distinct W.cid select distinct W.cid select highway, segment, direction, AVG(speed) From Payments [ range 300 seconds] as W, select distinct W.cid From Payments [ range 300 seconds] as W, from Vehicles[ range 5 seconds slide 1 second] Payments [ partition-by 1 row] as L select highway, segment, direction, AVG(speed) From Payments [ range 300 seconds] as W, group by highway, segment, direction Payments [ partition-by 1 row] as L where W.cid = L.cid and W.region != L.region from Vehicles[ range 5 seconds slide 1 second] having Payments [ partition-by 1 row] as L avg < 40 where W.cid = L.cid and W.region != L.region group by highway, segment, direction where W.cid = L.cid and W.region != L.region having avg < 40 Task parallelism: Data parallelism: Multiple data processing jobs Single data processing job 8 Peter Pietzuch - Imperial College London

  9. Distributed Dataflow Systems • Idea: Execute data-parallel tasks on cluster nodes parallelism degree 2 • Tasks organised as dataflow graph parallelism degree 3 • Almost all big data systems do this: • Apache Hadoop, Apache Spark, Apache Storm, Apache Flink, Google TensorFlow, ... • 9 Peter Pietzuch - Imperial College London

  10. Nobody Ever Got Fired For Using Hadoop/Spark • 2012 study of MapReduce workloads (A. Rowstron, D. Narayanan, A. Donnely, G. O’Shea, A. Douglas, HotCDP’12) – Microsoft: median job size < 14 GB – Yahoo: median job size < 12.5 GB – Facebook: 90% of jobs < 100 GB • Many data-intensive jobs easily fit into memory • One server cheaper/more efficient than compute cluster 10 Peter Pietzuch - Imperial College London

  11. Parallelism of Heterogeneous Servers Servers have many parallel CPU cores Heterogeneous servers with GPUs common PCIe Bus Command Queue SMX 1 ... SMX N Socket 1 Socket 2 C 1 C 5 C 1 C 5 10s of 1000s of C 2 C 6 C 2 C 6 CPU cores C 3 C 7 C 3 C 7 GPU cores C 4 C 8 C 4 C 8 L3 L3 L2 Cache DMA DRAM DRAM New types of compute accelerators: Xeon Phi, Google's TPUs, FPGAs, ... 11 Peter Pietzuch - Imperial College London

  12. Servers Are Becoming Increasingly Heterogeneous Slide courtesy of Torsten Hoefler (Systems Group, ETH Zürich) E How can Data-Intensive Systems Exploit Heterogeneous Hardware? 12 Peter Pietzuch - Imperial College London

  13. Roadmap • SABER: Hybrid stream processing engine for heterogeneous servers • [SIGMOD’16] • (1) How to parallelise computation on modern hardware? • (2) How to utilise heterogeneous servers ? • (3) Experimental performance results 13 Peter Pietzuch - Imperial College London

  14. Analytics with Window-based Stream Queries • Real-time analytics over data streams • Windows define finite data amount for processing highway highway highway highway highway highway highway highway highway highway segment segment segment segment segment segment segment segment segment segment direction direction direction direction direction direction direction direction direction direction speed speed speed speed speed speed speed speed speed speed window now Time-based window with size τ at current time t [t - τ : t] Vehicles[Range τ seconds] Count-based window with size n: last n tuples Vehicles[Rows n] 14 Peter Pietzuch - Imperial College London

  15. Defining Stream Query Semantics • Windows convert data streams to dynamic relations (database table) Window specification Any relational query Streams Relations (select, project, join, group by, etc) Stream operators: Istream, Dstream, Rstream 15 Peter Pietzuch - Imperial College London

  16. SQL Stream Queries SQL provides well-defined declarative semantics for queries – Based on relational algebra (select, project, join, …) • Example: Identify slow moving traffic on highway – Input stream: Vehicles(highway, segment, direction, speed) – Find highway segments with average speed below 40 km/h Output select highway, segment, direction, AVG(speed) as avg Input data from Vehicles[ range 5 sec slide 1 sec] group by highway, segment, direction having avg < 40 Operators 16 Peter Pietzuch - Imperial College London

  17. (1) How to Parallelise Computation? • Perform query evaluation across sliding windows in parallel – Exploit data parallelism across stream size: 4 sec 6 5 4 3 2 1 slide: 1 sec w 1 w 2 w 3 w 4 17 Peter Pietzuch - Imperial College London

  18. How to use GPUs with Stream Queries? • Naive strategy parallelises computation along window boundaries size: 4 sec 6 5 4 3 2 1 slide: 1 sec Task T 1 Combine partial results Task T 2 E Window-based parallelism results in redundant computation 18 Peter Pietzuch - Imperial College London

  19. How to use GPUs with Stream Queries? • Parallel processing of non-overlapping window data? size: 4 sec 6 5 4 3 2 1 slide: 1 sec w 1 T 1 T 2 Combine partial results w 2 T 3 w 3 T 4 w 4 T 5 E Slide-based parallelism limits degree of parallelism 19 Peter Pietzuch - Imperial College London

  20. Apache Spark: Small Slides à Low Throughput select AVG(S.1) from S [ rows 1024 slide x ] 2 1.8 1.6 (10 6 tuples/s) Throughput 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 1 2 3 4 5 6 7 8 9 Window slide ( 10 6 tuples ) • Spark relates window slide to micro-batch size used for parallelisation E Avoid coupling system parameters with query definition 20 Peter Pietzuch - Imperial College London

  21. SABER: Parallel Window Processing • Idea: Parallelise using task size that is best for hardware T 3 T 2 T 1 5 tuples/task 15 14 15 13 14 12 13 11 12 11 10 10 9 9 8 8 7 7 6 6 5 4 3 2 1 size: 7 tuples w 1 w 1 slide: 2 tuples w 2 w 2 w 3 w 3 w 4 w 4 w 5 w 5 • Task contains one or more window fragments 21 Peter Pietzuch - Imperial College London

  22. SABER: Window Fragment Processing • Process window fragments in parallel • Reassemble partial results to obtain overall result Worker A: T 1 w 1 w 2 w 2 result w 3 w 1 Empty Empty result w 1 w 2 w 3 Slot 2 Slot 1 Output result w 4 Result stage w 5 circular buffer Worker B : T 2 Partial result reassembly must also be done in parallel 22 Peter Pietzuch - Imperial College London

  23. API for Operator Implementation T 2 T 1 5 tuples/task • Fragment function f f 10 9 8 7 6 5 4 3 2 1 size: 7 tuples – Processes window fragments slide: 2 tuples w 1 w 2 f f f f f b f f f f • Assembly function f a w 2 results w 1 results – Merges partial window results f a f a output • Batch function f b – Composes fragment functions within task – Allows incremental processing 23 Peter Pietzuch - Imperial College London

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend