CIEL A universal execution engine for distributed data-flow - PowerPoint PPT Presentation

CIEL A universal execution engine for distributed data-flow computing Murray, Derek G., et al. [1] LSDPO (2017/2018) Paper Presentation Ioana Bica (ib354)

Overview 1. Motivation and related work 2. CIEL’s contributions 3. Dynamic task graph and system architecture 4. Skywriting 5. Fault tolerance 6. Evaluation 7. Final remarks 2

Motivation Existing distributed execution engines (MapReduce and Dryad) were inefficient ● for iterative algorithms. MapReduce job [2] Dryad job [3] 3

Related work Adding iteration capabilities to MapReduce: CGL-MapReduce ● Do not provide transparent fault tolerance. Do not support task dependency graphs. HaLoop ● Job latency is increased by consecutive iterations. Apache Mahout ● 4

Related Work Providing data-dependent control flow: Composition of multiple computations not possible. Pregel ● Only operates on a single dataset. (Google’s execution engine) Does not provide transparent scaling. Piccolo ● Fault tolerance involves checkpointing. (data-centric programming model) 5

CIEL dynamic control flow ● dynamic task dependencies ● transparent fault tolerance ● transparent scaling ● data locality ● Can execute iterative and recursive algorithms as a single job. 6

Contributions CIEL: dynamically builds a data-flow DAG as tasks execute ● increases the algorithmic expressibility in execution engines, by allowing ● iterative or recursive algorithms to be executed as a single job implements memoization of task results ● makes improvements to the fault tolerance mechanism ● 7

Dynamic task graph Consists of the following CIEL primitives: objects ● unstructured sequence of bytes ○ with unique name ○ references ● future reference concrete reference object name loc_1, loc_2, …., loc_n tasks ● 8

Tasks input dependencies Non-blocking atomic computations. object_1 object_2 publish objects Tasks TASK spawn new tasks object_3 Cycles cannot be formed in the dependency graph. expected output 9

Dynamic task graph example 10

Lazy evaluation of objects Start from the resulting object and recursively evaluate tasks as their dependencies become concrete. 11

System architecture maintain current state of the dynamic task graph keeps track of references published by tasks and the new spawned tasks Tasks are dispatched to the worker nearest to the data. 12

Skywriting Turing complete programming language ● used to write parallelised jobs that can run on CIEL ● dynamically typed ● allows data mapping mechanisms through static file referencing ● Skywriting can express arbitrary data-dependent control flow. 13

Key features ● ref(url) ● spawn(f, [args, …]) ● exec(executor, args, n) ● spawn_exec(executor, args, n) * - dereference unary operator ● 14

Using Skywriting to create tasks Explicitly: using spawn() or spawn_exec() ● Implicitly: using the *- operator ● 15

Memoisation memoise task results ● enabled by using deterministic naming for the objects: ● executor H(args||n) i and by using lazy evaluation (only execute tasks if there outputs can resolve ● dependencies) 16

Fault tolerance Worker failures are handled similarly to Dryad ● re-execute task performed by failed worker ○ re-execute tasks using data from the failed worker ○ Master failure: does not force the entire job to fail ● derive master state from set of active jobs ○ use persistent logging and secondary masters ○ 17

Evaluation grep benchmark ● k-means clustering ● dynamic programming ● shows that CIEL has increased algorithmic expressivity compared to ○ MapReduce impact of master failures on performance ● No recursive algorithm? ● 18

Grep 19

k-mean clustering CIEL achieves higher cluster utilization and less constant overhead ● CIEL is not any more scalable than Hadoop ● 20

When to use (or not) CIEL? CIEL enables clients to run iterative and recursive algorithms in a highly ● parallelized manner with transparent fault tolerance and transparent scaling CIEL was designed for coarse-grained parallelism across large data sets ● For fine-grained parallelism, work-stealing schemes are better. ○ If data fits into RAM, Piccolo is more efficient. ○ If jobs share a lot of data, OpenMP is more appropriate. ○ For better scalability and performance use MPI . ○ 21

Drawbacks and ideas for improvement CIEL does not control the number of tasks it spawns. ● Modifications to the data flow graph during execution are centralized. ● When a worker fails, all of the tasks that depend on the task executed by that ● worker need to be re-executed. 22

References [1] Murray, Derek G., et al. "CIEL: a universal execution engine for distributed data-flow computing." Proc. 8th ACM/USENIX Symposium on Networked Systems Design and Implementation . 2011. [2] www.cdmh.co.uk [3] www.microsoft.com [4] Dean, J., and S. Ghemawat. "MapReduce: simplified data processing on large clusters. OSDI’04 Proceedings of the 6th conference on Symposium on Opearting Systems Design and Implementation”, dalam: International Journal of Enggineering Science Invention." URL: http://static. googleusercontent. com/media/resear ch. google. com (diunduh pada 2015-05-10) (2004): 10-100. [5] Isard, Michael, et al. "Dryad: distributed data-parallel programs from sequential building blocks." ACM SIGOPS operating systems review . Vol. 41. No. 3. ACM, 2007. 23

Thank you! 24

Questions? 25

CIEL A universal execution engine for distributed data-flow - PowerPoint PPT Presentation

CIEL A universal execution engine for distributed data-flow computing Murray, Derek G., et al. [1] LSDPO (2017/2018) Paper Presentation Ioana Bica (ib354) Overview 1. Motivation and related work 2. CIELs contributions 3. Dynamic task

LE CIEL RUE TOXIQUE TROTTOIR THE SHOW (IN A FEW WORDS) When Poetry and Clown Arts Meet

ASECNA INFRASTRUCTURE www.asecna.aero Les routes du ciel, notre mtier SUMMARY 1.

CIEL: a universal execution engine for distributed data-flow computing Derek G. Murray, Malte

CIEL: A UNIVERSAL EXECUTION ENGINE FOR DISTRIBUTED DATA-FLOW COMPUTING Derek G. Murray, Malte

Explore Skywriting With CIEL Chung Leung, LAM Brief Introduction Skywriting A script

Multimedia Data Processing on CIEL Arman Idani 14 Feb 2012 R202 Data Centric Networking

Presentation of Anahita Mauritius June 2012 Private & Confidential | June 2012 | 1 |

Mellin vu du ciel Mellin, seen from the sky Philippe Flajolet INRIA Rocquencourt March 10, 2008

A global environmental approach Organic Cosmetics - Organic Food Supplement 04300 Mane en

CHEMICALS IN PLASTICS Health impacts throughout the lifecycle of plastics

a framework for new-generation AI Research Soumith Chintala, Adam Paszke, Sam Gross & Team

Convergent Perturbation Theory for the lattice 4 -model Aleksandr Ivanov 1 , Vasily Sazonov 2 ,

DirectConnect: A Guide to A Direct Care Career Jennifer Rabalais, MA Project Director Institute

Lattice Flavour Physics N. Tantalo Rome University Tor Vergata and INFN sez. Tor

EXPECTATIONS FROM PK-PD MODELLING AND SIMULATION IN THE EVALUATION OF MEDICINES IN CHILDREN Pr

University of Virginia Board of Visitors Richard Chait, Professor Emeritus, Harvard University

Investigation of MSWI mechanical properties based on the amount of reactive compounds PhD

CHALLENGES TO BROWNFIELD REDEVELOPMENT Wanda Ballard Repasky JOHNSON & REPASKY, PLLC 6013

Science Opera+ons Phil Edwards | Head of Science

Housing Strategies April 25, 2019 What is affordable housing? Housing, in good condition,

SCORE LCA presentation Jade GARCIA Director SCORE LCA workshop 21 March 2019

Prioritization Consultants: Colin Gavigan, Mickey Schindler, Matt Musilli Who We Are

and its relevance to GHG Accounting Dr.-Ing. Jyotirmay Mathur, Professor, Mechanical Engineering

FOR INFRASTRUCTURE OVERVIEW OF ONGOING WORK Yannos Wikstrm October 23rd 2019 NORDIC LCA

CIEL A universal execution engine for distributed data-flow - PowerPoint PPT Presentation

CIEL A universal execution engine for distributed data-flow computing Murray, Derek G., et al. [1] LSDPO (2017/2018) Paper Presentation Ioana Bica (ib354) Overview 1. Motivation and related work 2. CIELs contributions 3. Dynamic task

LE CIEL RUE TOXIQUE TROTTOIR THE SHOW (IN A FEW WORDS) When Poetry and Clown Arts Meet

ASECNA INFRASTRUCTURE www.asecna.aero Les routes du ciel, notre mtier SUMMARY 1.

CIEL: a universal execution engine for distributed data-flow computing Derek G. Murray, Malte

CIEL: A UNIVERSAL EXECUTION ENGINE FOR DISTRIBUTED DATA-FLOW COMPUTING Derek G. Murray, Malte

Explore Skywriting With CIEL Chung Leung, LAM Brief Introduction Skywriting A script

Multimedia Data Processing on CIEL Arman Idani 14 Feb 2012 R202 Data Centric Networking

Presentation of Anahita Mauritius June 2012 Private &amp; Confidential | June 2012 | 1 |

Mellin vu du ciel Mellin, seen from the sky Philippe Flajolet INRIA Rocquencourt March 10, 2008

A global environmental approach Organic Cosmetics - Organic Food Supplement 04300 Mane en

CHEMICALS IN PLASTICS Health impacts throughout the lifecycle of plastics

a framework for new-generation AI Research Soumith Chintala, Adam Paszke, Sam Gross &amp; Team

Convergent Perturbation Theory for the lattice 4 -model Aleksandr Ivanov 1 , Vasily Sazonov 2 ,

DirectConnect: A Guide to A Direct Care Career Jennifer Rabalais, MA Project Director Institute

Lattice Flavour Physics N. Tantalo Rome University Tor Vergata and INFN sez. Tor

EXPECTATIONS FROM PK-PD MODELLING AND SIMULATION IN THE EVALUATION OF MEDICINES IN CHILDREN Pr

University of Virginia Board of Visitors Richard Chait, Professor Emeritus, Harvard University

Investigation of MSWI mechanical properties based on the amount of reactive compounds PhD

CHALLENGES TO BROWNFIELD REDEVELOPMENT Wanda Ballard Repasky JOHNSON &amp; REPASKY, PLLC 6013

Science Opera+ons Phil Edwards | Head of Science

Housing Strategies April 25, 2019 What is affordable housing? Housing, in good condition,

SCORE LCA presentation Jade GARCIA Director SCORE LCA workshop 21 March 2019

Prioritization Consultants: Colin Gavigan, Mickey Schindler, Matt Musilli Who We Are

and its relevance to GHG Accounting Dr.-Ing. Jyotirmay Mathur, Professor, Mechanical Engineering

FOR INFRASTRUCTURE OVERVIEW OF ONGOING WORK Yannos Wikstrm October 23rd 2019 NORDIC LCA

Presentation of Anahita Mauritius June 2012 Private & Confidential | June 2012 | 1 |

a framework for new-generation AI Research Soumith Chintala, Adam Paszke, Sam Gross & Team

CHALLENGES TO BROWNFIELD REDEVELOPMENT Wanda Ballard Repasky JOHNSON & REPASKY, PLLC 6013