CS 744: NAIAD Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - - PowerPoint PPT Presentation

Jun 18, 2023 •181 likes •386 views

CS 744: NAIAD Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - Course Project Proposal feedback - Midterm grades - Checkins? Applications Machine Learning SQL Streaming Graph Computational Engines Scalable Storage Systems Resource

CS 744: NAIAD Shivaram Venkataraman Fall 2019
ADMINISTRIVIA - Course Project Proposal feedback - Midterm grades - Checkins?
Applications Machine Learning SQL Streaming Graph Computational Engines Scalable Storage Systems Resource Management Datacenter Architecture
DASHBOARDS
Streaming + ITERATIVE COMPUTATION
TIMELY DATAFLOW
TIMELY DATAFLOW
VERTEX API Receiving Messages v.OnRecv(e : Edge, m : Msg, t : Time) v.OnNotify(t : Timestamp) Sending Messages this.SendBy(e : Edge, m : Msg, t : Time) this.NotifyAt(t : Timestamp)
IMPLEMENTING TIMELY DATAFLOW Need to track when it is safe to notify Path Summary Check if (t 1 ,l 1 ) could-result-in (t 2 ,l 2 ) Scheduler Occurrence and Precursor count Precursor count = 0 à Frontier
ARCHITECHTURE Workers communicate using Shared Queue Batch messages delivered Account for cycles Vertex single threaded
DISTRIBUTED PROGRESS TRACKING Broadcast-based approach Maintain local precursor count, occurrence count Send progress update (p ∈ Pointstamp, δ ∈ Z) Local frontier tracks global frontier Optimizations Batch updates and broadcast Use projected timestamps from logical graph
FAULT TOLERANCE Checkpoint Restore Log data as computation goes on Reset all workers to checkpoint Write a full checkpoint on demand Reconstruct state Pause worker threads Resume execution Flush message queues OnRecv
MICRO STRAGGLERS What is different from stragglers in MapReduce? Sources of stragglers Network Concurrency Garbage Collection
Differential DATAFLOW // 1a. Define input stages for the dataflow. var input = controller.NewInput<string>(); // 1b. Define the timely dataflow graph. // Here, we use LINQ to implement MapReduce. var result = input.SelectMany(y => map(y)) .GroupBy(y => key(y), (k, vs) => reduce(k, vs)); // 1c. Define output callbacks for each epoch result.Subscribe(result => { ... }); // 2. Supply input data to the query. input.OnNext(/* 1st epoch data */); input.OnCompleted();
SUMMARY Stream processing à Increasingly important workload trend Timely dataflow: Principled approach to model batch, streaming together Vertex message model - Compute frontier - Distributed progress tracking
DISCUSSION https://forms.gle/v3YsW1HvnqsxCuPu5
What are some example scenarios discussed in the dataflow paper that are NOT a good fit for implementation using Naiad?
Consider you are implementing a micro-batch streaming API on top of Apache Spark. What are some of the bottlenecks/challenges you might have in building such a system?

Recommend

Using Naiad to Analyze Twitter Data in Batch and Real-time George Wort University of Cambridge

Using Naiad to Analyze Twitter Data in Batch and Real-time George Wort University of Cambridge 2017 Naiad Timely Dataflow System. Batch Processing. Stream Processing. Graph Processing. Supports iterative and incremental

179 views • 6 slides

Phone Fax 25448 SEIL ROAD 1-815-744-1910 1-815-744-1968 SHOREWOOD, ILLINOIS 60404-7620

Supervisor Trustees Joseph D. Baltz Bryan W. Kopman Larry Ryan John Theo Theobald Clerk Kristin Cross Brett Wheeler Phone Fax 25448 SEIL ROAD 1-815-744-1910 1-815-744-1968 SHOREWOOD, ILLINOIS 60404-7620 www.troytownship.com March

599 views • 57 slides

Naiad a timely dataflow model Whats it hoping to achieve? 1. high throughput 2. low latency

Naiad a timely dataflow model Whats it hoping to achieve? 1. high throughput 2. low latency 3. incremental computation Why? So much data! Problems with other, contemporary dataflow systems: 1. Too specific (e.g. Map-Reduce, Hadoop)

872 views • 25 slides

Naiad: A Timely Dataflow System Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard,

Naiad: A Timely Dataflow System Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, Martin Abadi Presented by Stefan Ivanov for R244: Large-Scale Data Processing and Optimization Summary The Context Overall

493 views • 32 slides

Naiad: A Timely Dataflow System Derek G. Murray Frank McSherry Rebecca Isaacs Michael Isard

Naiad: A Timely Dataflow System Derek G. Murray Frank McSherry Rebecca Isaacs Michael Isard Paul Barham Martn Abadi MSR Silicon Valley Presented by Jesse Mu (jlm95) Background: dataflow programming Batch processing Batch processing

964 views • 74 slides

Naiad: A Timely Dataflow System Derek G. Murray Frank McSherry Rebecca Isaacs Michael Isard

Naiad: A Timely Dataflow System Derek G. Murray Frank McSherry Rebecca Isaacs Michael Isard Paul Barham Martin Abadi Microsoft Research Silicon Valley Presented by Braden Ehrat Batch Stream Graph processing processing processing

791 views • 21 slides

Naiad: A Timely Dataflow System Indigo Orton R244 Computer Laboratory Motivation High

Naiad: A Timely Dataflow System Indigo Orton R244 Computer Laboratory Motivation High throughput Low latency Interac4ve querying Example Analytics dashboard Constant metric streams stream Automated insights

298 views • 17 slides

Naiad (Timely Dataflow) & Streaming Systems CS 848: Models and Applications of Distributed

Naiad (Timely Dataflow) & Streaming Systems CS 848: Models and Applications of Distributed Data Systems Mon, Nov 7th 2016 Amine Mhedhbi What is Timely Dataflow ?! What is its significance? Dataflow ?! Dataflow?! Dataflow?!

1.08k views • 70 slides

Naiad James Thomas Goals High-throughput batch processing Low-latency processing

Naiad James Thomas Goals High-throughput batch processing Low-latency processing Iterative computation with streaming updates (novel contribution) For 100% in-memory workloads Novel Application, CIDR 2013 paper

423 views • 13 slides

2.744 Dreamweaver Tutorial Sangmok Han sangmok@mit.edu Feb 24, 2010 Overview We will go over

2.744 Dreamweaver Tutorial Sangmok Han sangmok@mit.edu Feb 24, 2010 Overview We will go over the steps for creating the below page using Dreamweaver: http://web.mit.edu/2.744/www/Results/studentSubmissions/humanUseAnalysis/sa

491 views • 30 slides

QR CODES 4 All Diane Edgar Education Specialist Region 4 ESC 713.744.6862 Handout Follow

QR CODES 4 All Diane Edgar Education Specialist Region 4 ESC 713.744.6862 Handout Follow along at http://www.esc4.net/default.aspx?name=e dtech.qrcodes Or http://techapps.wikispaces.com/QRazy+Q R+Codes+in+Your+Classroom What

399 views • 27 slides

CS 744: Big Data Systems Shivaram Venkataraman Fall 2018 ADMINISTRIVIA - Waitlist/Enrollment

CS 744: Big Data Systems Shivaram Venkataraman Fall 2018 ADMINISTRIVIA - Waitlist/Enrollment - Midterm clarification - How to make a killer presentation Midterm, Project Midterm Exam - Written exam based on main papers - Held on Nov 5,

573 views • 20 slides

Annual Budget 25448 Seil Rd. Shorewood, IL 60404 815-744-1968 www.troytownship.com P a g e | 1

UPDATED: March 19, 2018 2018-2019 Annual Budget 25448 Seil Rd. Shorewood, IL 60404 815-744-1968 www.troytownship.com P a g e | 1 Click for Table of Contents This page intentionally left blank. P a g e | 2 Click for Table of Contents

922 views • 54 slides

Y R A N I M I L E R P 25448 Seil Rd. Shorewood, IL 60404 815-744-1968

2017-2018 Annual Budget Y R A N I M I L E R P 25448 Seil Rd. Shorewood, IL 60404 815-744-1968 www.troytownship.com P a g e | 1 Click for Table of Contents Y R A N I M I This page intentionally left blank. L E R P P a g e

1.36k views • 64 slides

Proposed Town Fund Levy Presentation 25448 Seil Rd. Shorewood, IL 60404 815-744-1968

2018 Proposed Road and Bridge Fund Levy and Proposed Town Fund Levy Presentation 25448 Seil Rd. Shorewood, IL 60404 815-744-1968 www.troytownship.com October 15, 2018 P a g e | 1 Click for Table of Contents This page intentionally left

368 views • 34 slides

Proposed Town Fund Levy Presentation 25448 Seil Rd. Shorewood, IL 60404 815-744-1968

2017 Proposed Road and Bridge Fund Levy and Proposed Town Fund Levy Presentation 25448 Seil Rd. Shorewood, IL 60404 815-744-1968 www.troytownship.com October 16, 2017 P a g e | 1 Click for Table of Contents This page intentionally left

493 views • 35 slides

ALD Basics: ALD on Powders December 19 th , 2019 dhiggs@forgenano.com smoulton@forgenano.com A

Webinar ALD Basics: ALD on Powders December 19 th , 2019 dhiggs@forgenano.com smoulton@forgenano.com A LITTLE BIT ABOUT US Founded in 2013 30 Employees 1,500 m 2 facility in Louisville, CO, USA Expansion in 2020 Mission: To become the world

1.09k views • 52 slides

The Coach Is Is In In: Shape Your QI I Project TODAY Room - Pinn innacle le II II Come

The Coach Is Is In In: Shape Your QI I Project TODAY Room - Pinn innacle le II II Come armed with the projects that you are currently working on. This one-on-one mentorship session will offer advice and coaching opportunity with QI

846 views • 16 slides

Reaction Monitoring Kelly Ruggles kelly@fenyolab.org New York University Traditional

Protein Quantitation II: Multiple Reaction Monitoring Kelly Ruggles kelly@fenyolab.org New York University Traditional Affinity-based proteomics Use antibodies to quantify proteins RPPA Western Blot Immunohistochemistry ELISA

730 views • 45 slides

African Easterly Waves and Atlantic Hurricanes Rosana Nieto Ferreira Tom Rickenbach East

African Easterly Waves and Atlantic Hurricanes Rosana Nieto Ferreira Tom Rickenbach East Carolina University Earle Williams (MIT) Nick Guy (Colorado State University) East Carolina University Connections: African Sahel and North Carolina

951 views • 61 slides

Autarky: Closing controlled channels with self-paging enclaves Meni Orenbach, Technion Andrew

Autarky: Closing controlled channels with self-paging enclaves Meni Orenbach, Technion Andrew Baumann, Microsoft Research Mark Silberstein, Technion Public cloud computing Enclave Enclave Enclave Sensitive data 29-Apr-20 Meni Orenbach,

971 views • 29 slides

Space Robotics and the GER: An Industrial Perspective Christian Sallaberger Vice-President &

Space Missions GER Panel 3: Near-Term Implementation Ideas, Strategies and Plans Space Robotics and the GER: An Industrial Perspective Christian Sallaberger Vice-President & Director, Space Exploration MDA GER Opening Observations

376 views • 9 slides

A SerDes Balancing Act: Co-Optimizing Tx and Rx Equalization Settings to Maximize Margin Donald

A SerDes Balancing Act: Co-Optimizing Tx and Rx Equalization Settings to Maximize Margin Donald Telian, Owner SiGuys Todd Westerhoff, VP SiSoft AGENDA A SerDes Balancing Act Introduction Co-Optimization Examples How to Co-Optimize

321 views • 29 slides

EBCE Data Analytics Platform Ta j Ait-Laoussine February 13, 2019 The EBCE Platform in Context

EBCE Data Analytics Platform Ta j Ait-Laoussine February 13, 2019 The EBCE Platform in Context External Online Trade Network Services Information Resources Product Marketplace Customer Resources CCA Email Interactive Voice Personalized

479 views • 8 slides

CS 744: NAIAD Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - - PowerPoint PPT Presentation

CS 744: NAIAD Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - Course Project Proposal feedback - Midterm grades - Checkins? Applications Machine Learning SQL Streaming Graph Computational Engines Scalable Storage Systems Resource

Using Naiad to Analyze Twitter Data in Batch and Real-time George Wort University of Cambridge

Phone Fax 25448 SEIL ROAD 1-815-744-1910 1-815-744-1968 SHOREWOOD, ILLINOIS 60404-7620

Naiad a timely dataflow model Whats it hoping to achieve? 1. high throughput 2. low latency

Naiad: A Timely Dataflow System Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard,

Naiad: A Timely Dataflow System Derek G. Murray Frank McSherry Rebecca Isaacs Michael Isard

Naiad: A Timely Dataflow System Derek G. Murray Frank McSherry Rebecca Isaacs Michael Isard

Naiad: A Timely Dataflow System Indigo Orton R244 Computer Laboratory Motivation High

Naiad (Timely Dataflow) &amp; Streaming Systems CS 848: Models and Applications of Distributed

Naiad James Thomas Goals High-throughput batch processing Low-latency processing

2.744 Dreamweaver Tutorial Sangmok Han sangmok@mit.edu Feb 24, 2010 Overview We will go over

QR CODES 4 All Diane Edgar Education Specialist Region 4 ESC 713.744.6862 Handout Follow

CS 744: Big Data Systems Shivaram Venkataraman Fall 2018 ADMINISTRIVIA - Waitlist/Enrollment

Annual Budget 25448 Seil Rd. Shorewood, IL 60404 815-744-1968 www.troytownship.com P a g e | 1

Y R A N I M I L E R P 25448 Seil Rd. Shorewood, IL 60404 815-744-1968

Proposed Town Fund Levy Presentation 25448 Seil Rd. Shorewood, IL 60404 815-744-1968

Proposed Town Fund Levy Presentation 25448 Seil Rd. Shorewood, IL 60404 815-744-1968

ALD Basics: ALD on Powders December 19 th , 2019 dhiggs@forgenano.com smoulton@forgenano.com A

The Coach Is Is In In: Shape Your QI I Project TODAY Room - Pinn innacle le II II Come

Reaction Monitoring Kelly Ruggles kelly@fenyolab.org New York University Traditional

African Easterly Waves and Atlantic Hurricanes Rosana Nieto Ferreira Tom Rickenbach East

Autarky: Closing controlled channels with self-paging enclaves Meni Orenbach, Technion Andrew

Space Robotics and the GER: An Industrial Perspective Christian Sallaberger Vice-President &amp;

A SerDes Balancing Act: Co-Optimizing Tx and Rx Equalization Settings to Maximize Margin Donald

EBCE Data Analytics Platform Ta j Ait-Laoussine February 13, 2019 The EBCE Platform in Context

Naiad (Timely Dataflow) & Streaming Systems CS 848: Models and Applications of Distributed

Space Robotics and the GER: An Industrial Perspective Christian Sallaberger Vice-President &