Making Sense of Performance in Data Analytics Frameworks Authors: - PowerPoint PPT Presentation

Apr 09, 2023 •129 likes •362 views

Making Sense of Performance in Data Analytics Frameworks Authors: Kay Ousterhout, Ryan Rasti, Sylvia Ratnasamy, Scott Shenker, Byung-Gon Chun Presenter: Zi Wang Why? Commonly Accepted mantras Network IO/disk Straggler Takeways

Making Sense of Performance in Data Analytics Frameworks Authors: Kay Ousterhout, Ryan Rasti, Sylvia Ratnasamy, Scott Shenker, Byung-Gon Chun Presenter: Zi Wang
Why? • Commonly Accepted mantras • Network • IO/disk • Straggler
Takeways • Network can reduce job completion time by at 2% • I/O optimizations lead to <19% reduction in completion time • Many straggler causes can be identified and fixed • CPU is in general the bottleneck
Outline • Methodology • Results • Threats to validity
What is the job’s bottleneck Network Compute Disk Task x: may be bottlenecked on tasks different resources at different times Time t: different tasks may be bottlenecked on different resources time
Blocked Time Analysis • Time when task is blocked on one resource (e.g network) • Blocked time analysis: how much faster would the job complete if tasks never blocked on the resource?
An Example of Blocked Time Analysis for Network (1) Measure time when tasks are blocked on the network tasks (2) Simulate how job completion time would change
Blocked time analysis: how quickly could a job have completed if a resource were infinitely fast? Scheduler would have moved Task 2 to slot 2
Experiments Setting • Big Data Benchmark, 50 queries, 50GB Data, 5 machines • TPC-DS (Scale 5000), 260 queries, 850GB Data, 20 machines • Production, 30 queries, tens of GB Data, 9 machines
Experiments Setting • All three workloads are Spark-SQL workloads • Coarse-grained analysis of traces from Facebook, Google, Microsoft are used for sanity check
Are jobs network-light?
Analysis • Queries often shuffle and output much less data than they read • However, the result seems inconsistent from previous work…
Two Reasons • Incomplete Metric Only look at shuffle time • • Conflation of CPU and network time Sending data over the network has an associated CPU cost •
Analysis for I/O • Compressed data is used, CPU is traded for I/O • Spark is written in Scala. Data read must be deserialized to Java Objects.
Role of Straggler • The median reduction from eliminating straggler < 10% • Common causes: garbage collection, I/O • Many Stragglers are caused by inherent factors like output size
Threats to Validity • Only One Framework (Spark) • Small cluster sizes • Only three workloads
Related work • Instead of using Spark, using Naiad can achieve up to 3x speedups going from 1G network to 10G network • Spark is also memory-efficient, leveraging “in- memory” computation • Modern hardware (I/O, network links) are also more improved compared to CPU
Comparison to Pivot Tracing • Static v.s. Dynamic • Resource Directed Analysis v.s. Crossing Boundaries Analysis
References • Making Sense of Performance in Data Analytics Frameworks • Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems • The impact of fast networks on graph analytics • Project Tungsten: Bringing Apache Spark Closer to Bare Metal
“The only way to get ahead is to find errors in conventional wisdom.” –Larry Ellison

Recommend

TUFF TUFF TUFF TUFF TUFF TUFF TUFF TUFF MAKING MAKING MAKING MAKING SENSE OF SENSE OF

TUFF TUFF TUFF TUFF TUFF TUFF TUFF TUFF MAKING MAKING MAKING MAKING SENSE OF SENSE OF SENSE OF SENSE OF THE FRAUD THE FRAUD THE FRAUD THE FRAUD JACK WRAITH MBE JIGSAW! JIGSAW! JIGSAW! JIGSAW! Chief Executive Officer TUFF

537 views • 16 slides

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020 You are still ill in in tim ime to change room Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics

1.38k views • 98 slides

Word Sense Word Sense Word Sense Disambiguation Disambiguation Disambiguation Presented by

Word Sense Word Sense Word Sense Disambiguation Disambiguation Disambiguation Presented by Jen-Wei Kuo Reference Foundations of Statistical Natural Language Processing, Chapter 7, Word Sense Disambiguation Speech and Language Processing,

877 views • 32 slides

MAKING SENSE OF WEB ANALYTICS Why website data is so important What data should you track

MAKING SENSE OF WEB ANALYTICS Why website data is so important What data should you track How to track and report What to do with the data #1 SOURCE OF MARKETING MEASUREMENT 57% of global data and analytics

374 views • 18 slides

Establishing Performance Frameworks www.apse.org.uk Performance Frameworks Effective Process

Establishing Performance Frameworks www.apse.org.uk Performance Frameworks Effective Process Of Identifying Outcomes and Indicators Effective Process of monitoring and Comparing Performance www.apse.org.uk Which Arrow Represents Your

465 views • 42 slides

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Spreadsheet Analytics 17 Nov 2013 Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics I (statistics and data analysis) BANA-2082 Business Analytics II (optimization/simulation) BANA-4080

254 views • 12 slides

MAKING SENSE OF MEDIA Dr Idil Osman MAKING SENSE OF MEDIA; ENGAGING VULNERABLE COMMUNITIES

MAKING SENSE OF MEDIA Dr Idil Osman MAKING SENSE OF MEDIA; ENGAGING VULNERABLE COMMUNITIES Minority communities particularly from refugee and diaspora backgrounds engage with media spaces at varying levels of engagement and for different

271 views • 5 slides

State of the WHO- -FIC FIC State of the WHO making sense of classifications making sense of

State of the WHO- -FIC FIC State of the WHO making sense of classifications making sense of classifications Classification, Assessment, Terminology Team Classification, Assessment, Terminology Team WHO WHO Report from WHO Report from WHO

439 views • 16 slides

Start Making Sense How to stay on track when going agile gets hard Joe Kearns : Principal

25/02/2020 Start Making Sense How to stay on track when going agile gets hard Joe Kearns : Principal Consultant 21 February 2020 1 START MAKING SENSE Agile - adjective 2 1 25/02/2020 START MAKING SENSE Agile Its hard!

576 views • 19 slides

Making Sense of Word Sense 24 February, 2011 Deutschen Gesellschaft fr Sprachwissenschaft (DGfS)

Making Sense of Word Sense 24 February, 2011 Deutschen Gesellschaft fr Sprachwissenschaft (DGfS) Gottingen Rebecca J. Passonneau Nancy Ide Vikas Bhardwaj Vassar College Ansaf Salleb Aouissi Outline The word sense conundrum The MASC

716 views • 47 slides

The quantity of a small set You perceive the parts and put together the whole can be intuitively

Making Sense of Number Sense Definition Number sense is .good intuition about numbers Making Sense of Number Sense and their relationships . It develops gradually as a result of exploring numbers, visualizing them in a variety of contexts

339 views • 9 slides

SENSE 2013 Findings for College of Southern Idaho Presentation Overview SENSE Overview

SENSE 2013 Findings for College of Southern Idaho Presentation Overview SENSE Overview Student Respondent Profile SENSE Benchmarks Community College Students and Stories Strategies to Promote Learning that Matters 2 SENSE

971 views • 67 slides

The Holy Grail of Sense Definition: The Holy Grail of Sense Definition: Creating a

The Holy Grail of Sense Definition: The Holy Grail of Sense Definition: Creating a Sense-Disambiguated Corpus from Scratch Creating a Sense-Disambiguated Corpus from Scratch Anna Rumshisky Anna Rumshisky Marc Verhagen Marc Verhagen Jessica

462 views • 28 slides

When the plain sense of Scripture makes common sense, make no other sense, therefore take every

When the plain sense of Scripture makes common sense, make no other sense, therefore take every word at its ordinary, usual, literal meaning, unless the facts of the immediate context studied in the light of related passages and axiomatic and

497 views • 12 slides

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen Dllner - Landscape Analytics - DLA 2015, www.hpi3d.de Landscape Analytics Big Data Big Data Analytics Visual Analytics Predictive Analytics

479 views • 20 slides

Web Frameworks Web Frameworks Banned for homework assignments Now that you're starting

Web Frameworks Web Frameworks Banned for homework assignments Now that you're starting your project where you can use these Let's talk about it Web Frameworks There are many common tasks that every web developer must accomplish on

633 views • 22 slides

Conflation & Matching Break out session Why we need conflation? Matching legacy data with

Conflation & Matching Break out session Why we need conflation? Matching legacy data with new data? Solving cross border issues Overlaying thematic data layer on updated basemap Difference scale maps Remarks on ontology

157 views • 5 slides

Students on placement: the view from both sides Dr Bob Pymm, School of Information Studies, CSU,

Students on placement: the view from both sides Dr Bob Pymm, School of Information Studies, CSU, Australia SCHOOL OF INFORMATION STUDIES Industry placements Common across a wide range of professional study teachers, physios, vets,

344 views • 16 slides

CO COMM MM 31 310: 0: Fu Fund ndra rais ising ing Pers rsonal al Pro rodu duct

CO COMM MM 31 310: 0: Fu Fund ndra rais ising ing Pers rsonal al Pro rodu duct ctivity ty A confession systems guy 1) Focus on the basics 2) Build the system 3) Put it on autopilot BUT BUT WH WHY? Y? #prod oduct

819 views • 80 slides

Sacrament of Baptism Testimonies Dear friends in Christ, our faith declares that by the sin of

Sacrament of Baptism Testimonies Dear friends in Christ, our faith declares that by the sin of Adam, humanity as the offspring of Adam is corrupted in its very nature, so that from birth we are inclined to sin; and that new life and a right

149 views • 12 slides

Comparing Canonicalizations of Historical German Text Bryan Jurish jurish@bbaw.de Project

Comparing Canonicalizations of Historical German Text Bryan Jurish jurish@bbaw.de Project Deutsches Textarchiv Berlin-Brandenburg Academy of Sciences Berlin, Germany SIGMORPHON 2010 Uppsala, Sweden 15 July, 2010 SIGMORPHON-2010 /

510 views • 22 slides

Genesis Series Lesson #003 February 25, 2003 Dean Bible Ministries www.deanbibleministries.org

Genesis Series Lesson #003 February 25, 2003 Dean Bible Ministries www.deanbibleministries.org Dr. Robert L. Dean, Jr. Book of Beginnings GENESIS Wrote Genesis? Who The excessive skepticism of many liberal theologians stems not from a

566 views • 34 slides

January 2018 Overview Why the changes have been made What the changes are Support and

YLE revisions January 2018 Overview Why the changes have been made What the changes are Support and preparation materials WHY THE CHANGES HAVE BEEN MADE Key strengths of existing YLE Why have the changes been made? Keep YLE

725 views • 48 slides

Health Ethics Seminar October 22, 2020 Heidi Janz, Ph.D. Assistant Adjunct Professor John

Health Ethics Seminar October 22, 2020 Heidi Janz, Ph.D. Assistant Adjunct Professor John Dossetor Health Ethic Centre The COVID-19 pandemic has laid bare the systemic ableism that relegates people with disabilities to the margins of both

538 views • 49 slides