6.888 Lecture 8: Networking for Data Analy9cs Mohammad Alizadeh - PowerPoint PPT Presentation

6.888 Lecture 8: Networking for Data Analy9cs Mohammad Alizadeh ² Many thanks to Mosharaf Chowdhury (Michigan) and Kay Ousterhout (Berkeley) Spring 2016 1

“Big Data” Huge amounts of data being collected daily Wide variety of sources - Web, mobile, wearables, IoT, scien9fic - Machines: monitoring, logs, etc Many applica9ons - Business intelligence, scien9fic 2 research, health care

Big Data Systems BlinkDB Storm Spark-Streaming Pregel GraphLab GraphX DryadLINQ Dremel Spark MapReduce Hadoop Dryad Hive 2005 2010 2015 3

Data Parallel Applica9ons Mul9-stage dataflow • Computa9on interleaved with communica9on Computa9on Stage (e.g., Map, Reduce) • Distributed across many machines Reduce Stage • Tasks run in parallel A communication stage cannot complete Shuffle Communica9on Stage (e.g., Shuffle) until all the data have been transferred • Between successive computa9on stages Map Stage

Ques9ons How to design the network for data parallel applica9ons? Ø What are good communica9on abstrac9ons? Does the network ma]er for data parallel applica9ons? Ø What are the bo]lenecks for these applica9ons?

Efficient Coflow Scheduling with Varys ² Slides by Mosharaf Chowdhury (Michigan), with minor modifica9ons 6

Exis9ng Solu9ons Flow: Transfer of data from a source to a des9na9on WFQ CSFQ D 3 DeTail PDQ pFabric GPS RED ECN XCP RCP DCTCP D 2 TCP FCP 1980s 1990s 2000s 2005 2010 2015 Per-Flow Fairness Flow Completion Time Independent flows cannot capture the collective communication behavior common in data-parallel applications

Cof low Communication abstraction for data-parallel applications to express their performance goals 1. Minimize completion times, 2. Meet deadlines

Broadcast Single Flow Aggregation All-to-All Parallel Flows Shuffle

1 1 … for faster #1 completion How to 2 2 of coflows? schedule . . coflows … to meet . . online … #2 more . . deadlines? N N Datacenter

Benefits of Inter-Coflow Scheduling Coflow 1 Coflow 2 6 Units Link 2 Link 1 3 Units 2 Units Fair Sharing Smallest-Flow First 1,2 The Optimal L2 L2 L2 L1 L1 L1 2 4 6 2 4 6 2 6 time time 4 time Coflow1 comp. time = 5 Coflow1 comp. time = 3 Coflow1 comp. time = 5 Coflow2 comp. time = 6 Coflow2 comp. time = 6 Coflow2 comp. time = 6 1. Finishing Flows Quickly with Preemptive Scheduling, SIGCOMM’2012. 2. pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’2013.

is NP-Hard Benefits of Inter-Coflow Scheduling Coflow 1 Coflow 2 6 Units Link 2 Link 1 3 Units 2 Units Fair Sharing Flow-level Prioritization 1 The Optimal Concurrent Open Shop Scheduling 1 L2 L2 L2 • Examples include job scheduling and L1 L1 L1 caching blocks 2 4 6 2 4 6 2 6 time time 4 time • Solutions use a ordering heuristic Coflow1 comp. time = 6 Coflow1 comp. time = 3 Coflow1 comp. time = 6 Coflow2 comp. time = 6 Coflow2 comp. time = 6 Coflow2 comp. time = 6 1. Finishing Flows Quickly with Preemptive Scheduling, SIGCOMM’2012. 1. A Note on the Complexity of the Concurrent Open Shop Problem, Journal of Scheduling, 9(4):389–396, 2006 2. pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’2013.

is NP-Hard Inter-Coflow Scheduling Coflow 1 Coflow 2 6 Units Link 2 Link 1 3 Units 2 Units Input Links Output Links Concurrent Open Shop Scheduling 1 1 with Coupled Resources • Examples include job scheduling and caching blocks 2 6 2 • Solutions use a ordering heuristic • Consider matching constraints 2 3 3 Datacenter 3

Varys Employs a two-step algorithm to minimize coflow completion times Keep an ordered list of coflows to be scheduled, 1. Ordering heuristic preempting if needed Allocates minimum required resources to each coflow 2. Allocation algorithm to finish in minimum time

Alloca9on Algorithm Finishing flows Allocate minimum A coflow faster than the flow rates such cannot finish bottleneck cannot that all flows of a before its decrease a coflow’s coflow finish very last flow completion time together on time

Varys Architecture Sender Receiver Driver Put Get Reg Centralized master-slave architecture Varys Varys Varys Daemon Daemon Daemon • Applications use a client library to communicate with the master Network Interface Actual timing and rates are determined Topology Usage Monitor Estimator by the coflow scheduler (Distributed) File System Coflow Scheduler TaskName Comp. Tasks calling f Varys Client Library Varys Master 1. Download from http://varys.net

Discussion 17

Making Sense of Performance in Data Analy9cs Frameworks ² Slides by Kay Ousterhout (Berkeley), with minor modifica9ons 18

Network Load balancing: VL2 [SIGCOMM ‘09], Hedera [NSDI ’10], Sinbad [SIGCOMM ’13] Application semantics: Orchestra [SIGCOMM ’11], Baraat [SIGCOMM ‘14], Varys [SIGCOMM ’14] Reduce data sent: PeriSCOPE [OSDI ‘12], SUDO [NSDI ’12] In-network aggregation: Camdoop [NSDI ’12] Better isolation and fairness: Oktopus [SIGCOMM ’11], EyeQ [NSDI ‘12], FairCloud [SIGCOMM ’12] Disk Themis [SoCC ‘12], PACMan [NSDI ’12], Spark [NSDI ’12], Tachyon [SoCC ’14] Stragglers Scarlett [EuroSys ‘11], SkewTune [SIGMOD ‘12], LATE [OSDI ‘08], Mantri [OSDI ‘10], Dolly [NSDI ‘13], GRASS [NSDI ‘14], Wrangler [SoCC ’14]

Network Load balancing: VL2 [SIGCOMM ‘09], Hedera [NSDI ’10], Sinbad [SIGCOMM ’13] Application semantics: Orchestra [SIGCOMM ’11], Baraat [SIGCOMM ‘14], Varys [SIGCOMM ’14] Reduce data sent: PeriSCOPE [OSDI ‘12], SUDO [NSDI ’12] In-network aggregation: Camdoop [NSDI ’12] Better isolation and fairness: Oktopus [SIGCOMM ‘11]), EyeQ [NSDI ‘12], FairCloud Missing: what’s most important to [SIGCOMM ’12] end-to-end performance? Disk Themis [SoCC ‘12], PACMan [NSDI ’12], Spark [NSDI ’12], Tachyon [SoCC ’14] Stragglers Scarlett [EuroSys ‘11], SkewTune [SIGMOD ‘12], LATE [OSDI ‘08], Mantri [OSDI ‘10], Dolly [NSDI ‘13], GRASS [NSDI ‘14], Wrangler [SoCC ’14]

Network Load balancing: VL2 [SIGCOMM ‘09], Hedera [NSDI ’10], Sinbad [SIGCOMM ’13] Application semantics: Orchestra [SIGCOMM ’11], Baraat [SIGCOMM ‘14], Varys Widely-accepted mantras: [SIGCOMM ’14] Reduce data sent: PeriSCOPE [OSDI ‘12], SUDO [NSDI ’12] In-network aggregation: Camdoop [NSDI ’12] Better isolation and fairness: Oktopus [SIGCOMM ‘11]), EyeQ [NSDI ‘12], FairCloud Network and disk I/O are bottlenecks [SIGCOMM ’12] Disk Stragglers are a major issue with Themis [SoCC ‘12], PACMan [NSDI ’12], Spark [NSDI ’12], Tachyon [SoCC ’14] unknown causes Stragglers Scarlett [EuroSys ‘11], SkewTune [SIGMOD ‘12], LATE [OSDI ‘08], Mantri [OSDI ‘10], Dolly [NSDI ‘13], GRASS [NSDI ‘14], Wrangler [SoCC ’14]

This work (1) How can we quan9fy performance bo]lenecks? Blocked time analysis (2) Do the mantras hold? Takeaways based on three workloads run with Spark

Blocked 9me analysis (1) Measure time when tasks are blocked on the tasks network (2) Simulate how job completion time would change

(1) Measure the time when tasks are blocked on the network network read compute disk write Original task runtime : time to handle one record : time blocked on network : time blocked on disk compute Best case task runtime if network were infinitely fast

(2) Simulate how job comple9on 9me would change time 2 slots Task 0 Task 2 : time blocked on network Task 1 t o : Original job completion time 2 slots Task 0 Task 2 Task 1 t n : Job completion time with infinitely fast network Incorrectly computed time: doesn’t account for task scheduling

Takeaways based on three Spark workloads: Network optimizations can reduce job comple9on 9me by at most 2% CPU (not I/O) often the bottleneck <19% reduction in completion time from optimizing disk Many straggler causes can be identified and fixed

When does the network ma]er? �� Network important when: �� (1) Computa9on op9mized �� (2) Serializa9on 9me low �� (3) Large amount of data sent �� over network ��

Discussion 28

What You Said “I very much appreciated the thorough nature of the "Making Sense of Performance in Data Analy9cs Frameworks" paper.” “ I see their paper as more of a survey on the performance of current data analy9cs plahorms as opposed to a paper that discusses fundamental tradeoffs between compute and networking resources. I think the ques9on of whether current “data-analy9cs plahorms” are network bound or CPU bound depends heavily on the implementa9on, and design assump9ons. As a result, I see their work as somewhat of a self-fulfilling prophecy.” 29

6.888 Lecture 8: Networking for Data Analy9cs Mohammad Alizadeh - PowerPoint PPT Presentation

6.888 Lecture 8: Networking for Data Analy9cs Mohammad Alizadeh Many thanks to Mosharaf Chowdhury (Michigan) and Kay Ousterhout (Berkeley) Spring 2016 1 Big Data Huge amounts of data being collected daily Wide variety of sources -

Exploring data.census.gov March 25, 2020 Dial al-in: 888 888-847 47-6593 6593 Participant P

NAMED DATA NETWORKING (NDN) Named Data Networking NDN BRIEF HISTORY When the Networking was

Networking in Eastern Networking in Eastern Networking in Eastern Networking in Eastern Europe

Social Networking Trends and Social Networking Trends and Social Networking Trends and Social

6.888 Advanced Topics in Networking Lecture 1: Introduc<on Mohammad Alizadeh Spring 2016

6.888 Lecture 14: Software Defined Networking Mohammad Alizadeh Many thanks to Nick McKeown

NETWORKING ATTACKS NETWORKING ATTACKS NOTICES Lab #2 extended to Feb. 17 @ 23:59 HW #3

Networking By: Dewan Islam What is Networking? - Networking is the connection between two or

Computer Networking NETWORKING Computer Networking Computer network is like a phone system

Networking Tapping the Hidden Job Market n How are jobs

Kids and Social Networking Introduction Kids and Social Networking Social Networking sites play

Using Your Network 10-07-16 Networking & You What is networking? Benefits of

Networking Don Porter CSE 506 Networking (2 parts) Goals: Review networking basics

Docker Networking Workshop Agenda 1. Detailed Overview 2. Docker Networking Evolution 3. Use

CD- -ROM Networking ROM Networking CD Somchai Prasitjutrakul Somchai Prasitjutrakul Dept. .

Networking By Destiney Plaza Overview What is networking Architecture Hardware

Apache Storm: Hands-on Session A.A. 2019/20 Fabiana Rossi Laurea Magistrale in Ingegneria

Introduction Resilient Multicast Support for IP multicast presents opportunity for large-

CS5412 / LECTURE 26 Ken Birman THE CHALLENGES OF INTRODUCING Spring, 2020 RDMA INTO CLOUD

Architecture of Flink's Streaming Runtime Robert Metzger @rmetzger_ rmetzger@apache.org What

Emergency Messages in VSNs By: Noor Ullah Contents Introduction Problems EWM

Designing Real-Time, Reliable and Efficient Cyber-Physical Systems for Future Smart City

Issues: Routing in Ad-hoc Networks Mobility Bandwidth constraint Error-prone and

Routing: Outlook Flooding Flooding Goal: To distribute a packet in the whole network

6.888 Lecture 8: Networking for Data Analy9cs Mohammad Alizadeh - PowerPoint PPT Presentation

6.888 Lecture 8: Networking for Data Analy9cs Mohammad Alizadeh Many thanks to Mosharaf Chowdhury (Michigan) and Kay Ousterhout (Berkeley) Spring 2016 1 Big Data Huge amounts of data being collected daily Wide variety of sources -

Exploring data.census.gov March 25, 2020 Dial al-in: 888 888-847 47-6593 6593 Participant P

NAMED DATA NETWORKING (NDN) Named Data Networking NDN BRIEF HISTORY When the Networking was

Networking in Eastern Networking in Eastern Networking in Eastern Networking in Eastern Europe

Social Networking Trends and Social Networking Trends and Social Networking Trends and Social

6.888 Advanced Topics in Networking Lecture 1: Introduc&lt;on Mohammad Alizadeh Spring 2016

6.888 Lecture 14: Software Defined Networking Mohammad Alizadeh Many thanks to Nick McKeown

NETWORKING ATTACKS NETWORKING ATTACKS NOTICES Lab #2 extended to Feb. 17 @ 23:59 HW #3

Networking By: Dewan Islam What is Networking? - Networking is the connection between two or

Computer Networking NETWORKING Computer Networking Computer network is like a phone system

Networking Tapping the Hidden Job Market n How are jobs

Kids and Social Networking Introduction Kids and Social Networking Social Networking sites play

Using Your Network 10-07-16 Networking &amp; You What is networking? Benefits of

Networking Don Porter CSE 506 Networking (2 parts) Goals: Review networking basics

Docker Networking Workshop Agenda 1. Detailed Overview 2. Docker Networking Evolution 3. Use

CD- -ROM Networking ROM Networking CD Somchai Prasitjutrakul Somchai Prasitjutrakul Dept. .

Networking By Destiney Plaza Overview What is networking Architecture Hardware

Apache Storm: Hands-on Session A.A. 2019/20 Fabiana Rossi Laurea Magistrale in Ingegneria

Introduction Resilient Multicast Support for IP multicast presents opportunity for large-

CS5412 / LECTURE 26 Ken Birman THE CHALLENGES OF INTRODUCING Spring, 2020 RDMA INTO CLOUD

Architecture of Flink's Streaming Runtime Robert Metzger @rmetzger_ rmetzger@apache.org What

Emergency Messages in VSNs By: Noor Ullah Contents Introduction Problems EWM

Designing Real-Time, Reliable and Efficient Cyber-Physical Systems for Future Smart City

Issues: Routing in Ad-hoc Networks Mobility Bandwidth constraint Error-prone and

Routing: Outlook Flooding Flooding Goal: To distribute a packet in the whole network

6.888 Advanced Topics in Networking Lecture 1: Introduc<on Mohammad Alizadeh Spring 2016

Using Your Network 10-07-16 Networking & You What is networking? Benefits of