Optimal Communication Cost Magdalena Balazinska and Dan Suciu - PowerPoint PPT Presentation

Query Processing with Optimal Communication Cost Magdalena Balazinska and Dan Suciu University of Washington AITF 2017 1

Context Past: NSF Big Data grant • PhD student Paris Koutris received the ACM SIGMOD Jim Gray Dissertation Award Current: AiTF Grant • PI’s Magda Balazinska, Dan Suciu • Student: Walter Cai 2

Basic Question • How much communication is needed to compute a query Q on p servers? • Parallel data processing – Gamma, MapReduce, Hive, Teradata, Aster Data, Spark, Impala, Myria, Tensorflow – See Magda Balazinska’s current class

Background • Q conjunctive query; ρ * = its fractional edge covering number Thm. [ Atserias,Grohe,Marx’2011] If every input relation has size ≤ m then |Output(Q )| ≤ m ρ * • Q(x,y,z) :- R(x,y) ∧ S(y,z) ∧ T(z,x) If |R|, |S|, |T| ≤ m then |Output(Q)| ≤ m 3/2 x ½ ½ ρ * = 3/2 ½ y z 4

Massively Parallel Communication Model (MPC) Extends BSP [Valiant] Input (size=m) Input data = size m O(m/p) O(m/p) Server 1 Server 1 . . . . Server p Server p Number of servers = p

Massively Parallel Communication Model (MPC) Extends BSP [Valiant] Input (size=m) Input data = size m O(m/p) O(m/p) Server 1 Server 1 . . . . Server p Server p Number of servers = p ≤ L ≤ L Round 1 Server 1 Server 1 . . . . Server p Server p One round = Compute & communicate

Massively Parallel Communication Model (MPC) Extends BSP [Valiant] Input (size=m) Input data = size m O(m/p) O(m/p) Server 1 Server 1 . . . . Server p Server p Number of servers = p ≤ L ≤ L Round 1 Server 1 Server 1 . . . . Server p Server p One round = Compute & communicate ≤ L Round 2 ≤ L Algorithm = Several rounds Server 1 Server 1 . . . . Server p Server p ≤ L ≤ L Round 3 . . . . . . . .

Massively Parallel Communication Model (MPC) Extends BSP [Valiant] Input (size=m) Input data = size m O(m/p) O(m/p) Server 1 Server 1 . . . . Server p Server p Number of servers = p ≤ L ≤ L Round 1 Server 1 Server 1 . . . . Server p Server p One round = Compute & communicate ≤ L Round 2 ≤ L Algorithm = Several rounds Server 1 Server 1 . . . . Server p Server p Max communication load / round / server = L ≤ L ≤ L Round 3 . . . . . . . .

Massively Parallel Communication Model (MPC) Extends BSP [Valiant] Input (size=m) Input data = size m O(m/p) O(m/p) Server 1 Server 1 . . . . Server p Server p Number of servers = p ≤ L ≤ L Round 1 Server 1 Server 1 . . . . Server p Server p One round = Compute & communicate ≤ L Round 2 ≤ L Algorithm = Several rounds Server 1 Server 1 . . . . Server p Server p Max communication load / round / server = L ≤ L ≤ L Round 3 . . . . . . . . Practical ε ∈ (0,1) Cost: Ideal Naïve 1 Naïve 2 Load L L = m/p L = m/p 1- ε L = m L = m/p Rounds r 1 O(1) 1 p

A Naïve Lower Bound • Query Q • Inputs R, S, T, … s.t. |size(Q)| = m ρ * • Algorithm with load L, • After r rounds, one server “knows” ≤ L*r tuples: it can output ≤ ( L*r) ρ * tuples (AGM) • p servers compute |size(Q)| = m ρ * , hence p*(L*r) ρ * ≥ m ρ * Thm. Any r-round algorithm has L ≥ m / r*p 1/ ρ * 14

Speedup Speed = O(1/L) A load of L = m/p corresponds to linear speedup A load of L = m/p 1- ε corresponds to # processors (=p) sub-linear speedup What is the theoretically optimal load L = f(m,p)? What is the theoretically optimal load L = f(m,p)? Is this the right question in the field? Is this the right question in the field? 15

Join of Two Tables Join(x,y,z) = R(x,y) ∧ S(y,z) z y x 1 1 |R| = |S| = m tuples ρ * = 2 In the field: • Hash-join on y: L = m / p (w/o skew) • Broadcast-join: L ≈ m In theory: L ≥ m / p 1/2

|R| = |S| = |T| = m tuples Triangles Triangles(x,y,z) = R(x,y) ∧ S(y,z) ∧ T(z,x) State of the art: • Hash-join, two rounds: • Problem: intermediate result too big! • Broadcast S,T, one round: • Problem: two local tables are huge! 17

Triangles(x,y,z) = R(x,y) ∧ S(y,z) ∧ T(z,x) |R| = |S| = |T| = m tuples Triangles in One Round • Place servers in a cube p = p 1/3 × p 1/3 × p 1/3 • Each server identified by (i,j,k) Server (i,j,k) Server (i,j,k) (i,j,k) k j [Afrati&Ullman’10] i p 1/3 [Beame’13,’14 ] 18

Triangles(x,y,z) = R(x,y) ∧ S(y,z) ∧ T(z,x) |R| = |S| = |T| = m tuples Triangles in One Round T Round 1 : Z X Send R(x,y) to all servers (h 1 (x),h 2 (y),*) Fred Alice Send S(y,z) to all servers (*, h 2 (y), h 3 (z)) S Y Z Send T(z,x) to all servers (h 1 (x), *, h 3 (z)) Jack Jack Jim Jim Output : Fred Alice R Fred Jim compute locally R(x,y) ∧ S(y,z) ∧ T(z,x) X Y Jack Jim Carol Alice Fred Alice Fred Jim … Jack Jim Carol Alice Fred Fred Jim Jim Jim Jim Jack Jack Jim Jack Carol Alice Fred Jim (i,j,k) Fred Jim … Jim Jack Fred Jim Fred Jim Jim Jack Fred Jim Jack Jim Fred Jim Jack Jim k j = h 1 (Jim) Jim Jack p 1/3 19 i = h 2 (Fred)

Triangles(x,y,z) = R(x,y) ∧ S(y,z) ∧ T(z,x) |R| = |S| = |T| = m tuples Communication load per server Theorem Assuming “no skew”, HyperCube computes Triangles with L = O(m/p 2/3 ) w.h.p. Can we compute Triangles with L = m/p? No! Theorem Any 1-round algo. has L = Ω (m/p 2/3 ), even on inputs with no skew. 20

Triangles(x,y,z) = R(x,y) ∧ S(y,z) ∧ T(z,x) |R| = |S| = |T| = 1.1M 1.1M triples of Twitter data  220k triangles; p=64 local 1 or 2-step hash-join; local 1-step Leapfrog Trie-join (a.k.a. Generic-Join) Total CPU time Number of tuples shuffled Wall clock time 2 rounds hash-join 1 round broadcast 1 round hypercube 21

Triangles(x,y,z) = R(x,y) ∧ S(y,z) ∧ T(z,x) |R| = |S| = |T| = 1.1M 1.1M triples of Twitter data  220k triangles; p=64

General Case Theorem The optimal load for computing Q in one-round on skew-free data is L = O(m / p 1/ τ * ) τ * = fractional vertex cover of Q’s hypergraph 0 1 0 ½ m / p 2/3 τ * = 3/2 m / p τ * = 1 ½ ½ Thm. Any r-round algorithm has L ≥ m / r*p 1/ ρ * ρ * = fractional edge cover of Q’s hypergraph 1 1 m / p 2/3 ρ * = 3/2 ½ ½ m / p 1/2 ρ * = 2 ½

Skew • Skewed data is major impediment to parallel data processing • Practical solutions: – Deal with stragglers, hope they eventually terminate – Remove heavy hitters from computation • Our approach: – Query  Residual Query – Join R(x,y) ∧ S(y,z)  Cartesian Product R(x) ∧ S(z) 24

Skewed Values  New Query Join(x,y,z) = R(x,y) ∧ S(y,z) No-skew : 1 2 p τ * = 1, L = m/p 1 2 p ½ 1 Skewed: (y = single value, degree = m) 2 Join becomes Product(x,z) = R(x) ∧ S(z) τ * = 2, L = m/p 1/2 R(x)  p ½ S(z) 

Optimal Communication Cost Magdalena Balazinska and Dan Suciu - PowerPoint PPT Presentation

Query Processing with Optimal Communication Cost Magdalena Balazinska and Dan Suciu University of Washington AITF 2017 1 Context Past: NSF Big Data grant PhD student Paris Koutris received the ACM SIGMOD Jim Gray Dissertation Award

TUTORIAL - TUTORIAL -ABC ABC TOTAL COST for a COST OBJECT TOTAL COST for a COST OBJECT

SK Telecom 1 U U U U U U U- U - - communication - - - - - communication

Cost Report Capital Cost Operating Cost (Up front cost) (Annual cost over time) Utilities

Cost Allocation Plans and Indirect Cost Rates Cost Allocation Plans and Indirect Cost Rates

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Chapter 4 Chapter 4 Marginal Costing and Cost-Volume-Profit Analysis Cost behaviour Cost

Session 12 Assessing and Developing Communication SECTION 4: 1 Communication Communication

The COST Science Communication Manager a new role Presentation for COST Action KICK-OFF

COST European Cooperation in Science and Technology Introduction to the COST Framework Programme

Pricing according to cost Cost-based pricing Cost of a service = value of economic means used in

COST Action CA18108 Quantum gravity phenomenology in the multi-messenger approach What is a COST

Chapter 2: Cost Behavior, Activity Analysis, and Cost Estimation Agenda History of Cost

Inverse problems and control optimal in non-linear mechanics C. Stolz 1 2 Introduction

Martingale Optimal Transport in Higher Hadrien De March Dimension Optimal transport

UHF Communication System UHF Communication System UHF Communication System UHF Communication

Ontology Engineering Lecture 1: Introduction to Knowledge bases, ontologies, and the Semantic Web

Hadoop Ecosystem Corso di Sistemi e Architetture per Big Data A.A. 2019/2020 Valeria Cardellini

Big Data and Internet Thinking Chentao Wu Associate Professor Dept. of Computer Science and

Scripting for Multimedia LECTURE 9: WORKING WITH TABLES Tables in HTML A table displays a

Date: March 20, 2014 Concerns: Exercises ImageMiner for ASCI course Tutors: Ork de Rooij and

Platinum 2007 Platinum 2007 14th May 2007 14th May 2007 Good morning, Ladies & Gentlemen.

Tracking a Changing Environment When to track change? When to not track? Use flower constancy to

Bio iocollections In Information Ext xtraction caro Alzuru, Andra Matsunaga, Maurcio

Optimal Communication Cost Magdalena Balazinska and Dan Suciu - PowerPoint PPT Presentation

Query Processing with Optimal Communication Cost Magdalena Balazinska and Dan Suciu University of Washington AITF 2017 1 Context Past: NSF Big Data grant PhD student Paris Koutris received the ACM SIGMOD Jim Gray Dissertation Award

TUTORIAL - TUTORIAL -ABC ABC TOTAL COST for a COST OBJECT TOTAL COST for a COST OBJECT

SK Telecom 1 U U U U U U U- U - - communication - - - - - communication

Cost Report Capital Cost Operating Cost (Up front cost) (Annual cost over time) Utilities

Cost Allocation Plans and Indirect Cost Rates Cost Allocation Plans and Indirect Cost Rates

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Chapter 4 Chapter 4 Marginal Costing and Cost-Volume-Profit Analysis Cost behaviour Cost

Session 12 Assessing and Developing Communication SECTION 4: 1 Communication Communication

The COST Science Communication Manager a new role Presentation for COST Action KICK-OFF

COST European Cooperation in Science and Technology Introduction to the COST Framework Programme

Pricing according to cost Cost-based pricing Cost of a service = value of economic means used in

COST Action CA18108 Quantum gravity phenomenology in the multi-messenger approach What is a COST

Chapter 2: Cost Behavior, Activity Analysis, and Cost Estimation Agenda History of Cost

Inverse problems and control optimal in non-linear mechanics C. Stolz 1 2 Introduction

Martingale Optimal Transport in Higher Hadrien De March Dimension Optimal transport

UHF Communication System UHF Communication System UHF Communication System UHF Communication

Ontology Engineering Lecture 1: Introduction to Knowledge bases, ontologies, and the Semantic Web

Hadoop Ecosystem Corso di Sistemi e Architetture per Big Data A.A. 2019/2020 Valeria Cardellini

Big Data and Internet Thinking Chentao Wu Associate Professor Dept. of Computer Science and

Scripting for Multimedia LECTURE 9: WORKING WITH TABLES Tables in HTML A table displays a

Date: March 20, 2014 Concerns: Exercises ImageMiner for ASCI course Tutors: Ork de Rooij and

Platinum 2007 Platinum 2007 14th May 2007 14th May 2007 Good morning, Ladies &amp; Gentlemen.

Tracking a Changing Environment When to track change? When to not track? Use flower constancy to

Bio iocollections In Information Ext xtraction caro Alzuru, Andra Matsunaga, Maurcio

Platinum 2007 Platinum 2007 14th May 2007 14th May 2007 Good morning, Ladies & Gentlemen.