massively parallel communication and query evaluation
play

Massively Parallel Communication and Query Evaluation Paul Beame - PowerPoint PPT Presentation

Massively Parallel Communication and Query Evaluation Paul Beame U. of Washington Based on joint work with Paraschos Koutris and Dan Suciu [PODS 13], [PODS 14] 1 Massively Parallel Systems 2 MapReduce [Dean,Ghemawat 2004] Rounds of Map:


  1. Massively Parallel Communication and Query Evaluation Paul Beame U. of Washington Based on joint work with Paraschos Koutris and Dan Suciu [PODS 13], [PODS 14] 1

  2. Massively Parallel Systems 2

  3. MapReduce [Dean,Ghemawat 2004] Rounds of Map: Local and data parallel on (key, value) pairs creating (key 1 ,value 1 )… (key k ,value k ) pairs Shuffle: Groups or sorts (key, value) pairs by key • Local sorting plus global communication round Reduce: Local and data parallel on key s: (key,value 1 )… (key,value k ) reduces to (key,value) – Data fits jointly in main memory of 100’s/1000’s of parallel servers each with gigabyte + storage – Fault tolerance 3

  4. What can we do with MapReduce? Models & Algorithms • Massive Unordered Distributed Data (MUD) model [Feldman-Muthukrishnan-Sidiropoulos-Stein-Svitkina 2008] – 1 round can simulate data streams on symmetric functions, using Savitch- like small space simulation – Exact computation of frequency moments in 2 rounds of MapReduce • MRC model [Karloff, Suri, Vassilvitskii 2010] – For n 1-ε processors and n 1-ε storage per processor, O ( t ) rounds can simulate t PRAM steps so O (log k n ) rounds can simulate NC k – Minimum spanning trees and connectivity on dense graphs in 2 rounds of MapReduce – Generalization of parameters, sharper simulations, sorting and computational geometry applications [Goodrich, Sitchinava, Zhang 2011] 4

  5. What can we do with MapReduce? Models & Algorithms • Communication-processor tradeoffs for 1 round of MapReduce – Upper bounds for database join queries [Afrati,Ullman 2010] – Upper and lower bounds for finding triangles, matrix multiplication, finding neighboring strings [Afrati, Sarma, Salihoglu, Ullman 2012] 5

  6. More than just MapReduce What can we do with this? Are there limits? Lower bounds? A simple general model? 6

  7. MapReduce �(� ��� �) �(� ��� �) time [Dean,Ghemawat 2004] Rounds of Map: Local and data parallel on (key, value) pairs creating (key 1 ,value 1 )… (key k ,value k ) pairs Shuffle: Groups or sorts (key, value) pairs by key • Local sorting plus global communication round Reduce: Local and data parallel on key s: (key,value 1 )… (key,value m ) reduces to (key,value) unspecified time – Data fits jointly in main memory of 100’s/1000’s of parallel servers each with gigabyte + storage – Fault tolerance essential for efficiency 7

  8. Properties of a Simple General Model of Massively Parallel Computation • Organized in synchronous rounds • Local computation costs per round should be considered free, or nearly so – No reason to assume that sorting is special compared to other operations • Memory per processor is the fundamental constraint – This also limits # of bits a processor can send or receive in a single round 8

  9. Bulk Synchronous Parallel Model [Valiant 1990] Local computations separated by global synchronization barriers • Key notion: An h -relation, in which each processor sends and receives at most h bits • Parameters: – periodicity L : time interval between synchronization barriers ���� �� ������� �� ���������� – bandwidth g : ��� ��� � � 9

  10. Massively Parallel Communication (MPC) Model Input (size= N ) • Total size of the input = N • Number of processors = p Server 1 . . . . Server p • Each processor has: Step 1 – Unlimited computation Server 1 . . . . Server p power Step 2 – L ≥ N/p bits of memory Server 1 . . . . Server p • A round/step consists of: Step 3 – Local computation . . . . – Global communication of an L -relation • i.e., each processor sends/receives ≤ L bits • L stands for the communication/memory load 10

  11. MPC model continued • Wlog N/p ≤ L ≤ N – any processor with access to the whole input can compute any function • Communication – processors pay individually for receiving the L bits per round, total communication cost up to pL ≥ N per round. • Input distributed uniformly – Adversarially or random input distribution also • Access to random bits (possibly shared) 11

  12. Relation to other communication models • Message-passing (private messages) model – each costs per processor receiving it – wlog one player is a Coordinator who sends and receives every message • Many recent results improving Ω ( N / p ) lower bounds to Ω ( N ) [WZ 12], [PVZ12], WZ13], [BEOPV13],... • Complexity is never larger than N bits independent of rounds – No limit on bits per processor, unlike MPC model • CONGEST model – Total communication bounds > N possible but depends on network diameter and topology • MPC corresponds to a complete graph for which largest communication bound possible is ≤ N 12

  13. Complexity in the MPC model • Tradeoffs between rounds r , processors p , and load L • Try to minimize load L for each fixed r and p – Since N/p ≤ L ≤ N , the range of variation in L is a factor p ε for 0 ≤ ε ≤ 1 • 1 round – still interesting theoretical/practical questions – many open questions • Multi-round computation more difficult – e.g. PointerJumping, i.e., st-connectivity in out-degree 1 graphs. • Can achieve load O ( N/p ) in r =O(log 2 p ) rounds by pointer doubling 13

  14. Database Join Queries • Given input relations R 1 , R 2 , …, R m as tables of tuples, of possibly different arities, produce the table of all tuples answering the query Q ( x 1 , x 2 , …, x k ) = R 1 ( x 1 , x 2 , x 3 ), R 2 ( x 2 , x 4 ),…, R m ( x 4 , x k ) – Known as full conjunctive queries since every variable in the RHS appears in the query (no variables projected out) • Our examples: Connected queries only 14

  15. The Query Hypergraph • One vertex per variable • One hyper-edge per relation Q ( x 1 , x 2 , x 3 , x 4 , x 5 ) = R ( x 1 , x 2 , x 3 ), S ( x 2 , x 4 ), T ( x 3 , x 5 ), U ( x 4 , x 5 ) R x 4 S x 2 x 1 U x 5 x 3 T 15

  16. k-partite data graph/hypergraph x 3 Query Hypergraph x 1 x 4 x 2 Data Hypergraph � possible values per variable �� vertices total 16

  17. k-partite data graph/hypergraph x 3 Query Hypergraph x 1 x 4 x 2 Query Data Hypergraph Answers � possible values per variable �� vertices total 17

  18. Some Hard Inputs • Matching Databases – Number of relations R 1 , R 2 , … and size of query is constant – Each R j is a perfect a j -dimensional matching on [ n ] a j where a j is the arity of R j • i.e. among all the a j -tuples ( k 1 ,..., k a j ) ∊ R j , each value k ∊ [ n ] appears exactly once in each coordinate. • No skew (all degrees are the same) • Number of output tuples is at most n – Total input size is N = O (log( n! ))= O ( n log n ) 18

  19. Example in two steps Algorithm 1: Find all triangles C 3 ( x , y , z ) = R 1 ( x , y ), R 2 ( y , z ), R 3 ( z , x ) For each server 1 ≤ u ≤ p : Input : n / p tuples from each of R 1 , R 2 , R 3 R 1 = R 2 = R 3 = Step 1 : send R 1 ( x , y ) to server ( y mod p ) X Y Y Z Z X send R 2 ( y , z ) to server ( y mod p ) a1 b3 b1 c2 c1 a2 a2 b1 b2 c3 c2 a1 Step 2 : join R 1 ( x , y ) with R 2 ( y , z ) send [ R 1 ( x , y ), R 2 ( y , z )] to server ( z mod p ) a3 b2 b3 c1 c3 a3 send R 3 ( z , x ) to server ( z mod p ) Output join [R 1 ( x , y ), R 2 ( y , z ) ] with R 3 ( z , x’) output all triangles R 1 ( x , y ), R 2 ( y , z ), R 3 ( z , x ) C 3 = Load: O ( n / p ) tuples (i.e. ε = 0 ) Number of rounds: r = 2 X Y Z a3 b2 c3 19

  20. [Ganguly’92, Afrati’10] ( i , j , k ) Example in one step j k k j p 1/3 i i Servers form a cube: [ p ] ≅ [ p 1/3 ] × [ p 1/3 ] × [ p 1/3 ] Find all triangles Algorithm 2: C 3 ( x , y , z ) = R 1 ( x , y ), R 2 ( y , z ), R 3 ( z , x ) For each server 1 ≤ u ≤ p : R 1 = R 2 = R 3 = Step 1 : Choose random hash functions h 1 ,h 2 ,h 3 X Y Y Z Z X send R 1 ( x , y ) to servers ( h 1 ( x ) mod p 1/3 , h 2 ( y ) mod p 1/3 , * ) a1 b3 b1 c2 c1 a2 send R 2 ( y , z ) to a2 b1 b2 c3 c2 a1 servers ( * , h 2 ( y ) mod p 1/3 , h 3 ( z ) mod p 1/3 ) a3 b2 b3 c1 c3 a3 send R 3 ( z , x ) to servers ( h 1 ( x ) mod p 1/3 , * , h 3 ( z ) mod p 1/3 ) Output all triangles R 1 ( x , y ), R 2 ( y , z ), R 3 ( z , x ) C 3 = Load: O( n / p × p 1/3 ) tuples ( ε = 1/3 ) X Y Z Number of rounds: r = 1 a3 b2 c3 20

  21. We Show Example Find all triangles C 3 ( x , y , z ) = R 1 ( x , y ), R 2 ( y , z ), R 3 ( z , x ) Load: O( n / p × p 1/3 ) tuples ( ε = 1/3 ) Number of rounds: r = 1 Above algorithm is optimal for any randomized 1 round MPC algorithm for the triangle query Based on general characterization of queries based on the fractional cover number of their associated hypergraph 21

  22. Fractional Cover Number τ* Vertex Cover LP : Edge Packing LP: τ* = min ∑ i v i τ* = max ∑ j u j Subject to: Subject to: ∑ x i ∈ vars ( Rj ) v i ≥ 1 ∀ j ∑ x i ∈ vars ( Rj ) u j ≤ 1 ∀ i v i ≥ 0 ∀ i u j ≥ 0 ∀ j ½ τ* ( L k )=  k / 2  τ* ( C k )= k / 2 ½ 0 1 1 0 ½ 22 ½ ½

  23. 1-Round No Skew Theorem: Any 1-round randomized MPC algorithm with p = ω ( 1 ) and load o ( N / p 1 / τ* ( Q ) ) will fail to compute connected query Q on some matching database input with probability Ω ( 1 ). τ* ( C 3 )= 3 / 2 so need Ω ( N/p 2/3 ) load, i.e. ε ≥ p 1/3 for C 3 … previous 1-round algorithm is optimal Can get a matching upper bound this for all databases without skew by setting parameters in randomized algorithm generalizing the triangle case • exponentially small failure probability 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend