coded mapreduce
play

Coded MapReduce Mohammad Ali Maddah-Ali Bell Labs, Alcatel-Lucent - PowerPoint PPT Presentation

Coded MapReduce Mohammad Ali Maddah-Ali Bell Labs, Alcatel-Lucent joint work with Sonze Li (USC) and Salman Avestimehr (USC) DIMACS Dec. 2015 Infrastructure for big data Computing Communication Storage The interaction among major


  1. Coded MapReduce Mohammad Ali Maddah-Ali Bell Labs, Alcatel-Lucent joint work with Sonze Li (USC) and Salman Avestimehr (USC) DIMACS Dec. 2015

  2. Infrastructure for big data Computing Communication Storage The interaction among major components is the limiting barrier!

  3. In this talk Computing Communication Fundamental tradeoff between Computing and Communication

  4. Formulation Minimum communication for a specific computation task ? Information Theory (Korner and Marton 1979) Computer Science (Yao 1979) Shortcomings: Need a framework that is • Problem oriented • General • Does not scale • Scalable Challenge: right formulation What does data companies are using?

  5. Storage Computation Hadoop Distributed File Systems MapReduce (HDFS) Communication Load Storage Communication Load Computation Load Refer to Yesterdays’ Talks: • Alexander Barg • Alexander Dimakis

  6. MapReduce: A General Framework N Subfiles, K Servers, Q Keys Input File N Subfiles 1 2 3 4 5 6 K Servers Map Map Map Intermediate (Key, Value) 1 1 1 3 3 3 5 5 5 2 2 2 4 4 4 6 6 6 (Blue, , ) 1 1 Shuffling Phase 1 3 5 1 3 5 1 3 5 2 4 6 2 4 6 2 4 6 Reduce Reduce Reduce Q Keys

  7. Example: Word Counting N Subfiles, K Servers, Q Keys A Books N=6 Chapters 1 2 3 4 5 6 K=3 Servers Map Map Map Intermediate (Key, Value) 1 1 1 3 3 3 5 5 5 (A, , ) 2 2 2 4 4 4 6 6 6 1 1 Number of A’s in chapter one 1 3 5 1 3 5 1 3 5 2 4 6 2 4 6 2 4 6 Reduce Reduce Reduce Q=3 Keys # of A’s # of C’s # of B’s

  8. MapReduce: A General Framework N Subfiles, K Servers, Q Keys General Framework • Matrix Multiplication 1 2 3 4 4 6 • Distributed Optimization • Page Rank • …. Map Map Map 1 1 1 3 3 3 5 5 5 Active Research Area: 2 2 2 4 4 4 6 6 6 How to fit different jobs into this framework. 1 3 5 1 3 5 1 3 5 2 4 6 2 4 6 2 4 6 Reduce Reduce Reduce

  9. Communication Load N=6 Subfiles, K=3 Servers, Q=3 Keys 5 6 1 2 3 4 1 1 1 3 3 3 5 5 5 2 2 2 4 5 4 6 6 6 Communication Load (MR) 1 2 1 2 3 4 3 4 5 6 5 6 1 3 5 1 3 5 1 3 5 Communication is a 2 4 6 2 4 6 2 4 6 bottleneck! Can we reduce communication load at the cost of computation?

  10. Communication Load N Subfiles, K Servers, Q Keys, Comp. Load r 5 6 1 2 3 4 1 2 4 3 1 2 1 1 1 3 3 3 5 5 5 2 2 2 4 5 4 6 6 6 3 3 3 1 1 1 1 1 1 4 4 4 2 2 2 2 2 2 1 2 3 4 5 6 Comm. Load (Uncoded) 1 3 5 1 3 5 1 3 5 2 4 6 2 4 6 2 4 6 Locally available

  11. Communication Load N Subfiles, K Servers, Q Keys, Comp. Load r Comm. Load (Map Reduce) Communication Load Comm. Load (Uncoded) Computation Load Can we do better? Can we get a non-vanishing gain for large K?

  12. Coded MapReduce N Subfiles, K Servers, Q Keys, Comp. Load r 5 6 1 2 3 4 1 2 4 3 5 6 1 1 1 3 3 3 5 5 5 2 2 2 4 4 4 6 6 6 3 3 3 1 1 1 5 5 5 4 4 4 2 2 2 6 6 6 ⊕ ⊕ ⊕ 1 3 5 4 2 6 Comm. Load (Uncoded) 1 3 5 1 3 5 1 3 5 2 4 6 2 4 6 2 4 6 Comm. Load (Coded) Each Coded (key,value) pairs are useful for two servers

  13. Communication Load N Subfiles, K Servers, Q Keys, Comp. Load r Comm. Load (Map Reduce) Communication Load Comm. Load (Uncoded) Comm. Load (Coded) Computation Load Communication Load x Computation Load ~ constant

  14. Proposed Scheme N Subfiles, K Servers, Q Keys, Comp. Load r Objective: Each server can coded intermediate (Key, Value) pairs that are useful for r other servers Need to assign the sub-files such that: - for every subset S of r+1 servers, and for every subset T of S with r servers, - Servers in T share an intermediate (Key, Value) pairs useful for server S \ T - S T ⊕ ⊕ ⊕

  15. Proposed Scheme N Subfiles, K Servers, Q Keys, Comp. Load r -N sub-files: W 1 , W 2 , …, W N - Split the set of subfiles to batch of subfiles. - Each subset of size r of the servers takes a unique batch of subfiles.

  16. Coded MapReduce-Delay Profile N=1200 Subfiles, K=10 Servers, Q=10 Keys r =1 r =2 r =3 r =4 r =5 r =6 r =7 As soon as r copes of a mapping is done, kills that mapping on other servers. Map time duration: Exponential random variable

  17. Connection with Coded Caching A 1# A 2# A 3# B 1# B 2# B 3# C 1# C 2# C 3# A 2 B 1% A 3 C 1% B 3 C 2% A 1# B 1# C 1# A 2# B 2# C 2# A 3# B 3# C 3# Ji-Caire-Molisch, 2014 Maddah-Ali-Niessen, 2012 - In coded caching, in placement phase, the demand of the each user is not known - In coded MapReduce, in job assignment, the server which reduces a key is known!

  18. Why it works! N Subfiles, K Servers, Q Keys, Comp. Load r Key Idea: - When a subfile is assigned to a server, that server computes all (key,value) pairs for that subfiles. - This imposes a symmetry to the problem.

  19. Can We Do Better? Theorem: The proposed scheme is optimum within a constant factor in rate. Comm. Load (Coded)

  20. Outer Bound N=3 Subfiles, K=3 Servers, Q=3 Keys, Comp. Load r Server 1 Server 3 Server 2 Server 1 Server 2 Server 3 Server 1 Server 2 Server 3

  21. Outer Bound N=3 Subfiles, K=3 Servers, Q=3 Keys, Comp. Load r Server 1 Server 3 Server 2 Server 1 Server 2 Server 3

  22. Conclusion • Communication-Computation tradeoff is of great interests and challenging • Coded MapReduce provides a near optimal framework for trading “computing” with “communication” in distributed computing • Communication load x Computation load is approximately constant • Many future directions: – Impact of Coded MapReduce on the overall run-time of MapReduce – General server topologies – Applications to wireless distributed computing (“wireless Hadoop ”) • Papers available on arxiv.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend