trading communication and computing for
play

Trading Communication and Computing for Distributed Matrix - PowerPoint PPT Presentation

Multi-Cell Mobile Edge Coded Computing: Trading Communication and Computing for Distributed Matrix Multiplication ISIT, June 21-26, 2020 Emerging Mobile Applications Computation-intensive Delay-sensitive 2 Mobile Edge Computing


  1. Multi-Cell Mobile Edge Coded Computing: Trading Communication and Computing for Distributed Matrix Multiplication ISIT, June 21-26, 2020

  2. Emerging Mobile Applications Computation-intensive ◼ Delay-sensitive ◼ 2

  3. Mobile Edge Computing (MEC) Provides IT and cloud-computing capabilities within the Radio Access Network (RAN) in ◼ close proximity to mobile subscribers [ETSI'14] Data center Web Gateway CDN Promote user experience: ◆ Save energy ◆ Reduce latency [ETSI’14] “Mobile -edge computing —Introductory technical white paper,” White Paper, ETSI, Sophia Antipolis, France, Sep. 2014. 3

  4. Challenge Task offloading procedure Uplink Computation Downlink Computation timeline Downlink Uplink EN 1 EN k EN K User i User 1 User M Input data uploading Challenges: ◼ ◆ Severe interferences or deep fading Distributed edge computing ◼ ◆ Random server computing times, i.e., stragglers Output data downloading ◆ End-to-end times are significantly prolonged ◼ 4

  5. Our Approach Exploit computation replication and coded computing ◼ ◆ Consider matrix multiplication in linear inference task: — U : user ’s input vectors, A : network-stored model, V : desired output vectors ◆ Assign the input vectors U from users to multiple ENs ◆ Encode model A by hybrid MDS and repetition codes Overcome Reduce recovery Repeated assignment of U straggler threshold Create spatial redundancy Transmission Mitigate MDS-Repetition coding for A cooperation interferences Investing more time in any one of three task offloading steps can reduce the ◼ time needed for subsequent steps Tradeoffs among upload, computing, and download latencies 5

  6. Related Works [Zhang’19] utilizes MDS-Repetition codes ◼ ◆ Assume input vectors from all users are available at all ENs ◆ Propose a computing-downloading strategy [Li’20] exploits computation replication ◼ ◆ Assume computing times of ENs are deterministic; adopt general task model ◆ Characterize an upload-download latency tradeoff Our work ◼ ◆ Propose a joint task assignment, upload, computing, and download policy ◆ Study tradeoffs among upload, computing, and download latencies ◆ Converse: our policy is approximately optimal for sufficiently large upload times. [Zhang’19]J. Zhang and O. Simeone, “ On model coding for distributed inference and transmission in mobile edge computing systems, ” IEEE Commun. Letters, vol. 23, no. 6, pp. 1065 – 1068, Jun. 2019. [Li’20] K. Li, M. Tao, and Z. Chen, “Exploiting computation replication for mobile edge computing: A fundamental 6 computation- communication tradeoff study,” IEEE Trans. Wireless Commun., pp. 1 – 1, 2020.

  7. MEC Network Model Stored encoded model Uplink Downlink User 1 User i User M Input data Desired output Each user i has N input vectors and desires N output vectors ◼ Each EN stores rows of A m × n , ◼ is the time to compute a row-vector product ◼ is the set of input vectors from all users assigned to EN k ◼ The computing time of EN k is ◼ 7

  8. Performance Metric the average number of ENs that Repetition order r : ◼ are assigned the same input vector The rest K-q ENs Recovery order q: the number of non-straggling ENs to return outputs are stragglers ◼ Feasible region: To store enough information ◆ of A for computing outputs Normalized uploading time (NULT): ◼ reference time Normalized computing time (NCT): ◼ reference time Normalized downloading time (NDLT): ◼ reference time avoid rounding complications For an NULT , the compute-download latency region: ◼ 8

  9. Fundamental Question Given an upload latency , what is the optimal trade-off region between computing and download latencies ? 4.5 4 3.5  d ) 3 Inner Bound NDLT ( 2.5 2 1.5 Outer Bound 1 0 2 4 6 8 10 12 14 16 NCT (  c ) Compute-download latency region at for M=K=10 ◆ Characterize the inner bound and outer bound on the compute-download latency region at any given upload latency ◆ Present tradeoffs among upload, computing, and download latencies. 9

  10. Example: Task Assignment & Upload M=5 users, K=5 ENs, N=5 input vectors, m=40 row vectors, μ=3/5, (r, q)=(4, 3) ◼ Each user divides input vectors into 5 4 = 5 subsets, each has 1 input and is ◼ assigned to a distinct subset of 4 ENs Uplink: 5-transmitter 5-receiver X-multicast user ◼ i = 1, ..., 5 channel with multicast group size 4 u i,5 u i,2 u i,3 u i,4 1 EN 1 Interference ◆ Optimal per-receiver DoF … alignment u i,1 u i,3 u i,4 u i,5 u i,5 u i,4 u i,3 EN 2 ◆ Approximated transmission rate: u i,1 u i,2 u i,4 u i,5 i EN 3 ◆ Upload time: u i,1 u i,2 u i,1 u i,2 u i,3 u i,5 EN 4 … The NULT: ◼ 5 u i,1 u i,2 u i,3 u i,4 EN 5 Any 2 ENs may be stragglers at edge computing phases 10

  11. Example: Coding & Edge Computing Repetition code MDS code rate Hybrid MDS-Repetition codes rate ◼ select to m store at each ◆ Coding rates : EN storage constraint: MDS code Inputs assignment recovery condition: (60, 40) a a A A c = [ , ..., ] at r = 4 1 60 choose maximum : store a i = 1, ..., 5 a a a a a a a 7 19 22 1 4 1 0 13 16 a a a a a a a a u i,2 u i,3 u i,4 u i,5 ◆ Encode A into A c with 60 rows, then split into × 8 20 23 2 5 1 1 14 17 EN 1 a a a a a a a a 9 21 24 3 6 1 2 15 18 submatrices, each with 6 rows and a a a a a a a a 25 28 1 4 31 33 37 40 a a a a a a a a u i,1 u i,3 u i,4 u i,5 × EN 2 replicated at 2 ENs. 26 29 32 2 5 41 34 38 a a a a a a a a 27 30 33 39 42 3 6 36 a a a a a a a a 25 28 1 0 43 49 52 7 46 a a a a a a a a u i,1 u i,2 u i,4 u i,5 × EN 3 Edge computing ( q = 3 ) 29 26 8 1 1 47 50 53 44 a a a a a a a a ◼ 30 9 1 2 45 54 27 48 51 a a a a a a a a ◆ Each EN computes 24 × 20 row-vector products 43 13 16 31 33 46 55 58 a a a a a a a a u i,1 u i,2 u i,3 u i,5 EN 4 × 32 56 59 14 17 34 44 47 a a a a a a a a ◆ Waiting for the fastest 3 ENs, the NCT is 60 57 15 18 33 36 45 48 a a a a a a a a 19 22 37 40 58 49 52 55 u i,1 u i,2 u i,3 u i,4 a a a a a a a × a EN 5 20 23 41 50 53 56 59 38 a a a a a a a a 60 21 24 51 54 57 39 42 11 Any 2 ENs may be stragglers

  12. Example: Output Data Download Divide needed outputs into multiple groups ◼ MISO broadcast ◆ Different groups are transmitted using TDMA channel ◆ Downlink channel for transmitting outputs in each group is cooperative X channel Computation results of : ◼ ◆ 30 outputs — 2-transmitter 5-receiver MISO broadcast channel — optimal per-receiver DoF: 2/5 by zero-forcing (ZF) precoding MDS code store (60, 40) Inputs assignment a a A c = [ , ..., ] A 1 60 at r = 4 12

  13. Example: Output Data Download Divide needed outputs into multiple groups ◼ MISO broadcast ◆ Different groups are transmitted using TDMA X channel channel ◆ Downlink channel for transmitting outputs in each group is cooperative X channel Computation results of : ◼ ◆ 30 outputs — 2-transmitter 5-receiver MISO broadcast channel — optimal per-receiver DoF: 2/5 by zero-forcing (ZF) precoding MDS code store ◆ The rest needed 34×5=170 outputs: (60, 40) Inputs assignment a a — 2-transmitter 5-receiver X channel, A c = [ , ..., ] A 1 60 at r = 4 — optimal per-receiver DoF: 1/3 by asymptotic interference alignment (IA) ◆ NDLT: 3/40+51/100=117/200 13

  14. Example: Output Data Download Computation results of : ◼ MISO broadcast X channel channel ◆ NDLT: 117/200×2 = 117/100 MDS code store (60, 40) Inputs assignment a a A c = [ , ..., ] A 1 60 at r = 4 14

  15. Example: Output Data Download Computation results of : cooperative X ◼ channel X channel ◆ 3-transmitter 5-receiver cooperative X channel with cooperation group size 2 ◆ 3-transmitter 5-receiver X channel ◆ NDLT: (21/100+77/300)×2=14/15 Note that the rest number of outputs in the last round can be regarded as an integer divided evenly by when Total NDLT: ◼ =14/15+(117/200)×3=1613/600 MDS code store (60, 40) Inputs assignment a a A c = [ , ..., ] A 1 60 at r = 4 15

  16. Achievable Results At a pair (r, q) in ◼ NULT: ◆ NCT: ◆ NDLT: ◆ where and is determined by Consider all feasible values of q For an NULT , the inner bound of compute-download latency region: ◼ (time- and memory-sharing) 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend