acm symposium on cloud computing 2019
play

ACM Symposium on Cloud Computing 2019 1 Tenants Cloud providers - PowerPoint PPT Presentation

ACM Symposium on Cloud Computing 2019 1 Tenants Cloud providers Rent Virtual Machines (VMs) VM Operate cloud infrastructures VM Great budget expenditure for: VM Data center equipment Power provisioning 2 Tenants Cloud providers


  1. ACM Symposium on Cloud Computing 2019 1

  2. Tenants Cloud providers Rent Virtual Machines (VMs) VM Operate cloud infrastructures VM Great budget expenditure for: VM • Data center equipment • Power provisioning 2

  3. Tenants Cloud providers Rent Virtual Machines (VMs) VM Operate cloud infrastructures VM Great budget expenditure for: VM • Data center equipment • Power provisioning Ø Virtual resources might be provisioned (via tenants) for peak load Ø Tenants’ VM placement (via providers) is challenging 3

  4. Cumulative probability, F(x) X = Average usage CDF of average CPU and memory usage, Alibaba cluster trace (2018). fg = foreground/online workload 4

  5. Cumulative CPU probability, utilization F(x) (%) X = Average usage Time (days) CDF of average CPU and memory usage, VM-level CPU usage for the Azure Alibaba cluster trace (2018). trace (2017). fg = foreground/online workload 5

  6. Cumulative CPU probability, utilization F(x) (%) X = Average usage Time (days) CDF of average CPU and memory usage, VM-level CPU usage for the Azure Great opportunity to use cloud idle resources Alibaba cluster trace (2018). trace (2017). fg = foreground/online workload 6

  7. Cumulative probability, F(x) X = Average usage CDF of average CPU and memory usage, Alibaba cluster trace (2018). 7

  8. Ø Cumulative • probability, F(x) • X = Average usage CDF of average CPU and memory usage, Alibaba cluster trace (2018). bg = background/batch workload 8

  9. Ø Cumulative • probability, F(x) • X = Average usage CDF of average CPU and memory usage, Alibaba cluster trace (2018). Problem statement: How to schedule background batch jobs to improve utilization without hurting black-box foreground performance? bg = background/batch workload 9

  10. Ø Ø • • • • Ø 10

  11. Ø • fg: facebook bg: FB-Hadoop • • • Ø • • o 11

  12. Ø Ø Ø Ø Physical server network Virtual Machine ( VM ) … Container [n_socket] Container [1] Worker process Worker process Data Scavenger Daemon sources 12

  13. Ø • VM Container Web serving DCopy 0 1 2 3 CPU Cores Last Level Cache (LLC) Ubuntu 16.04, KVM, Docker Using Linux’s cpuset cgroups 13

  14. Ø • VM Container Web serving DCopy 95%ile RT degradation 0 1 2 3 (%) CPU Cores Last Level Cache (LLC) Ubuntu 16.04, KVM, Docker Background CPU usage (%) Using Linux’s cpuset cgroups 14

  15. Ø • VM Container Web serving DCopy Instruction Per 95%ile RT Cycle (IPC) degradation 0 1 2 3 degradation(%) (%) CPU Cores Last Level Cache (LLC) Ubuntu 16.04, KVM, Docker Background CPU usage (%) Using Linux’s cpuset cgroups 15

  16. Ø • VM Container Web serving DCopy Instruction Per 95%ile RT Cycle (IPC) degradation 0 1 2 3 degradation(%) (%) CPU Cores IPC is used as performance proxy Last Level Cache (LLC) Ubuntu 16.04, KVM, Docker Background CPU usage (%) Using Linux’s cpuset cgroups 16

  17. Ø • o • • 17

  18. Ø Our generic online algorithm • Monitor VMs’ perf metric (e.g., memory usage) for window-size • Calculate mean, 𝜈 , and standard deviation, 𝜏 • React based on the VMs’ perf metric and 𝝂 +/- 𝒅 . 𝝉 Headroom Simplified illustration window-size 𝝂 + 𝒅. 𝝉 bg-- Normalized Do nothing metric value [memory usage, 𝝂 − 𝒅. 𝝉 bg++ network usage] Time bg = 0 bg = 1 – ( 𝝂 + 𝒅. 𝝉 ) 18

  19. Ø • 19

  20. Ø • Training CloudSuite Widely used benchmark suite Foreground Testing TailBench Designed for latency-critical applications KMeans A popular clustering algorithm Background (SparkBench) SparkPi Computes Pi with very high precision 20

  21. Ø • Training CloudSuite Widely used benchmark suite Foreground Testing TailBench Designed for latency-critical applications KMeans A popular clustering algorithm Background (SparkBench) SparkPi Computes Pi with very high precision Sensitivity analysis Experimental evaluation 21

  22. The load generators employed in TailBench are open-loop. Workload Domain Tail latency scale Xapian Online search Milliseconds Moses Real-time translation Milliseconds Silo In-memory database (OLTP) Microseconds Specjbb Java middleware Microseconds Masstree Key-value store Microseconds Shore On-disk database (OLTP) Milliseconds Sphinx Speech recognition Seconds Img-dnn Image recognition Milliseconds http://people.csail.mit.edu/sanchez/papers/2016.tailbench.iiswc.pdf 22

  23. 0 1 2 3 4 0 1 2 3 4 PM 1 5 6 7 8 9 5 6 7 8 9 LLC of size 25MB LLC of size 25MB Processor socket 0 Processor socket 1 250GB DRAM 10 Gb/s network KVM, Docker Resource Manager, PM 2 Ubuntu 16.04 Name Node, Data Node 23

  24. Background VM 1 0 0 1 1 2 2 3 3 4 4 0 1 2 3 4 PM 1 5 5 6 6 7 7 8 8 9 9 5 6 7 8 9 LLC of size 25MB LLC of size 25MB Processor socket 0 Processor socket 1 250GB DRAM 10 Gb/s network KVM, Docker Resource Manager, PM 2 Ubuntu 16.04 Name Node, Data Node 24

  25. Background VM 2 Background VM 1 0 0 1 1 2 2 3 3 4 4 0 0 1 1 2 2 3 3 4 4 PM 1 5 5 6 6 7 7 8 8 9 9 5 5 6 6 7 7 8 8 9 9 LLC of size 25MB LLC of size 25MB Processor socket 0 Processor socket 1 250GB DRAM 10 Gb/s network KVM, Docker Resource Manager, PM 2 Ubuntu 16.04 Name Node, Data Node 25

  26. Ø Ø • • • • Ø 26

  27. VM 1 Workload || VM 2 Workload bg: SparkPi 95%ile latency degradation (%) Better 27

  28. VM 1 Workload || VM 2 Workload bg: SparkPi 95%ile latency CPU Memory degradation (%) 43%↑ 201%↑ Better 28

  29. VM 1 Workload || VM 2 Workload bg: SparkPi 95%ile latency CPU Memory degradation (%) 43%↑ 201%↑ Better bg: KMeans 95%ile latency degradation (%) Better 29

  30. VM 1 Workload || VM 2 Workload bg: SparkPi 95%ile latency CPU Memory degradation (%) 43%↑ 201%↑ Better bg: KMeans 95%ile latency CPU Memory degradation (%) 34%↑ 321%↑ Better 30

  31. Lab testbed: 2-vCPU foreground VM, 2-core background container. 20 Increase in Baseline Heracles transfer time (%) Static 160Mbps 10 Static 80Mbps Better Scavenger 0 Sorting FFT 31

  32. Lab testbed: 2-vCPU foreground VM, 2-core background container. 20 Increase in Baseline Heracles transfer time (%) Static 160Mbps 10 Static 80Mbps Better Scavenger 0 Sorting FFT CPU Network Scavenger outperforms static approaches while 37%↑ 180Mbps ↑ affording higher background usage. 32

  33. Cloud testbed: 4-vCPU foreground VM, 6-core background DCopy container. 4 35346 369968 13 177343 62 50612 Normalized 3 95%ile latency No background 95%ile latency Baseline Scavenger 2 Better 1 0 xapian moses silo specjbb masstree shore sphinx img-dnn 33

  34. Cloud testbed: 4-vCPU foreground VM, 6-core background DCopy container. 4 35346 369968 13 177343 62 50612 Normalized 3 95%ile latency No background 95%ile latency Baseline Scavenger 2 Better 1 0 xapian moses silo specjbb masstree shore sphinx img-dnn 3-5% Scavenger can successfully and aggressively regulate bg CPU ↑ workload to mitigate its impact on fg performance. 34

  35. Ø Ø • • Ø • • 35

  36. 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend