SOFT CONTAINER
TOWARDS 100% RESOURCE UTILIZATION
ACCELA ZHAO, LAYNE PENG
SOFT CONTAINER TOWARDS 100% RESOURCE UTILIZATION ACCELA ZHAO, LAYNE - - PowerPoint PPT Presentation
SOFT CONTAINER TOWARDS 100% RESOURCE UTILIZATION ACCELA ZHAO, LAYNE PENG 1 WHO ARE THOSE GUYS Accela Zhao, Technologist at EMC OCTO, active Openstack community contributor, experienced in cloud scheduling and container technologies. Mail:
SOFT CONTAINER
TOWARDS 100% RESOURCE UTILIZATION
ACCELA ZHAO, LAYNE PENGWHO ARE THOSE GUYS …
Layne Peng, Principal Technologist at EMC OCTO, experienced cloud architect, one of the earliest contributors to Cloud Foundry in China, 9 patentsMail: accela.zhao@emc.com Mail: layne.peng@emc.com Twitter: @layne_peng
WHAT IS RESOURCE UTILIZATION?
This is what we buy This is what we use A gap of $$$ wasted
ENERGY AND RESOURCE UTILIZATION
Energy-related costs 42% of total (including buy new machines) An idle server consumes even 70% as much energy as running in full- speed
Low resource utilization is energy inefficient Waste energy, waste money
Real world resource utilization is usually low: around 20% or less
A CLOSER LOOK TO CLOUD
The key advantage - cloud consolidation Less machines, more apps. Energy- efficient and saves money. Improved resource utilization
when app starts
– Examples: Green Cloud, Paragon. And the schedulers in Openstack, Kubernetes, Mesos, …
placement when app is running
– Examples: Openstack Watcher, VMware DRS
resource constraints in response to co-located apps
– Related: Google Heracles
RESOURCE UTILIZATION ON CLOUD
Soft Container
RESOURCE UTILIZATION ON CLOUD
Scheduler Migration
AppsSoft Container
Manages resource utilization at app kick-off Manages resource utilization cross hosts while app running Manages resource utilization at fine granularity inside hostRESOURCE UTILIZATION ON CLOUD
A battle of putting more apps in each host vs. guaranteed app SLA The key problem: resource interference
– Apps co-located in one host share resources like CPU, cache, memory, … – They interfere with each other, result in poor performance compared to running standalone – Resource interference make SLA unenforceable
– Google Heracles: an analysis of resource interference – Paragon: resource interference-aware scheduling – Bubble-up: to measure resource interference
THE KEY PROBLEM: RESOURCE INTERFERENCE
RESOURCE INTERFERENCE: HOW IT LOOKS?
MySQL standalone running vs co-located with a CPU & disk hungry task
– The setup
– App tolerated resource interference
– App caused resource interference
– Better resource utilization management – Scheduling, Migration, Soft Container, …
RESOURCE INTERFERENCE: HOW TO MEASURE?
RESOURCE INTERFERENCE: HOW TO MEASURE?
MySQL standalone running, vs co-located with CPU stress, vs disk stress. In my case, MySQL is much more sensitive to CPU interference.
– Increase resource utilization by co-locating more apps
– Respond to the dynamic nature of time-varying workload
– Guarantee the SLA of critical apps
– Resource control and isolation of interference – Respond to dynamic workload change
INTRODUCING TO SOFT CONTAINER
– Varying container resources needs based upon neighbors and SLAs. (The container becomes elastic) – “Expanding” (bubble up) resources when idle resources exist – Shrinking resources on a specific container, when another critical app demands more resources
INTRODUCING TO SOFT CONTAINER
Container resource bubble
Time ResourceTHE FEEDBACK CONTROL LOOP
Controller Watcher Limiter
ContainersSoft Container
RESOURCES TO LIMIT
– Core – Time Quota – …
– Size – Bandwidth – …
– IOPS – Throughput – …
RESOURCES TO LIMIT - MISSING
– Core – Time Quota – …
– LLC – …
– Size – Bandwidth* – …
– …
– …
– Ulimit – Bandwidth – …
…
– IOPS – Throughput – …
Kernel 3.6, most supports can be found in the community…
ISOLATION THE RESOURCES - NAMESPACE
/proc/<pid>/ns:
We are still waiting…
LIMIT THE RESOURCE - CGROUP
Task, Control Group & Hierarchy Subsystem – Control options
Create a cgroup subsystem Change the limitation… Usage
# echo 524288000 > /sys/fs/cgroup/memory/foo/memory.limit_in_b ytesMISSING - NETWORK
Isolation, does not means resource controlling 10
Suppose two containers in a machine, totally 100Gbps b/w
80 100Gbps
MISSING - NETWORK
Isolation, does not means resource controlling 10
Suppose two containers in a machine, totally 100Gbps b/w
80 100Gbps 95 100Gbps
If the GREEN container consumes the majority of b/w, which may have a negative impact on the BLUE one… How we can avoid this case from happening?
MISSING - NETWORK
Community attempts:
Base on Traffic Control (tc)
Nightmare of the PaaS providers…
MISSING - NETWORK
Community attempts:
Base on Traffic Control (tc)
Nightmare of the PaaS providers…
MISSING - GPU
Nvidia’s efforts:
MISSING - GPU
Nvidia’s efforts:
MISSING - CACHE
Intel’s efforts:
Cache Monitor Technology (CMT)MISSING – MEMORY BANDWIDTH
Monitor
Memory Bandwidth Monitoring (MBM)Control
Ref Memory Bandwidth Management for Efficient Performance Isolation in Multi-core Platform: http://pertsserver.cs.uiuc.edu/~mcaccamo/papers/private/IEEE_TC_journal_submitted_C.pdf Code: https://github.com/heechul/memguardMISSING – MEMORY BANDWIDTH
Monitor
Memory Bandwidth Monitoring (MBM)Control
Ref Memory Bandwidth Management for Efficient Performance Isolation in Multi-core Platform: http://pertsserver.cs.uiuc.edu/~mcaccamo/papers/private/IEEE_TC_journal_submitted_C.pdf Code: https://github.com/heechul/memguard– App request latency – Disk IO await – Network response time
– CPU load average – Disk request queue size – Network queue length
– CPU util rate – Disk util rate – Network util rate
WATCH THE WORKLOAD CHANGE
– DRAM bandwidth – CPU bandwidth – Disk bandwidth
– App request count – Disk IOPS / req/s – Network IOPS / req/s
– Global level – Per container level
THE FEEDBACK CONTROL LOOP
Controller Watcher Limiter
ContainersSoft Container
THE FEEDBACK CONTROL LOOP
Controller Watcher Limiter
ContainersSoft Container
Immediate response
THE FEEDBACK CONTROL LOOP
Controller Watcher Limiter
ContainersSoft Container
Immediate response How to immediately resize the containers?
HOW WE LOOK AT RESIZE?
a. Create a new container; b. Live migrate the contents to new container: 1. Transfer existed data to new container; 2. Transfer the instant data to new container. c. Stop the old container d. Start the new container e. Route the traffic to new container9527 /usr/sbin/httpd
Control Groups (cgroup):
Control Groups (cgroup):
IN CONTAINER’S WORLD…
9527 /usr/sbin/httpd
Control Groups (cgroup):
Control Groups (cgroup):
IN CONTAINER’S WORLD…
We need to take a fresh look at the resources management from Container’s perspective.
SOFT CONTAINER: IMPLEMENTATION
Controller
Algorithm ”expand” Algorithm ”pin_idle” Algorithm plugin NWatcher
CPU plugin Disk plugin Watcher plugin NLimiter
RunC plugin Docker plugin Limiter plugin N Metrics Store CPU statistics Disk … More …Container Repo
RunC plugin Docker plugin Container type N Containers Auto discoverySOFT CONTAINER: CURRENT STATUS
Completely runnable!
Demo Time :-)
BENCHMARK RESULTS: BEFORE
If uncontrolled, MySQL workload is severely interfered by co-located low priority task
BENCHMARK RESULTS: BEFORE
The CPU utilization is far from saturation while workload varies by time (Although in my case, disk IO is highly utilized)
BENCHMARK RESULTS: SOFT CONTAINER
With Soft Container (green line), latency impact is controlled. (We can improve the algorithm to cope better with peak workload)
BENCHMARK RESULTS: SOFT CONTAINER
Soft Container helps improve CPU utilization by co-locating new tasks with MySQL
BENCHMARK RESULTS: SOFT CONTAINER
CPU utilization looks close to saturation, after adding in iowait time
to guard app SLA and balance resource utilization
HOW DOES SOFT CONTAINER DID THIS?
BENCHMARK RESULTS: SOFT CONTAINER
How the resource bubble floats under the control of Soft Container. (The vibration threshold are made very sensitive to workload change)