JustRunIt: Experiment‐Based Management of Virtualized Data Centers
Wei Zheng Ricardo Bianchini Rutgers University Yoshio Turner Renato Santos John Janakiraman HP Labs
JustRunIt: ExperimentBased Management of Virtualized Data Centers - - PowerPoint PPT Presentation
JustRunIt: ExperimentBased Management of Virtualized Data Centers Wei Zheng Yoshio Turner Ricardo Bianchini Renato Santos John Janakiraman Rutgers University HP Labs MoMvaMon Managing data center is a challenging task Resource
Wei Zheng Ricardo Bianchini Rutgers University Yoshio Turner Renato Santos John Janakiraman HP Labs
– Resource allocaMon, evaluaMon of soOware/hardware upgrades, capacity planning, etc. – Decisions affect performance, availability, energy consumpMon
– Models give insight into system behavior – Fast exploraMon of large parameter spaces
– Consumes a very expensive resource: human labor – Needs to be re‐calibrated and re‐validated as the systems evolve
– Consume a cheaper resource: machine Mme (and energy) – High fidelity
management of virtualized data centers
results to perform management tasks
– Resource management and hardware/soOware upgrades – Select the best value for soOware tunables – Evaluate the correctness of administrator acMons
– Case study 1: resource management – Case study 2: hardware upgrades
services
applicaMon Mer, and a database Mer
Agreements), e.g. response Mme
management flexibility
– No effect on on‐line services – Does not replicate enMre service – Almost service‐independent
Assess performance and energy of different configurations
A1 A2 A1 A2 A2 A3 A2 A3 A1 A2 A3 W1 W2 W1 W2 W2 W3 W1 W2 W3 D1 D2 D1 D3 D2 D3 D1 D2 D3
S-A2
Sandbox
S-W2 S-D2
On-line system
X X X X X X I I X I I I I X X I X T T T
Driver Checker
Ranges
Interpolator Management Entity
Param values Experiment results Param values Experiment results
Experimenter
JustRunIt
param2 param1
producMon system to a sandbox
– VM cloning: Modify Xen live migraMon to resume original VM instead of destroying it – Storage cloning: LVM copy‐on‐ write snapshot for sandbox VM – L2/L3 network address translaMon: implemented in driver domain netback driver to prevent network address conflict
changes
– Exs: CPU allocaMon, frequency
VM VM VM
following service Mers
– ApplicaMon protocol level requests/replies (e.g. HTTP)
Tier-N VM In-Proxy Out-Proxy Sandbox VM
X X X X X X I I X I I I I X X I X T T T
Driver Checker
Ranges
Interpolator Management Entity
Param values Experiment results Param values Experiment results
Experimenter
JustRunIt
param2 param1
within a Mme limit
– Cancel experiments if gain for a resource addition falls below a threshold – Cancel experiments for tiers that do not produce the largest gains from a resource addition
X X X X X X X X X CPU allocation CPU Freq (min,min)
X X X X X X I I X I I I I X X I X T T T
Driver Checker
Ranges
Interpolator Management Entity
Param values Experiment results Param values Experiment results
Experimenter
JustRunIt
param2 param1
– The most Mme‐consuming part is proxies implementaMon – Current proxies understand HTTP, mod_jk, MySQL protocols – Developed from an open source proxy daemon, each proxy need 800~1500 new lines of C code
lines of C in netback driver
service based on the same protocols
– Case study 1: resource management – Case study 2: hardware upgrades
interconnected with Gbit network
– RUBiS: online aucMon service modeled aOer Ebay.com – TPC‐W: online book store modeled aOer Amazon.com
3-tier service with one node per tier; two nodes for proxies Overhead exposed – slight RT degradation, no effect on TP
T h r
g h p u t ( r e q s / s )
Application server at 400 requests/second (similar results for higher load)
Response Time Throughput
Management EnMty Data Center JustRunIt
Change Result
possible set of nodes, while saMsfying all SLAs
SLA is violated, or when SLA is met by a large margin
determine resource needs
annealing) to minimize number of physical machines and number of VM migraMons
load balancing and storage service
4 services on 11 nodes SLA = 50ms Increase load on S0 Run 3 exps for 3 mins JRI Modeling
Violating SLA Running experiments Migrating Violating SLA Solving model Migrating
data centers [Stewart’05, Stewart’08, Padala’07, Padala’09, Cohen’04]
[Nagaraja’04, Tan’05, Oliveira’06]
management systems
– Tier interacMons – Different workload mix – Build proxies for a database server