Outline ! Control-theoretic Framework ! Service delay control on Web - - PowerPoint PPT Presentation

outline
SMART_READER_LITE
LIVE PREVIEW

Outline ! Control-theoretic Framework ! Service delay control on Web - - PowerPoint PPT Presentation

Outline ! Control-theoretic Framework ! Service delay control on Web servers ! On-line data migration in storage servers ! ControlWare: adaptive QoS control middleware 36 # Online Data Migration in Storage Systems ! Enterprise storage servers


slide-1
SLIDE 1

Outline

! Control-theoretic Framework ! Service delay control on Web servers ! On-line data migration in storage servers ! ControlWare: adaptive QoS control middleware

36 #

slide-2
SLIDE 2

Online Data Migration in Storage Systems ! Enterprise storage servers need to move data

" System expansion " Application changes

! Always-on: e-business, global data centers # Online data migration

37 #

Storage system" data 
 migration" E-mail server; DB …" I/Os" New device " "

slide-3
SLIDE 3

State of Practice

38 #

Script"

E-mail server; DB…"

Storage system"

storage
 devices" data 
 migration"

Migration" plan

Submover " HP-UX LVM " SAN" Slow I/O’s!!!! Need to bound impact

  • n applications!"

New device " "

slide-4
SLIDE 4

The Problem

! Execute a given migration plan on-line ! Challenges

"

Keep data consistent

"

Bound impact on application performance

"

Complete migration quickly

39 #

slide-5
SLIDE 5

Adaptive solution

! Feedback control loop: adapts migration speed based

  • n application I/O latency

" Enforce latency contract: Bounded average I/O latency " Complete migration in shortest time allowed by contract

! Standard control-theoretic design

" Systematic methodology " Robust, analytically proven performance

! Handle different workloads and devices

40 #

slide-6
SLIDE 6

Aqueduct

41 #

{LCi}" I/Os"

Controller" Actuator" Monitor"

{Li(k)}! Rm(k)"

Aqueduct

migration executor "

E-mail server; DB…"

Storage system"

storage
 devices" data 
 migration"

Migration" plan

Submover " HP-UX LVM " SAN"

Application" Latency" Contract

slide-7
SLIDE 7

Monitor

! Measure applications’ average I/O latency of each store in the last sampling window

"

Current implementation: trace replayer directly monitors I/O latencies

"

Can interface with performance monitoring tools (HP Openview)

42 #

Controller" Actuator" Monitor"

slide-8
SLIDE 8

Actuator

! Fine-grained control of migration speed using HP-UX LVM

" Divide store into small (32 MB) substores (LVs) " Submover moves substore using LVM silvering " Actuator enforces a submove rate by sleeping

43 #

Controller" Actuator" Monitor"

Mirror" Silvering" Split" 1 submv/sw" 2 submv/sw" submv" sleep" submv" sleep"

Sampling Window" Sampling Window"

sleep" sleep" sleep" sleep"

slide-9
SLIDE 9

Controller

! Compute error for each store i Ei(k) = P*LCi - Li(k) 0<P<1: safety margin, related to burstiness k: represents the kth sampling window ! Compute worst error Emin(k) = min{Ei(k)} ! Integral controller computes new submove rate: Rm(k) = Rm(k-1) + K*Emin(k) Control gain K: aggressiveness of rate change

44 #

Controller" Actuator" Monitor"

slide-10
SLIDE 10

Tuning controller parameters

! Stability ! Tracking: VL(k) = P*LC in steady state ! Settling time

45 #

Approximate linear model " VL(k+1)–VL(k)=G(Rm(k)-Rm(k-1)) " System profiling: Estimate G " Control Analysis " Compute K " Satisfy" Construct transfer function " Process gain G: impact of submove rate on victim latency.! Victim latency VL(k): highest average latency among all stores in the kth sampling window!

slide-11
SLIDE 11

Experimental setup

! Enterprise-scale storage server

46 #

HP 9000-N4000 Server"

8 440MHz processors"

FC-60 disk array "

(1.05 TB, 5 RAID5 Logical Units) "

Aqueduct" Openmail" I/O Trace" LU0" LUnew" HP-UX 11 & LVM"

emails" metadata" emails" metadata"

Fibre Channel"

slide-12
SLIDE 12

Experiments

! Baselines: no sleeping between (sub)moves

"

Whole-store: move one store at a time

"

Sub-store: move one substore at a time

! Constant: steady Poisson streams

"

Replace Logical Unit; migrate three 640-MB stores.

! Openmail: trace of an enterprise e-mail server running HP Openmail

"

Add Logical Unit; migrate a 1854 MB store and a 96 MB store

47 #

slide-13
SLIDE 13

Measure G # Tune K

y = 1.41x + 5.80 R2 = 0.98 y = 1.12x + 7.55 R2 = 0.99

6 8 10 12 14 16 1 2 3 4 5 6 7

Submove Rate Average Victim Latency (ms)

Openmail S ynthetic Linear9(Openmail) Linear9(S ynthetic) 48 #

Constant: K = 1.09" Openmail: K = 0.36" Constant" Openmail"

Process gain G: the slope

  • f the curves"

Control gain K"

slide-14
SLIDE 14

Openmail: victim latency

49 #

LC! 0.8*LC! Aqueduct Sub-store Whole-store" Average Victim Latency (ms)"

slide-15
SLIDE 15

Openmail: latency

50 #

5 10 15 20 25 30 35 40 45

10 20 Time (min) Latency(ms)

big0*(Aqueduct) big0*(S ub5store) big0*(Whole5store)

Aqueduct uniformly better than baselines, but … " LC!

slide-16
SLIDE 16

Openmail: latency & submove rate

51 #

5 10 15 20 5 10 15 20 25

T im e %(m in)

latency (ms) 5 10 15 20 submv rate (submv/min)

big0 S ubmv)R ate

" Load#highest#on#new#LU#towards#end#of#migra<on# " By#design,#submove#rate#must#be#1#or#higher#$# controller#is#working#correctly#

slide-17
SLIDE 17

Openmail: average latency

52 #

LC!

Aqueduct" Sub-store" Whole-store"

slide-18
SLIDE 18

Openmail: latency CDF

53 #

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 00 1 000 1 0000 1 00000

request latency (us) CDF

Whole1s tore A queduct S ub1s tore No=Migration

76% $ 91% $

slide-19
SLIDE 19

Related work

! Simpler versions of the problem

"

Take (parts of) system offline

"

Migrate data in “quiet periods”

! Silvering in Logical Volume Manager [HP-UX LVM, VxVM]: maintain data consistency, no QoS guarantees ! Proportional I/O scheduling: hard to handle unpredictability ! MS Manners: no guarantee to important tasks ! Control-theory-based: distributed visual tracking, Web servers, e-mail server, database, real-time processor scheduling ...

54 #

slide-20
SLIDE 20

Summary

! Migration must be executed adaptively ! Aqueduct is neither overly aggressive

"

Average I/O latency reduced by 76%

"

Contract violation ratio reduced by 78%

! nor overly conservative

"

Average victim latency 15% lower than latency contract

! Future

"

More detailed sensitivity analysis

"

Self-tuning controller

"

Multi-dimensional QoS contracts

55 #

slide-21
SLIDE 21

References

!

  • C. Lu, G. A. Alvarez, J. Wilkes, Aqueduct: Online Data Migration with

Performance Guarantees, USENIX Conference on File and Storage Technologies (FAST), 2002. ! G.A. Alvarez, C. Lu and J. Wilkes, Method and System for Online Data Migration on Storage Systems with Performance Guarantees, U. S. Patent 7,167,965, January 2007.

56 #

slide-22
SLIDE 22

Outline

! Control-theoretic Framework ! Service delay control on Web servers ! On-line data migration in storage servers ! ControlWare: adaptive QoS control middleware

57 #

slide-23
SLIDE 23

Adaptive QoS Control Framework

58 #

Controller Design " QoS Mapping " QoS Guarantee" Control Loop" Architecture " System Identification " Dynamic" Model" Controllers" Dynamic" Response" Specs"

QoS Control Software"

guarantee"

slide-24
SLIDE 24

ControlWare

Isolate programmers from control-theoretic concerns

Actuators " Controller Design " QoS Mapper "

QoS contract"

Control" Configuration" System ID " Control Loop Composition " Software QoS Control Loops " Controllers " Monitors " ControlWare Library" SoftBus "

59 #

slide-25
SLIDE 25

ControlWare: Reference

! Case studies on Squid Web cache and Apache !

  • R. Zhang, C. Lu, T. F. Abdelzaher, J. A. Stankovic, ControlWare: a

Middleware Architecture for Feedback Control of Software performance, ICDCS, 2002.

60 #

slide-26
SLIDE 26

Control-theoretic QoS Framework

! Map QoS guarantees to feedback control loops ! Establish difference equation models for computing systems via system identification ! Build practical QoS control systems

"

Apache Web server.

"

Enterprise storage server.

"

Avionics image transmission.

! Develop middleware for deploying QoS control

"

FCS/nORB, FC-ORB: Distributed real-time embedded systems.

"

ControlWare: Internet servers.

61 #