2006/5/8 1
IT As Service
Meichun Hsu (许玫君
玫君)
HP Labs China
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
IT As Service Meichun Hsu ( ) HP Labs China 2006/5/8 1 - - PowerPoint PPT Presentation
IT As Service Meichun Hsu ( ) HP Labs China 2006/5/8 1 Outline of Talk Technology and Economic Trends Selected Research at HP Labs
2006/5/8 1
玫君)
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 2
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 3
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 4
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 5
Virtual Resource Virtual Resource Virtual Resource Virtual Resource
Application Stacks (silos) Application Stacks
De-couples resources from consumption
Duplicated Infrastructure Consolidated Infrastructure
Virtualization Without Virtualization:
With Virtualization:
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 6
utilized
payroll PeopleSoft instances to 1 single global instance of PeopleSoft 8
− From 41 to 10 dedicated − From 0 to 5 utility servers
applications have gone from development to production in 6 weeks
application infrastructure specific deployment costs
space)
Before Now
2 3 1 1 7 TB 3 TB
Benefits
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 7
software is primary enabler
A Case Study
Existing Network (100 servers) − 20 Infrastructure, 80 Capacity Servers − 12 computer racks required − High maintenance requirements (2.5 FTE) Up-Front Hardware & Installation $700k Hosting ($1k/month/rack) $144k H/W support contracts + $100k Maintenance FTE + $200k Annual cost = $444k New (100 virtual servers) − 4 ProLiant DL580 servers manage 100 VMs − 2 MSA1000 SANs host storage − Low maintenance requirements (1.0 FTE) Up-Front Hardware & Installation $165k Hosting ($1k/month/rack) $ 12k HW support contracts + $ 5k Maintenance FTE + $100k Annual cost = $117k
Before New Consolidation Solution
Typical IT Ratios Today:
One person 20 servers One Person 2TB storage
Typical IT Ratios NGDC:
One person 200 servers One Person 200TB storage
(fewer people, more capacity) Source: IDC Source: HP analysis
Today Consolidated Solution
Net result of almost 75% annual cost savings, 60% labor savings, and 94% reduction in physical hw (servers/storage)
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 8
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 9
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 10
Alex Zhang et al
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 11
− Given a set of old servers and their associated workload traces (e.g. CPU utilization time series), how can we “pack” them into a minimum set of new servers?
Servers before consolidation Servers after consolidation
VMM VMM
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 12
t = 1 t = 2 t = 3 t = 4 t = T t = 1 t = 2 t = 3 t = 4 t = T
CPU Memory
Server 1 Server 2 Server 3 Server 4 Server 5 Servers 1+3+4 =New Server 1 Servers 2+5 = New Server 2
Probabilistic Capacity Limit
−E.g. 5-minute CPU utilization < 50% to be satisfied with probability 0.995
− Can workloads be proportionally scaled? − Are Workloads additive? − Metrics independence? − What is the VMM Overhead?
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 13
The problem input data is
) (a α
(such as 0.99) for satisfying the bin capacity (potentially different across performance metrics a = 1, 2, …, K ). (1)
∑
=
=
m j
j y
1
) ( Z Minimize
Subject to: (2)
. ,..., 2 , 1 and ,..., 2 , 1 for ), ( ) , ( m j n i j y j i x = = ≤
(3)
. ,... 2 , 1 for , 1 ) , (
1
n i j i x
m j
= =
∑
=
(4p)
. ,..., 2 , 1 and ,..., 2 , 1 , ,..., 2 , 1 for ), ( ) , , ( ) , , ( ) , ( ) , , (
1
T t K a m j a C t a j v t a j M j i x t a i w
n i
= = = ≤ ⋅ − ⋅
∑
=
(5) x(i, j) = 0 or 1 (binary variable). (6)
. variable) s (continuou 1 ) ( ≤ ≤ j y
(7p)
. ,..., 2 , 1 and ,..., 2 , 1 for , )] ( [ ) , , (
1
K a m j T a t a j v
T t
= = − 1 ≤
∑
=
α
(8p)
. ,..., 2 , 1 and ,..., 2 , 1 , ,..., 2 , 1 for ), ( ) , , ( T t K a m j j y t a j v = = = ≤
(9p) v(j, a, t) = 0 or 1 (binary variable).
− Given: A set of old servers to be packed; bin capacity (on multiple metrics) with Probabilistic goals − High-Dimensional Bin-Packing Formulation
O(n2 m T )
− n is number of old servers, − m is number of metrics, − T is number of time periods (m*T is the dimensionality) וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 14
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 15
Resource pools (physical in this example) Mechanisms and workflows Resource pools (virtual in this example) Services catalog (virtual resources)
Server utility Storage utility Application utility
Virtual resource pools Virtual system services Capacity planning Mapping Virtual system services
Application utility
Virtual resource pools Virtual system services Capacity planning Mapping Virtual system services
Data fabric Data fabric Datacenter service bus Datacenter service bus
The future Data Center is one that
infrastructure that provides appropriate resources, that is highly automated
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 16
Quartermaster Core
Quartermaster Model Manager(s)
Quartermaster Tools
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 17
− Today 40 1u boxes in a rack in a data center
− creates 10Kw rack − cooling in by intuition
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 18
Kevin Lai, Lars Rasmusson, Li Zhang, Eytan Adar, and Bernardo Huberman
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 19
Host
Resource Broker / VMM
Clients (e.g. Linux VMs) share a physical host, each getting a slice of physical resource Host managed by resource broker (e.g. embedded in Xen)
…
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 20
allocation
performance if necessary
Host 0
client agent
to maximize own utility
client A app client B app
bids received during last time period
based on bids received; a client gets resource proportional to bid $
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 21
scene being rendered on cluster
different hosts in cluster
changing bidding interval
in <1s
Initial bid was $10
seconds, resulting in being allocated small% of a server Updated bid to $10
resulting in increase in throughput
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 22
http://tycoon.hpl.hp.com/pulse/
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 23
George Foreman, Jaap Suermondt et al
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 24
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 25
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 26
Ongoing - automated
Initial Call Data Explore Train Classify Quantify New Call Data
Monthly results
Only once
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 27
6/20 6/27 7/4 7/11 7/18 7/25
synchronizing damage battery hung screen issues network misc
question
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 28
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 29
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 30
− An open source implementation of the Web Services Resource Framework, Web Services Notification, and Web Services Distributed Management (Management Using Web Services) family
− Known as Apache WSRF, Pubscribe, and MUSE − (Just) now reached version 1.0! − For more information on these projects
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 31
hardware − Provide secured storage of cryptographic keys
Infrastructure (GSI) − Enhanced trust between Grid resources and users
User proxy Process resource proxy resource proxy Process resource proxy Process TP M TP M
− TPM on server enables remote attestation
TP M
− TPM on client protects keys used by user proxy We are working with China Grid team to integrate TPM into GSI
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 32
existing monitoring solutions
to monitoring service.
− Focus: data access models and model parser
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 33
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 34
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 35
universities and research institutes worldwide
− U.S. − U.K. − Israel − Japan − India − China
Bristol Japan Israel Palo Alto India China וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 36
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ
2006/5/8 37
וֹכּמּף ףץ٪ّ٠מּَِ ٩٭۶ףוֹ٭٩ץף ێ ۖףףףِِ