Benchmarking for Power and Performance
Heather Hanson (UT-Austin) Karthick Rajamani (IBM/ARL) Juan Rubio (IBM/ARL) Soraya Ghiasi (IBM/ARL) Freeman Rawson (IBM/ARL)
Benchmarking for Power and Performance Heather Hanson (UT-Austin) - - PowerPoint PPT Presentation
Benchmarking for Power and Performance Heather Hanson (UT-Austin) Karthick Rajamani (IBM/ARL) Juan Rubio (IBM/ARL) Soraya Ghiasi (IBM/ARL) Freeman Rawson (IBM/ARL) The Future UNITED STATES ENVIRONMENTAL PROTECTION AGENCY WASHINGTON, D.C.
Heather Hanson (UT-Austin) Karthick Rajamani (IBM/ARL) Juan Rubio (IBM/ARL) Soraya Ghiasi (IBM/ARL) Freeman Rawson (IBM/ARL)
January 20, 2007
UNITED STATES ENVIRONMENTAL PROTECTION AGENCY WASHINGTON, D.C. 20460 OFFICE OF AIR AND RADIATION December 28, 2006 Dear Enterprise Server Manufacturer or Other Interested Stakeholder, “The purpose of this letter is to inform you that the U.S. Environmental Protection Agency (EPA) is initiating its process to develop an ENERGY STAR specification for enterprise computer servers. In the coming months, EPA will conduct an analysis to determine whether such a specification for servers is viable given current market dynamics, the availability and performance of energy-efficient designs, and the potential energy savings…”
January 20, 2007
Measurement Protocol
– http://www.energystar.gov/ia/products/d – Recommendation is that system vendors provide curves showing power consumption under different loads
reaches 0%
– Expect consumers to use this curve to estimate their own overall energy consumption
point on the curve to get costs
– End-to-end numbers can give the wrong results – Averaged utilization doesn’t necessarily correlate to average power
used?
Figure1 from EPA Server Energy Measurement Protocol document
January 20, 2007
Measurement Protocol
– http://www.energystar.gov/ia/products/d – Recommendation is that system vendors provide curves showing power consumption under different loads
reaches 0%
– Expect consumers to use this curve to estimate their own overall energy consumption
point on the curve to get costs
– End-to-end numbers can give the wrong results – Averaged utilization doesn’t necessarily correlate to average power
used?
Modified to show the behavior of a power-managed system
Sample aggressive management savings
January 20, 2007
– Power and thermal management techniques have significant implications for system performance.
experimentally evaluate new ideas.
the power/performance benchmarking issues.
power/performance research.
– Variability and its effect on system power management – Collecting correlated power and performance data.
features distinct from traditional performance benchmarking.
January 20, 2007
– Workload variation in system utilization
– Workload variation in program characteristics
– How quickly does the system respond to changing characteristics? – How well does the system respond to changing characteristics? – Specifically for power-managed systems, how well does the system
– “Identical” components are not actually identical in power
January 20, 2007
January 20, 2007
exploited to reduce power consumption without harming performance unduly?
– DVFS is the most typical solution
response time criterion
– Other techniques can be employed instead
– Different techniques will give different results depending on the system design and workload
Workload A while CPU Packing provides better results for System 2 running Workload B
time varying nature of system utilization.
– Any new power/performance benchmarks should capture this behavior. – Response to time-varying behavior is a key feature
50 100 150 200 250 300 350 400 1 6 11 16 21 26 31 36 41 46 51 56 Time (minutes) Utilization (%) CPU 0 CPU 1 CPU 2 CPU 3 50 100 150 200 250 300 350 400 1 6 11 16 21 26 31 36 41 46 51 56 Time (minutes) Utilization (%) CPU 0 CPU 1 CPU 2 CPU 3
January 20, 2007
– Let max permissible utilization = 85% per processor (340% on previous graph)
– 10% of the max permissible utilization is 34%
management methods are.
– Total number of transactions are the same in the 3 cases below. – A distribution centered on the average time is probably the most realistic option
certain techniques are responsive enough both in the rapidity of response and the range of response.
Different Transaction Injection Types
1 3 5 7 9 11 10 20 30 40 50 60 Seconds Transactions Fixed Injection Rate Scaled Injection Rate Normally Distributed Injection Rate
January 20, 2007
significant impact on power consumption.
– On the Pentium M, power consumption at the same system utilization can vary by a factor of 2.
– Processors will have more clock gating and will employ other, more aggressive, power savings techniques more extensively. – Additional components are adding power reducing techniques such as memory power down, disk idle power reduction and so on
January 20, 2007
January 20, 2007
gcc
time scales
– Must respond on microarchitectural- level time scales
gzip
phases at OS-level time scales
– Can be slower to respond – Can have some associated overhead which gets amortized
January 20, 2007
– What system and processor resources are being used by the workload?
– How rapidly is the behavior of the workload changing?
– How rapidly can the various power management techniques respond to changing conditions and when?
different power management solutions of different systems
application relative to its maximum
– Ignores differences between and within applications – Has no variation in intensity over very long periods of time – Can breed dependence on slow-response techniques only
January 20, 2007
– Process variation, part binning
amounts of power
– Different design criteria may produce same functional spec, but different implementations
– Different loads – Different power supplies
power
– Temperature, humidity, type of heat sink, etc – Ex: Temperature of the datacenter can cause power consumption to increase as
January 20, 2007
1.10 1.08 1.03 1.08 1.00 0.94 0.96 0.98 1.00 1.02 1.04 1.06 1.08 1.10 1.12 Part 5 Part 4 Part 3 Part 2 Part 1 Normalized Power
January 20, 2007
1.00 1.00 1.52 1.16 1.55 1.28 2.07 1.51 1.52 1.51 0.0 0.5 1.0 1.5 2.0 2.5 Max Active (Idd7) Max Idle (Idd3N) Normalized Current/Power Vendor 1 Vendor 2 Vendor 3 Vendor 4 Vendor 5
January 20, 2007
Example: A vendor benchmarks a system with 100 processors of type A fabricated by manufacturer X, 200 DIMMs of memory of type B fabricated by manufacturer Y and placed in an “ideal” datacenter…
purchased by the customer.
that consume more energy.
which it cools efficiently.
temperatures go up in response to rising system temperatures.
January 20, 2007
power/performance benchmarking
– Intensity Variation
– Nature of activity variation
– Component-level variation
– System response
January 20, 2007
January 20, 2007
since 2002.
January 20, 2007