Energy Awareness Power is a critical, limited, and shared system - - PowerPoint PPT Presentation

energy awareness
SMART_READER_LITE
LIVE PREVIEW

Energy Awareness Power is a critical, limited, and shared system - - PowerPoint PPT Presentation

Philipps-Universitt Marburg, Fachbereich 12 SE 12086: Aktuelle Betriebssystemtechnologien Michael Engel, Prof. Bernd Freisleben Energy Awareness Power is a critical, limited, and shared system resource Niels Fallenbeck, Christoph Scheid


slide-1
SLIDE 1

Energy Awareness

Philipps-Universität Marburg, Fachbereich 12 SE 12086: Aktuelle Betriebssystemtechnologien Michael Engel, Prof. Bernd Freisleben

Niels Fallenbeck, Christoph Scheid

energy_awareness@webkommune.de

“Power is a critical, limited, and shared system resource”

slide-2
SLIDE 2
  • utline
  • The Past
  • The Present
  • Technologies in Research
  • A Green TCP/IP
  • Scheduling for reduced CPU energy
  • PAVM - power aware virtual memory
  • Binary Rewriting
slide-3
SLIDE 3

Who eats the energy?

0 W 30 W 60 W 90 W 120 W

89 W 77 W 100 W 30 W 9 W 35 W 115 W

Intel Pentium 4 Intel Mobile Pentium 4 Intel Pentium M FreeScale MPC7447A (G4) IBM PowerPC 970FX (G5) AMD Athlon XP AMD Athlon64 FX

slide-4
SLIDE 4

Who eats the energy?

0% 18% 35% 53% 70% 1994 1998 2004

31% 22% 9% 3% 1% 1% 4% 12% 20% 15% 18% 12% 34% 39% 68%

Display CPU Hard disk Memory Graphics Other

slide-5
SLIDE 5

Measurements

  • CPU Frequency Scaling
  • Turn Off Display/Hard Disk Drive
  • Deactivate unused hardware
  • any more?
slide-6
SLIDE 6

“Energy Stars”

  • In 2000 PCs in USA used approx. 21,9 TWh
  • US $ 1,75 billion at US $ 0,08 per kWh
  • less than half: system units
  • more than half: monitors
slide-7
SLIDE 7

“Energy Stars II”

  • PC on but not used
  • 74% on during daytime
  • used 12% of time
  • 21% on overnight/weekends w/ no use
slide-8
SLIDE 8

EnergyStar [1992]

  • Environmental Protection Agency (EPA)
  • EnergyStar-Certificate (=Green PC)
  • <60 W during periods of inactivity
  • power-off processors
  • software state: save to disk
slide-9
SLIDE 9

EnergyStar* [1995]

  • 1995: President Clinton: all PC’s purchased

by government agencies must be EnergyStar compliant.*

  • two challenges:
  • network connections are dropped
  • ressources are unaccessible
  • 11% of EnergyStar-PCs fully enabled for

Energy Star operation

* There are no requirements that Energy Star features must remain enabled following installation of PC.

slide-10
SLIDE 10

TCP/IP

slide-11
SLIDE 11
  • connection sleep option TCP_SLEEP

a green TCP/IP

slide-12
SLIDE 12
  • server-side:
  • connectins are not dropped after timeout
  • sleeping connections are blocked.

(to prevent client from being flooded)

  • client-side
  • TCP_SLEEP-option

modifications

(TCP-Code in Linux Kernel)

slide-13
SLIDE 13

testing green TCP/IP

  • two telnet-sessions, biff (email notification)
  • TCP_SLEEP-package + power-down

sequence + power-off (cable) + emails to servers

  • one session crashed (timeout), one OK.
slide-14
SLIDE 14

green TCP/IP todo

  • Advanced Power Management (APM)

Firmware interface

  • maximum sleep period
  • three-way handshake for both entering and

exiting connection sleep

slide-15
SLIDE 15

Scheduling

  • Running CPU in full speed

100% energy used

  • Cutting CPU frequency and voltage by 2

25% energy used

  • Idle Times
  • hard (disk wait, ...)
  • soft (waiting for user input, ...)
slide-16
SLIDE 16

Scheduling

  • Algorithms
  • Opt
  • FUTURE
  • PAST
  • PAST Algorithm
  • CPU utilization
  • Energy computation
  • Speed adjustment
slide-17
SLIDE 17

Scheduling

slide-18
SLIDE 18

PAVM

Power Aware Virtual Memory

  • Organized in Modules (= Bunch of Devices)
  • SDR/DDR
  • wide data bus, low clock rate
  • RDRAM
  • narrow data bus, high clock rate
  • 4 different power states in memory controller:

attention, standby, nap, powerdown

  • power saving policy uses attention and standby
slide-19
SLIDE 19

PAVM

Power Aware Virtual Memory

Attention Standby Nap Powerdown 7 mW 11 mW 225 mW 313 mW

Attention Refresh, clock, row, col decoder Standby Refresh, clock, row decoder Nap Refresh, clock Powerdown Refresh

slide-20
SLIDE 20

PAVM

Power Aware Virtual Memory

  • Memory Nodes
  • Tracking Active Nodes
  • Reducing Active Set
  • NUMA

Non-Uniform Memory Access

  • Hiding Latency

0% 25% 50% 75% 100% 300 600 900 1200

context switching time (ns)

slide-21
SLIDE 21

PAVM

Power Aware Virtual Memory

  • Revision #1: DLL Aggregation
  • Revision #2: Page Migration
  • private page, shared page
  • kmigrated daemon
  • Revision #3: Reducing Migration Overhead
  • ignoring short-lived processes (as done by kmigrated)
slide-22
SLIDE 22

PAVM

Power Aware Virtual Memory

900 mW 1.800 mW 2.700 mW 3.600 mW 4.500 mW Light Poweruser Multimedia

237 mW 646 mW 1.725 mW 397 mW 791 mW 2.442 mW 465 mW 986 mW 2.687 mW 892 mW 2.324 mW 3.991 mW 4.100 mW 4.118 mW 4.230 mW

Base On/Off PAVM PAVMr1 PAVMr2

slide-23
SLIDE 23

Binary Rewriting

to Improve Energy Efficiency through Post-pass Register Re-allocation

  • binaries are optimized for energy efficiency
  • reduce cache power consumption by

reducing dynamic activities (dynamic load/ stores)

  • find dead & unused registers and re-allocate

them on hot paths

slide-24
SLIDE 24

dead/unused register problem

  • inefficiencies in register allocation

(long live ranges of variables)

  • large regions where variable occupies

register but is not used

slide-25
SLIDE 25

types of allocators / methods of allocation

  • spilling vs. splitting
slide-26
SLIDE 26

framework overview

slide-27
SLIDE 27

register re-allocation

  • verview
slide-28
SLIDE 28

definitions

  • dead register: register which does not

contain live value

  • unused register: register which does

contain live value, but neither defined nor used at current basic block

  • dead registers need not be stored
  • unused registers (if carrying live value) must

be stored before a new value is loaded

slide-29
SLIDE 29

basic blocks

slide-30
SLIDE 30

hot region identification

  • hot region = hot basic block
  • hot if average dynamic execution frequency

exceeds threshold

  • adjacent hot basic blocks: hot region
  • two hot regions separated by cold basic

block (disjoint regions: register re- allocation carried out independently) ! algorithm

slide-31
SLIDE 31

spill identification

  • identify spills to be removed from hot

blocks

  • alias analysis
slide-32
SLIDE 32

weighted bipartite graph matching

  • spilled variables in hot

region: bipartition A (same variable in different basic block: different vertices)

  • set of dead registers in

hot region: bipartition B

slide-33
SLIDE 33

experiment

  • x86 (Pentium)
  • 6 general purpose registers

(eax, ecx, edx, ebx, esi, edi)

  • Machine Suif compiler (does not support

live range splitting)

  • benchmarks: SPEC2000, MediaBench
slide-34
SLIDE 34

results

  • number of hot

basic blocks is small

  • sizes of hot regions

are small

  • number of spills

removed is small

slide-35
SLIDE 35

more results

  • number of static

load/stores is reduced by 0.4% (average)

  • number of

dynamic spill load/ stores is reduced significantly: 0 - 26.4%

slide-36
SLIDE 36

Resources

  • Home of “energy awareness”

http://www.mathematik.uni-marburg.de/~fallenbe/energy/