KOALA A PLATFORM FOR OS-LEVEL POWER MANAGEMENT D.C. Snowdon E. Le - - PowerPoint PPT Presentation

koala a platform for os level power management
SMART_READER_LITE
LIVE PREVIEW

KOALA A PLATFORM FOR OS-LEVEL POWER MANAGEMENT D.C. Snowdon E. Le - - PowerPoint PPT Presentation

KOALA A PLATFORM FOR OS-LEVEL POWER MANAGEMENT D.C. Snowdon E. Le Sueur S.M. Petters G. Heiser Image by Diliff under CC license KNOBS CPU frequency CPU voltage CPU sleep states memory and bus frequency power states of IO


slide-1
SLIDE 1

KOALA A PLATFORM FOR OS-LEVEL POWER MANAGEMENT

D.C. Snowdon

  • E. Le Sueur

S.M. Petters

  • G. Heiser
slide-2
SLIDE 2

Image by Diliff under CC license

slide-3
SLIDE 3
slide-4
SLIDE 4

KNOBS

  • CPU frequency
  • CPU voltage
  • CPU sleep states
  • memory and bus frequency
  • power states of IO devices (not considered here)
slide-5
SLIDE 5

SIMPLISTIC POLICIES

Speed Time

slide-6
SLIDE 6

SIMPLISTIC POLICIES

Speed Time

slide-7
SLIDE 7

REALITY CHECK

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 600 800 1000 1200 1400 1600 1800 0.7 0.8 0.9 1 1.1 gzip Normalised Energy swim Normalised Energy Frequency (MHz) gzip swim

slide-8
SLIDE 8

CYCLE COUNT

1 1.2 1.4 1.6 1.8 2 600 800 1000 1200 1400 1600 1800 Normalised Cycles CPU Frequency (MHz) swim equake mgrid gzip

slide-9
SLIDE 9

OVERHEAD

  • switching frequency and power incurs CPU downtime
  • Pentium-M
  • frequency change: 10µs
  • voltage change asynchronous
  • Opteron
  • frequency and voltage change: 2ms (140µs out of spec)
slide-10
SLIDE 10

TEMPERATURE

27.6 27.8 28 28.2 28.4 28.6 50 52 54 56 58 60 62 64 66 68 Input Power (W) Temperature (degrees C) High Fan Medium Fan Low Fan

slide-11
SLIDE 11

POWER SUPPLY

18 20 22 24 26 28 30 32 34 36 38 18 20 22 24 26 28 30 32 34 36 38 Input Power (W) Predicted Input Power (W) Expected 1.3V 1.2V 1.1V 1.0V

slide-12
SLIDE 12

TIME MODEL

T = Ccpu fcpu + Cbus fbus + Cmem fmem + Cio fio + . . . The coefficients are workload-specific, and can be

Cbus = α1PMC 1 + α2PMC 2 + . . . Cmem = β1PMC 1 + β2PMC 2 + . . .

slide-13
SLIDE 13

ENERGY MODEL

Edyn ∝ cyc × V 2.

E = V 2

cpu(γ1fcpu + γ2fbus + γ3fmem)∆t+

V 2

cpu(α0PMC 0 + · · · + αmPMC m) +

γ4fmem∆t + β0PMC 0 + · · · + βmPMC m + Pstatic∆t,

slide-14
SLIDE 14

UNIFIED POLICY

η = P 1−αT 1+α,

  • 1

1

α

forces highest performance

slide-15
SLIDE 15
  • 1

1

UNIFIED POLICY

η = P 1−αT 1+α,

α

minimises energy-delay product

slide-16
SLIDE 16
  • 1

1

UNIFIED POLICY

η = P 1−αT 1+α,

α

minimises energy

slide-17
SLIDE 17
  • 1

1

UNIFIED POLICY

η = P 1−αT 1+α,

α

minimises power consumption

slide-18
SLIDE 18

IMPLEMENTATION

  • recent Linux kernel (2.6.24.4)
  • per-process collection of relevant statistics
  • policy-decision when process blocks or preempts
  • use data from previous time slice to predict optimal setting
  • assumes temporal locality
  • uses logarithmic tables to simplify calculation (no float)
slide-19
SLIDE 19

EVALUATION

The Laptop

  • Dell Latitude D600
  • Pentium-M 0.8 – 1.8 GHz
  • 0.98 – 1.34 V
  • three sleep states
  • measured at battery

The Server

  • AMD Opteron 246
  • 0.8 – 2 GHz
  • 0.9 – 1.5 V
  • high switching overhead
  • measured at wall socket
slide-20
SLIDE 20

CHARACTERISATION

The Laptop

  • number of completed burst

transactions

  • number of lines removed

from L2 cache

  • correlation 0.98 / 0.96

The Server

  • quadword write transfers
  • L2 cache misses
  • dispatch stalls due to

reorder buffer being full

  • DRAM accesses due to page

conflicts

  • correlation 0.98 / 0.98
slide-21
SLIDE 21

MODEL ACCURACY

65 70 75 80 85 90 95 100 105 lbm test mcf test equake ref swim ref povray train gzip graphic ref milc test libquantum test dealII test sjeng test gcc train cactusADM test bzip2 test 2 bzip2 test 1

  • mnetpp train

bwaves test gromacs test xalancbmk test wrf test namd test calculix train zeusmp test astar test tonto test hmmer test h264 test sphinx train Energy Consumption (%) Est E Act E

slide-22
SLIDE 22

MODEL ACCURACY

80 85 90 95 100 105 110 lbm test mcf test equake ref swim ref povray train gzip graphic ref milc test libquantum test dealII test sjeng test gcc train cactusADM test bzip2 test 2 bzip2 test 1

  • mnetpp train

bwaves test gromacs test xalancbmk test wrf test namd test calculix train zeusmp test astar test tonto test hmmer test h264 test sphinx train Performance (%) Est T Act T

slide-23
SLIDE 23

UNIFIED POLICY

  • 70

80 90 100 110 120 130 140

  • 1
  • 0.5

0.5 1 Actual Energy (%)

  • lbm_test

mcf_ref swim_ref gzip_graphic_ref milc_test povray_test equake_ref

slide-24
SLIDE 24

UNIFIED POLICY

40 50 60 70 80 90 100 110

  • 1
  • 0.5

0.5 1 Actual Performance (%)

  • lbm_test

mcf_ref swim_ref gzip_graphic_ref milc_test povray_test equake_ref

slide-25
SLIDE 25

BATTERY

  • AWARE POLICY

0.2 0.4 0.6 0.8 1 1.2 2 4 6 8 10 12 14 16 18 20 0.6 0.7 0.8 0.9 1 1.1 1.2 , Normalised Battery State Normalised Energy/Time Iteration

  • Battery Level

Execution Time Energy

slide-26
SLIDE 26

DISCUSSION

  • practicality
  • cooperation from vendors, built-in power measurement
  • energy management by hardware or software
  • hints from applications
  • is this too fine-grained
  • dumb component shutdown (both software and hardware)