Motivation Power and energy: Will not motivate low power topic - - PDF document

motivation power and energy
SMART_READER_LITE
LIVE PREVIEW

Motivation Power and energy: Will not motivate low power topic - - PDF document

Motivation Power and energy: Will not motivate low power topic You are here, arent you!? how low can we go? Lars Svensson Scaling and limits set the context for larssv@chalmers.se insights and for good competitive design


slide-1
SLIDE 1

Power and energy: how low can we go?

Lars Svensson larssv@chalmers.se

1

Motivation

  • Will not motivate low power topic …
  • You are here, aren’t you!?
  • Scaling and limits set the context for

insights and for good competitive design

  • … and there is an intellectual challenge

too, of course :-)

2

Overview / Outline

  • Will start from a CMOS perspective
  • Gradually, circuits and examples will

become less and less like the ordinary gates

  • Practical design methods in later

lectures (Per Larsson-Edefors)

  • End at fundamental physical limit

3

CMOS energetics

4

CMOS inverter

  • Input goes low, output is

charged

  • Q = CLVdd
  • Einj = Vdd · Q = CLVdd2
  • Ediss = Vavg · Q = CLVdd2 / 2
  • ECL = Einj – Ediss = CLVdd2 / 2
  • Dissipated during discharge

Q CL Vdd

5

CMOS inverter

  • Half of injected energy

dissipated during charge,

  • ther half during discharge
  • Ecycle = Einj
  • Repeat with frequency f:
  • P = f CLVdd2 = ß fck CLVdd2
  • Several capacitive nodes:
  • P = fck Vdd2 ! (ßi CLi)

Q CL Vdd

6

slide-2
SLIDE 2

How reduce E and P?

  • Reduce f, C, and/or V
  • V looks especially promising since squared
  • Reduce swing wrt Vdd
  • One-time gain, but OK
  • Used in memories, I/O
  • Logic typically full-swing
  • Reduce Vdd

7

Vdd reduction

  • Gate-source voltage VGS of device is

reduced too

  • Drain current reduced
  • Id ~ (Vdd – VT)x , 1 < x " 2
  • Time to charge CL:
  • t ~ Q / Id ~#Vdd CL / (Vdd – VT)x
  • Goes toward +inf as Vdd approaches

threshold voltage VT

Q CL Vdd

8

Further Vdd reduction

  • To avoid speed reduction, VT must be

reduced too

  • Keep Vdd / (Vdd – VT)x approx. constant
  • Reduce VT as fast as Vdd
  • How far?

9

Leakage currents

  • Devices will conduct even below VT
  • Causes leakage power when idle
  • Exponential increase with lower VT will

eventually defeat any switching power gain

10

Optimal voltages

  • Sakurai surfaces of power and delay
  • Same delay at A and B, but lower

power at B

11

  • Minimum moves with activity factor ß
  • Weighted sum of active and idle powers

Optimal choice?

12

slide-3
SLIDE 3

Metrics

  • Sakurai minimum steers towards low

performance!

  • Does not care about time
  • Other metrics of frugality?
  • Power = Energy per time
  • Energy
  • EDP = Energy * time

13

Energy-delay curves

  • General way of relating energy and delay

for a design

  • Useful to invert one axis

2 4 6 8 10 12 14 16 1 2 3 4 5 6 7 8 9 10 Delay vs energy when Vdd/Vt is varied Energy Delay

14

Energy-delay curves

  • General way of relating energy and delay

for a design

  • Useful to invert one axis

2 4 6 8 10 12 14 16 0.2 0.4 0.6 0.8 1 1.2 1.4 Inverse delay Inverse delay vs energy when Vdd/Vt is varied Energy 2 4 6 8 10 12 14 16 1 2 3 4 5 6 7 8 9 10 Delay vs energy when Vdd/Vt is varied Energy Delay

14

Fast processing with slow circuits?

  • Energy per operation falls faster than

performance with reduced Vdd

  • Opportunity!
  • Reduce Vdd until performance is halved,

then duplicate hardware

  • Examples: pipelining, parallel processing,

multicore

  • Overheads

15

Moore / Dennard scaling

16

!"#$%&'#$()*+&,-+#)*./01$2+#3 !"#$%&'#$()*+&,-+#)*./01$2+#3

!" !" #" #" $#" $#" %&$" %&$" !' !' !$' !$' #' #' $#' $#' #((# #((# )()( )()( )()$ )()$ )(%)$ )(%)$ *+)$, *+)$, *#)$, *#)$,

  • ./0*12
  • ./0*123

3

  • ./0*12
  • ./0*123

3 44

44

  • ./0*12
  • ./0*123

3 !!!

!!! %&$' %&$' &!%' &!%'

  • ./0*12
  • ./0*123

3 #

405/*12 405/*12,

,

!6 !6 %6 %6 #6 #6 !%)' !%)' !$" !$" "#$%&''()*%%%%%%%"#$%+,-./%0123%&/45(6%7(4( "#$%&''()*%%%%%%%"#$%+,-./%0123%&/45(6%7(4(

089 088 0808

:'(;*.*4,'* :'(;*.*4,'* <='%7.= <='%7.= 01>3%&/45(6%7(4( 01>3%&/45(6%7(4(

081

$,5'/=?%!;4=6% $,5'/=?%!;4=6%

  • "

"=@,') =@,') 0123%<',A=/4.,; 0123%<',A=/4.,;

08B 082

  • "./',C',/=**,'

"./',C',/=**,'

08> 083 08D 08E 080

01>8 01>8 01>3 01>3 0128 0128 0123 0123 01B8 01B8 01B3 01B3 0118 0118 0113 0113 E888 E888 E883 E883 E808 E808

17

slide-4
SLIDE 4

!"#$%&'#$()*+&,-+#)*./01$2+#3 !"#$%&'#$()*+&,-+#)*./01$2+#3

!" !" #" #" $#" $#" %&$" %&$" !' !' !$' !$' #' #' $#' $#' #((# #((# )()( )()( )()$ )()$ )(%)$ )(%)$ *+)$, *+)$, *#)$, *#)$,

  • ./0*12
  • ./0*123

3

  • ./0*12
  • ./0*123

3 44

44

  • ./0*12
  • ./0*123

3 !!!

!!! %&$' %&$' &!%' &!%'

  • ./0*12
  • ./0*123

3 #

405/*12 405/*12,

,

!6 !6 %6 %6 #6 #6 !%)' !%)' !$" !$" "#$%&''()*%%%%%%%"#$%+,-./%0123%&/45(6%7(4( "#$%&''()*%%%%%%%"#$%+,-./%0123%&/45(6%7(4(

089 088 0808

:'(;*.*4,'* :'(;*.*4,'* <='%7.= <='%7.= 01>3%&/45(6%7(4( 01>3%&/45(6%7(4(

081

$,5'/=?%!;4=6% $,5'/=?%!;4=6%

  • "

"=@,') =@,') 0123%<',A=/4.,; 0123%<',A=/4.,;

08B 082

  • "./',C',/=**,'

"./',C',/=**,'

08> 083 08D 08E 080

01>8 01>8 01>3 01>3 0128 0128 0123 0123 01B8 01B8 01B3 01B3 0118 0118 0113 0113 E888 E888 E883 E883 E808 E808

105

17

!"#$%&&#"'!#(%"')*+,,&-' !"#$%&&#"'!#(%"')*+,,&-'. . /$,01%'2'3%+4+5%' /$,01%'2'3%+4+5%'

!"!!# !"!# !"# # #! #!! #!!! #$%! #$&! #$'! #$$! (!!! (!#!

)*+,-./01

23456, 23456, 7,898:, 7,898:,

18

!"#$%&&#"'!#(%"')*+,,&-' !"#$%&&#"'!#(%"')*+,,&-'. . /$,01%'2'3%+4+5%' /$,01%'2'3%+4+5%'

!"!!# !"!# !"# # #! #!! #!!! #$%! #$&! #$'! #$$! (!!! (!#!

)*+,-./01

23456, 23456, 7,898:, 7,898:,

103

18

!"#$%&&#"'!#(%"')*+,,&-' !"#$%&&#"'!#(%"')*+,,&-'. . /$,01%'2'3%+4+5%' /$,01%'2'3%+4+5%'

!"!!# !"!# !"# # #! #!! #!!! #$%! #$&! #$'! #$$! (!!! (!#!

)*+,-./01

23456, 23456, 7,898:, 7,898:,

103 … while clocks have gone from kHz to GHz!

18

What happened?

  • Reduced average activity factors
  • Large memories
  • Reduced capacitances
  • Scaling

19

Dennard scaling

  • 1974 classical paper on scaling ion-

implanted MOS devices

20

slide-5
SLIDE 5

Scaling limit?

  • Carver Mead’s 1994 predictions

21

Scaling limit?

  • Carver Mead’s 1994 predictions

22 nm

21

Limits

22

Limits

22

Limits

22

Oxide thickness

  • Single atom layers!
  • Scaling cannot go much further…

23

slide-6
SLIDE 6

Energetics revisited

24

Charging CL

  • Charging + discharging CL to

Vdd costs CLVdd2, always. Right?

  • No.
  • Replace Vdd with ramp!
  • 0 Vdd
  • Same charge transferred
  • … but Vavg may be < Vdd / 2

Q CL

25

Charging CL

  • Charging + discharging CL to

Vdd costs CLVdd2, always. Right?

  • No.
  • Replace Vdd with ramp!
  • 0 Vdd
  • Same charge transferred
  • … but Vavg may be < Vdd / 2

Q CL

Ediss < CLVdd2 / 2

25

Ediss < CLVdd2 / 2

  • But how big is it?
  • Model switch device as resistor R
  • Then, if ramp duration is T
  • I = Q / T = CLVdd / T
  • P = I2R = (RCL / T) CLVdd2 / T
  • Ediss = P · T = (RCL / T) CLVdd2

Q CL

26

Ediss < CLVdd2 / 2

  • But how big is it?
  • Model switch device as resistor R
  • Then, if ramp duration is T
  • I = Q / T = CLVdd / T
  • P = I2R = (RCL / T) CLVdd2 / T
  • Ediss = P · T = (RCL / T) CLVdd2

Q CL

26

Ediss < CLVdd2 / 2

  • But how big is it?
  • Model switch device as resistor R
  • Then, if ramp duration is T
  • I = Q / T = CLVdd / T
  • P = I2R = (RCL / T) CLVdd2 / T
  • Ediss = P · T = (RCL / T) CLVdd2

Q CL

2 RCL < T

26

slide-7
SLIDE 7

Ediss < CLVdd2 / 2

  • But how big is it?
  • Model switch device as resistor R
  • Then, if ramp duration is T
  • I = Q / T = CLVdd / T
  • P = I2R = (RCL / T) CLVdd2 / T
  • Ediss = P · T = (RCL / T) CLVdd2

Q CL

2 RCL < T Ediss < CL Vdd2 / 2

26

Observations

  • It is good for R to be small
  • R does not matter for power in

“standard” charging

  • Ediss arbitrarily small for large T
  • “Adiabatic charging”
  • No heat delivered to environment

Ediss = (RCL / T) CLVdd2

27

Observations

  • It is good for R to be small
  • R does not matter for power in

“standard” charging

  • Ediss arbitrarily small for large T
  • “Adiabatic charging”
  • No heat delivered to environment

Ediss = (RCL / T) CLVdd2

N

  • h

e a t r e c

  • v

e r y

27

How size the switch?

  • Small R is good!
  • … but requires a wide transistor …
  • … which means a large gate

capacitance …

  • … which is expensive to charge /

discharge

  • To reap benefits, must charge and

discharge gate capacitance adiabatically

28

How size the switch?

  • Small R is good!
  • … but requires a wide transistor …
  • … which means a large gate

capacitance …

  • … which is expensive to charge /

discharge

  • To reap benefits, must charge and

discharge gate capacitance adiabatically etc…

28

  • Yes. The discharge.
  • To adiabatically charge and discharge a

capacitance, the controlling device must be on throughout

  • … so its gate must be charged before and

discharged after the load …

CL

29

slide-8
SLIDE 8
  • Yes. The discharge.
  • To adiabatically charge and discharge a

capacitance, the controlling device must be on throughout

  • … so its gate must be charged before and

discharged after the load …

CL

29

  • Yes. The discharge.
  • To adiabatically charge and discharge a

capacitance, the controlling device must be on throughout

  • … so its gate must be charged before and

discharged after the load …

CL

29

  • Yes. The discharge.
  • To adiabatically charge and discharge a

capacitance, the controlling device must be on throughout

  • … so its gate must be charged before and

discharged after the load …

CL

29

  • To adiabatically charge and discharge a

capacitance, the controlling device must be on throughout

  • … so its gate must be charged before and

discharged after the load …

CL

The pipeline problem

29

The pipeline problem

  • Every logic stage is

controlled by a previous stage

  • How can previous

stage start a new computation before next stage is done?

  • How control

discharge?

30

Invertible stages

  • For each logic

function H in pipeline, there must exist a logical inverse H–1

  • Then, inverse may

control discharge

  • f inputs to

function

31

slide-9
SLIDE 9

Reversibility

  • Symmetry
  • Pipeline can be

run in reverse

  • Reverse order of

power ramps (or power clocks)

32

Reversibility

  • Symmetry
  • Pipeline can be

run in reverse

  • Reverse order of

power ramps (or power clocks)

32

Reversibility, cont.

  • Next, build a reversible processor
  • No undoable operations
  • No overwrite to memory or

registers…

  • Next, use a reversible programming

language

  • Inputs must be computable from
  • utputs

33

Irreversibility

  • Irreversibility means that information is lost
  • Information is connected to entropy:
  • Entropy is a measure of disorder, or

randomness, so lack of information

  • Thermodynamic arguments show that

destroying one bit of information necessarily causes at least energy kT ln(2) [Landauer 1961]

34

kT ln(2)?

  • So, Emin = kT ln(2)
  • Bolzmann constant: k = 1.38e–23
  • ln(2) = 0.693
  • When T = 300, Emin = 2.6e–21
  • Scale is zeptojoules!
  • Cf. CV2 @ 0.1V and 0.1fF: attojoules
  • Three orders of magnitude to go!

35

Practical adiabatics

  • Small circuits research subfield since

‘90s

  • Focus less on reversibility, more on

energy recovery

  • Reversibility less important when load

capacitance mostly wiring

  • Resonant clock distribution
  • Lowest-hanging fruit!

36

slide-10
SLIDE 10

Summary

  • Moore’s-law scaling gives more

computation per joule

  • End in sight?
  • Voltage reductions for traditional CMOS

bottom out rather soon

  • Long way to go to thermodynamic limits
  • Something else than CMOS needed…

37