g j m smit
play

G.J.M. Smit Contents Efficient architectures Introduction - PDF document

G.J.M. Smit Contents Efficient architectures Introduction energy-efficient systems for streaming applications Streaming DSP applications Tiled architectures Dynamic reconfiguration Gerard Smit Run-time mapping of streams to


  1. G.J.M. Smit Contents Efficient architectures � Introduction energy-efficient systems for streaming applications � Streaming DSP applications � Tiled architectures � Dynamic reconfiguration Gerard Smit � Run-time mapping of streams to architecture � Conclusion University of Twente Faculty EEMCS / CTIT the Netherlands e-mail: G.J.M.Smit@ewi.utwente.nl 2 Sources of energy drain in a system Motivations for energy-efficient systems Portable devices that rely on batteries � Communication � Batteries are heavy and large � energy spent by wireless interface � There is no Moore’s law for batteries � internal traffic between various parts of the system � Exponential increase in demand for (streaming) communication and � computation Computation � � Multimedia, wireless communication, etc. applications � High-performance computing � operating system � Cost for cooling and packaging high � wireless protocol processing � Reliability: 10°C increase doubles component failure rate � Power dissipation 100W to 2000W Storage / memory � � Supply current 100A (per chip: Itanium) to 2000A (per board)! � disk � Environmental concerns � Display � Pollution, EMC radiation � e.g. there are 7.000 GSM (x 5 providers) antennas in Netherlands � � energy supply is a large portion of the exploitation budget � Energy bill of Google 50 M$ per year → there is no primary source of energy reduction 3 4 Energy profile of two mobile systems Basic rules of energy reduction Do not do more than necessary � avoid overhead � do not optimize for ‘worst case’ but for the ‘current case’ � React on the environment: adaptability (QoS) � Use Locality of reference � avoid communication over long distances � avoid off-chip communications (1000 times more expensive) � can we wait until the connection is better/cheaper? � Can we prefetch information when connection is cheap? � Take a holistic view � Be energy aware at all levels of your system (QoS) � technological, system architecture, operating system, applications � Do the tasks at the most energy-efficient platform/way � Heterogeneous architectures � Match algorithm with architecture � Migrate functions from mobile to wired system(?) � Mobile audio player Mobile audio/video player [source Philips Research] 5 6 Chameleon 1

  2. G.J.M. Smit Myths and facts Battery gap Myths in energy reduction � Battery energy contents improves approx. 10% per year � energy consumption is only a hardware problem � In most portables batteries contribute to 1/3 of the weight � time will solve the energy problem � 2 AAA batteries have energy contents of ~ 3.3 Wh � new battery technology will solve the problem � Facts � Required energy grows with far more than 10% � functionality of a device is often limited by required energy � consumption batteries are the largest single source of weight in a portable � help from IC technology will slow down � energy is a ‘vertical’ parameter and involves all layers � solution might be in the higher levels: system architecture, � communication protocols, operating system, applications gap between battery energy and required energy grows � communication will require relative more energy than processing � 7 8 Metrics Energy efficiency � Power � Energy efficiency = � Energy dissipated in a certain period of time � E = P. t [Ws] Essential energy dissipation for a certain function Actually used total energy dissipation More power, less energy P Less power, more energy time 9 10 Design for energy efficiency abstraction level examples dynamic power management compression method scheduling system communication error control medium access protocols Technology and logic design level hierarchical memory systems application specific modules logic encoding logic data guarding clock management reversible logic asynchronous design reducing voltage technological chip layout packaging 11 12 Chameleon 2

  3. G.J.M. Smit Where does energy go in CMOS inverter CMOS digital logic? Dynamic power consumption � CMOS � Charging and discharging capacitors � � Inherently low power Most dominant (80-95%) in 130 nm technology � � Cost effective Short circuit current � Level change: Vdd Short circuit path between supply rails when logic level changes � Charge external loads 10% – 15% � Leakage I load � Leaking diodes and transistors � I crowbar Problem even in standby � V V o i Effect is increasing with smaller feature size! � Will soon become a significant/dominant portion of total � C l Level change: Short current 13 14 Power consumption approximation Minimise capacitance � Dynamic power consumption � On-chip 10-50 femto Farads P = ∑ α C V 2 � Internal C reduced by technology scaling � E.g. MIPS 25% reduction in power due to migration from 0.8 with: α = switching activity um to 0.64 um C = total capacity � Off-chip 14 pF V = voltage swing � Energy required for 32 x 32 multiplication 5.7 nJ � Semiconductor trend � V drops 5V → 1.8 V → 1.2 V � 0.8 V � 32 bits data to memory (24 address lines) 20 nJ � smaller technologies � With smaller feature size gap will increase!! � C decreases (on-chip C, not off-chip C ) 15 16 Technology and logic Reordering logic inputs Summary � Numerous techniques are available � Packaging � Technology scaling � Circuits � Clock gating P(A=1) = 0.5 � Data guarding P(X=1) = 0.1 P(Y=1) = 0.02 P(B=1) = 0.2 P(C=1) = 0.1 A B � Architecture X Y & & B C � Technological level gain is limited to x2 Z Z & & � Reduce switching activities is the most effective C A technique Circuit a. Circuit b. 17 18 Chameleon 3

  4. G.J.M. Smit System level � Potentially high gain � Three major mechanisms � Avoid unnecessary activity � Exploit locality of reference System architectural level � Use most efficient platform 19 20 System architecture tricks Memories & busses Gated-clocks and power shutdown A significant fraction of total energy budget is consumed in � � busses and memories Dynamic power management � � Minimise bus access More efficient algorithms and architectures � � Minimise memory access Proper I/O interconnect design and packaging � � caching Single package � � Clustering Coding of data � � compression Interconnection network � Reorder access � CPU centric / connection centric / NoC � Bus encoding techniques � Memory access � Break memory in smaller sub-arrays/banks � Recompute rather than refetch from memory � Each bank can be individually powered down � Local memories/cache (locality of reference) � Memory allocation and garbage collector � 21 22 Encoding Dynamic power management Large amount of energy goes into off-chip IO � Natural focus of designers � Encoding bus data and address can reduce power significantly � � Worst-case conditions Examples � � Peak performance Gray code: addresses usually increment sequentially by one � � Peak utilisation Compression � � Consequence is that system is not fully utilised Bus invert coding � � Dynamic power management exploits periods of � Transmit original or inverted data whichever results in fewer transitions from previous idleness caused by system under-utilisation � Extra signal indicates polarity 2’s complement versus signed magnitude � 23 24 Chameleon 4

  5. G.J.M. Smit Problems with power management Barriers to voltage scaling � Voltage scaling requires threshold voltage to be � Cost of restarting scaled as well � Latency (e.g. time to spin-up) � Decrease in noise margin � Extra energy, e.g. higher start-up current disk � Leakage power will increase � Disk 2W in active, 1 W in idle, 3W in spin-up � Requires special circuits � Two main questions � soft error rate will increase � When to shut-down � Caused by alpha particles and cosmic rays � When to wake-up � Reduced capacitance implies lower energy to flip a bit � Delay increases.. 25 26 Why Dynamic Voltage Scaling? Speed vs. voltage Execute only as fast as necessary to meet deadlines � Workload in devices are typically bursty: � Work 7 Normalised delay 5 time 3 We can save energy by slowing down and thus utilize the idle � 1 time. 1.5 2.0 2.5 3.0 Supply voltage [V] Work time 27 28 Traditional Voltage scaling under deadline constraints � Example: � task 100 ms deadline, needs 50 ms CPU full speed � Traditional: 50 ms computation, 50 ms idle useful inactivity computation � Half speed/voltage scaled: 100 ms comp., 0 idle threshold � Ideal situation: ¼ energy reduction peak D x deadline task x energy S 1 S 2 S 3 D 2 D 3 D 1 S x initiation time task x consumption sleep Speed / Voltage time Task 1 Task 2 Task 3 Task 1 Wake-up time time 29 30 Chameleon 5

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend