helping moore s law architectural techniques to address
play

Helping Moores Law: Architectural Techniques to Address Parameter - PowerPoint PPT Presentation

Helping Moores Law: Architectural Techniques to Address Parameter Variation Radu Teodorescu Computer Science Department University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu/~teodores Technology scaling continues Quad


  1. Helping Moore’s Law: Architectural Techniques to Address Parameter Variation Radu Teodorescu Computer Science Department University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu/~teodores

  2. Technology scaling continues Quad Opteron Core 2 Duo Pentium 4 Pentium 3 number of transistors transistor size Architectural Techniques to Address Parameter Variation Radu Teodorescu 2

  3. Challenges to scaling Manufacturing process Environmental Temperature variation Sub-wavelength lithography (2(>$ (2(>$ ?@A ?@A !"#$ !"#$ % & '( % & '( 45nm 192nm (07$ (07$ light BC@A BC@A Dopant density fluctuations Supply voltage fluctuations *+,,-.%/0-123$%&4) *+,,-.%/0-123$%&4) 4#256%7$-"28"-"1.%9%,0:$7 4#256%7$-"28"-"1.%9%,0:$7 4#";6%<7$=+$;(. 4#";6%<7$=+$;(. !"#$%& µ !"#$%& µ µ '$() µ '$() µ µ µ µ Architectural Techniques to Address Parameter Variation Radu Teodorescu 3

  4. Variation in transistor parameters pdf Frequency Power Reliability nominal switching speed leakage power AMD Quad-core Intel 80-core Opteron Polaris Architectural Techniques to Address Parameter Variation Radu Teodorescu 4

  5. Process variation effects 130nm HOF HOF 5*#$"6)7%-'8#%92%+0: 5*#$"6)7%-'8#%92%+0: HOD HOD HOG HOG One generation of process technology ABC ABC is lost to process variation. HOH HOH DBE DBE HOQ HOQ QOM QOM Q Q J J HQ HQ HJ HJ GQ GQ 5*#$"6)7%-';%"<"=%'>. ,? @ 5*#$"6)7%-';%"<"=%'>. ,? @ ! Shekhar Borkar et al, Intel, DAC 2003 Architectural Techniques to Address Parameter Variation Radu Teodorescu 5

  6. Variation components fast, leaky die-to-die transistors within-die C1 C2 C1 C2 C3 C4 C3 C4 slower, less leaky transistors Architectural Techniques to Address Parameter Variation Radu Teodorescu 6

  7. Addressing parameter variation Variation reduction Variation tolerance variation-aware application computing stack dynamic fine-grain body biasing scheduling and power management Runtime system Runtime system C1 C1 C2 C2 C3 C3 C4 C4 C5 C5 variation tolerance C1 C2 L2 Cache L2 Cache Microarchitecture Microarchitecture C6 C6 C7 C7 C8 C8 C9 C9 C10 C10 variation reduction C11 C11 C12 C12 C13 C13 C14 C14 C15 C15 C3 C4 Circuits Circuits L2 Cache L2 Cache C16 C16 C17 C17 C18 C18 C19 C19 C20 C20 reduce power speed up of high power cells slow cells Architectural Techniques to Address Parameter Variation Radu Teodorescu 7

  8. Outline • Two solutions: • Dynamic fine-grain body biasing [MICRO’07] Runtime system Runtime system variation tolerance • Variation aware scheduling and power management Microarchitecture Microarchitecture [ISCA’08] variation reduction • Evaluation Circuits Circuits • Future work Architectural Techniques to Address Parameter Variation Radu Teodorescu 8

  9. Outline • Two solutions: • Dynamic fine-grain body biasing Runtime system Runtime system variation tolerance • Variation aware scheduling and power management Microarchitecture Microarchitecture variation reduction • Evaluation Circuits Circuits • Future work Architectural Techniques to Address Parameter Variation Radu Teodorescu 9

  10. Body biasing • A voltage is applied between source/drain and substrate of a group of transistors • Forward body bias (FBB) Frequency Leakage • Reverse body bias (RBB) Frequency Leakage • Key knob to trade off frequency for leakage power BB DVFS Leakage Dynamic Frequency Frequency power power Architectural Techniques to Address Parameter Variation Radu Teodorescu 10

  11. Static fine-grain body biasing (S-FGBB) [Tschanz et al, Intel] RBB reduces static power of leaky cells C1 C2 FBB speeds up C3 C4 slow cells FGBB • The result is reduced WID variation Leakage Frequency power • improved processor frequency, lower power • Additional control over a chip’s frequency and power Architectural Techniques to Address Parameter Variation Radu Teodorescu 11

  12. Static fine-grain body biasing BB values fixed for the lifetime of the chip Bin 1 Worst case conditions F max (temperature, power) are Bin 2 assumed Frequency S-FGBB has to be conservative High Bin 3 power Bin 4 Leakage power limit Leakage Architectural Techniques to Address Parameter Variation Radu Teodorescu 12

  13. Dynamic fine-grain body biasing (D-FGBB) Temp • Significant temperature variation: BC@A BC@A (07$ (07$ • Space : across different cores (07$ (07$ BC@A BC@A BC@A BC@A BC@A BC@A • Time : as the activity factor of the workload (07$ (07$ (07$ (07$ changes • Circuit delay increases with temperature: slow T (07$ (07$ BC@A BC@A fast BC@A BC@A (07$ (07$ (07$ (07$ BC@A BC@A BC@A BC@A (07$ (07$ delay Architectural Techniques to Address Parameter Variation Radu Teodorescu 13

  14. Dynamic fine-grain body biasing Target: F max max T RBB (07$ (07$ BC@A BC@A Higher power S-FGBB slow consumption BC@A BC@A FBB BB - fixed (07$ (07$ Target: F max average T (07$ (07$ RBB BC@A BC@A Lower power D-FGBB fast consumption BC@A BC@A BB - variable FBB (07$ (07$ Architectural Techniques to Address Parameter Variation Radu Teodorescu 14

  15. Dynamic fine-grain body biasing Target: F max max T RBB (07$ (07$ BC@A BC@A Higher power S-FGBB slow consumption BC@A BC@A FBB BB - fixed (07$ (07$ The goal of D-FGBB is to keep the body bias optimal as temperature changes Target: F max average T (07$ (07$ RBB BC@A BC@A Lower power D-FGBB fast consumption BC@A BC@A BB - variable FBB (07$ (07$ Architectural Techniques to Address Parameter Variation Radu Teodorescu 14

  16. Finding the optimal BB • Dynamically measure the delay of each BB cell • Delay sampling circuit: FBB CLK Critical Path Replica Phase Detector RBB delay sampling circuit • BB for each cell is adjusted as temperature changes • Until optimal delay is reached Architectural Techniques to Address Parameter Variation Radu Teodorescu 15

  17. D-FGBB environments environment goal Standard Improve frequency and power Maximize frequency High performance Low power Minimize leakage power Architectural Techniques to Address Parameter Variation Radu Teodorescu 16

  18. Standard environment S-FGBB finds and sets F max Average conditions (T avg ) D-FGBB at T avg F max Frequency D-FGBB saves leakage power compared to S-FGBB at F max S-FGBB at T avg F orig Original chip Power limit Leakage Architectural Techniques to Address Parameter Variation Radu Teodorescu 17

  19. D-FGBB Summary D-FGBB is very effective at reducing WID variation: S-FGBB64 1.037 1.037 1.037 1.000 1.000 1.000 frequency 0.962 frequency 0.962 0.962 0.925 0.925 0.925 0.887 0.887 0.887 0.850 0.850 0.850 0.812 0.812 0.812 0 0.5 1.0 1.0 0 0.5 1.0 1.0 0 0.5 1.0 leakage power leakage power leakage power leakage leakage leakage S-FGBB D-FGBB NoBB • 40% lower leakage • 10% higher frequency Architectural Techniques to Address Parameter Variation Radu Teodorescu 18

  20. Outline • Two solutions: • Dynamic fine-grain body biasing Runtime system Runtime system variation tolerance • Variation aware scheduling and power management Microarchitecture Microarchitecture [ISCA’08] variation reduction • Evaluation Circuits Circuits • Future work Architectural Techniques to Address Parameter Variation Radu Teodorescu 19

  21. Motivation • Large CMPs will have significant core-to-core variation • We model a 20-core CMP, 32nm C1 C1 C2 C2 C3 C3 C4 C4 C5 C5 L2 Cache L2 Cache Leakage Total Frequency power power C6 C6 C7 C7 C8 C8 C9 C9 C10 C10 fastest C2 C11 C11 C12 C12 C13 C13 C14 C14 C15 C15 vs. 2X 40% 30% L2 Cache L2 Cache slowest C20 C16 C16 C17 C17 C18 C18 C19 C19 C20 C20 Design-identical cores will have significantly different properties Architectural Techniques to Address Parameter Variation Radu Teodorescu 20

  22. How can we exploit this variation? • Current CMPs run at the frequency of the slowest core C1 C1 C1 C1 C1 C2 C2 C2 C2 C2 C3 C3 C3 C3 C3 C4 C4 C4 C4 C4 C5 C5 C5 C5 C5 • We can run each core at the maximum frequency it can achieve L2 Cache L2 Cache L2 Cache L2 Cache L2 Cache • 15% average frequency increase C6 C6 C6 C6 C6 C7 C7 C7 C7 C7 C8 C8 C8 C8 C8 C9 C9 C9 C9 C9 C10 C10 C10 C10 C10 C11 C11 C11 C11 C11 C12 C12 C12 C12 C12 C13 C13 C13 C13 C13 C14 C14 C14 C14 C14 C15 C15 C15 C15 C15 • Heterogeneous system L2 Cache L2 Cache L2 Cache L2 Cache L2 Cache • Variation-aware scheduling C16 C16 C16 C16 C16 C17 C17 C17 C17 C17 C18 C18 C18 C18 C18 C19 C19 C19 C19 C19 C20 C20 C20 C20 C20 • Variation-aware power management Architectural Techniques to Address Parameter Variation Radu Teodorescu 21

  23. Variation-aware scheduling Applications • Variation in core frequency and power • Application behavior • dynamic power consumption C1 C1 C2 C2 C3 C3 C4 C4 C5 C5 • instructions per cycle (IPC) L2 Cache L2 Cache • C6 C6 C7 C7 C8 C8 C9 C9 C10 C10 System goals: • C11 C11 C12 C12 C13 C13 C14 C14 C15 C15 reduce power • L2 Cache L2 Cache improve performance C16 C16 C17 C17 C18 C18 C19 C19 C20 C20 Architectural Techniques to Address Parameter Variation Radu Teodorescu 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend