resistive computation avoiding the power wall with low
play

RESISTIVE COMPUTATION: AVOIDING THE POWER WALL WITH LOW-LEAKAGE, - PowerPoint PPT Presentation

RESISTIVE COMPUTATION: AVOIDING THE POWER WALL WITH LOW-LEAKAGE, STT-MRAM BASED COMPUTING Xiaochen Guo , Engin Ipek, and Tolga Soyata Rochester Computer Systems Architecture Laboratory Multicore Scaling Limited by Power 2 Traditional MOSFET


  1. RESISTIVE COMPUTATION: AVOIDING THE POWER WALL WITH LOW-LEAKAGE, STT-MRAM BASED COMPUTING Xiaochen Guo , Engin Ipek, and Tolga Soyata Rochester Computer Systems Architecture Laboratory

  2. Multicore Scaling Limited by Power 2  Traditional MOSFET scaling theory relies on reducing V DD in proportion to device dimensions I leak ∝ e - V th 2x 1.4x P = P dynamic + P static = N  (C eff  V DD P dynamic = N  (C eff  V DD 2  f + I leak  V DD ) 2  f ) 1.4x 1.4x 2x  V DD has scaled very slowly since 90nm  Multicore scaling severely challenged by power 6/21/12

  3. Our Approach: Resistive Computation 3  Opportunity: spin-torque transfer magnetoresistive RAM (STT-MRAM)  Near-zero leakage power  Low-energy read operation  Goal: selectively migrate on-chip storage and combinational logic to STT-MRAM to reduce power  On-chip storage  Caches, TLBs, RF, queues  Combinational logic  Lookup-table (LUT) based computing 6/21/12

  4. STT-MRAM 4  Desirable properties Access transistor � + � - � - � V write � V read � V write � + � + � - �  CMOS compatibility  Read speed as fast as SRAM  Density comparable to DRAM  Unlimited write endurance Value = 0 � Value = 1 � MTJ �  Key challenge: expensive writes  Long switching latency (6.7ns @ 32nm)  High switching energy (0.3pJ/bit @ 32nm) 6/21/12

  5. Switching Time vs. Cell Size 5  Faster switching with L2$, L1I$, LUTs, wider access transistors TLBs, MC Queues + Faster writes - Slower reads RF, L1D$ - Lower density - Higher read energy 6/21/12

  6. Fundamental Building Blocks RAM Arrays and Lookup Tables

  7. STT-MRAM Arrays 7  Problem: low write throughput Multiporting Banking  Existing solutions incur high overheads to sustain adequate write throughput in STT-MRAM arrays 6/21/12

  8. STT-MRAM Arrays 8  CMOS subbank buffers  Latch in addr/data and release H-tree; complete write locally  Allow forwarding from ongoing writes  Facilitate local differential writes  Reads access subbank via exclusive read port 6/21/12

  9. STT-MRAM LUTs [Suzuki09, Matsunaga08] 9  Store truth tables of logic functions directly in STT-MRAM  Benefits  Leakage confined to peripheral circuitry  Low-power (low-swing) lookups  Fast lookups using sense amp  Logic functions with many minterms can utilize LUTs effectively 6/21/12

  10. Case Study: 3-bit Adder 10 6/21/12

  11. Pipeline Organization

  12. Hybrid CMT Pipeline 12 Small arrays and simple logic in CMOS Large arrays and complex logic in STT- MRAM 6/21/12

  13. Front End 13 LUT-based carry- select adder to compute PC+4 LUT-based front-end thread selection logic SRAM-based refill queue to avoid I$ conflicts Predecode and back- end thread selection with MRAM-related stall conditions 6/21/12

  14. Register File 14 Architectural registers of all threads aggregated in a unified STT- MRAM array to amortize subbank buffers Registers of a single thread striped across subbanks to reduce subbank buffer conflicts 6/21/12

  15. Floating-Point Unit 15 STT-MRAM CMOS FPU FPU Add, Sub, 24 cycles 12 cycles Mult Div 64 cycles 64 cycles 6/21/12

  16. Memory System 16 Use store buffers to avoid L1 D$ subbank conflicts L1s optimized for fast writes using 30F 2 cells L2 and memory controllers optimized for density using 10F 2 cells 6/21/12

  17. Evaluation

  18. Performance 18 6/21/12

  19. Power 19 6/21/12

  20. Contributions and Findings 20  New technique to reduce leakage and dynamic power in a deep-submicron microprocessor  Selectively migrate on-chip storage and combinational logic from CMOS to STT-MRAM  Use subbank buffers to alleviate long write latency  STT-MRAM is an attractive low-power solution beyond 32nm  Dramatically lower leakage power  Modest loss in performance 6/21/12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend