the ieee rebooting computing initiative and the
play

The IEEE Rebooting Computing Initiative and the International - PowerPoint PPT Presentation

The IEEE Rebooting Computing Initiative and the International Roadmap for Devices and Systems Tom Conte Co-Chair, I EEE Rebooting Com puting I nitiative Vice Chair, I nternational Roadm ap for Devices and System s Schools of CS & ECE,


  1. The IEEE Rebooting Computing Initiative and the International Roadmap for Devices and Systems Tom Conte Co-Chair, I EEE Rebooting Com puting I nitiative Vice Chair, I nternational Roadm ap for Devices and System s Schools of CS & ECE, Georgia I nstitute of Technology tom @conte.us

  2. W hy does com puting need a “reboot”? Moore's Law for 2D predicted to end in 2021 – But few architects care because… Transistors have been getting smaller but cannot be clocked faster – From an architect’s perspective: 10nm isn’t any better than 14nm, which was only marginally better than 22nm The Power Wall: Single thread exponential performance scaling already ended in 2005 2

  3. A history of m odern com puting: How w e got here 1945: Von Neumann’s report describing computer arch. 1955: Manchester Transistor Computer, IBM 709T 1965: Software industry begins (IBM 360), Moore # 1 1975: Moore’s Law update; Dennard’s geo. scaling rule 1985: “Killer micros”: HPC, general-purpose hitch a ride on Moore’s law 1995: Slowdown in CMOS wires: superscalar era begins 3

  4. I n 1 9 9 5 , w ire delays im pact pipelining: Superscalar begins Processor performance Moore’s law 4 Source: Sanjay Patel, UIUC (used with permission)

  5. W e hid parallelism extraction w ith Superscalar Processor Microarchitectures Branch Instruction Instruction Fetch predictor Cache ... Decode & Dispatch ... register file Schedule Issue N independent instructions ... Execute in parallel Data Cache ... ALU ALU ALU ... Reorder instructions ... … Very few of these “tricks” are energy efficient 5

  6. How w e got here, part 2 1945: Von Neumann’s report describing computer arch. 1955: Manchester Transistor Computer, IBM 709T 1965: Software industry begins (IBM 360) 1975: Moore’s Law; Dennard’s geometric scaling rule 1985: “Killer micros”: HPC, general-purpose hitch a ride on Moore’s law 1995: Slowdown in CMOS wires: superscalar era begins 2005: The Power Wall: Single thread exponential scaling ends (Intel Prescott) … 6

  7. Intel P4 Prescott: Q1 2015 200W/cm 2

  8. Multicore era begins Dilemma: Could not clock single core aggressively AND continued to get transistors/chip Solution: Clock multiple cores conservatively 8

  9. How w e got here, part 3 1945: Von Neumann’s report describing computer arch. 1955: Manchester Transistor Computer, IBM 709T 1965: Software industry begins (IBM 360) 1975: Moore’s Law; Dennard’s geometric scaling rule 1985: “Killer micros”: HPC, general-purpose hitch a ride on Moore’s law 1995: Slowdown in CMOS wires: superscalar era begins 2005: The Power Wall: Single thread exponential scaling ends (Intel Prescott) 2012: Realizing the problem: IEEE Rebooting Computing Initiative founded 9

  10. IEEE Rebooting Computing Goal: Rethink Everything : Turing & Von Neumann to now Why IEEE? Encompasses the whole computing stack Circuits & Systems Society Council on Electronic Design Automation 10

  11. I EEE Rebooting Com puting  Summit 1: 2013 Dec. 12-13 (summary online) – Invitation only – Three Pillars: Rebooting Com puting – Energy Efficiency Energy Efficiency Applications/ HCI – Security – Applications/HCI Security 11

  12. I EEE Rebooting Com puting  Summit 2 : 2014 May 14-16 – Engines of Computation  Adiabatic/Reversible Computing Rebooting Com puting  Approximate Computing Energy Efficiency Applications/ HCI  Neuromorphic Computing Security  Augmentation of CMOS Engine Room 12

  13. RCI Sum m it 2 : W ays to com pute Many alternatives – New switch – 3D Integration – Adiabatic/ Reversible logic – Unreliable switch – Approximate, Stochastic – Cryogenic – Neuromorphic accelerators – Analog neuromorphic – Quantum – … not all are general-purpose drop ins – (nor do they need to be) 13

  14. There w as a com m on phenom enon w e discovered… You talking to me ?!? The phenomenal success of von Neumann caused all other approaches to be labeled as “lunatic fringe” Biases against taking risks remains today 14

  15. I EEE Rebooting Com puting  Summit 3: 2014 Oct. 23-24 – Algorithms and Architectures  Random algorithms Algorithms &  HCI and Applications Architectures Rebooting Com puting  Also: Security, Approximate Energy Efficiency Applications/ HCI Computing Security  ITRS joins forces with RCI Engine Room 15

  16. I EEE Rebooting Com puting  Summit 4: 2015 Dec. 10-11 Goal: coordinating efforts between: Algorithms & – Industry (HP, Intel, NVIDIA) Architectures – US: DOE, DARPA, IARPA, NSF Rebooting Com puting Goal 2: How to roadmap the future Energy Efficiency Applications/ HCI Security Engine Room 16

  17. RCI : “Softw are drives the com puter industry” Questions for software industry: – How valuable is legacy softw are ? – What computing resources do the em erging applications need? – How long and how much investment will it take to train new generation of program m ers ? Degrees of Pain Vs. Gain… 17

  18. Potential Approaches vs. Disruption in Computing Stack Algorithm Language Non von Neumann computing API Architecture Architectural changes ISA Microarchitecture FU Hidden changes logic device “Moore More” Level 1 2 3 4 Total Disruption LEGEND: No Disruption

  19. Level 1 : More Moore Software: Legacy code works without issue New switch candidates: – Logic examples: Tunneling FET,CNFET, superconducting electronics – Memory examples: MRAM, memristor, PCM, … 19

  20. 20 More Moore: A better sw itch? Courtesy Dimitri Nikonov and Ian Young

  21. 3 D Architecture exam ple: 21 21

  22. 3 D vs. 2 D Cost Reduction Deposition and etch Lithography and etch Gate Bulk Si 22

  23. Level 1 : More Moore Software: Legacy code works without issue New switch candidates: – Logic examples: Tunneling FET,CNFET, superconducting electronics – Memory examples: MRAM, memristor, PCM Moore’s law w ill go to 3 D 23

  24. Potential Approaches vs. Disruption in Computing Stack Algorithm Language Non von Neumann computing API Architecture Architectural changes ISA Microarchitecture FU Hidden changes logic device “Moore More” Level 1 2 3 4 Total Disruption LEGEND: No Disruption

  25. Level 2 : Not CMOS, but hidden Software: Legacy code works, but may require performance tuning Superscalar in 1995 was an example Microarchitectural changes to – Use unreliable switch logic, and/ or – Use cryogenic superconducting – Reversible computing 25

  26. Lowering voltage gives quadratic improvement in power, but Devices become unreliable below 1V Probability of signal error grows as energy of signal is reduced below 20kT

  27. Traditional Fault Tolerant Computing Reliability “Triple Modular Redundancy” (TMR) – ~ 200% overhead in area and energy to correct an error due to a single bit flip. – Lose all power benefit of lower voltage 27

  28. Redundant Residue Numbers can also correct errors Range = 3*5*2*7 = 210 Redundant decimal mod 3 mod 5 mod 2 mod 7 mod 11 mod 13 13 1 3 1 6 2 0 14 2 4 0 0 3 1 add case 1 27 0 2 1 6 7->5 1 add case 2 27 0 1->2 1 6 5 1 add case 3 27 0 1->2 1 6 7->5 1 Chinese Remainder Theorem |X’|11 & |X’|13 |X’|mc == |X|mc ? How to correct? Case 1 (0,2,1,6) ⇔ 27 ; X’ = 27; |27|11 = 5; |27|13 = 1 |X’|m5 = 5 |X|m5 = 7 |X’|m6 = 1 |X|m6 = 1 replace |X|m5 with |X’|m5 Case 2 (0,1,1,6) ⇔ 111 ; X’ = 111; |111|11 = 1; |111|13 = 7 |X’|m5 = 1 |X|m5 = 5 |X’|m6 = 7 |X|m6 = 1 check error correction table Case 3 Two errors; Double Errors Detection algorithm could be used to detect errors. But unable to correct. This has been around for a long time (1968)– time to look again

  29. RRNS Core Microarchitecture 50% overhead (<< 200% !) 100x more power efficient B. Deng, et al., “Computationally-redundant energy-efficient processing for y'all (CREEPY),” Proceedings of the 2016 IEEE International Conference on Rebooting Computing (ICRC), (San Diego, CA), Oct. 17-19, 2916. 29

  30. Superconducting: sm aller, low er pow er, sam e perform ance same scale comparison 2’ x 2’ Supercomputer Titan at ORNL - #2 of Top500 Superconducting Supercomputer Performance 17.6 PFLOP/s (#2 in world*) 20 PFLOP/s ~1x Memory 710 TB (0.04 B/FLOPS) 5 PB (0.25 B/FLOPS) 7x Power 8,200 kW avg. (not included: cooling, storage memory) 80 kW total power (includes cooling) 0.01x 4,350 ft 2 (404 m 2 , not including cooling) ~ 200 ft 2 (includes cooling) Space 0.05x Cooling additional power, space and infrastructure required All cooling shown Courtesy of M. Manheimer, IARPA Cryogenic Computing Complexity (C3) Program 30

  31. MIT-LL Fully-Planarized Nb Josephson Junction Process Target 10-Nb-layer Process Process Features  • Nb/AlOx/Nb JJ technology • 10 kA/cm 2 (100 µ A/ µ m 2 ) baseline • 200-mm Si substrates • 4-, 8- &10-Nb layer nodes • Feature sizes to 500 nm • Full planarization for uniformity • Transition to stacked/stud vias SFQ-4ee (8-Nb-layer) 2 µ m 31

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend