New Computing Approaches unlimited release SAND2017-0924 C Erik P. - PowerPoint PPT Presentation

Photos placed in horizontal position with even amount of white space between photos and header Computational Complexity and Approved for New Computing Approaches unlimited release SAND2017-0924 C Erik P. DeBenedictis, Center for Computing Research, Sandia Wildly Heterogeneous Post-CMOS Technologies Meet Software Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE -AC04-94AL85000. SAND NO. 2011-XXXXP 1

Overview Logic devices fall into Solution: complexity theory categories by potential upside based on a kT measure • A large class of devices are • Discounts CMOS’ maturity limited by thermodynamics advantage by assessing physical limits of energy – CMOS in this large class and has a big head start efficiency in units of kT – A common limit precludes • Use algorithmic complexity any from being a lot better to assess devices’ ability to than the others combine into useful • However, differences are functions worth exploiting • Need analog vs. digital kT • How do we compare within comparisons to work the large class? 2

Scope of Talk is the Red Class Name of approach Performance limit or other Investment to capability date Neural networks (irrespective of Learning and maybe Billions intelligence 1 implementation) Quantum computing Quantum speedup Billions (superconducting electronics) Thermodynamic ( kT ) 1 Neuromorphic computing, i. e. Billions implementations of neural networks Thermodynamic ( kT ) 2 Novel devices: Spintronics, Carbon Millions Nanotubes, Josephson Junctions, (each) new memories, etc. Thermodynamic ( kT ) 3 Analog computing Millions Thermodynamic ( kT ) 4 “3D+architecture,” i. e. Trillion continuation of Moore’s law Reversible computing Arbitrarily low energy/op Millions 1 DeBenedictis, Erik P. "Rebooting Computers as Learning Machines." Computer 49.6 (2016): 84-87. 2 DeBenedictis, Erik P. "The Boolean Logic Tax." Computer 49.4 (2016): 79-82. 3 DeBenedictis, Erik P. "Computational Complexity and New Computing Approaches." Computer 49.12 (2016): 76-79. 4 DeBenedictis, Erik P. “It’s Time to Redefine Moore’s Law Again.” Computer 50.2 (2017): 40-43 (still in print) 3

Overview of Example Memristor-based neural Analyzing via complexity networks as an example theory based on kT measure • • Let’s compare limits Analog memristor-based neural networks are claimed – Digital kT limits via Landauer, to be more energy-efficient etc. than a digital implementation – Analog kT limits from circuit • Difficulties in comparison theory – Scale: Measured memristor • Result (below, will derive) circuits are small, but a GPU cluster can execute billions of • Interpretation (will discuss) synapses – There is a parameter space of – Precision: Memristors typically scale and precision where have a dozen levels, but GPUs use floating point each is best 2 ( L ) E digital = ~24 ln(1/ p error ) log 2 N kT ln(1/ p error ) L 2 N 2 E analog = ~1/24 kT 4

Novelty of Next Few Slides • To compare digital and analog, we need a p error , or the probability that the answer will be wrong. Reliability goes up with energy, so we need a common reference point. • Analog circuits are limited by thermal noise of magnitude kT , but the theory is not organized in the same way as digital minimum energy. • The terminology has to line up. 5

Digital Minimum Energy Digital circuit Minimum energy • Vectors v and w are inputs • p error per input is e - E signal/ kT • Leading to gate energy E gate = ~2 ln(1/ p error ) kT assuming 2 inputs • L distinguishable levels require log 2 L -bit binary numbers • Multiplier array is about 6 N 2 gates and assume 100% overhead 2× 2 ( L ) E digital = ~24 ln(1/ p error ) log 2 N kT 6

Analog Minimum Energy I Analog circuit Circuit analysis • Inputs v and w = 1/ g • P n = 4 kTf = V n 2 ½ g max N , P n ( V n ) is noise power (voltage) at amplifier, f is amplifier bandwidth and conductivities are 0… g max • V pn = V n A v √ln(1/ p error ), where V pn is peak noise • P dot ( B ) = 1/6 V 2 g max N , where • L = 2 V / V pn , where V is P dot is power of dot product supply and V pn is peak noise • E dot = P dot (B) / (2 f ), where at amp. E dot is energy at Nyquist freq 7

Analog Minimum Energy II So now what happens? • Landauer’s contribution was • If Landauer was right, the to establish circuit values should cancel. implementation- Hmm. Let’s try… independent minimum energies for computation. • The previous slide was just a bunch of circuit equations – Two equations with g max – Two equations with V – Two equations with f ln(1/ p error ) L 2 N 2 E analog = ~1/24 kT 8

Comparison of Minimum Energies Each “wins” in a region of the How can this be right? The parameter space human brain is misplaced • Well, actually, the human brain is digital. • Tell story: Neuroscientist Brad looked at result and said “oh yeah, biology uses level-based signaling in C. elegans and retinas…” Ha, only small scale. So maybe god/evolution figured this out already 2 ( L ) E digital = ~24 ln(1/ p error ) log 2 N kT ln(1/ p error ) L 2 N 2 E analog = ~1/24 kT 9

What’s Different? Look  Variable energy per multiply, at equal precision here  Divide by N for energy per arithmetic operation: 2 ( L ) E digital / N = ~24 ln(1/ p error ) log 2 kT ln(1/ p error ) L 2 E analog / N = ~1/24 N kT  The energy consumed by an analog multiply depends on how many times the result is added up.  …or maybe, multiplies are free, but adds are not?  Why? Circuit equations rule, but intuitively, signals flow backwards through the memristor array (show the audience). Consequence: Algorithms do not readily transport from analog to digital and vice versa. 10

Second Example: Ultra Low-energy Synapse  The kT limits approach can be applied to a quasi-analog neural synapse  Achieves much less than kT energy dissipation per training cycle  Why?  Most neural network learning is merely verifying that the system has learned what it needs to know  Only state changes need to dissipate energy  Ref: DeBenedictis, Erik P., et al. "A path toward ultra-low- energy computing." Rebooting Computing (ICRC), IEEE International Conference on . IEEE, 2016. 11

Landauer’s Method Extracted From his Paper prob p q r p1 q1 r1 Si (k's) State Sf (k's) System: 1  0.25993 a 0.125 1 1 1 1 1 0.25993 0  0.25993 b 0.125 1 1 0 0 1 0.25993 p p 1 1  0.25993 g 0.125 1 0 1 1 0 0.367811 q q 1 0  0.25993 d 0.125 1 0 0 0 0 0.367811 1  0.25993 g 0.125 0 1 1 1 0 0 r r 1 0  0.25993 d 0.125 0 1 0 0 0 0 1  0.25993 g 0.125 0 0 1 1 0 0 0  0.25993 d 0.125 0 0 0 0 0 0 2.079442 Sf (k's) 1.255482 Si-Sf (k's) 0.823959 Typically of the order of kT for each irreversible function From source: …typically of the order of kT for each irreversible function 12 [Landauer 61] Landauer, Rolf. "Irreversibility and heat generation in the computing process." IBM journal of research and development 5.3 (1961): 183-191.

Backup: Details  Each input combination gets a row  Each input combination k has probability p k , p k ’s summing to 1  S i (i for input) is the sum of all p k log p k ’s  Each unique output combination is analyzed  Rows merge if the machine produces the same output  Each output combination k has probability p k , p k ’s summing to 1  S f (f for final) is the sum of all p k log p k ’s  Minimum energy is S i – S f  Notes  Inputs states that don’t merge do not raise minimum energy  Inputs that merge raise minimum energy based on their probability  Assumption: All input combinations equally probable 13

Example: a Learning Machine This “learning machine” example exceeds energy efficiency limits of Boolean logic. The learning machine continues indefinitely monitors the environment for knowledge, yet usually just lion apple night danger food sleep verifies that it has learned what it needs to know. Say 0 0 0 0 0 0 “causes” (lion, apple, and night) and “effects” (danger, 0 0 0 0 0 0 food, and sleep) have value 1. 0 0 0 0 0 0 1 0 0 1 1 0 Example input: Old-style {lion, danger } {apple, food } {night, sleep } {lion, magnetic danger } {apple, food } {night, sleep } {lion, cores danger } {apple, food } {night, sleep } {lion, 1 1 0 1 0 0 danger, food } {apple, food } {night, sleep } { lion, 0 0 0 0 0 0 danger } {lion, danger } Functional example: Signals create currents; Machine continuously monitors environment for {1, 1} or core flips a  1.5 {-1, -1} pairs and remembers them in state of a magnetic core. Theoretically, there is no need for energy consumption unless state changes. 14

New Computing Approaches unlimited release SAND2017-0924 C Erik P. - PowerPoint PPT Presentation

Photos placed in horizontal position with even amount of white space between photos and header Computational Complexity and Approved for New Computing Approaches unlimited release SAND2017-0924 C Erik P. DeBenedictis, Center for Computing

New Approaches to New Approaches to New Approaches to Repair of Repair of Repair of Spinal

Outline Specification Approaches Munindar P. Singh (NCSU) Service-Oriented Computing Spring

TITANIUM EYEWEAR DESIGNED IN ICELAND, MADE IN ITALY AGNAR NEW NEW NEW ALBA NEW NEW NEW

God of Peace? Question Question Various approaches Question Various approaches Suggestions

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

Introducing the new Predator 68 New Predator 68 New Predator 68 New Predator 68 New Predator 68

Todays research computing UF Research Computing Introduction to Galaxy at UF HPC

PRIVACY IN PERVASIVE COMPUTING COMPUTING Marc Langheinrich ETH Zurich, Switzerland Approaches to

Learning Approaches to Estimate Depth from RGB Lecture 5 What will we learn - Latest Approaches

COMPUTING COMMUNITY CONSORTIUM The mission of the Computing Research Association's Computing

THE COMPUTING COMMUNITY CONSORTIUM (CCC) COMPUTING COMMUNITY CONSORTIUM The mission of Computing

Calm Computing The Coming Age of Mark Weiser and John Seely Brown Calm Computing Whyfor, Calm

Ray Wu Presentation to School of Computing, National University of Singapore Computing Evolution

ManyCore ManyCore Computing: ManyCore ManyCore Computing: Computing: Computing: The Impact on

New MTT - Approaches 20 Marine Raker Pile Installation New MTT - Approaches Excavation

Keeping Score: Keeping Score: New Approaches to the Standard of Living New Approaches to the

26:198:722 Expert Systems Dr. Peter R. Gillett Associate Professor Department of Accounting

CSCE 410/611: Virtualization Definitions, Terminology Why Virtual Machines? Mechanics

MA Macroeconomics 14. Institutions and Efficiency Karl Whelan School of Economics, UCD Autumn

Congratulations to our 2020 TMEA All State Band Members!!! Eb Clarinet - Sarah Harvey, 12 - 1st

How our Current Theory of Economics and Practice of Finance have Unsustainability built in QCEA

Control for the Lundberg process Reinsurance and investment Christian Hipp Institute for

CS145: INTRODUCTION TO DATA MINING 08: Classification Evaluation and Practical Issues

A Semi-supervised Stacked Autoencoder Approach for Network Traffic Classification Ons Aouedi,