non-von Neumann computing? Abu Sebastian IBM Research Zurich - PowerPoint PPT Presentation

Computational memory: A stepping stone to non-von Neumann computing? Abu Sebastian IBM Research – Zurich Stanford EE380, 7 th March 2018

IBM Research - Zurich Abu Sebastian, IBM Research - Zurich 2

IBM Research - Zurich Abu Sebastian, IBM Research - Zurich 3

Outline  Motivation for in-memory computing  Constituent elements of computational memory  Computational memory: Logical operations  Computational memory: Arithmetic operations  Computational memory: Computing with device dynamics  Mixed-precision in-memory computing  Summary & Outlook Abu Sebastian, IBM Research - Zurich 4

Abu Sebastian, IBM Research - Zurich 5

Internet of Things (IoT) 30B Internet of An estimated 30 billion 35% CAGR Things internet-connected devices by 2020 Billions of Devices And that the amount of data produced will be over 40 trillion gigabytes Connected Cars Wearables Connected / Smart TVs Tablets Smartphones Personal Computers 0B 2013 2020 Source : BI Intelligence Estimates Abu Sebastian, IBM Research - Zurich 6

The AI revolution Artificial Intelligence Information and Telecommunications Powered by Data Oil, Automobiles and Mass Production Steel, Electricity and Heavy Engineering Steam and Railways The Industrial Revolution Today 1700 Abu Sebastian, IBM Research - Zurich 7

The computing challenge ~20W ~80,000 W 2880 processor threads 16 terabytes of RAM 80 kW of power 20 tons of air-conditioned cooling capacity Conventional von Neumann computing architecture IBM’s Watson in Jeopardy! MEMORY Input data Results CPU Abu Sebastian, IBM Research - Zurich 8

The computing challenge Landscape of AI Algorithms Largely CPUs Cognitive / AI “Human intelligence” exhibited by machines CPUs, FPGAs, GPUs Machine Advanced Analytics: Learning GPUs to train; NoSQL, Learning without explicit programming Hadoop & CPUs, FPGAs to inference; Analytics Race to ASICs Deep Learning Many-layer neural networks  WEEKs to train certain deep neural networks! Abu Sebastian, IBM Research - Zurich 9

Advances in von Neumann computing Processor-in-memory Storage class memory Monolithic 3D integration (near memory computing) Access time Non-volatile Hard ~5 ms disk Slow STORAGE Flash MEMORY ~100 us STORAGE CLASS MEMORY CMOS Processing Volatile Fast Units <100 ns MEMORY DRAM <1 ns CPU CPU Vermij et al ., Proc. ACM CF , 2016 Wong, Salahuddin, Nature Nano ., 2015 Burr et al ., IBM J. Res. Dev ., 2008  Still confined within the von Neumann paradigm  Minimize the time and distance to memory access Abu Sebastian, IBM Research - Zurich 10

Beyond von Neumann: In-memory computing Processing unit & Conventional memory Processing unit & Computational memory  Perform “certain” computational tasks using “certain” memory cores/units without the need to shuttle data back and forth in the process  Logical operations  Arithmetic operations  Machine learning algorithms  Exploits the physical attributes and state dynamics of the memory devices Abu Sebastian, IBM Research - Zurich 11

Constituent elements of computational memory “Alternate atomic arrangements” “Charge on a capacitor” Capacitor Control gate BL Floating WL gate n+ n+ n+ n+ p-Si p-Si Phase-change Metal-oxide material  Difference in atomic arrangements induced by the application of electrical pulses and measured as a difference in electrical resistance  Resistive memory devices or memristive devices  Based on physical mechanisms such as ionic drift and phase transition Abu Sebastian, IBM Research - Zurich 13

Phase-change memory Ge(In, Ag, Sn) GeTe Te Sb(Bi, Au, As) Sb 2 Te 3  A nanometric volume of phase-change material between two electrodes  “WRITE” Process  By applying a voltage pulse the material can be changed from the crystalline phase (SET) to the amorphous phase (RESET)  “READ” process  Low-field electrical resistance Abu Sebastian, IBM Research - Zurich 14

Multi-level storage capability “11” “10” “01” “00”  Possible to achieve intermediate phase configurations  Can achieve a continuum of resistance/conductance levels  Essentially an analog storage device! Burr et al ., IEEE JETCAS, 2016; Sebastian et al ., Proc. E\PCOS, 2016 Abu Sebastian, IBM Research - Zurich 15

Rich dynamic behavior Strong field and Nanoscale thermal transport, Phase transitions, temperature thermoelectric effects structural relaxation dependence  Feedback interconnection of electrical, thermal and structural dynamics Sebastian et al ., Nature Comm ., 2014; Le Gallo et al ., New J. Phys ., 2015; Le Gallo et al ., JAP , 2016; Sebastian et al ., IRPS 2015 Abu Sebastian, IBM Research - Zurich 16

Logic design using resistive memory devices X High resistance (Logic “0”) X Y S Y Low resistance (Logic “1”) S C C  Voltage serves as the single logic state variable in conventional CMOS  CMOS gates regenerate this state variable during computation  How about using the resistance state of memristive devices as a state variable?  Can toggle the states by applying voltage signals; only binary storage required  Logical operations enabled by the interaction between voltage and resistance state variables Borghetti et al ., Nature, 2010 Vourkas, Sirakoulis, IEEE CAS Magazine , 2017 Abu Sebastian, IBM Research - Zurich 18

Stateful logic IN 1 “1” IN 1 “1” “1” “0” OUT NOR OUT V c V c IN 2 IN 2 IN 1 IN 2 OUT 0 0 1 “0” “0” 0 1 0 IN 1 “0” IN 1 “0” 1 0 0 “1” “1” OUT OUT V c V c 1 1 0 IN 2 IN 2 “0” “0”  Stateful logic exhibited by certain memristive logic families  The Boolean variable is represented only in terms of the resistance state Kvatinsky et al ., IEEE TCAS , 2014 Abu Sebastian, IBM Research - Zurich 19

Bulk bitwise operations V ISO “1” “1” “0” “0” “1” V C “0” “1” “0” “1” “1” V C “1” “0” “0” “1” “0” “1” “1” “0” “1” “1” V ISO  Can perform bulk bit-wise operations in a cross-bar array  Each processing task can be divided into a sequence of such operations Talati et al ., IEEE Trans. on Nanotech. , 2016 Abu Sebastian, IBM Research - Zurich 20

Matrix-vector multiplication = MAP to MAP to read DECIPHER conductance voltage from the Burr et al ., Adv. Phys: X, 2017 values current Zidan et al ., Nature Electronics , 2018  By arranging the memristive devices in a cross-bar configuration, one can perform matrix-vector operation with O(1) complexity  Exploits multi-level storage capability and Kirchhoff’s circuits laws  Can also implement multiplication with the matrix transpose Abu Sebastian, IBM Research - Zurich 22

Storing a matrix element in a PCM device Distribution of conductance values in a large array + Iterative programming - algorithm  An iterative programming scheme is typically used to store the matrix elements in a PCM device Abu Sebastian, IBM Research - Zurich 23

Scalar multiplication using PCM devices  Experimental characterization of scalar multiplication based on Ohm’s law Abu Sebastian, IBM Research - Zurich 24

Application: Compressed sensing and recovery Compressed measurements High-dimensional High-dimensional signal signal (recovered)  Compressed sensing: Acquire a large signal at sub-Nyquist sampling rates and subsequently reconstruct that signal accurately  Sampling and compression done simultaneously  Used in various applications such as MRI, facial recognition, holography, audio restoration or in mobile-phone camera sensors (allows significant reduction in the acquisition energy per image) Abu Sebastian, IBM Research - Zurich 25

Compressed sensing using computational memory Iterative reconstruction (AMP Algorithm) Measurement  Store the measurement matrix in a cross-bar array of resistive memory devices  The same array used for both compression and reconstruction  Reconstruction complexity reduction: O(NM) → O(N) Le Gallo et al ., Proc. IEDM , 2017 Abu Sebastian, IBM Research - Zurich 26

non-von Neumann computing? Abu Sebastian IBM Research Zurich - PowerPoint PPT Presentation

Computational memory: A stepping stone to non-von Neumann computing? Abu Sebastian IBM Research Zurich Stanford EE380, 7 th March 2018 IBM Research - Zurich Abu Sebastian, IBM Research - Zurich 2 IBM Research - Zurich Abu Sebastian, IBM

The von Neumann Architecture The von Neumann Architecture of Computer Systems of Computer

Computer Architecture Review CS 562 1 The von Neumann Model John von Neumann (1946)

Set Theory and von Neumann algebras Rom an Sasyk ENS Lyon & Universidad de Buenos Aires

von Neumann's bottleneck von Neumann machine One control unit that connects memory and

Ozawas class S for locally compact groups and unique prime factorization of group von Neumann

von Neumann von Neumann vs. Harvard von Neumann Same memory holds data, instructions.

Von Neumann algebras, countable groups and ergodic theory Workshop Young Researchers in

G odel, Von Neumann and the origins of theoretical computer science Alasdair Urquhart

Lecture 4: Von Neumann algebraic Hardy spaces David Blecher University of Houston December 2016

Die Entwicklung der Spielprogrammierung: Von John von Neumann bis zu den hochparallelen

15-292 History of Computing Post War Computing (1945-1959) John von Neumann 1903-1957

Factorization and dilation problems for completely positive maps on von Neumann algebras

SSC 335/394: Scientific and Technical Computing Computer Architectures single CPU Von Neumann

Chapter 4 The Von Neumann Model Computing Layers Problems Algorithms Language Instruction

The Existence theorem of the Stokes-Neumann Problem Nasrin Arab CASA Tu / e 28 April 2010 Nasrin

CFD Lab Course The Lattice Boltzmann Method Philipp Neumann 20.5.2011 P. Neumann: CFD Lab

Outline DIF/DSF with DIF/DSF with PCMtrees PCMtrees Detecting Differential Item and Testing

Multimedia Outline Compression RTP Scheduling Spring2002 CS461 1

MA-207 Differential Equations II Ronnie Sebastian Department of Mathematics Indian Institute of

Chebyshev Expansions for Solutions of Linear Differential Equations Alexandre Benoit, Joint work

RESISTIVE MEMORY TECHNOLOGY Mahdi Nazm Bojnordi Assistant Professor School of Computing

Theoretical study of photoproduction of an N bound state on a deuteron target with forward

A new look at integrable -models and their deformations Dmitri Bykov Max-Planck-Institut f

Transversality, the maximum principle, and the approximation problem H ector J. Sussmann

non-von Neumann computing? Abu Sebastian IBM Research Zurich - PowerPoint PPT Presentation

Computational memory: A stepping stone to non-von Neumann computing? Abu Sebastian IBM Research Zurich Stanford EE380, 7 th March 2018 IBM Research - Zurich Abu Sebastian, IBM Research - Zurich 2 IBM Research - Zurich Abu Sebastian, IBM

The von Neumann Architecture The von Neumann Architecture of Computer Systems of Computer

Computer Architecture Review CS 562 1 The von Neumann Model John von Neumann (1946)

Set Theory and von Neumann algebras Rom an Sasyk ENS Lyon &amp; Universidad de Buenos Aires

von Neumann's bottleneck von Neumann machine One control unit that connects memory and

Ozawas class S for locally compact groups and unique prime factorization of group von Neumann

von Neumann von Neumann vs. Harvard von Neumann Same memory holds data, instructions.

Von Neumann algebras, countable groups and ergodic theory Workshop Young Researchers in

G odel, Von Neumann and the origins of theoretical computer science Alasdair Urquhart

Lecture 4: Von Neumann algebraic Hardy spaces David Blecher University of Houston December 2016

Die Entwicklung der Spielprogrammierung: Von John von Neumann bis zu den hochparallelen

15-292 History of Computing Post War Computing (1945-1959) John von Neumann 1903-1957

Factorization and dilation problems for completely positive maps on von Neumann algebras

SSC 335/394: Scientific and Technical Computing Computer Architectures single CPU Von Neumann

Chapter 4 The Von Neumann Model Computing Layers Problems Algorithms Language Instruction

The Existence theorem of the Stokes-Neumann Problem Nasrin Arab CASA Tu / e 28 April 2010 Nasrin

CFD Lab Course The Lattice Boltzmann Method Philipp Neumann 20.5.2011 P. Neumann: CFD Lab

Outline DIF/DSF with DIF/DSF with PCMtrees PCMtrees Detecting Differential Item and Testing

Multimedia Outline Compression RTP Scheduling Spring2002 CS461 1

MA-207 Differential Equations II Ronnie Sebastian Department of Mathematics Indian Institute of

Chebyshev Expansions for Solutions of Linear Differential Equations Alexandre Benoit, Joint work

RESISTIVE MEMORY TECHNOLOGY Mahdi Nazm Bojnordi Assistant Professor School of Computing

Theoretical study of photoproduction of an N bound state on a deuteron target with forward

A new look at integrable -models and their deformations Dmitri Bykov Max-Planck-Institut f

Transversality, the maximum principle, and the approximation problem H ector J. Sussmann

Set Theory and von Neumann algebras Rom an Sasyk ENS Lyon & Universidad de Buenos Aires