Emerging Non Volatile Memory Resistive Memory Technologies Key - - PowerPoint PPT Presentation

emerging non volatile memory resistive memory technologies
SMART_READER_LITE
LIVE PREVIEW

Emerging Non Volatile Memory Resistive Memory Technologies Key - - PowerPoint PPT Presentation

Emerging Non Volatile Memory Resistive Memory Technologies Key concept: replace DRAM cell capacitor with a programmable resistor 1T-1C DRAM 1T-1R STT-MRAM, PCM, RRAM Charge based sensing Resistance based sensing


slide-1
SLIDE 1

Emerging Non Volatile Memory

slide-2
SLIDE 2

Resistive Memory Technologies

¨ Key concept: replace DRAM cell capacitor with a programmable

resistor

  • 1T-1C DRAM
  • Charge based sensing
  • Volatile
  • 1T-1R STT-MRAM, PCM, RRAM
  • Resistance based sensing
  • Non-volatile
slide-3
SLIDE 3

Leading Contenders

STT-MRAM PCM-RAM R-RAM

+ Multi-level cell capable + 4F2 3D-stackable cell

  • Endurance: ~109 writes
  • ~100ns switching time
  • ~300uW switching

power + Multi-level cell capable + 4F2 3D-stackable cell

  • Endurance: 106~1012

writes + ~5ns switching time + ~50uW switching power

  • Limited to single-level

cell

  • 3D un-stackable

+ High endurance (~1015) + ~4ns switching time + ~50uW switching power [ITRS’13]

[Halupka, et al. ISSCC’10] [Pronin. EETime’13] [Henderson. InfoTracks’11]

slide-4
SLIDE 4

Positioning of Resistive Memories

RRAM PCM STT SRAM DRAM FLASH HDD Lower Cost Capacity Higher Speed Higher Endurance

slide-5
SLIDE 5

In-Memory Processing

slide-6
SLIDE 6

Example Research Question

¨ Can we reduce the cost of data movement between

memory and processor core?

Processor Memory 1X 500X How to reduce data movement energy?

data sources: Nvidia

slide-7
SLIDE 7

Example Research Question

¨ Can we reduce the cost of data movement between

memory and processor core?

Processor Memory 1X 500X How to reduce data movement energy?

data sources: Nvidia

slide-8
SLIDE 8

Combinatorial Optimization

¨ Numerous critical problems in science and engineering can be

cast within the combinatorial optimization framework.

Approximate Heuristic Algorithms

Genetic Algorithms Ant Colony Optimization Semi-Definite Programming Simulated Annealing Tabu Search Communication Networks 10010 01 1 1001 Data Mining DNA Analysis Artificial Intelligence Pharmaceuticals

Combinatorial Optimization Problems Traveling Salesman Knapsack Scheduling Machine Learning Bin Packing

slide-9
SLIDE 9

Combinatorial Optimization

¨ Numerous critical problems in science and engineering can be

cast within the combinatorial optimization framework.

Massively Parallel Boltzmann Machine

Approximate Heuristic Algorithms

Genetic Algorithms Ant Colony Optimization Semi-Definite Programming Simulated Annealing Tabu Search Communication Networks 10010 01 1 1001 Data Mining DNA Analysis Artificial Intelligence Pharmaceuticals

Combinatorial Optimization Problems Traveling Salesman Knapsack Scheduling Machine Learning Bin Packing

slide-10
SLIDE 10

The Boltzmann Machine

¨ Two-state units connected with real-valued edge weights form

a stochastic neural network.

¨ Goal: iteratively update the state or weight variables to

minimize the network energy (E).

xj

The Boltzmann Machine

Σ x0 x3 w3,j w0,j

slide-11
SLIDE 11

The Boltzmann Machine

¨ Two-state units connected with real-valued edge weights form

a stochastic neural network.

¨ Goal: iteratively update the state or weight variables to

minimize the network energy (E).

xj

The Boltzmann Machine

Σ x0 x3 w3,j w0,j

1 1 + eδ/C

Control Parameter

δ = (2xj-1) Σxiwi,j

E = -½ ΣΣxixjwi,j

slide-12
SLIDE 12

Computational Model

¨ Network energy is minimized by adjusting either the edge

weights or recomputing the states.

¨ Iterative matrix-vector multiplication between weights and

states is critical to finding minimal network energy.

The Boltzmann Machine Data Movement Functional Units … … … Memory Arrays

w0,0 w0,1 … w1,0 … x0 x1 … Σ, , 1 1 + ex

slide-13
SLIDE 13

Resistive Random Access Memory

¨ An RRAM cell comprises an access transistor and a resistive

switching medium.

RRAM Cell Wordline Bitline The Boltzmann Machine Functional Units … … … RRAM Arrays

V

RRAM: Resistive RAM (source: HP, 2009)

slide-14
SLIDE 14

¨ A read is performed by activating a wordline and measuring

the bitline current (I).

Resistive Random Access Memory

I = V/R1 V ‘1’

R1 The Boltzmann Machine Functional Units … … … RRAM Arrays

slide-15
SLIDE 15

Memristive Boltzmann Machine

¨ Key Idea: exploit current summation on the RRAM bitlines to

compute dot product.

‘1’ ‘1’ ‘1’ ‘1’ I =ΣV/Ri V

The Boltzmann Machine Functional Units … … … RRAM Arrays

slide-16
SLIDE 16

Memristive Boltzmann Machine

¨ Memory cells represent the weights and state variables are

used to control the bitline and wordlines.

I =ΣV/Ri

w01 w02 w03 w04

V

X1 X2 X3 X4 X0 The Boltzmann Machine Functional Units … … … RRAM Arrays

slide-17
SLIDE 17

Memristive Boltzmann Machine

¨ Memory cells represent the weights and state variables are

used to control the bitline and wordlines.

w01 w02 w03 w04

V

X1 X2 X3 X4 X0

I =ΣX0XiW0i

The Boltzmann Machine Functional Units … … … RRAM Arrays

slide-18
SLIDE 18

System Integration

Software configures the

  • n-chip data layout and

initiates the optimization by writing to a memory mapped control register. To maintain ordering, accesses to the accelerator are made uncacheable by the processor. DDR3 reads and writes are used for configuration and data transfer.

Accelerator DIMM

  • 1. Configure the DIMM
  • 2. Write weights and states
  • 3. Compute
  • 4. Read the outcome

Controller CPU

D R A M

slide-19
SLIDE 19

Summary of Results

0.01 0.1 1 0.01 0.1 1

Execution Time Normalized to the Single Threaded Kernel System Energy Normalized to the Single Threaded Baseline

60x 34x 9x 6x

Multi-threaded Kernel PIM Accelerator Memristive Accelerator