Emerging Non Volatile Memory Resistive Memory Technologies Key - - PowerPoint PPT Presentation
Emerging Non Volatile Memory Resistive Memory Technologies Key - - PowerPoint PPT Presentation
Emerging Non Volatile Memory Resistive Memory Technologies Key concept: replace DRAM cell capacitor with a programmable resistor 1T-1C DRAM 1T-1R STT-MRAM, PCM, RRAM Charge based sensing Resistance based sensing
Resistive Memory Technologies
¨ Key concept: replace DRAM cell capacitor with a programmable
resistor
- 1T-1C DRAM
- Charge based sensing
- Volatile
- 1T-1R STT-MRAM, PCM, RRAM
- Resistance based sensing
- Non-volatile
Leading Contenders
STT-MRAM PCM-RAM R-RAM
+ Multi-level cell capable + 4F2 3D-stackable cell
- Endurance: ~109 writes
- ~100ns switching time
- ~300uW switching
power + Multi-level cell capable + 4F2 3D-stackable cell
- Endurance: 106~1012
writes + ~5ns switching time + ~50uW switching power
- Limited to single-level
cell
- 3D un-stackable
+ High endurance (~1015) + ~4ns switching time + ~50uW switching power [ITRS’13]
[Halupka, et al. ISSCC’10] [Pronin. EETime’13] [Henderson. InfoTracks’11]
Positioning of Resistive Memories
RRAM PCM STT SRAM DRAM FLASH HDD Lower Cost Capacity Higher Speed Higher Endurance
In-Memory Processing
Example Research Question
¨ Can we reduce the cost of data movement between
memory and processor core?
Processor Memory 1X 500X How to reduce data movement energy?
data sources: Nvidia
Example Research Question
¨ Can we reduce the cost of data movement between
memory and processor core?
Processor Memory 1X 500X How to reduce data movement energy?
data sources: Nvidia
Combinatorial Optimization
¨ Numerous critical problems in science and engineering can be
cast within the combinatorial optimization framework.
Approximate Heuristic Algorithms
Genetic Algorithms Ant Colony Optimization Semi-Definite Programming Simulated Annealing Tabu Search Communication Networks 10010 01 1 1001 Data Mining DNA Analysis Artificial Intelligence Pharmaceuticals
Combinatorial Optimization Problems Traveling Salesman Knapsack Scheduling Machine Learning Bin Packing
Combinatorial Optimization
¨ Numerous critical problems in science and engineering can be
cast within the combinatorial optimization framework.
Massively Parallel Boltzmann Machine
Approximate Heuristic Algorithms
Genetic Algorithms Ant Colony Optimization Semi-Definite Programming Simulated Annealing Tabu Search Communication Networks 10010 01 1 1001 Data Mining DNA Analysis Artificial Intelligence Pharmaceuticals
Combinatorial Optimization Problems Traveling Salesman Knapsack Scheduling Machine Learning Bin Packing
The Boltzmann Machine
¨ Two-state units connected with real-valued edge weights form
a stochastic neural network.
¨ Goal: iteratively update the state or weight variables to
minimize the network energy (E).
xj
The Boltzmann Machine
Σ x0 x3 w3,j w0,j
The Boltzmann Machine
¨ Two-state units connected with real-valued edge weights form
a stochastic neural network.
¨ Goal: iteratively update the state or weight variables to
minimize the network energy (E).
xj
The Boltzmann Machine
Σ x0 x3 w3,j w0,j
1 1 + eδ/C
Control Parameter
δ = (2xj-1) Σxiwi,j
E = -½ ΣΣxixjwi,j
Computational Model
¨ Network energy is minimized by adjusting either the edge
weights or recomputing the states.
¨ Iterative matrix-vector multiplication between weights and
states is critical to finding minimal network energy.
The Boltzmann Machine Data Movement Functional Units … … … Memory Arrays
w0,0 w0,1 … w1,0 … x0 x1 … Σ, , 1 1 + ex
Resistive Random Access Memory
¨ An RRAM cell comprises an access transistor and a resistive
switching medium.
RRAM Cell Wordline Bitline The Boltzmann Machine Functional Units … … … RRAM Arrays
V
RRAM: Resistive RAM (source: HP, 2009)
¨ A read is performed by activating a wordline and measuring
the bitline current (I).
Resistive Random Access Memory
I = V/R1 V ‘1’
R1 The Boltzmann Machine Functional Units … … … RRAM Arrays
Memristive Boltzmann Machine
¨ Key Idea: exploit current summation on the RRAM bitlines to
compute dot product.
‘1’ ‘1’ ‘1’ ‘1’ I =ΣV/Ri V
The Boltzmann Machine Functional Units … … … RRAM Arrays
Memristive Boltzmann Machine
¨ Memory cells represent the weights and state variables are
used to control the bitline and wordlines.
I =ΣV/Ri
w01 w02 w03 w04
V
X1 X2 X3 X4 X0 The Boltzmann Machine Functional Units … … … RRAM Arrays
Memristive Boltzmann Machine
¨ Memory cells represent the weights and state variables are
used to control the bitline and wordlines.
w01 w02 w03 w04
V
X1 X2 X3 X4 X0
I =ΣX0XiW0i
The Boltzmann Machine Functional Units … … … RRAM Arrays
System Integration
Software configures the
- n-chip data layout and
initiates the optimization by writing to a memory mapped control register. To maintain ordering, accesses to the accelerator are made uncacheable by the processor. DDR3 reads and writes are used for configuration and data transfer.
Accelerator DIMM
- 1. Configure the DIMM
- 2. Write weights and states
- 3. Compute
- 4. Read the outcome
Controller CPU
D R A M
Summary of Results
0.01 0.1 1 0.01 0.1 1
Execution Time Normalized to the Single Threaded Kernel System Energy Normalized to the Single Threaded Baseline
60x 34x 9x 6x
Multi-threaded Kernel PIM Accelerator Memristive Accelerator