Onyx: A Prototype Phase-Change Memory Storage Array Ameen Akel * - - PowerPoint PPT Presentation

onyx a prototype phase change memory storage array
SMART_READER_LITE
LIVE PREVIEW

Onyx: A Prototype Phase-Change Memory Storage Array Ameen Akel * - - PowerPoint PPT Presentation

Onyx: A Prototype Phase-Change Memory Storage Array Ameen Akel * Adrian Caulfield, Todor Mollov, Rajesh Gupta, Steven Swanson Non-Volatile Systems Laboratory, Department of Computer Science and Engineering University of California, San Diego *


slide-1
SLIDE 1

Onyx: A Prototype Phase-Change Memory Storage Array

Ameen Akel*

Adrian Caulfield, Todor Mollov, Rajesh Gupta, Steven Swanson

Non-Volatile Systems Laboratory, Department of Computer Science and Engineering University of California, San Diego

*Now at Micron Technology 1

slide-2
SLIDE 2

4 KB Operation Request Latencies

2

0.01 0.1 1 10 100 1000 10000 Write Read Log Operation Request Latency (us)

Disk Flash Current PCM Projected PCM

slide-3
SLIDE 3

Advantages of Studying PCM SSDs

  • Understand current PCM performance

– With current storage infrastructure – Versus other NV tech: e.g. Flash SSDs

  • PCM performance may differ from simulation

– Variance in write latency due to data – Wear-out characteristics

  • Use real applications to gauge performance
  • Understand how software should change for PCM
  • Prepare to integrate future-generation PCM

3

slide-4
SLIDE 4

Overview

  • Motivation
  • PCM Devices

– Technology Overview – Micron P8P Devices

  • Onyx Architecture

– Logical Architecture – PCM DIMMs – Physical Architecture

  • Performance Analysis
  • Applications and Conclusions

4

slide-5
SLIDE 5

PCM: The Device Level

  • PCM storage medium:

Chalcogenide

– Resistance depends on molecular phase

  • Writes

– Heaters are attached to the chalcogenide – Current passed through heaters to change phase – Allows bit-alterable writes

  • Reads

– Measure resistance through chalcogenide area – Resistance sensed by ability to sink current

5

  • M. Breitwisch et al VLSI '07
slide-6
SLIDE 6

PCM: The Device Level

  • PCM storage medium:

Chalcogenide

– Resistance depends on molecular phase

  • Writes

– Heaters are attached to the chalcogenide – Current passed through heaters to change phase – Allows bit-alterable writes

  • Reads

– Measure resistance through chalcogenide area – Resistance sensed by ability to sink current

6

amorph fcc hexagonal XRD-measurements

  • M. Wuttig, et. al., FP6 Project CAMELS.
slide-7
SLIDE 7

PCM Write Operations in Depth

7

fil fi t “ ” fl fl

) ! * +, -

&' ! ( ' &' !

! ! . / 0) #

fi − fi

sfi

fi

  • Material heated to…

– > 600∘C then cooled quickly  Amorphous – ~ 350∘C then cooled slowly  Crystalline

  • Set and reset

– Reset – 0 state – Set – 1 state

10 ns 50-150 ns

slide-8
SLIDE 8

PCM Projections

  • Future PCM latency projections*:
  • Process node progression: 90, 45, 32, 20, 9 nm

8

Operation Latency Read 48 ns Set 150 ns Reset 40 ns *B. C. Lee, et. al. Architecting Phase Change Memory as a Scalable DRAM

  • Alternative. ISCA 2009.
slide-9
SLIDE 9

P8P PCM

  • First-generation NOR-flash replacement
  • Part: NP8P128A13B1760E (P8P)
  • Process Node: 90 nm
  • Capacity: 16 MB
  • Per Device Bandwidth, Latency, Current

– Write (64 bytes): 0.5 MB/s, 120 us, 35 mA – Read (16 bytes): 48.6 MB/s, 314 ns, 15 mA

  • Lifetime: One million writes until first bit error

9

slide-10
SLIDE 10

Overview

  • Motivation
  • PCM Devices

– Technology Overview – Micron P8P Devices

  • Onyx Architecture

– Logical Architecture – PCM DIMMs – Physical Architecture

  • Performance Analysis
  • Applications and Conclusions

10

slide-11
SLIDE 11

Moneta: SSD for Emulated Fast NVMs

  • DRAM-based NV-SSD

emulator

  • Learn by building

– Hardware – Controller & interconnect – Software – Driver, file system, apps

  • Uses optimized software

stack

– Decreases request latency – Improves request concurrency

CPU DRAM DRAM DRAM DRAM DRAM DRAM DRAM DRAM

Moneta

Moneta Driver OS IO Stack Application File System PCIe

11

slide-12
SLIDE 12

Onyx: Phase-Change Memory SSD

  • Based on Moneta*

– Shares hardware – Shares software stack

  • PCM replaces DRAM

– Uses real PCM – Custom PCM controller

CPU DRAM DRAM PCM PCM PCM PCM PCM PCM

Onyx

Onyx Driver OS IO Stack Application File System PCIe *A. M. Caulfield, et. al. Moneta: A high- performance storage array architecture for next-generation, non-volatile

  • memories. MICRO 2010

12

slide-13
SLIDE 13

Moneta/Onyx Architecture

13

2GB PCM 2GB PCM 2GB PCM 2GB PCM

Ring (4 GB/s)

Ring Control Transfer Buffers DMA Control Scoreboard Tag Status Registers Host via PIO Host via DMA Request Queue

slide-14
SLIDE 14

Onyx PCM Controller

  • Request Completion

– Late Completion – On PCM write completion – Early Completion – On request reception

  • Start-Gap Wear Leveling*

– Low overhead wear leveling (two registers + logic) – Prevents hot spots from wearing out memory – Rotates line in memory every gap interval

14

*M. K. Qureshi, et. al. Enhancing lifetime and security of PCM- based main memory with start-gap wear leveling. MICRO 42.

slide-15
SLIDE 15

Closer Look at a PCM DIMM

  • 8 Ranks of 5 PCM devices

– 64 data bits + 16 ECC bits – Effectively 16 ranks per memory interface

  • Shared control and data lines
  • Capacity: 640 MB / DIMM

15

Device Device 1 Device 3 Device 2 Device 4

Data[0:15] Data[16:31] Data[32:47] Data[48:63] Data[64:79] Address[0:25]

slide-16
SLIDE 16

Prototyping Advanced SSDs

  • Built on RAMP’s BEE3 board

– Four FPGAs connected in a ring – Four DIMM slots per FPGA – PCIe 1.1 x8 host connection

  • System capacity: 10 GB

16

slide-17
SLIDE 17

Overview

  • Motivation
  • PCM Devices

– Technology Overview – Micron P8P Devices

  • Onyx Architecture

– Logical Architecture – PCM DIMMs – Physical Architecture

  • Performance Analysis
  • Applications and Conclusions

17

slide-18
SLIDE 18

Read Performance

18

200 400 600 800 1000 1200 1400 1600 1800 2000 0.5 1 2 4 8 16 32 64 128 256 512 1024 Bandwidth (MB/s) Request Size (KB)

Onyx FusionIO Moneta

slide-19
SLIDE 19

Write Performance

19

200 400 600 800 1000 1200 1400 1600 1800 2000 0.5 1 2 4 8 16 32 64 128 256 512 1024 Bandwidth (MB/s) Request Size (KB)

Onyx-Late Onyx-Early FusionIO Moneta

slide-20
SLIDE 20

BerkeleyDB Performance

20

1000 2000 3000 4000 5000 6000 7000 8000 BTree HashTable Transactions / Second BDB Benchmark

Onyx FusionIO Moneta

slide-21
SLIDE 21

Potential PCM Applications

  • As a read cache

– First-gen PCM read speeds compete with flash – Next-gen PCM should improve read performance

  • Replace DRAM in high-performance apps

– PCM cost will likely drop below DRAM – Will scale aggressively past DRAM

  • Outpace flash in high-performance SSDs

– Reduces complexity of management – Provides higher-rated lifetime – Saves power, logic, and design time

21

slide-22
SLIDE 22

Conclusions

  • Onyx designed to maximize PCM performance
  • More improvements possible as PCM scales

– Onyx architecture will scale with PCM – Onyx will benefit from faster reads and writes

  • PCM simplifies SSD management relative to

flash and improves small access performance

22

slide-23
SLIDE 23

Thank You!

Questions?

23