An overview on solid-state-drives architectures and enterprise - - PowerPoint PPT Presentation

an overview on solid state drives architectures and
SMART_READER_LITE
LIVE PREVIEW

An overview on solid-state-drives architectures and enterprise - - PowerPoint PPT Presentation

An overview on solid-state-drives architectures and enterprise solutions Romolo Marotta Sapienza, University of Rome Why SSD are so attractive? Capacity vs Access Time SRAM 10 8 DRAM FLASH 10 6 Capacity (MB) 10 4 10 2 1 10 -2 10 -6 10 -4


slide-1
SLIDE 1

Romolo Marotta

Sapienza, University of Rome

An overview on solid-state-drives architectures and enterprise solutions

slide-2
SLIDE 2

Capacity vs Access Time

Capacity (MB) Access Time (ms) 10-6 10-4 10-2 1 102 104 10-2 1 102 104 106 108

SRAM DRAM FLASH

Why SSD are so attractive?

2 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-3
SLIDE 3

Capacity vs Access Time

Capacity (MB) Access Time (ms) 10-6 10-4 10-2 1 102 104 10-2 1 102 104 106 108

SRAM DRAM FLASH HDD

  • MAG. TAPE
  • OPT. DISK

Why SSD are so attractive?

2 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-4
SLIDE 4

Capacity vs Access Time

Capacity (MB) Access Time (ms) 10-6 10-4 10-2 1 102 104 10-2 1 102 104 106 108

SRAM DRAM FLASH HDD

  • MAG. TAPE
  • OPT. DISK

Why SSD are so attractive?

2 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-5
SLIDE 5

Capacity vs Access Time

Capacity (MB) Access Time (ms) 10-6 10-4 10-2 1 102 104 10-2 1 102 104 106 108

SRAM DRAM FLASH HDD

  • MAG. TAPE
  • OPT. DISK

MECHANICAL GAP

Why SSD are so attractive?

2 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-6
SLIDE 6

Capacity vs Access Time

Capacity (MB) Access Time (ms) 10-6 10-4 10-2 1 102 104 10-2 1 102 104 106 108

SRAM DRAM FLASH HDD

  • MAG. TAPE
  • OPT. DISK

MECHANICAL GAP

SSD

Why SSD are so attractive?

2 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-7
SLIDE 7
  • Flash memory was invented by Dr. Fujio Masuoka in 1984
  • The name “Flash” was adopted because the process of erasing

the memory contents reminded him of the flash of a camera

  • A Flash memory cell is a Floating Gate

Metal-Oxide-Semiconductor Field-Effect Transistor (MOSFET)

P-well N+ N+

SOURCE DRAIN CONTROL GATE FLOATING GATE

2

SiO ONO

What is a flash memory cell?

3 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-8
SLIDE 8
  • The N+ region is a silicon lattice with phosphorous impurities,

creating an excess of electrons

  • The P- region is a silicon lattice with boron impurities,

creating an absence of electrons

  • The floating gate is surrounded by insulating layers

P-well N+ N+

SOURCE DRAIN CONTROL GATE FLOATING GATE

2

SiO ONO

INSULATING LAYERS

What is a flash memory cell?

Anatomy

4 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-9
SLIDE 9
  • The Source (S) and Drain (D) are

disconnected, thus a current cannot flow between them

P-well N+ N+

SOURCE DRAIN CG FG

How does a flash memory cell work?

5 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-10
SLIDE 10
  • The Source (S) and Drain (D) are

disconnected, thus a current cannot flow between them

  • Applying a voltage between

Control Gate (CG) and Body (B) creates a concentration of electrons between S and D

P-well N+ N+

SOURCE DRAIN CG FG

V1

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

How does a flash memory cell work?

5 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-11
SLIDE 11
  • The Source (S) and Drain (D) are

disconnected, thus a current cannot flow between them

  • Applying a voltage between

Control Gate (CG) and Body (B) creates a concentration of electrons between S and D

  • If the voltage if high enough (V1),

it creates a channel between S and D which allows the current ID to flow between them.

P-well N+ N+

SOURCE DRAIN CG FG

N-channel

V1

_ _ _ _ _ _ _ _

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

How does a flash memory cell work?

5 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-12
SLIDE 12
  • Applying an appropriate voltage

(VPROGRAM) to CG, the electrons will be trapped in FG

P-well N+ N+

SOURCE DRAIN CG FG

N-channel

V1

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

How does a flash memory cell work?

Program operation

6 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-13
SLIDE 13
  • Applying an appropriate voltage

(VPROGRAM) to CG, the electrons will be trapped in FG

P-well N+ N+

SOURCE DRAIN CG FG

N-channel _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

VPROGRAM

How does a flash memory cell work?

Program operation

6 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-14
SLIDE 14
  • Applying an appropriate voltage

(VPROGRAM) to CG, the electrons will be trapped in FG

  • those electrons are kept in FG,

although there is no tension on CG

P-well N+ N+

SOURCE DRAIN CG FG

_ _ _ _ _ _ _ _

How does a flash memory cell work?

Program operation

6 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-15
SLIDE 15
  • Applying an appropriate voltage

(VPROGRAM) to CG, the electrons will be trapped in FG

  • those electrons are kept in FG,

although there is no tension on CG

  • we call this state “0”

P-well N+ N+

SOURCE DRAIN CG FG

_ _ _ _ _ _ _ _

How does a flash memory cell work?

Program operation

6 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-16
SLIDE 16
  • Applying an appropriate voltage

(VPROGRAM) to CG, the electrons will be trapped in FG

  • those electrons are kept in FG,

although there is no tension on CG

  • we call this state “0”
  • In state 0, a voltage V0 > V1 is

required in order to establish the N-channel

P-well N+ N+

SOURCE DRAIN CG FG

N-channel _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

V0

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

How does a flash memory cell work?

Program operation

6 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-17
SLIDE 17
  • In state 0, electrons are stored in

FG

P-well N+ N+

SOURCE DRAIN CG FG

_ _ _ _ _ _ _ _

How does a flash memory cell work?

Erase operation

7 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-18
SLIDE 18
  • In state 0, electrons are stored in

FG

  • a voltage (VPROGRAM) is required

in order to remove them from FG

P-well N+ N+

SOURCE DRAIN CG FG

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

VERASE

How does a flash memory cell work?

Erase operation

7 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-19
SLIDE 19
  • In state 0, electrons are stored in

FG

  • a voltage (VPROGRAM) is required

in order to remove them from FG

  • at this point no charges are on FG

P-well N+ N+

SOURCE DRAIN CG FG

How does a flash memory cell work?

Erase operation

7 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-20
SLIDE 20
  • In state 0, electrons are stored in

FG

  • a voltage (VPROGRAM) is required

in order to remove them from FG

  • at this point no charges are on FG
  • we call this state “1”

P-well N+ N+

SOURCE DRAIN CG FG

How does a flash memory cell work?

Erase operation

7 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-21
SLIDE 21
  • The two states allows a cell to store a bit

STATE 1:

  • No charges in FG
  • required a VCG > V1 in order to

set-up the N-Channel ⇒ ID > 0

P-well N+ N+ SOURCE DRAIN CG FG

V1 V0 ID VCG STATE 1

STATE 0:

  • Charges are in FG
  • required a VCG > V0 > V1 in order

to set-up the N-Channel ⇒ ID > 0

P-well N+ N+ SOURCE DRAIN CG FG

_ _ _ _ _ _ _ _

V1 V0 ID VCG STATE 0

Since the cell stores ONE bit, it is called Single-Level Cell

How does a flash memory cell work?

States of a Single-Level Cell

8 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-22
SLIDE 22
  • Erase the cell

P-well N+ N+

SOURCE DRAIN CG FG

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

VERASE

How does a flash memory cell work?

How to write a cell?

9 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-23
SLIDE 23
  • Erase the cell
  • To write 1: it is done

P-well N+ N+

SOURCE DRAIN CG FG

VI

How does a flash memory cell work?

How to write a cell?

9 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-24
SLIDE 24
  • Erase the cell
  • To write 1: it is done
  • To write 0: program the cell

P-well N+ N+

SOURCE DRAIN CG FG

N-channel _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

VPROGRAM

How does a flash memory cell work?

How to write a cell?

9 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-25
SLIDE 25
  • Erase the cell
  • To write 1: it is done
  • To write 0: program the cell

P-well N+ N+

SOURCE DRAIN CG FG

_ _ _ _ _ _ _ _

VI

How does a flash memory cell work?

How to write a cell?

9 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-26
SLIDE 26
  • Reading a cell consists in

inferring on the cell state

P-well N+ N+

SOURCE DRAIN CG FG STATE=?

V1 V0 ID VCG STATE 1 STATE 0

How does a flash memory cell work?

How to read a cell?

10 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-27
SLIDE 27
  • Reading a cell consists in

inferring on the cell state

  • Apply an intermediate

voltage V1 < VI < V0 on CG

P-well N+ N+

SOURCE DRAIN CG FG

VI

STATE=?

V1 V0 VI ID VCG STATE 1 STATE 0

How does a flash memory cell work?

How to read a cell?

10 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-28
SLIDE 28
  • Reading a cell consists in

inferring on the cell state

  • Apply an intermediate

voltage V1 < VI < V0 on CG

  • Read the actual value I ∗

D of

the current ID

P-well N+ N+

SOURCE DRAIN CG FG

VI

STATE=?

V1 V0 VI VI ID VCG STATE 1 STATE 0

How does a flash memory cell work?

How to read a cell?

10 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-29
SLIDE 29
  • Reading a cell consists in

inferring on the cell state

  • Apply an intermediate

voltage V1 < VI < V0 on CG

  • Read the actual value I ∗

D of

the current ID

  • If I ∗

D = 0 the bit value is 0

P-well N+ N+

SOURCE DRAIN CG FG

_ _ _ _ _ _ _ _

VI

V1 V0 VI VI VI ID VCG STATE 1 STATE 0

How does a flash memory cell work?

How to read a cell?

10 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-30
SLIDE 30
  • Reading a cell consists in

inferring on the cell state

  • Apply an intermediate

voltage V1 < VI < V0 on CG

  • Read the actual value I ∗

D of

the current ID

  • If I ∗

D = 0 the bit value is 0

  • If I ∗

D = 0 the bit value is 1

P-well N+ N+

SOURCE DRAIN CG FG

VI

V1 V0 VI VI VI VI ID VCG STATE 1 STATE 0

How does a flash memory cell work?

How to read a cell?

10 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-31
SLIDE 31
  • If we partially charge FG, we need a lower threshold voltage

for creating a channel

  • We store 2 bits by using 1 programmed state, 2 partially

programmed states and 1 erased state

  • A flash cell storing multiple bits is a Multi-Level Cell (MLC)
  • A Triple-Level Cell (TLC) stores 3 bits

_ _ _ _ _ _ _ _ _ _

ERASED PARTIALL Y PROGRAMMED PROGRAMMED

_ _ _ _

Multi-Level Cell

11 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-32
SLIDE 32
  • writing a flash cell involves an erase and a program

⇒ electrons move from/into FG

anode interface cathode interface (SiO2) electron traps < 10 nm

Why does a flash cell deteriorate in time?

12 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-33
SLIDE 33
  • writing a flash cell involves an erase and a program

⇒ electrons move from/into FG

  • electrons collide with and damage the insulating layer creating traps

⇒ a Stress Induced Leakage Current (SILC) can flow through these traps

anode interface cathode interface (SiO2) electron traps < 10 nm anode interface cathode interface

Damaged Oxide

SILC

Why does a flash cell deteriorate in time?

12 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-34
SLIDE 34
  • writing a flash cell involves an erase and a program

⇒ electrons move from/into FG

  • electrons collide with and damage the insulating layer creating traps

⇒ a Stress Induced Leakage Current (SILC) can flow through these traps

  • a lot of traps can build a path from the body to FG

⇒ electrons can flow through that path ⇒impossibility to program ⇒ the flash cell is unusable

anode interface cathode interface (SiO2) electron traps < 10 nm anode interface cathode interface

Damaged Oxide

SILC anode interface cathode interface breakdown path

Oxide Breakdown

Why does a flash cell deteriorate in time?

12 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-35
SLIDE 35
  • A flash cell can be programmed and erased a limited number of

times before a breakdown ⇒ This number is called P/E-Cycles

  • Vendors design firmware capable of recompute the voltage

thresholds for read/write operations ⇒ enterprise-MLC (eMLC)

100000 30000 10000 5000

P/E Cycles

SLC eMLC MLC TLC

Why does a flash cell deteriorate in time?

PE-Cycles

13 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-36
SLIDE 36

SLC:

  • lower density
  • higher cost
  • faster write
  • faster read
  • higher endurance

MLC:

  • higher density
  • lower cost
  • erase time is similar to SLC
  • the level of charges in FG has to be

set carefully ⇒ slower program ⇒ slower write

  • state is not 0/1 ⇒ slower read
  • eMLC has 3x shorter endurance
  • MLC has 10x shorter endurance
  • TLC has 20x shorter endurance

MLC vs SLC

14 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-37
SLIDE 37

Flash Cell

1 bit

How are flash chips organized?

15 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-38
SLIDE 38

Page

16384 + 512 bits

Flash Cell

1 bit

How are flash chips organized?

15 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-39
SLIDE 39

Additional bits are used to store ECC and recover from runtime read errors

Page

16384 + 512 bits

Flash Cell

1 bit

How are flash chips organized?

15 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-40
SLIDE 40

Block

64 pages= 128KB + 4KB Additional bits are used to store ECC and recover from runtime read errors

Page

16384 + 512 bits

Flash Cell

1 bit

How are flash chips organized?

15 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-41
SLIDE 41

Some bits are used mark the block as faulty

Block

64 pages= 128KB + 4KB Additional bits are used to store ECC and recover from runtime read errors

Page

16384 + 512 bits

Flash Cell

1 bit

How are flash chips organized?

15 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-42
SLIDE 42

Plane

2048 blocks = 256MB + 8MB Some bits are used mark the block as faulty

Block

64 pages= 128KB + 4KB Additional bits are used to store ECC and recover from runtime read errors

Page

16384 + 512 bits

Flash Cell

1 bit

How are flash chips organized?

15 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-43
SLIDE 43

Die

2 planes = 512MB + 16MB

Plane

2048 blocks = 256MB + 8MB Some bits are used mark the block as faulty

Block

64 pages= 128KB + 4KB Additional bits are used to store ECC and recover from runtime read errors

Page

16384 + 512 bits

Flash Cell

1 bit

How are flash chips organized?

15 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-44
SLIDE 44

Chip

4 dies = 2048MB + 64MB

Die

2 planes = 512MB + 16MB

Plane

2048 blocks = 256MB + 8MB Some bits are used mark the block as faulty

Block

64 pages= 128KB + 4KB Additional bits are used to store ECC and recover from runtime read errors

Page

16384 + 512 bits

Flash Cell

1 bit

How are flash chips organized?

15 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-45
SLIDE 45
  • Flash cells are connected forming an array called string
  • According to the strategies used to connect multiple cells, we

can distinguish at least two kind of configuration: NOR Flash cells are connected in parallel, resembling a NOR gate

A B Vcc Vout

NAND Flash cells are connected in series, resembling a NAND gate

A B Vcc Vout

How are flash cells organized?

16 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-46
SLIDE 46
  • Let F be the CG side length

NOR:

  • occupies area 10F 2
  • read/write a single cell

NAND:

  • occupies area 4F 2
  • read/write a single page
  • erase a single block

NOR ARCHITECTURE

Page 1 Page 2 Page 3 Bit 1 Bit 2 Bit 3 Source For One Block Bit 1 Bit 2 Select Gate 2 Select Gate 1

NAND ARCHITECTURE

Page 2 Page 3 Bit 3 Page 1 Page 8

How are flash cells organized?

17 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-47
SLIDE 47

NOR:

  • fast random-byte read
  • slower page read
  • slower write
  • lower density

⇒ good for source code NAND:

  • no random-byte read
  • slow partial page read when

supported

  • faster page read
  • faster page write
  • higher density

⇒ good for storage We focus on NAND flash technology

NOR vs NAND

18 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-48
SLIDE 48
  • Write-in-place strategy:
  • 1. read the block
  • 2. erase the block
  • 3. program the block with the updated page

How is a page written?

19 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-49
SLIDE 49
  • Write-in-place strategy:
  • 1. read the block
  • 2. erase the block
  • 3. program the block with the updated page
  • 1 page write = N page read + 1 block erase + N page write

(N = number of pages in a block) ⇒ very slow write

How is a page written?

19 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-50
SLIDE 50
  • Write-in-place strategy:
  • 1. read the block
  • 2. erase the block
  • 3. program the block with the updated page
  • 1 page write = N page read + 1 block erase + N page write

(N = number of pages in a block) ⇒ very slow write ⇒ If we update the page 40 times per second (every 25ms), the block is completely broken in:

  • SLC =

PECycles UpdateRate = 105 40ps ≈ 2500s ≈ 40m

  • MLC =

104 40ps ≈ 4m

  • TLC = 5·103

40ps ≈ 2m

How is a page written?

19 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-51
SLIDE 51
  • Write-in-place strategy:
  • 1. read the block
  • 2. erase the block
  • 3. program the block with the updated page
  • 1 page write = N page read + 1 block erase + N page write

(N = number of pages in a block) ⇒ very slow write ⇒ If we update the page 40 times per second (every 25ms), the block is completely broken in:

  • SLC =

PECycles UpdateRate = 105 40ps ≈ 2500s ≈ 40m

  • MLC =

104 40ps ≈ 4m

  • TLC = 5·103

40ps ≈ 2m

ALERT!

In our example the write rate is 80KBps

How is a page written?

19 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-52
SLIDE 52
  • Write amplification occurs when 1 user page write leads to

multiple flash writes

  • Write amplification make flash blocks deteriorate faster
  • Let F be the number of flash writes corresponding to U user

writes ⇒ The write amplification A is: A = F + U U = 1 + F U = 1 + Af where Af is the write amplification factor

Write Amplification

20 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-53
SLIDE 53
  • Write-in-place is inadequate in terms of reliability and

performance (Af ≈ # number of pages in a block)

Operating System's view of SSD SSD 1 2 3 4 5 6 7 9 8 1 2 3 4 5 6 7 9 8 Page Id Physical Page Id

How is a page written? Relocation-on-write

21 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-54
SLIDE 54
  • Write-in-place is inadequate in terms of reliability and

performance (Af ≈ # number of pages in a block)

  • Updated pages are re-written on new locations

Operating System's view of SSD SSD 1 2 3 4 5 6 7 9 8 1 2 3 4 5 6 7 9 8 Page Id Physical Page Id User Write: 3

How is a page written? Relocation-on-write

21 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-55
SLIDE 55
  • Write-in-place is inadequate in terms of reliability and

performance (Af ≈ # number of pages in a block)

  • Updated pages are re-written on new locations
  • The logical address of the update page is mapped to a

different physical page

Operating System's view of SSD SSD 1 2 3 4 5 6 7 9 8 1 2 3 4 5 6 7 9 8 Page Id Physical Page Id User Write: 3 User Write: 3

How is a page written? Relocation-on-write

21 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-56
SLIDE 56
  • Write-in-place is inadequate in terms of reliability and

performance (Af ≈ # number of pages in a block)

  • Updated pages are re-written on new locations
  • The logical address of the update page is mapped to a

different physical page

  • Previous pages are invalidated

Operating System's view of SSD SSD 1 2 3 4 5 6 7 9 8 1 2 3 4 5 6 7 9 8 Page Id Physical Page Id User Write: 3 User Write: 3 User Write: 3

x

How is a page written? Relocation-on-write

21 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-57
SLIDE 57
  • Write-in-place is inadequate in terms of reliability and

performance (Af ≈ # number of pages in a block)

  • Updated pages are re-written on new locations
  • The logical address of the update page is mapped to a

different physical page

  • Previous pages are invalidated

⇒ 1 user page write = 1 page read (obtain an empty page) + 2 page write (update data + invalidate page) ⇒ faster write

Operating System's view of SSD SSD 1 2 3 4 5 6 7 9 8 1 2 3 4 5 6 7 9 8 Page Id Physical Page Id User Write: 3 User Write: 3 User Write: 3

x

How is a page written? Relocation-on-write

21 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-58
SLIDE 58
  • Assign Logical Addresses to pages
  • Store the association between physical and logical addresses in

a Translation Mapping Table

  • Store the number of erase operation performed on physical

pages in a Erase Count Table

  • Tables are:
  • maintained in SRAM (high efficient) at runtime
  • stored on flash during shutdown to ensure durability
  • loaded at boot-up

Flash Translation Layer

22 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-59
SLIDE 59
  • Free pages for relocation can be retrieved from the whole SSD
  • Wear-leveling guarantees that the number of PE-Cycles is

uniformly distributed among all blocks ⇒ Wear-leveling extends the time to live of each block and the whole SSD

  • Thanks to wear-leveling all blocks break at the same time

1200 1000 800 600 400 200 5000 10000 15000 20000 25000 30000 35000 Block ID Erase Count 32GB+Free blocks

Wear-Leveling

23 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-60
SLIDE 60
  • In order to guarantee that enough free pages are available for

write relocation, wear-leveling needs:

  • Over-provisioning - keep free a percentage of raw capacity
  • Garbage collection - keep invalid pages in the same block
  • DRAM buffers - keep valid pages in a buffer in order to write

full blocks and reduce fragmentation

  • We can distinguish at least two kind of wear-leveling

algorithms:

  • Dynamic wear-leveling
  • Static wear-leveling

Wear-Leveling

24 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-61
SLIDE 61
  • It is called dynamic, because it is executed every time the OS

replace a block of data

  • A small percentage (e.g. 2%) of raw capacity is reserved as

free-block pool

  • It chooses from the free pool the block with minimum erase

count the buffer is flushed

  • The replaced block is erased and added to the free pool

⇒ Only frequently-updated blocks are consumed

Wear-Leveling

Dynamic Algorithm

25 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-62
SLIDE 62
  • Periodically scan the metadata of each block
  • Individuate inactive data blocks with lower erase count than

free blocks

  • Copy their content into free-blocks and exchange them

⇒ this guarantees that static blocks participate to wear leveling

Wear-Leveling

Static Algorithm

26 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-63
SLIDE 63

At first approximation, wear-leveling eliminate write amplification generated by different sizes of erase and write units ⇒ The block time to fault is: BlockTTF ≈ Ndie · Nplanes · Nblocks · N · PECycles PageWriteRate

Wear-Leveling

Impact on reliability

27 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-64
SLIDE 64

At first approximation, wear-leveling eliminate write amplification generated by different sizes of erase and write units ⇒ The block time to fault is: BlockTTF ≈ Ndie · Nplanes · Nblocks · N · PECycles PageWriteRate BlockTTF ≈ Ndie · Nplanes · Nblocks · N · PageSize · PECycles PageWriteRate · PageSize

Wear-Leveling

Impact on reliability

27 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-65
SLIDE 65

At first approximation, wear-leveling eliminate write amplification generated by different sizes of erase and write units ⇒ The block time to fault is: BlockTTF ≈ Ndie · Nplanes · Nblocks · N · PECycles PageWriteRate BlockTTF ≈ Ndie · Nplanes · Nblocks · N · PageSize · PECycles PageWriteRate · PageSize BlockTTF ≈ CapacitySSD · PECycles WriteRate

Wear-Leveling

Impact on reliability

27 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-66
SLIDE 66

At first approximation, wear-leveling eliminate write amplification generated by different sizes of erase and write units ⇒ The block time to fault is: BlockTTF ≈ Ndie · Nplanes · Nblocks · N · PECycles PageWriteRate BlockTTF ≈ Ndie · Nplanes · Nblocks · N · PageSize · PECycles PageWriteRate · PageSize BlockTTF ≈ CapacitySSD · PECycles WriteRate

  • Blocks deteriorate uniformly, thus:

BlockTTF ≈ SSDTTF

Wear-Leveling

Impact on reliability

27 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-67
SLIDE 67
  • Take a SSD with capacity C and a write rate W
  • According to the flash cells used, we have different time to fault:

Wear-Leveling

Example

28 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-68
SLIDE 68
  • Take a SSD with capacity C and a write rate W
  • According to the flash cells used, we have different time to fault:
  • C = 4GB, W = 80KBps
  • SLC ⇒ SSDTTF = C·PECyclesSLC

W

= 4GB·105

80KBps ≈ 158years

  • MLC ⇒ SSDTTF = C·PECyclesMLC

W

≈ 15.8years

  • TLC ⇒ SSDTTF = C·PECyclesTLC

W

≈ 7.9years

Wear-Leveling

Example

28 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-69
SLIDE 69
  • Take a SSD with capacity C and a write rate W
  • According to the flash cells used, we have different time to fault:
  • C = 4GB, W = 80KBps
  • SLC ⇒ SSDTTF = C·PECyclesSLC

W

= 4GB·105

80KBps ≈ 158years

  • MLC ⇒ SSDTTF = C·PECyclesMLC

W

≈ 15.8years

  • TLC ⇒ SSDTTF = C·PECyclesTLC

W

≈ 7.9years

  • C = 128GB, W = 4MBps
  • SLC ⇒ SSDTTF = C·PECyclesSLC

W

= 128GB·105

4MBps

≈ 101years

  • MLC ⇒ SSDTTF = C·PECyclesMLC

W

≈ 10years

  • TLC ⇒ SSDTTF = C·PECyclesTLC

W

≈ 5years

Wear-Leveling

Example

28 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-70
SLIDE 70

As said before wear leveling make flash blocks deteriorate

  • uniformly. Anyhow
  • Garbage collection increase the number of flash write
  • Static Wear-leveling increases the number of flash write

⇒ re-introduce write amplification factor SSDTTF ≈ CapacitySSD · PECycles (1 + Af )WriteRate

Wear-Leveling

Impact on Reliability 2

29 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-71
SLIDE 71

NAND Flash NAND Flash NAND Flash NAND Flash

Durable Storage

SSD Architecture

30 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-72
SLIDE 72

NAND Flash NAND Flash NAND Flash NAND Flash

Durable Storage Flash Bus

SSD Architecture

30 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-73
SLIDE 73

NAND Flash NAND Flash NAND Flash NAND Flash

Durable Storage Flash Bus

Flash Controller

Read & Write Wear-leveling FTL

SSD Architecture

30 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-74
SLIDE 74

NAND Flash NAND Flash NAND Flash NAND Flash

Durable Storage Flash Bus

Flash Controller

Read & Write Wear-leveling FTL

SRAM

Control Bus TM tables EC tables

SSD Architecture

30 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-75
SLIDE 75

NAND Flash NAND Flash NAND Flash NAND Flash

Durable Storage Flash Bus

Flash Controller

Read & Write Wear-leveling FTL

SRAM

Control Bus TM tables EC tables

CPU

Garbage Collection ECC errors

SSD Architecture

30 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-76
SLIDE 76

NAND Flash NAND Flash NAND Flash NAND Flash

Durable Storage Flash Bus

Flash Controller

Read & Write Wear-leveling FTL

SRAM

Control Bus TM tables EC tables

CPU

Garbage Collection ECC errors

Host Interface

PATA, SATA SCSI, etc

SSD Architecture

30 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-77
SLIDE 77

NAND Flash NAND Flash NAND Flash NAND Flash

Durable Storage Flash Bus

Flash Controller

Read & Write Wear-leveling FTL

SRAM

Control Bus TM tables EC tables

CPU

Garbage Collection ECC errors

Host Interface

PATA, SATA SCSI, etc

SSD Architecture

30 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-78
SLIDE 78

NAND Flash NAND Flash NAND Flash NAND Flash

Durable Storage Flash Bus

Flash Controller

Read & Write Wear-leveling FTL

SRAM

Control Bus TM tables EC tables

CPU

Garbage Collection ECC errors

Host Interface

PATA, SATA SCSI, etc

DRAM Buffer

Data Bus Data Bus Write Cache

SSD Architecture

30 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-79
SLIDE 79
  • Reducing the amount of user data effectively stored in flash

chips allows to reduce the write rate and increase the life of flash drives

  • Data reduction techniques are:
  • Compression
  • Deduplication

Data Reduction

31 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-80
SLIDE 80
  • It consists in reducing the number of bits needed to store data.
  • Lossless compression allows to restore data to its original state
  • Lossy compression permanently eliminates bits of data that

are redundant, unimportant or imperceptible

  • CompressionRatio = UncompressedSize

CompressedSize

⇒ Data reduction is DRc =

1 CompressionRatio

SSDTTF ≈ CapacitySSD · PECycles WriteRate · (1 + Af ) · DRc

Data Reduction

Data Compression

32 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-81
SLIDE 81
  • It looks for redundancy of sequences of bytes across very large

comparison windows.

  • Sequences of data are compared to the history of other such

sequences.

  • The first uniquely stored version of a sequence is referenced

rather than stored again

  • Let DD the average percentage of deduplicable data

SSDTTF ≈ CapacitySSD · PECycles WriteRate · (1 + Af ) · DRc · (1 − DD)

Data Reduction

Data Deduplication

33 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-82
SLIDE 82
  • RAID uses redundancy (e.g. a parity code) to increase

reliability

  • Any RAID solution increase the amount of data physically

written on disks (RAID Overhead) ⇒ when adopting a RAID solution with flash technology we are reducing the lifetime of the whole storage system by a factor at most equal to the RAID overhead

RAID Solutions on flash technology

34 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-83
SLIDE 83
  • N flash disks of capacity C and cells supporting L P/E-cycles.
  • Write load rate equal to W.

RAID Solutions on flash technology

Example

35 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-84
SLIDE 84
  • N flash disks of capacity C and cells supporting L P/E-cycles.
  • Write load rate equal to W.

RAID0:

  • stripes data
  • no fault tolerance
  • W is uniformly distributed on disks (thanks to striping)

RAID Solutions on flash technology

Example

35 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-85
SLIDE 85
  • N flash disks of capacity C and cells supporting L P/E-cycles.
  • Write load rate equal to W.

RAID0:

  • stripes data
  • no fault tolerance
  • W is uniformly distributed on disks (thanks to striping)

⇒ TTLRAID0 = N·C·L

W

RAID Solutions on flash technology

Example

35 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-86
SLIDE 86
  • N flash disks of capacity C and cells supporting L P/E-cycles.
  • Write load rate equal to W.

RAID0:

  • stripes data
  • no fault tolerance

RAID10:

  • stripes data
  • replicates each disk
  • W is uniformly distributed on disks (thanks to striping)

⇒ TTLRAID0 = N·C·L

W

RAID Solutions on flash technology

Example

35 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-87
SLIDE 87
  • N flash disks of capacity C and cells supporting L P/E-cycles.
  • Write load rate equal to W.

RAID0:

  • stripes data
  • no fault tolerance

RAID10:

  • stripes data
  • replicates each disk
  • W is uniformly distributed on disks (thanks to striping)

⇒ TTLRAID0 = N·C·L

W

⇒ TTLRAID10 = N·C·L

2W

RAID Solutions on flash technology

Example

35 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-88
SLIDE 88
  • N flash disks of capacity C and cells supporting L P/E-cycles.
  • Write load rate equal to W.

RAID0:

  • stripes data
  • no fault tolerance

RAID10:

  • stripes data
  • replicates each disk
  • W is uniformly distributed on disks (thanks to striping)

⇒ TTLRAID0 = N·C·L

W

⇒ TTLRAID10 = N·C·L

2W

Alert!

In order to increase reliability we half the time to live of flash cells

RAID Solutions on flash technology

Example

35 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-89
SLIDE 89

Modeling SSD endurance in a complex system

36 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-90
SLIDE 90

SSDs

Modeling SSD endurance in a complex system

36 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-91
SLIDE 91

SSDs System Workload

Modeling SSD endurance in a complex system

36 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-92
SLIDE 92

SSDs System Workload SSD System

Modeling SSD endurance in a complex system

36 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-93
SLIDE 93
  • We have learned that any redundancy reduces the maximum

time to live of all SSDs

  • The answer is YES, but why?

Does it still make sense to use RAID?

37 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-94
SLIDE 94

CPU Flash Controller DRAM Buffer Host Interface SRAM

Control Bus Data Bus Data Bus PATA, SATA SCSI, etc

NAND Flash NAND Flash NAND Flash NAND Flash

Flash Bus

Does it still make sense to use RAID?YES! Why?

38 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-95
SLIDE 95

ALL THESE COMPONENTS MAY HAVE A FAUL T

CPU Flash Controller DRAM Buffer Host Interface SRAM

Control Bus Data Bus Data Bus PATA, SATA SCSI, etc

NAND Flash NAND Flash NAND Flash NAND Flash

Flash Bus

Does it still make sense to use RAID?YES! Why?

38 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-96
SLIDE 96
  • The system building block is called XBrick:
  • 25 800GB eMLC SSDs
  • Two 1U Storage Controllers (redundant storage processors)
  • The scale up is guaranteed by adding more XBricks (up to six

in a rack) that will be connected through InfiniBand ports.

  • The system performs inline data reduction by:
  • deduplication
  • compression

EMC XtremIO

39 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-97
SLIDE 97
  • The system checks for deduplicated data:
  • 1. subdivide the write stream in 4KB blocks
  • 2. for each block in the stream

2.1 compute a digest 2.2 check in a shared mapping table the presence of the block 2.3 if present update a reference counter 2.4 else use the digest to determine the location of the block and send the block to the respective controller node

  • The addressing of blocks should uniformly distribute the data
  • n all nodes

EMC XtremIO

Deduplication

40 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-98
SLIDE 98
  • The XtremIO system implements a proprietary data protection

algorithm called XtremIO Data Protection (XDP)

  • Disks in a node are arranged in 23+2 columns
  • 1 row parity and 1 diagonal party
  • Each stripe is subdivided in 28 rows and 29 diagonals

EMC XtremIO

XtremIO Data Protection

41 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-99
SLIDE 99
  • In order to compute efficiently the diagonal parity and to

spread writes on all disks, XDP waits to fill in memory the emptiest stripe

  • When the stripe is full, commit it on disks
  • The emptiest stripe selection implies that free space is linearly

distributed on stripes

  • XDP can:
  • overcome 2 concurrent failures (2 parities)
  • have a write overhead smaller than other RAID solutions

EMC XtremIO

XtremIO Data Protection

42 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-100
SLIDE 100

Suppose a system that is 80% full:

  • The emptiest stripe is 40% full (due to the emptiest selection)
  • A stripe can handle 28 · 23 = 644 writes
  • The emptiest stripe can handle 644 · 40% ≈ 257

#parities = 28(rows) + 29(diagonal) = 57 RAIDoverhead = #writes #userwrites RAIDoverhead = 257 + 57 257 1.22

EMC XtremIO

XtremIO Data Protection

43 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-101
SLIDE 101
  • The system building block is made of:
  • One Storage Enclosure of 12 4TB eMLC SSDs
  • Two Control Enclosures (redundant storage processors) with

8-core Intel Xeon and 32GB of RAM

  • The system performs inline data reduction by:
  • compression with two dedicated hardware accelerators

IBM FlashDrive V840

44 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-102
SLIDE 102
  • FlashDrive V840 offers two levels of RAID protection:
  • RAID5 in configuration 10 +1 Parity +1 Spare among disks
  • RAID5 in configuration 9+1 among chips in a disk
  • RAID overhead = 4

IBM FlashDrive V840

2D RAID

45 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-103
SLIDE 103
  • One storage enclosure equipped with:
  • 2 controller nodes with 2 Intel eight-core processors and 32GB
  • f RAM
  • 24 SSD drives
  • according to the type of flash cells, drives capacity is:

1920GB (MLC), 400GB (eMLC), 200GB (SLC);

  • The system performs inline data reduction by:
  • deduplication which uses a hashing engine capability built into

ASICs

HP 3PAR STORE 7450

46 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-104
SLIDE 104

DISKS CHUNKLETS

1st level

HP 3PAR STORE 7450

Data Protection

47 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-105
SLIDE 105

DISKS CHUNKLETS RAID10 protected

1st level 2nd level

LOGICAL DISKS

HP 3PAR STORE 7450

Data Protection

47 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-106
SLIDE 106

DISKS CHUNKLETS RAID10 protected RAID5 protected

1st level 2nd level

LOGICAL DISKS

HP 3PAR STORE 7450

Data Protection

47 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-107
SLIDE 107

DISKS CHUNKLETS RAID10 protected RAID5 protected

1st level 2nd level 3rd level

visible to hosts LOGICAL DISKS VIRTUAL VOLUME

HP 3PAR STORE 7450

Data Protection

47 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-108
SLIDE 108

More detailed info can be found in the main references:

  • http://www.csee.umbc.edu/~squire/images/ssd1.pdf
  • XtremIO, FlashDrive v840, HP 3PAR white papers

If you want to play, there is an interesting tool by Intel:

  • http://estimator.intel.com/ssdendurance

Additional Material

48 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-109
SLIDE 109

Quick look to the (not-far) future

49 of 47 - An overview on solid-state-drives architectures and enterprise solutions

slide-110
SLIDE 110

Questions?

marotta@diag.uniroma1.it www.dis.uniroma1.it/~marotta

Thanks for your attention

50 of 47 - An overview on solid-state-drives architectures and enterprise solutions