[PPT] - Coding for Unreliable Flash Memory Cells Ryan Gabrys, Lara Dolecek PowerPoint Presentation

SLIDE 1

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Coding for Unreliable Flash Memory Cells

Ryan Gabrys, Lara Dolecek

Laboratory for Robust Information Systems (LORIS) Department of Electrical Engineering, UCLA

0 / 19

SLIDE 2

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

1 Background 2 Empirical Data 3 Data Analysis 4 Error-Correction Model 5 Performance Results 6 Conclusion

1 / 19

SLIDE 3

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Flash Memory Basics

Flash memory is comprised of a set of floating gate cells.

2 / 19

SLIDE 4

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Flash Memory Basics

Flash memory is comprised of a set of floating gate cells. Information is stored by controlling the number of electrons stored within each cell.

2 / 19

SLIDE 5

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Flash Memory Basics

Flash memory is comprised of a set of floating gate cells. Information is stored by controlling the number of electrons stored within each cell. Density Per Cell

2 / 19

SLIDE 6

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Flash Memory Basics

Flash memory is comprised of a set of floating gate cells. Information is stored by controlling the number of electrons stored within each cell. Density Per Cell

Single-Level-Cell (SLC) 1 bit per cell.

2 / 19

SLIDE 7

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Flash Memory Basics

Flash memory is comprised of a set of floating gate cells. Information is stored by controlling the number of electrons stored within each cell. Density Per Cell

Single-Level-Cell (SLC) 1 bit per cell. Multiple-Level-Cell (MLC) 2 bits per cell.

2 / 19

SLIDE 8

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Flash Memory Basics

Flash memory is comprised of a set of floating gate cells. Information is stored by controlling the number of electrons stored within each cell. Density Per Cell

Single-Level-Cell (SLC) 1 bit per cell. Multiple-Level-Cell (MLC) 2 bits per cell. Triple-Level-Cell (TLC) 3 bits of information per cell.

2 / 19

SLIDE 9

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Flash Memory Basics

Flash memory is comprised of a set of floating gate cells. Information is stored by controlling the number of electrons stored within each cell. Density Per Cell

Single-Level-Cell (SLC) 1 bit per cell. Multiple-Level-Cell (MLC) 2 bits per cell. Triple-Level-Cell (TLC) 3 bits of information per cell.

2 / 19

SLIDE 10

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Mapping Information to Voltage Levels in TLC

Most Significant Bit MSB Center Significant Bit CSB Least Significant Bit LSB

!""# !"!# !!!# !!"# "!"# "!!# ""!# """# $%&#'%()*+,#

.+/#'%()*+,#

3 / 19

SLIDE 11

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Data Collection

On the first of every 100 P/E cycles the following was performed:

4 / 19

SLIDE 12

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Data Collection

On the first of every 100 P/E cycles the following was performed:

1

Erase the block. (block= 220 cells)

4 / 19

SLIDE 13

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Data Collection

On the first of every 100 P/E cycles the following was performed:

1

Erase the block. (block= 220 cells)

2

Read back the errors.

4 / 19

SLIDE 14

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Data Collection

On the first of every 100 P/E cycles the following was performed:

1

Erase the block. (block= 220 cells)

2

Read back the errors.

3

Write random data.

4 / 19

SLIDE 15

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Data Collection

On the first of every 100 P/E cycles the following was performed:

1

Erase the block. (block= 220 cells)

2

Read back the errors.

3

Write random data.

4

Read back the errors.

4 / 19

SLIDE 16

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Data Collection

On the first of every 100 P/E cycles the following was performed:

1

Erase the block. (block= 220 cells)

2

Read back the errors.

3

Write random data.

4

Read back the errors.

On the other 99 cycles, the block was erased and all-zeros were written.

4 / 19

SLIDE 17

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Raw Error Rate

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 10

−5

10

−4

10

−3

10

−2

P/E Cycles Error Rate Error Rates for TLC Flash

LSB CSB MSB Symbol Error Rate

5 / 19

SLIDE 18

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Observation 1: Error Patterns Within a Symbol

Number of bits in symbol that err Percentage of errors 1 0.9617 2 0.0314 3 0.0069

6 / 19

SLIDE 19

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Observation 2: Unreliable Flash Cells

We organized the Flash cells into two categories:

7 / 19

SLIDE 20

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Observation 2: Unreliable Flash Cells

We organized the Flash cells into two categories:

1

reliable Flash cells,

7 / 19

SLIDE 21

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Observation 2: Unreliable Flash Cells

We organized the Flash cells into two categories:

1

reliable Flash cells,

2

unreliable Flash cells.

7 / 19

SLIDE 22

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Observation 2: Unreliable Flash Cells

We organized the Flash cells into two categories:

1

reliable Flash cells,

2

unreliable Flash cells.

An unreliable Flash cell is a cell that experienced ≥ 50 errors across the lifetime of the device.

7 / 19

SLIDE 23

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Observation 2: Unreliable Flash Cells

We organized the Flash cells into two categories:

1

reliable Flash cells,

2

unreliable Flash cells.

An unreliable Flash cell is a cell that experienced ≥ 50 errors across the lifetime of the device. We identified 62659 (of the 134217728 total cells tested) unreliable Flash cells.

7 / 19

SLIDE 24

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Observation 2: Unreliable Flash Cells

We organized the Flash cells into two categories:

1

reliable Flash cells,

2

unreliable Flash cells.

An unreliable Flash cell is a cell that experienced ≥ 50 errors across the lifetime of the device. We identified 62659 (of the 134217728 total cells tested) unreliable Flash cells. These cells accounted for over 10% of the total errors measured.

7 / 19

SLIDE 25

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Observation 3: Unreliable Flash Cells Have Prominent Error Patterns

Programmed state Percentage of Errors 000 0.4745 000 0.1630 010 0.0711 000 0.0676 001 0.0558 001 0.0525 011 0.0186

!""# !"!# !!!# !!"# "!"# "!!# ""!# """# $%&#'%()*+,#

.+/#'%()*+,#

8 / 19

SLIDE 26

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Main Idea

Design error-correction scheme that takes into account Observations 1, 2, and 3.

9 / 19

SLIDE 27

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Main Idea

Design error-correction scheme that takes into account Observations 1, 2, and 3.

1

Code corrects errors where most of the errors affect only a single bit within each Flash cell (Observation 1).

9 / 19

SLIDE 28

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Main Idea

Design error-correction scheme that takes into account Observations 1, 2, and 3.

1

Code corrects errors where most of the errors affect only a single bit within each Flash cell (Observation 1).

2

Code allows unreliable Flash cells to be programmed at low voltage levels (Observations 2,3).

9 / 19

SLIDE 29

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Code Properties

Codes are over alphabet of size q = 2m, where m is some positive integer and each symbol represents a Flash cell.

10 / 19

SLIDE 30

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Code Properties

Codes are over alphabet of size q = 2m, where m is some positive integer and each symbol represents a Flash cell. A symbol is a binary length-m vector.

10 / 19

SLIDE 31

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Code Properties

Codes are over alphabet of size q = 2m, where m is some positive integer and each symbol represents a Flash cell. A symbol is a binary length-m vector. A codeword is n binary length-m vectors so the result is a length-nm vector.

10 / 19

SLIDE 32

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Code Properties

Codes are over alphabet of size q = 2m, where m is some positive integer and each symbol represents a Flash cell. A symbol is a binary length-m vector. A codeword is n binary length-m vectors so the result is a length-nm vector. Example over alphabet of size 8: (45702) − > (100 101 111 000 010)

10 / 19

SLIDE 33

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Error Vectors

Definition (Graded Bit-Error Vector) The length-nm vector e = (e0, e1, . . . , en−1), where each m-bit vector ei represents a symbol of size 2m, is a [t1, t2; ℓ1, ℓ2]-bit-error-vector if

11 / 19

SLIDE 34

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Error Vectors

Definition (Graded Bit-Error Vector) The length-nm vector e = (e0, e1, . . . , en−1), where each m-bit vector ei represents a symbol of size 2m, is a [t1, t2; ℓ1, ℓ2]-bit-error-vector if

1 |{i : ei = 0}| ≤ t1 + t2. 11 / 19

SLIDE 35

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Error Vectors

Definition (Graded Bit-Error Vector) The length-nm vector e = (e0, e1, . . . , en−1), where each m-bit vector ei represents a symbol of size 2m, is a [t1, t2; ℓ1, ℓ2]-bit-error-vector if

1 |{i : ei = 0}| ≤ t1 + t2. 2 ∀i, wt(ei) ≤ ℓ2. 11 / 19

SLIDE 36

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Error Vectors

Definition (Graded Bit-Error Vector) The length-nm vector e = (e0, e1, . . . , en−1), where each m-bit vector ei represents a symbol of size 2m, is a [t1, t2; ℓ1, ℓ2]-bit-error-vector if

1 |{i : ei = 0}| ≤ t1 + t2. 2 ∀i, wt(ei) ≤ ℓ2. 3 |{i : wt(ei) > ℓ1}| ≤ t2. 11 / 19

SLIDE 37

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Graded Bit-Error Example

Suppose the vector x of length 6 with 3-bit symbols shown below was transmitted. x = (000 110 010 101 000 111)

12 / 19

SLIDE 38

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Graded Bit-Error Example

Suppose the vector x of length 6 with 3-bit symbols shown below was transmitted. x = (000 110 010 101 000 111) The vector y was received: y = (101 110 000 101 000 011) .

12 / 19

SLIDE 39

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Graded Bit-Error Example

Suppose the vector x of length 6 with 3-bit symbols shown below was transmitted. x = (000 110 010 101 000 111) The vector y was received: y = (101 110 000 101 000 011) . Then [2, 1; 1, 2]-bit-errors occurred.

12 / 19

SLIDE 40

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Graded Bit-Error Example

Suppose the vector x of length 6 with 3-bit symbols shown below was transmitted. x = (000 110 010 101 000 111) The vector y was received: y = (101 110 000 101 000 011) . Then [2, 1; 1, 2]-bit-errors occurred.

There are 2 + 1 symbols in error.

12 / 19

SLIDE 41

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Graded Bit-Error Example

Suppose the vector x of length 6 with 3-bit symbols shown below was transmitted. x = (000 110 010 101 000 111) The vector y was received: y = (101 110 000 101 000 011) . Then [2, 1; 1, 2]-bit-errors occurred.

There are 2 + 1 symbols in error. At most 2 bits are in error for each symbol.

12 / 19

SLIDE 42

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Graded Bit-Error Example

Suppose the vector x of length 6 with 3-bit symbols shown below was transmitted. x = (000 110 010 101 000 111) The vector y was received: y = (101 110 000 101 000 011) . Then [2, 1; 1, 2]-bit-errors occurred.

There are 2 + 1 symbols in error. At most 2 bits are in error for each symbol. There is 1 symbol that has more than 1 bit in error.

12 / 19

SLIDE 43

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Dynamic Bit-Error-Correcting Codes

Let C be a [t1, t2; ℓ1, ℓ2]2m code.

13 / 19

SLIDE 44

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Dynamic Bit-Error-Correcting Codes

Let C be a [t1, t2; ℓ1, ℓ2]2m code. The encoder chooses a message w ∈ ZM for some positive integer M.

13 / 19

SLIDE 45

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Dynamic Bit-Error-Correcting Codes

Let C be a [t1, t2; ℓ1, ℓ2]2m code. The encoder chooses a message w ∈ ZM for some positive integer M. For any collection of indices S where |S| ≤ s1, there exists a codeword c that can be used to represent w such that:

13 / 19

SLIDE 46

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Dynamic Bit-Error-Correcting Codes

Let C be a [t1, t2; ℓ1, ℓ2]2m code. The encoder chooses a message w ∈ ZM for some positive integer M. For any collection of indices S where |S| ≤ s1, there exists a codeword c that can be used to represent w such that:

For any j ∈ S, the voltage level of cell cj is at most s2.

13 / 19

SLIDE 47

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Dynamic Bit-Error-Correcting Codes

Let C be a [t1, t2; ℓ1, ℓ2]2m code. The encoder chooses a message w ∈ ZM for some positive integer M. For any collection of indices S where |S| ≤ s1, there exists a codeword c that can be used to represent w such that:

For any j ∈ S, the voltage level of cell cj is at most s2.

The code C is referred to as an [t1, t2; ℓ1, ℓ2; s1, s2]2m dynamic-bit-error-correcting code.

13 / 19

SLIDE 48

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Evaluation

We compared a dynamic graded-bit-error-correcting code against the following classes of codes:

14 / 19

SLIDE 49

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Evaluation

We compared a dynamic graded-bit-error-correcting code against the following classes of codes:

1

A non-binary code over GF(8).

14 / 19

SLIDE 50

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Evaluation

We compared a dynamic graded-bit-error-correcting code against the following classes of codes:

1

A non-binary code over GF(8).

2

A binary BCH code applied to MSB/CSB/LSB in parallel.

14 / 19

SLIDE 51

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Evaluation

We compared a dynamic graded-bit-error-correcting code against the following classes of codes:

1

A non-binary code over GF(8).

2

A binary BCH code applied to MSB/CSB/LSB in parallel.

3

Three different binary BCH codes each one applied to one of the MSB/CSB/LSB.

14 / 19

SLIDE 52

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Evaluation

We compared a dynamic graded-bit-error-correcting code against the following classes of codes:

1

A non-binary code over GF(8).

2

A binary BCH code applied to MSB/CSB/LSB in parallel.

3

Three different binary BCH codes each one applied to one of the MSB/CSB/LSB.

4

A [t1, t2; ℓ1, ℓ2]23-bit-error-correcting code.

14 / 19

SLIDE 53

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Evaluation

We compared a dynamic graded-bit-error-correcting code against the following classes of codes:

1

A non-binary code over GF(8).

2

A binary BCH code applied to MSB/CSB/LSB in parallel.

3

Three different binary BCH codes each one applied to one of the MSB/CSB/LSB.

4

A [t1, t2; ℓ1, ℓ2]23-bit-error-correcting code.

For each graph, the codes compared have the same length and the same rate.

14 / 19

SLIDE 54

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Results for block length 4096

2800 3000 3200 3400 3600 3800 4000 4200 4400 10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

P/E Cycles Page Error Rate Error Rates of Codes Applied to TLC Flash

Non−Binary BCH [4096,3534,80]8 Binary BCH [4096,3531,47]2 Graded−Bit−Error−Correcting Code [81,7;1,3] Binary BCH [4096, 3351, 62]2, [4096, 3339, 63]2, [4096, 3915, 15]2 Dynamic Graded−Bit−Error−Correcting Code [82,6;1,3;2,3] 40% Improvement 15 / 19

SLIDE 55

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Results for block length 8192

2400 2600 2800 3000 3200 3400 3600 3800 10

−5

10

−4

10

−3

10

−2

10

−1

P/E Cycles Page Error Rate Error Rates of Codes Applied to TLC Flash

Non−Binary BCH [8192,3622,120]8 Binary BCH [8192,7242,73]2 Graded−Bit−Error−Correcting Code [120,8;1,3] Binary BCH [8192, 6904, 99]2, [8192, 6891, 100]2, [8192, 7931, 20]2 Dynamic Graded−Bit−Error−Correcting Code [121,7;1,3;2,3] 39% Improvement 16 / 19

SLIDE 56

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Results for block length 16384

2600 2800 3000 3200 3400 3600 3800 4000 10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

Page Error Rate Error Rates of Codes Applied to TLC Flash P/E Cycles

Non−Binary BCH [16384,14593,205]8 Binary BCH [16384,14591,128]2 Graded−Bit−Error−Correcting Code [242,8;1,3] Binary BCH [16384, 13905, 177]2, [16384, 13863, 180]2, [16384, 15963, 30]2 Dynamic Graded−Bit−Error−Correcting Code [244,6;1,3;4,3] 39% Improvement 17 / 19

SLIDE 57

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Conclusion

Newer generations of Flash memory continue to demand more efficient error-correction schemes.

18 / 19

SLIDE 58

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Conclusion

Newer generations of Flash memory continue to demand more efficient error-correction schemes. Errors that occur within these newer Flash devices tend to follow certain patterns.

18 / 19

SLIDE 59

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Conclusion

Newer generations of Flash memory continue to demand more efficient error-correction schemes. Errors that occur within these newer Flash devices tend to follow certain patterns. By taking into account the dominant error patterns observed

n a TLC Flash cell, we designed more efficient error

correction codes.

18 / 19

SLIDE 60

Outline Background Empirical Data Data Analysis Error-Correction Model Performance Results Conclusion

Thank You

New center on Coding for Storage at UCLA: http://www.loris.ee.ucla.edu/codess Kick-off day on 9/19/2013! Registration is free. Register early, space is limited.

19 / 19