An Efficient Wear-level Architecture using Self-adaptive Wear - - PowerPoint PPT Presentation

an efficient wear level architecture using self adaptive
SMART_READER_LITE
LIVE PREVIEW

An Efficient Wear-level Architecture using Self-adaptive Wear - - PowerPoint PPT Presentation

An Efficient Wear-level Architecture using Self-adaptive Wear Leveling Jianming Huang , Yu Hua, Pengfei Zuo, Wen Zhou, Fangting Huang Huazhong University of Science and Technology ICPP 2020 Non-volatile Memory NVM features Non-volatility


slide-1
SLIDE 1

An Efficient Wear-level Architecture using Self-adaptive Wear Leveling

Jianming Huang, Yu Hua, Pengfei Zuo, Wen Zhou, Fangting Huang Huazhong University of Science and Technology

ICPP 2020

slide-2
SLIDE 2

Non-volatile Memory

2

Intel Optane DC Persistent Memory

  • NVM features

− Non-volatility − Large capacity − Byte-addressability − DRAM-scale latency

  • NVM drawbacks

‒ Limited endurance ‒ High write energy consumption

slide-3
SLIDE 3

Multi-level Cell NVM

3

Compared with SLC NVM, MLC NVM

  • Higher storage density
  • Lower cost
  • Comparable read latency
  • Weaker endurance

The MLC technique has been used in different kinds of NVM, including PCM, RRAM, and STT-RAM.

1 11 01 10 00 Vth Vth # of cell # of cell

single-level cell (SLC) multi-level cell (MLC)

10^7 of SLC PCM vs 10^5 of MLC PCM

Wear-leveling schemes are necessary and important

slide-4
SLIDE 4

Wear-leveling Schemes

4

  • Table based wear-leveling scheme (TBWL)
  • Algebraic based wear-leveling scheme (AWL)
  • Hybrid wear-leveling scheme (HWL)
slide-5
SLIDE 5

Wear-leveling Schemes

5

  • Table based wear-leveling scheme (TBWL)

Segment Swapping trigger wear-leveling

PA WC

1 2 3 150 519 115 210 1 2 3

LA

2 1 3 150 116 521 210

PA WC

1 2 3

LA

slide-6
SLIDE 6

Wear-leveling Schemes

6

  • Table based wear- leveling scheme (TBWL)
  • Algebraic based wear-leveling scheme (AWL)

region-based Start-Gap (RBSG)

A B C D 1 2 3

Initial State Step 1

E 4 D A B C E

Final State

A B C E D

Step 2

A B E C D A E B C D E A B C D

Step 3 Step 4 Start line Gap line LA PA

slide-7
SLIDE 7

Wear-leveling Schemes

7

  • Table based wear- leveling scheme (TBWL)
  • Algebraic wear-leveling scheme (AWL)
  • Hybrid wear-leveling scheme (HWL)

PCM-S

A B C D prn0 E F G H prn2

Step 1

SRAM (Memory Controller) Read Out NVM Line Shift Write Back

Step 2 Step 3

SRAM (Memory Controller)

A B C D E F G H D C B A H G F E D C B A H G F E

NVM

prn0 prn2

slide-8
SLIDE 8

RAA Attack for TBWL

8

  • RAA attack
  • Repeated Address Attack (RAA)
  • Repeatedly write data to the same address
  • The lifetime of TBWL under RAA attack
  • One region contains the limited lines
  • All lines in one region are repeatedly written
  • NVM is worn out at the early stage

‒ 𝑈ℎ𝑓 number of lines within a region × (Tℎ𝑓 endurance of a line)

slide-9
SLIDE 9

BPA Attack for AWL

9

  • BPA attack
  • Birthday Paradox Attack (BPA)
  • Randomly select logical addresses and repeatedly write to each
slide-10
SLIDE 10

BPA Attack for AWL

10

  • BPA attack
  • The lifetime of AWL under BPA attack
  • Lifetime is low
slide-11
SLIDE 11

BPA Attack for HWL

11

  • The lifetime of HWL under BPA attack
  • Smaller wear-leveling granularity increases the NVM lifetime
slide-12
SLIDE 12

Problems of Existing Wear-leveling Schemes

12

  • TBWL and AWL fail to defend against attacks
  • RAA attack leads to low lifetime in TBWL
  • BPA attack leads to low lifetime in AWL
slide-13
SLIDE 13

Problems of Existing Wear-leveling Schemes

13

  • HWL obtains high lifetime with small granularity
  • The large granularity leads to low lifetime
  • The small granularity leads to high lifetime
  • TBWL and AWL fail to defend against attacks
slide-14
SLIDE 14

Problems of Existing Wear-leveling Schemes

14

  • HWL obtains high lifetime with small granularity
  • The cache hit ratio of HWL is affected by the granularity
  • The wear-leveling entries stored on cache are limited
  • Entries with large granularity cover large NVM high cache hit ratio
  • Entries with small granularity cover small NVM low cache hit ratio
  • TBWL and AWL fail to defend against attacks
slide-15
SLIDE 15

Problems of Existing Wear-leveling Schemes

15

  • HWL obtains high lifetime with small granularity
  • The cache hit ratio of HWL is affected by the granularity
  • TBWL and AWL fail to defend against attacks

How to address the conflict between the lifetime and performance is important

  • High performance and lifetime are in conflict
slide-16
SLIDE 16

SAWL

16

SAWL: self-adaptive wear-leveling scheme

High hit ratio & unbalanced write distribution Split regions to decrease the granularity Low hit ratio & uniform write distribution Merge regions to increase the granularity achieve high lifetime and performance

slide-17
SLIDE 17

Architecture of SAWL

17

  • IMT (translation lines): record the locations in which the user data are actually stored

wear-leveling the data lines

(NVM) Memory Controller Address Translation Data Exchange lrn1,wlg1,prn1,key1 lrn2,wlg2,prn2,key2 lrnk,wlgk,prnk,keyk Region Split/Merge

Cached Mapping Table (CMT) Global Translation Directory (GTD)

Data lines

tpma

3 8 4 1

tlma

1 2 3

key

1 1 lrn3,wlg3,prn3,key3

Translation lines

Integrated Mapping Table (IMT)

line 0 line 1 line 2 region 0 line 0 line 1 line 2 region 1 line 0 line 1 line 2 region 2 line 0 line 1 line 2 region N line 0 line 1 line 2 region 0 line 0 line 1 line 2 region M

(SRAM) (DRAM)

sync

(prn,key)

(2,4),(4,5), ,(15,6) tpma 1 (6,3),(5,5), ,(18,7) 2 (8,2),(10,7), ,(3,6)

slide-18
SLIDE 18

Architecture of SAWL

18

  • CMT: buffer the recently used IMT entries

reduce translation latency

(NVM) Memory Controller Address Translation Data Exchange lrn1,wlg1,prn1,key1 lrn2,wlg2,prn2,key2 lrnk,wlgk,prnk,keyk Region Split/Merge

Cached Mapping Table (CMT) Global Translation Directory (GTD)

Data lines

tpma

3 8 4 1

tlma

1 2 3

key

1 1 lrn3,wlg3,prn3,key3

Translation lines

Integrated Mapping Table (IMT)

line 0 line 1 line 2 region 0 line 0 line 1 line 2 region 1 line 0 line 1 line 2 region 2 line 0 line 1 line 2 region N line 0 line 1 line 2 region 0 line 0 line 1 line 2 region M

(SRAM) (DRAM)

sync

(prn,key)

(2,4),(4,5), ,(15,6) tpma 1 (6,3),(5,5), ,(18,7) 2 (8,2),(10,7), ,(3,6)

slide-19
SLIDE 19

(NVM) Memory Controller Address Translation Data Exchange lrn1,wlg1,prn1,key1 lrn2,wlg2,prn2,key2 lrnk,wlgk,prnk,keyk Region Split/Merge

Cached Mapping Table (CMT) Global Translation Directory (GTD)

Data lines

tpma

3 8 4 1

tlma

1 2 3

key

1 1 lrn3,wlg3,prn3,key3

Translation lines

Integrated Mapping Table (IMT)

line 0 line 1 line 2 region 0 line 0 line 1 line 2 region 1 line 0 line 1 line 2 region 2 line 0 line 1 line 2 region N line 0 line 1 line 2 region 0 line 0 line 1 line 2 region M

(SRAM) (DRAM)

sync

(prn,key)

(2,4),(4,5), ,(15,6) tpma 1 (6,3),(5,5), ,(18,7) 2 (8,2),(10,7), ,(3,6)

Architecture of SAWL

19

  • GTD: record the relationships between logical/physical addresses of translation lines

wear-leveling the translation lines

slide-20
SLIDE 20

Operations in SAWL

20

  • Merge the region
  • 1. Choose two unmerged neighborhood logical

regions.

  • 2. Physically exchange the data to satisfy the

algebraic mapping between the logical and physical regions.

A B C D E F F E

5 8

D C B A

logical region Physical region

  • 3. Update the relevant CMT entries on the

SRAM and the IMT table on the NVM.

3 1 8 1 5 2 1 CMT entry 3 2 lrn IMT entry wlg prn key lrn prn key 2 3 1 2 3 5 8 1 CMT entry 2 3 4 IMT entry lrn wlg prn key lrn prn key

A B C D E F A B F E D C

1 5 2 3 8 logical region Physical region

slide-21
SLIDE 21

Operations in SAWL

21

  • Merge the region
  • Split the region
  • 1. Logically split the region without data move.

The data have already satisfied the algebraic mapping.

A B C D B A D C

1 2 3 logical region physical region

  • 2. Update the CMT entries and IMT table.

2 3 1 2 3 IMT entry lrn prn key CMT entry 2 3 4 lrn wlg prn key 3 1 1 2 1 IMT entry lrn prn key CMT entry 2 1 2 lrn wlg prn key

A B C D

logical region

D C B A 2

physical region

slide-22
SLIDE 22

When to Adjust the Region

22

hit ratio >= 95% Split the regions hit ratio <= 90% Merge the regions

  • We use the hit ratio as the trigger to merge/split region
  • Hit ratio below 90% significantly decreases the performance
  • Hit ratio above 95% slightly impacts on the performance
slide-23
SLIDE 23

Parameter in SAWL

23

  • SOW: size of the observation window
  • Small SOW -> cache hit rate frequently fluctuates
slide-24
SLIDE 24

Parameter in SAWL

24

  • SOW: size of the observation window
  • Small SOW -> cache hit rate frequently fluctuates
  • Large SOW -> systems miss important trigger point
slide-25
SLIDE 25

Parameter in SAWL

25

  • SOW: size of the observation window
  • Small SOW -> cache hit rate frequently fluctuates
  • Large SOW -> systems miss important trigger point
  • We use 2^22 as the SOW value
slide-26
SLIDE 26

Parameter in SAWL

26

  • SSW: size of the settling window
  • Small SSW -> frequently adjust region size
slide-27
SLIDE 27

Parameter in SAWL

27

  • SSW: size of the settling window
  • Small SSW -> frequently adjust region size
  • Large SSW -> fail to sufficiently adjust the region size
slide-28
SLIDE 28

Parameter in SAWL

28

  • SSW: size of the settling window
  • Small SSW -> frequently adjust region size
  • Large SSW -> fail to sufficiently adjust the region size
  • We use 2^22 as the SSW value
slide-29
SLIDE 29

Experimental Setup

29

  • Configuration of simulated system via Gem5
  • Comparisons
  • Baseline: an NVM system without wear-leveling scheme.
  • NWL-4: naive wear-leveling scheme with a region consisting of 4 memory lines.
  • NWL-64: naive wear-leveling scheme with a region consisting of 64 memory lines.
  • AWL schemes: RBSG and TLSR.
  • HWL schemes: PCM-S and MWSR.
  • Benchmark: SPEC2006
slide-30
SLIDE 30

Cache Hit Rate

30

  • SAWL has high cache hit rate, and adjusts the region size according to the hit rate.
slide-31
SLIDE 31

Lifetime under BPA Program

31

  • Smaller swapping period increases the NVM lifetime at the cost of high write overhead.
  • SAWL efficiently defends against the BPA attack.

10^6 endurance 10^5 endurance

slide-32
SLIDE 32

Lifetime under Applications

32

  • SAWL achieves high lifetime in all applications.
slide-33
SLIDE 33

Conclusion

33

  • Existing wear-leveling schemes fail to efficiently work on MLC NVM
  • SAWL dynamically tunes the wear-leveling granularity
  • Low cache hit ratio with uniform write distribution leads to merging regions
  • High cache hit ratio with unbalanced write distribution leads to splitting regions
  • SAWL achieves high lifetime under attacks and general applications

with low performance overhead

slide-34
SLIDE 34

Thanks! Q&A

Email: jmhuang@hust.edu.cn