ouroboros wear leveling a two level hierarchical wear
play

Ouroboros Wear-leveling: A Two-level Hierarchical Wear-leveling - PowerPoint PPT Presentation

Ouroboros Wear-leveling: A Two-level Hierarchical Wear-leveling Model for NVRAM Qingyue Liu Peter Varman ECE Department, Rice University May 18, 2017 New Challenges for New Technologies RRAM PCM 3DXpoint Advantages Major Drawback:


  1. Ouroboros Wear-leveling: A Two-level Hierarchical Wear-leveling Model for NVRAM Qingyue Liu Peter Varman ECE Department, Rice University May 18, 2017

  2. New Challenges for New Technologies RRAM PCM 3DXpoint • Advantages • Major Drawback: – High-Density: Easy to – Lifetime endurance problem scale down under 10nm – PCM: 10 7 ~10 8 writes per cell – Non-volatile – In practice, lifetime around – In-place update 20x shorter without wear- – Low leakage power leveling 2

  3. Wear-leveling (WL) • A technique for prolonging the service life of some kinds of erasable computer storage media • Block migration across the memory with certain rules – Move high usage blocks to low usage frames 800 D A Write A 0 1 20 B 270 C 2 Aim: Make write evenly 3 6 A D distributed across the 80 E 4 memory 600 5 F 100 G 6 96 H 7 3

  4. SSD WL vs. NVRAM WL • Solid State Disk (SSD) • NVRAM – Written out-of-place – In-place writing – Granularity: – Granularity: ➢ Read/write: page ➢ Read/write: byte ➢ Erase: block ➢ No erase – Requires garbage collection – No garbage collection • NVRAM has more freedom and can do better – No complex design for garbage collection – Fine-grained wear-leveling – Allows both algebraic and full-associative logical to physical mappings 4

  5. Outline • Background • Previous Work • Our Contributions – Hierarchical Ouroboros Wear-leveling – System Design • Architecture • Parameter selection – Experiments and Results • Conclusion 5

  6. Previous Work: NVRAM • Wear-leveling using restricted algebraic mappings – No address mapping table – Granularity: memory line (cache line) – Example: Start-Gap Wear-leveling [1] • Wear-leveling using fully-associative mappings – Additional address mapping table needed – Granularity: block – Example: Segment Swapping [2], PCM-aware swap [3] [1] Qureshi etal, "Enhancing lifetime and security of PCM-based main memory with start-gap wear-leveling." MICRO, 2009. [2] Zhou etal , “ A durable and energy efficient main memory using phase change memory technology ” ISCA, 2009. [3] A. P. Ferreira, M. Zhou, S. Bock, B. Childers, R. Melhem, and D. Moss ´ e , “ Increasing pcm main memory lifetime ,” in Proceedings of the conference on design, automation and test in Europe. European Design and Automation Association, 2010, pp. 914 – 919. 6

  7. Start-Gap Method Analysis A Start A B WRITE 0 C D • Advantages: GapLine – Distribute writes smoothly Gap within the frame – Small space overhead Q R – Simple algorithm 6 S • Disadvantages: T GapLine – Region size is limited since only U 1 line is relocated at a time V – May not use all the region to 7 W distribute the writes X GapLine 7

  8. Previous Work: NVRAM • Wear-leveling using restricted algebraic mappings – No address mapping table – Granularity: memory line (cache line) – Example: Start-Gap Wear-leveling [1] • Wear-leveling using fully-associative mappings – Additional address mapping table needed – Granularity: block – Example: Segment Swapping [2], PCM-aware swap [3] [1] Qureshi etal, "Enhancing lifetime and security of PCM-based main memory with start-gap wear-leveling." MICRO, 2009. [2] Zhou etal , “ A durable and energy efficient main memory using phase change memory technology ” ISCA, 2009. [3] A. P. Ferreira, M. Zhou, S. Bock, B. Childers, R. Melhem, and D. Moss ´ e , “ Increasing pcm main memory lifetime ,” in Proceedings of the conference on design, automation and test in Europe. European Design and Automation Association, 2010, pp. 914 – 919. 8

  9. Segment Swap vs. PCM-aware Swap • PCM-aware Swap: • Segment Swap: – Periodically swap content in – Periodically swap content in highest-usage frame with highest-usage frame with content in random frame content in lowest-usage frame • Advantages: G 0 A A – Can involve all space into wear- 1 B leveling 2 C – Can easily be implemented 3 D 4 E 5 F A 6 G 7 H 9

  10. Analysis of 2 Swap Methods: A* Pattern Without Wear-leveling Segment Swap PCM-aware Swap • A* Pattern: Write to the same logical block A continuously • Deterministic swap is better than randomized swap under correct conditions 10

  11. Analysis of 2 Swap Methods: AB* Pattern Without Wear-leveling Segment Swap PCM-aware Swap • AB* Pattern: Alternate writes to two logical blocks A and B (catastrophic pattern for Segment Swap) • Randomized swap is better than deterministic swap in bad cases 11

  12. Outline • Background • Previous Work • Our Contributions – Hierarchical Ouroboros Wear-leveling – System Design • Architecture • Parameter selection – Experiments and Results • Conclusion 12

  13. NVRAM Model Frame A B • Memory partitioned C D into frames Block • Each frame holds a block A B • A block holds a set of C Q memory lines D R • Block assumed to have S T consecutive address range Memory line U V W X 13

  14. Hierarchical Ouroboros Wear-leveling • Aim: – Guarantee write distribution as smooth as possible • Level 1: Local WL within frames – Start-gap like rule – Smooth distribution of writes within a frame – Granularity: Memory line – Aim: Make expensive large block Global WL less frequent 14

  15. Hierarchical Ouroboros Wear-leveling • Level 2: Global WL across frames – Exploit demand prediction to direct global wear-leveling – Use randomization in block migration to avoid worst-case behavior – Smooth distribution of writes across frames – Granularity: Frame – Aim: Involve all memory space into wear-leveling 15

  16. Global Wear-Leveling Framework Demand-based Ouroboros Migration • Inputs 1. Usage counter of each physical frame (U) 2. Prediction of the number of future writes to each logical block (P) – Repetitive workloads – Program Analysis (embedded applications) – Use recent activity (demand) as predictor 16

  17. Global Wear-Leveling Framework 1. Collect statistics: • Estimate future demand of each block to form a vector P • Collect current usage for each frame to form a vector U 2. Generate raw block migration mapping Aim: Map the i th hottest (highest demand) block to the i th • coldest (lowest usage) frame 17

  18. Raw Block Migration Initialization: Hot-to-Cold Blocks Cold-to-Hot Frames Physical Frame(Usage U) C 1 15 5 5 0 1 2 3 4 B 10 4 6 6 10 20 5 40 100 A 5 0 10 - + Logical Block (Demand P) 5 0 D 0 20 0 1 2 3 4 B C D E F A 0 E 3 40 0 10 15 0 0 0 0 F 2 100 D C F E B A 5 0 1 2 3 4 Final Block Order 18

  19. Global Wear-Leveling Framework 1. Collect statistics: • Estimate future demand of each block to form a vector A • Collect current usage for each frame to form a vector U 2. Generate raw block migration mapping Aim: Map the i th hottest (highest demand) block to the i th • coldest (lowest usage) frame 3. Classification step: • Identify a hot pool with up to K hottest blocks that meet a minimum demand threshold 4. Pruning Step: • Move only blocks in the hot pool to deterministic frames 19

  20. Block Migration with Pruning Method Initialization: Hot-to-Cold Blocks Cold-to-Hot Frames Physical Frame(Usage U) C C 1 15 5 5 0 1 2 3 4 B B 10 4 6 6 10 20 5 40 100 A 5 0 10 - + Logical Block (Demand D) 5 0 D 0 20 0 1 2 3 4 B C D E F A 0 E 3 40 0 10 15 0 0 0 0 F 2 100 A C E D B F 5 0 1 2 3 4 |H| = |C| =2 Final Block Order: 20

  21. Deterministic Block Migration Ring 2 Hot Cold E C Block Block 1 4 … B E B C Hot Cold Block Block Deterministic Block Migration 21

  22. Ouroboros Block Migration Ring 2 2 Hot Hot Random C C Block Block Free Block 3 1 5 1 4 B 5 F … E B 0 4 Cold Cold E Block Block Deterministic Block Ouroboros Block Free Frame Pool Migration Migration Ring 22

  23. Ouroboros Block Migration Ring 2 2 Hot Random F C C Block Free Block 3 1 5 1 4 B C 5 E F … E B Hot 0 4 Cold Block Cold Block E B Block Deterministic Block Ouroboros Block Free Frame Pool Migration Migration Ring 22

  24. Global Wear-Leveling Framework 1. Collect statistics: • Estimate future demand of each block to form a vector A • Collect current usage for each frame to form a vector U 2. Generate raw block migration mapping Aim: Map the i th hottest (highest demand) block to the i th coldest • (lowest usage) frame 3. Classification step: • Identify a hot pool with up to K hottest blocks that meet a minimum demand threshold 4. Pruning Step: • Move only blocks in the hot pool to deterministic frames 5. Randomization step: • Identify free frame pool with more than K free frames for randomization 6. Form Ouroboros block migration ring for block relocation 23

  25. Block Migration with Randomization Initialization: Hot-to-Cold Blocks Cold-to-Hot Frames Physical Frame(Usage U) C 1 1 15 5 5 0 1 2 3 4 B 10 4 4 6 6 10 20 5 40 100 A 5 0 10 - + Logical Block (Demand D) 5 0 D 0 20 0 1 2 3 4 B C D E F A 0 E 3 40 0 10 15 0 0 0 0 F 2 2 100 Free Frame Pool A C F D B E 0 5 5 5 0 1 2 3 4 Final Block Order: |H| = |C| =2, |F|=2 24

  26. Outline • Background • Previous Work • Our Contributions – Hierarchical Ouroboros Wear-leveling – System Design • Architecture • Parameter selection – Experiments and Results • Conclusion 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend