Ouroboros Wear-leveling: A Two-level Hierarchical Wear-leveling - PowerPoint PPT Presentation

Ouroboros Wear-leveling: A Two-level Hierarchical Wear-leveling Model for NVRAM Qingyue Liu Peter Varman ECE Department, Rice University May 18, 2017

New Challenges for New Technologies RRAM PCM 3DXpoint • Advantages • Major Drawback: – High-Density: Easy to – Lifetime endurance problem scale down under 10nm – PCM: 10 7 ~10 8 writes per cell – Non-volatile – In practice, lifetime around – In-place update 20x shorter without wear- – Low leakage power leveling 2

Wear-leveling (WL) • A technique for prolonging the service life of some kinds of erasable computer storage media • Block migration across the memory with certain rules – Move high usage blocks to low usage frames 800 D A Write A 0 1 20 B 270 C 2 Aim: Make write evenly 3 6 A D distributed across the 80 E 4 memory 600 5 F 100 G 6 96 H 7 3

SSD WL vs. NVRAM WL • Solid State Disk (SSD) • NVRAM – Written out-of-place – In-place writing – Granularity: – Granularity: ➢ Read/write: page ➢ Read/write: byte ➢ Erase: block ➢ No erase – Requires garbage collection – No garbage collection • NVRAM has more freedom and can do better – No complex design for garbage collection – Fine-grained wear-leveling – Allows both algebraic and full-associative logical to physical mappings 4

Outline • Background • Previous Work • Our Contributions – Hierarchical Ouroboros Wear-leveling – System Design • Architecture • Parameter selection – Experiments and Results • Conclusion 5

Previous Work: NVRAM • Wear-leveling using restricted algebraic mappings – No address mapping table – Granularity: memory line (cache line) – Example: Start-Gap Wear-leveling [1] • Wear-leveling using fully-associative mappings – Additional address mapping table needed – Granularity: block – Example: Segment Swapping [2], PCM-aware swap [3] [1] Qureshi etal, "Enhancing lifetime and security of PCM-based main memory with start-gap wear-leveling." MICRO, 2009. [2] Zhou etal , “ A durable and energy efficient main memory using phase change memory technology ” ISCA, 2009. [3] A. P. Ferreira, M. Zhou, S. Bock, B. Childers, R. Melhem, and D. Moss ´ e , “ Increasing pcm main memory lifetime ,” in Proceedings of the conference on design, automation and test in Europe. European Design and Automation Association, 2010, pp. 914 – 919. 6

Start-Gap Method Analysis A Start A B WRITE 0 C D • Advantages: GapLine – Distribute writes smoothly Gap within the frame – Small space overhead Q R – Simple algorithm 6 S • Disadvantages: T GapLine – Region size is limited since only U 1 line is relocated at a time V – May not use all the region to 7 W distribute the writes X GapLine 7

Previous Work: NVRAM • Wear-leveling using restricted algebraic mappings – No address mapping table – Granularity: memory line (cache line) – Example: Start-Gap Wear-leveling [1] • Wear-leveling using fully-associative mappings – Additional address mapping table needed – Granularity: block – Example: Segment Swapping [2], PCM-aware swap [3] [1] Qureshi etal, "Enhancing lifetime and security of PCM-based main memory with start-gap wear-leveling." MICRO, 2009. [2] Zhou etal , “ A durable and energy efficient main memory using phase change memory technology ” ISCA, 2009. [3] A. P. Ferreira, M. Zhou, S. Bock, B. Childers, R. Melhem, and D. Moss ´ e , “ Increasing pcm main memory lifetime ,” in Proceedings of the conference on design, automation and test in Europe. European Design and Automation Association, 2010, pp. 914 – 919. 8

Segment Swap vs. PCM-aware Swap • PCM-aware Swap: • Segment Swap: – Periodically swap content in – Periodically swap content in highest-usage frame with highest-usage frame with content in random frame content in lowest-usage frame • Advantages: G 0 A A – Can involve all space into wear- 1 B leveling 2 C – Can easily be implemented 3 D 4 E 5 F A 6 G 7 H 9

Analysis of 2 Swap Methods: A* Pattern Without Wear-leveling Segment Swap PCM-aware Swap • A* Pattern: Write to the same logical block A continuously • Deterministic swap is better than randomized swap under correct conditions 10

Analysis of 2 Swap Methods: AB* Pattern Without Wear-leveling Segment Swap PCM-aware Swap • AB* Pattern: Alternate writes to two logical blocks A and B (catastrophic pattern for Segment Swap) • Randomized swap is better than deterministic swap in bad cases 11

NVRAM Model Frame A B • Memory partitioned C D into frames Block • Each frame holds a block A B • A block holds a set of C Q memory lines D R • Block assumed to have S T consecutive address range Memory line U V W X 13

Hierarchical Ouroboros Wear-leveling • Aim: – Guarantee write distribution as smooth as possible • Level 1: Local WL within frames – Start-gap like rule – Smooth distribution of writes within a frame – Granularity: Memory line – Aim: Make expensive large block Global WL less frequent 14

Hierarchical Ouroboros Wear-leveling • Level 2: Global WL across frames – Exploit demand prediction to direct global wear-leveling – Use randomization in block migration to avoid worst-case behavior – Smooth distribution of writes across frames – Granularity: Frame – Aim: Involve all memory space into wear-leveling 15

Global Wear-Leveling Framework Demand-based Ouroboros Migration • Inputs 1. Usage counter of each physical frame (U) 2. Prediction of the number of future writes to each logical block (P) – Repetitive workloads – Program Analysis (embedded applications) – Use recent activity (demand) as predictor 16

Global Wear-Leveling Framework 1. Collect statistics: • Estimate future demand of each block to form a vector P • Collect current usage for each frame to form a vector U 2. Generate raw block migration mapping Aim: Map the i th hottest (highest demand) block to the i th • coldest (lowest usage) frame 17

Raw Block Migration Initialization: Hot-to-Cold Blocks Cold-to-Hot Frames Physical Frame(Usage U) C 1 15 5 5 0 1 2 3 4 B 10 4 6 6 10 20 5 40 100 A 5 0 10 - + Logical Block (Demand P) 5 0 D 0 20 0 1 2 3 4 B C D E F A 0 E 3 40 0 10 15 0 0 0 0 F 2 100 D C F E B A 5 0 1 2 3 4 Final Block Order 18

Global Wear-Leveling Framework 1. Collect statistics: • Estimate future demand of each block to form a vector A • Collect current usage for each frame to form a vector U 2. Generate raw block migration mapping Aim: Map the i th hottest (highest demand) block to the i th • coldest (lowest usage) frame 3. Classification step: • Identify a hot pool with up to K hottest blocks that meet a minimum demand threshold 4. Pruning Step: • Move only blocks in the hot pool to deterministic frames 19

Block Migration with Pruning Method Initialization: Hot-to-Cold Blocks Cold-to-Hot Frames Physical Frame(Usage U) C C 1 15 5 5 0 1 2 3 4 B B 10 4 6 6 10 20 5 40 100 A 5 0 10 - + Logical Block (Demand D) 5 0 D 0 20 0 1 2 3 4 B C D E F A 0 E 3 40 0 10 15 0 0 0 0 F 2 100 A C E D B F 5 0 1 2 3 4 |H| = |C| =2 Final Block Order: 20

Deterministic Block Migration Ring 2 Hot Cold E C Block Block 1 4 … B E B C Hot Cold Block Block Deterministic Block Migration 21

Ouroboros Block Migration Ring 2 2 Hot Hot Random C C Block Block Free Block 3 1 5 1 4 B 5 F … E B 0 4 Cold Cold E Block Block Deterministic Block Ouroboros Block Free Frame Pool Migration Migration Ring 22

Ouroboros Block Migration Ring 2 2 Hot Random F C C Block Free Block 3 1 5 1 4 B C 5 E F … E B Hot 0 4 Cold Block Cold Block E B Block Deterministic Block Ouroboros Block Free Frame Pool Migration Migration Ring 22

Global Wear-Leveling Framework 1. Collect statistics: • Estimate future demand of each block to form a vector A • Collect current usage for each frame to form a vector U 2. Generate raw block migration mapping Aim: Map the i th hottest (highest demand) block to the i th coldest • (lowest usage) frame 3. Classification step: • Identify a hot pool with up to K hottest blocks that meet a minimum demand threshold 4. Pruning Step: • Move only blocks in the hot pool to deterministic frames 5. Randomization step: • Identify free frame pool with more than K free frames for randomization 6. Form Ouroboros block migration ring for block relocation 23

Block Migration with Randomization Initialization: Hot-to-Cold Blocks Cold-to-Hot Frames Physical Frame(Usage U) C 1 1 15 5 5 0 1 2 3 4 B 10 4 4 6 6 10 20 5 40 100 A 5 0 10 - + Logical Block (Demand D) 5 0 D 0 20 0 1 2 3 4 B C D E F A 0 E 3 40 0 10 15 0 0 0 0 F 2 2 100 Free Frame Pool A C F D B E 0 5 5 5 0 1 2 3 4 Final Block Order: |H| = |C| =2, |F|=2 24

Ouroboros Wear-leveling: A Two-level Hierarchical Wear-leveling - PowerPoint PPT Presentation

Ouroboros Wear-leveling: A Two-level Hierarchical Wear-leveling Model for NVRAM Qingyue Liu Peter Varman ECE Department, Rice University May 18, 2017 New Challenges for New Technologies RRAM PCM 3DXpoint Advantages Major Drawback:

An Efficient Wear-level Architecture using Self-adaptive Wear Leveling Jianming Huang , Yu Hua,

Tooth wear: Risk assessment Tooth wear definition: Tooth wear is defined as the loss of tooth

Endurance Enhancement of Flash-Memory Storage Systems: An Efficient Static Wear Leveling Design

Pattern-based Write Scheduling and Read Balance-oriented Wear-leveling for Solid State Drivers

OUROBOROS-R, an IND-CPA KEM based on Rank Metric NIST First Post-Quantum Cryptography

Ouroboros Crypsinous Privacy-Preserving Proof-of-Stake Thomas Kerber Aggelos Kiayias

Ouroboros: a simple, secure and efficient key exchange protocol based on coding theory

Hierarchical Bounding Volume October 11, 2005 () Hierarchical Bounding Volume October 11, 2005

What is a hierarchical model? Richard Erickson Quantitative Ecologist DataCamp Hierarchical

Cotton Opportunities in Womens Wear Womens Wear Plan Objective : To develop breakthrough

WHOLEHEARTED Digging Deeper to Broaden Our Reach WE WEAR THE MASK We Wear the Mask BY PAUL

Antiproton Source Antiproton Source Department Department Unfilling Collection Lenses

Leveling the Playing Field for Sustainable Transportation Michelle DeRobertis P.E. Principal,

Semantic Analysis for NLP-based Applications Johannes Leveling former affiliation: Intelligent

LogCLEF 2011 Overview Giorgio di Nunzio Thomas Mandl Johannes Leveling University of Padua

Leveling for Non-Volatile Main Memories Haikun Liu , Yuanyuan Ye, Xiaofei Liao, Hai Jin, Yu Zhang,

Timescale Stream Statistics for Hierarchical Management Chen Ding University of Rochester March

Evaluation of Different Caching Strategies for YouTube Multimedia Content Abschlussvortrag

Vector Quantizers Quantizers for for Vector Reduced Bit- -Rate Coding Rate Coding Reduced

Resource Oblivious Parallel Computing Vijaya Ramachandran Department of Computer Science

Caching with PSR-6 Laravel Barcelona @laravelbcn @ hannesvdvreken Hi, my name is Hannes.

Analyzing Data Access of Algorithms and How to Make Them Cache-Friendly? Kenjiro Taura 1 / 79

Cache-Oblivious Algorithms and Data Structures Gerth Stlting Brodal University of Aarhus

Physical Aware System Level Design for Tiled Hierarchical Chip Multiprocessors Jordi