 
              Exploiting Managed Language Semantics to Mitigate Wear-out in Persistent Memory Shoaib Akram Ghent University, Belgium Flash Memory Summit 2019 Santa Clara, CA 1
Main memory capacity expansion Charge storage in DRAM a scaling limitation 1 Price/Gb ($) 0.9 Manufacturing complexity makes 0.8 DRAM pricing 0.7 volatile WSTS, IC Insights 0.6 Jan’17 Jan’18 Flash Memory Summit 2019 Santa Clara, CA 2
Phase change memory (PCM) 🙃 Scalable → More Gb for the same price Byte addressable like DRAM Latency closer to DRAM 🙂 Low write endurance Flash Memory Summit 2019 Santa Clara, CA 3
Why PCM has low write endurance? Store information as change in resistance Crystalline is set & Amorphous is reset Amorphous temperature Electric pulses to program PCM cells Crystalline wear them out time Flash Memory Summit 2019 Santa Clara, CA 4
Mitigating PCM wear-out Wear-leveling to spread writes across PCM Flash Memory Summit 2019 Santa Clara, CA 5
Mitigating PCM wear-out Wear-leveling to spread writes across PCM Flash Memory Summit 2019 Santa Clara, CA 5
Mitigating PCM wear-out Wear-leveling to spread writes across PCM Problem: PCM-Only with wear-leveling wears out in a few months Flash Memory Summit 2019 Santa Clara, CA 5
Hybrid DRAM-PCM memory Capacity Endurance Persistence DRAM PCM This talk → Use DRAM to limit PCM writes Flash Memory Summit 2019 Santa Clara, CA 6
OS to limit PCM writes DRAM PCM Page migrations hurt performance and PCM lifetime Flash Memory Summit 2019 Santa Clara, CA 7
Managed runtimes Platform independence Application Abstract hardware/OS Managed → Aka Virtual Machine Runtime Ease programmer’s burden Operating Garbage collection (GC) System Hardware Flash Memory Summit 2019 Santa Clara, CA 8
GC to limit PCM writes Application GC aware of heap semantics → Pro-active allocation GC operates with objects Operating → Fine-grained mgmt. System Hardware Flash Memory Summit 2019 Santa Clara, CA 9
Write Distribution in GC heap mature nursery GC 70% of writes Flash Memory Summit 2019 Santa Clara, CA 10
Write Distribution in GC heap mature nursery GC 22% 70% of writes to 2% of objects Flash Memory Summit 2019 Santa Clara, CA 10
Write-Rationing Garbage Collection Limit PCM writes by discovering highly written objects Kingsguard → dynamic monitoring Crystal Gazer → prediction Flash Memory Summit 2019 Santa Clara, CA 11
Kingsguard-Nursery (KG-N) nursery mature large DRAM PCM Flash Memory Summit 2019 Santa Clara, CA 12
Kingsguard-Writers (KG-W) nursery mature large observer mature large DRAM PCM Flash Memory Summit 2019 Santa Clara, CA 13
Metadata optimization meta payload Full-heap GC: Mark a bit in meta of all live objects Meta Opt: Place object meta-data in DRAM Flash Memory Summit 2019 Santa Clara, CA 14
KG-W drawbacks Monitoring overhead Limited opportunity to predict writes Fixed DRAM consumption Flash Memory Summit 2019 Santa Clara, CA 15
Write-Rationing Garbage Collection Limit PCM writes by discovering highly written objects Kingsguard → monitoring Crystal Gazer → prediction Flash Memory Summit 2019 Santa Clara, CA 16
Allocation site as a write predictor a = new Object() b = new Object() c = new Object() Produces highly written d = new Object() objects Uniform distribution 🙂 Skewed distribution 🙃 Flash Memory Summit 2019 Santa Clara, CA 17
Write distribution by allocation site Few sites capture majority of writes 100 % mature objects Writes 75 Volume 50 25 0 0 50 100 150 Sites sorted by writes Flash Memory Summit 2019 Santa Clara, CA 18
Crystal Gazer operation Application Advice Bytecode Profiling Generation Compilation a = new Object() a = new Object() … … b = new Object() b = new_dram Object() Flash Memory Summit 2019 Santa Clara, CA 19
Advice generation Generate <alloc-site, advice> pairs advice → DRAM or PCM input is a write-intensity trace Two heuristics to classify allocation sites as DRAM Flash Memory Summit 2019 Santa Clara, CA 20
DRAM allocation sites Frequency : More than a threshold writes ✔ Aggressively limits writes ✗ 1 Byte and 1024 Byte object treated similarly Density : More than a threshold write-density ✔ Optimizes for writes and DRAM capacity Flash Memory Summit 2019 Santa Clara, CA 21
Classification examples Frequency threshold = 1 PCM writes = ?, DRAM bytes = ? Object Allocation Identifier # Writes # Bytes site O1 0 4 A() + 10 O2 0 4 A() + 10 O3 128 4 A() + 10 O4 128 4096 B() + 4 Flash Memory Summit 2019 Santa Clara, CA 22
Classification examples Frequency threshold = 1 PCM writes = ?, DRAM bytes = ? Object Allocation Identifier # Writes # Bytes site O1 0 4 A() + 10 O2 0 4 A() + 10 → O3 128 4 A() + 10 → O4 128 4096 B() + 4 Flash Memory Summit 2019 Santa Clara, CA 22
Classification examples Frequency threshold = 1 PCM writes = 0/256, DRAM bytes = 5008 Object Allocation Identifier # Writes # Bytes site O1 0 4 A() + 10 O2 0 4 A() + 10 → O3 128 4 A() + 10 → O4 128 4096 B() + 4 Flash Memory Summit 2019 Santa Clara, CA 22
Classification examples Density threshold = 1 PCM writes = ?, DRAM bytes = ? Object Allocation Identifier # Writes # Bytes site O1 0 4 A() + 10 O2 0 4 A() + 10 O3 128 4 A() + 10 O4 128 4096 B() + 4 Flash Memory Summit 2019 Santa Clara, CA 22
Classification examples Density threshold = 1 PCM writes = ?, DRAM bytes = ? Object Allocation Identifier # Writes # Bytes site O1 0 4 A() + 10 O2 0 4 A() + 10 → O3 128 4 A() + 10 32 O4 128 4096 B() + 4 Flash Memory Summit 2019 Santa Clara, CA 22
Classification examples Density threshold = 1 PCM writes = ?, DRAM bytes = ? Object Allocation Identifier # Writes # Bytes site O1 0 4 A() + 10 O2 0 4 A() + 10 O3 128 4 A() + 10 → <1 O4 128 4096 B() + 4 Flash Memory Summit 2019 Santa Clara, CA 22
Classification examples Density threshold = 1 PCM writes = 128/256, DRAM bytes = 12 Object Allocation Identifier # Writes # Bytes site O1 0 4 A() + 10 O2 0 4 A() + 10 O3 128 4 A() + 10 O4 128 4096 B() + 4 Flash Memory Summit 2019 Santa Clara, CA 22
Object placement in Crystal Gazer new_dram() → Set a bit in the object header GC → Inspect the bit on nursery collection to copy object in DRAM or PCM Flash Memory Summit 2019 Santa Clara, CA 23
Object placement in Crystal Gazer nursery mature large 🧑 mature large DRAM Is marked highly written? ✓ PCM Flash Memory Summit 2019 Santa Clara, CA 24
Persistence Persistent parent → copy child objects to PCM VM startup → Move highly-written to DRAM Write barrier tracks writes & persistent candidates Flash Memory Summit 2019 Santa Clara, CA 25
Evaluation methodology 15 Applications → DaCapo, GraphChi, SpecJBB Medium-end server platform Different inputs for production and advice Jikes RVM Flash Memory Summit 2019 Santa Clara, CA 26
Emulation platform App Jikes RVM OS ✗ CPU CPU Flash Memory Summit 2019 Santa Clara, CA 27
PCM write rates → lifetime PCM-Only write rate is up to 1.8 GB/s Safe operation is 200 MB/s for 5-10 year lifetime Flash Memory Summit 2019 Santa Clara, CA 28
PCM write rates KG-N KG-W Dens Freq Write rate in MB/s 800 600 400 200 0 Flash Memory Summit 2019 Santa Clara, CA 29
Performance KG-W Dens Freq 1.5 execution time KG-N norm 30% 8% 1.0 0.5 0.0 Flash Memory Summit 2019 Santa Clara, CA 30
DRAM capacity KG-W Dens Freq 75 % of heap in DRAM 50 25% 25 0 Flash Memory Summit 2019 Santa Clara, CA 31
KG-W versus Crystal Gazer 0.8 KG-N norm. PCM writes 0.7 KG-W 0.6 0.5 0.4 0.3 100 150 200 250 DRAM MB Flash Memory Summit 2019 Santa Clara, CA 32
KG-W versus Crystal Gazer 0.8 KG-N norm. Crystal Gazer PCM writes 0.7 Crystal Gazer KG-W 0.6 opens up 0.5 Pareto-optimal 0.4 trade-offs 0.3 100 150 200 250 DRAM MB Flash Memory Summit 2019 Santa Clara, CA 32
Write-rationing garbage collection Hybrid memory is inevitable DRAM PCM Each layer can play a role in wider adoption Write-rationing GC is pro-active and fine-grained Flash Memory Summit 2019 Santa Clara, CA 33
More information PLDI 2018 → Write-rationing garbage collection for hybrid memories SIGMETRICS 2019 → Crystal Gazer: Profile-driven write- rationing garbage collection for hybrid memories ISPASS 2019 → Emulating and evaluating hybrid memory for managed languages on NUMA platform Flash Memory Summit 2019 Santa Clara, CA 34
Recommend
More recommend