Exploiting Managed Language Semantics to Mitigate Wear-out in Persistent Memory
Shoaib Akram Ghent University, Belgium
Flash Memory Summit 2019 Santa Clara, CA 1
Exploiting Managed Language Semantics to Mitigate Wear-out in - - PowerPoint PPT Presentation
Exploiting Managed Language Semantics to Mitigate Wear-out in Persistent Memory Shoaib Akram Ghent University, Belgium Flash Memory Summit 2019 Santa Clara, CA 1 Main memory capacity expansion Charge storage in DRAM a scaling limitation 1
Exploiting Managed Language Semantics to Mitigate Wear-out in Persistent Memory
Shoaib Akram Ghent University, Belgium
Flash Memory Summit 2019 Santa Clara, CA 1
Main memory capacity expansion
Flash Memory Summit 2019 Santa Clara, CA 2
Charge storage in DRAM a scaling limitation Manufacturing complexity makes DRAM pricing volatile
0.6 0.7 0.8 0.9 1 Price/Gb ($) Janβ17 Janβ18 WSTS, IC Insights
Phase change memory (PCM)
Flash Memory Summit 2019 Santa Clara, CA 3
π Scalable β More Gb for the same price Byte addressable like DRAM Latency closer to DRAM π Low write endurance
Flash Memory Summit 2019 Santa Clara, CA 4
Store information as change in resistance Crystalline is set & Amorphous is reset
temperature time Amorphous Crystalline
Why PCM has low write endurance?
Electric pulses to program PCM cells wear them out
Mitigating PCM wear-out
Flash Memory Summit 2019 Santa Clara, CA 5
Wear-leveling to spread writes across PCM
Mitigating PCM wear-out
Flash Memory Summit 2019 Santa Clara, CA 5
Wear-leveling to spread writes across PCM
Mitigating PCM wear-out
Flash Memory Summit 2019 Santa Clara, CA 5
Wear-leveling to spread writes across PCM Problem: PCM-Only with wear-leveling wears out in a few months
Hybrid DRAM-PCM memory
Flash Memory Summit 2019 Santa Clara, CA 6
This talk β Use DRAM to limit PCM writes
Endurance Capacity Persistence
OS to limit PCM writes
Flash Memory Summit 2019 Santa Clara, CA 7
Page migrations hurt performance and PCM lifetime
Flash Memory Summit 2019 Santa Clara, CA
Managed runtimes
Managed Runtime Operating System Hardware Application Platform independence Abstract hardware/OS β Aka Virtual Machine Ease programmerβs burden Garbage collection (GC)
8
Flash Memory Summit 2019 Santa Clara, CA
GC to limit PCM writes
GC aware of heap semantics β Pro-active allocation Operating System Hardware Application GC operates with objects β Fine-grained mgmt.
9
Flash Memory Summit 2019 Santa Clara, CA 10
Write Distribution in GC heap
Flash Memory Summit 2019 Santa Clara, CA 10
Write Distribution in GC heap
to 2% of objects
Flash Memory Summit 2019 Santa Clara, CA 11
Write-Rationing Garbage Collection
Limit PCM writes by discovering highly written
Kingsguard β dynamic monitoring Crystal Gazer β prediction
Kingsguard-Nursery (KG-N)
Flash Memory Summit 2019 Santa Clara, CA 12
nursery
Kingsguard-Writers (KG-W)
Flash Memory Summit 2019 Santa Clara, CA
nursery
13
Metadata optimization
Flash Memory Summit 2019 Santa Clara, CA 14
payload meta
Full-heap GC: Mark a bit in meta of all live objects Meta Opt: Place object meta-data in DRAM
KG-W drawbacks
Flash Memory Summit 2019 Santa Clara, CA 15
Monitoring overhead Limited opportunity to predict writes Fixed DRAM consumption
Flash Memory Summit 2019 Santa Clara, CA 16
Write-Rationing Garbage Collection
Limit PCM writes by discovering highly written
Crystal Gazer β prediction Kingsguard β monitoring
Allocation site as a write predictor
Flash Memory Summit 2019 Santa Clara, CA 17
a = new Object() b = new Object() c = new Object() d = new Object()
Uniform distribution π Skewed distribution π Produces highly written
Write distribution by allocation site
Flash Memory Summit 2019 Santa Clara, CA 18
Few sites capture majority of writes
25 50 75 100 50 100 150 % mature objects Sites sorted by writes Writes Volume
Crystal Gazer operation
Flash Memory Summit 2019 Santa Clara, CA 19
Application Profiling Advice Generation Bytecode Compilation
a = new Object() β¦ b = new_dram Object() a = new Object() β¦ b = new Object()
Flash Memory Summit 2019 Santa Clara, CA 20
Advice generation
Generate <alloc-site, advice> pairs advice β DRAM or PCM input is a write-intensity trace Two heuristics to classify allocation sites as DRAM
DRAM allocation sites
Flash Memory Summit 2019 Santa Clara, CA 21
Frequency: More than a threshold writes βAggressively limits writes β 1 Byte and 1024 Byte object treated similarly Density: More than a threshold write-density βOptimizes for writes and DRAM capacity
Classification examples
Flash Memory Summit 2019 Santa Clara, CA 22
Object Identifier # Writes # Bytes Allocation site O1 4 A() + 10 O2 4 A() + 10 O3 128 4 A() + 10 O4 128 4096 B() + 4
Frequency threshold = 1 PCM writes = ?, DRAM bytes = ?
Classification examples
Flash Memory Summit 2019 Santa Clara, CA
Object Identifier # Writes # Bytes Allocation site O1 4 A() + 10 O2 4 A() + 10 O3 128 4 A() + 10 O4 128 4096 B() + 4
Frequency threshold = 1 PCM writes = ?, DRAM bytes = ?
β β
22
Classification examples
Flash Memory Summit 2019 Santa Clara, CA
Object Identifier # Writes # Bytes Allocation site O1 4 A() + 10 O2 4 A() + 10 O3 128 4 A() + 10 O4 128 4096 B() + 4
Frequency threshold = 1 PCM writes = 0/256, DRAM bytes = 5008
β β
22
Classification examples
Flash Memory Summit 2019 Santa Clara, CA
Object Identifier # Writes # Bytes Allocation site O1 4 A() + 10 O2 4 A() + 10 O3 128 4 A() + 10 O4 128 4096 B() + 4
Density threshold = 1 PCM writes = ?, DRAM bytes = ?
22
Classification examples
Flash Memory Summit 2019 Santa Clara, CA
Object Identifier # Writes # Bytes Allocation site O1 4 A() + 10 O2 4 A() + 10 O3 128 4 A() + 10 O4 128 4096 B() + 4
Density threshold = 1 PCM writes = ?, DRAM bytes = ?
β
32
22
Classification examples
Flash Memory Summit 2019 Santa Clara, CA
Object Identifier # Writes # Bytes Allocation site O1 4 A() + 10 O2 4 A() + 10 O3 128 4 A() + 10 O4 128 4096 B() + 4
Density threshold = 1 PCM writes = ?, DRAM bytes = ?
β
<1
22
Classification examples
Flash Memory Summit 2019 Santa Clara, CA
Object Identifier # Writes # Bytes Allocation site O1 4 A() + 10 O2 4 A() + 10 O3 128 4 A() + 10 O4 128 4096 B() + 4
Density threshold = 1 PCM writes = 128/256, DRAM bytes = 12
22
Flash Memory Summit 2019 Santa Clara, CA 23
Object placement in Crystal Gazer
new_dram() β Set a bit in the object header GC β Inspect the bit on nursery collection to copy object in DRAM or PCM
Flash Memory Summit 2019 Santa Clara, CA 24
Object placement in Crystal Gazer
nursery
π§
Is marked highly written? β
Flash Memory Summit 2019 Santa Clara, CA 25
Persistence
Persistent parent β copy child objects to PCM VM startup β Move highly-written to DRAM Write barrier tracks writes & persistent candidates
Flash Memory Summit 2019 Santa Clara, CA 26
Evaluation methodology
15 Applications β DaCapo, GraphChi, SpecJBB Medium-end server platform Different inputs for production and advice Jikes RVM
Flash Memory Summit 2019 Santa Clara, CA 27
Emulation platform CPU CPU
Jikes RVM
App OS
Flash Memory Summit 2019 Santa Clara, CA 28
PCM write rates β lifetime
PCM-Only write rate is up to 1.8 GB/s Safe operation is 200 MB/s for 5-10 year lifetime
Flash Memory Summit 2019 Santa Clara, CA 29
PCM write rates
200 400 600 800
Write rate in MB/s KG-N KG-W Dens Freq
Flash Memory Summit 2019 Santa Clara, CA 30
Performance
0.0 0.5 1.0 1.5
KG-N norm execution time KG-W Dens Freq
30%
8%
Flash Memory Summit 2019 Santa Clara, CA 31
DRAM capacity
25 50 75
% of heap in DRAM KG-W Dens Freq
25%
Flash Memory Summit 2019 Santa Clara, CA 32
0.3 0.4 0.5 0.6 0.7 0.8 100 150 200 250
KG-N norm. PCM writes DRAM MB
KG-W
KG-W versus Crystal Gazer
Flash Memory Summit 2019 Santa Clara, CA 32
0.3 0.4 0.5 0.6 0.7 0.8 100 150 200 250
KG-N norm. PCM writes DRAM MB
KG-W Crystal Gazer
KG-W versus Crystal Gazer
Crystal Gazer
Pareto-optimal trade-offs
Flash Memory Summit 2019 Santa Clara, CA 33
Write-rationing garbage collection
Hybrid memory is inevitable Each layer can play a role in wider adoption Write-rationing GC is pro-active and fine-grained DRAM PCM
Flash Memory Summit 2019 Santa Clara, CA 34
More information
PLDI 2018 β Write-rationing garbage collection for hybrid memories SIGMETRICS 2019 β Crystal Gazer: Profile-driven write- rationing garbage collection for hybrid memories ISPASS 2019 β Emulating and evaluating hybrid memory for managed languages on NUMA platform