Ahmad Hassan (SAP), Hans Vandierendonck (QUB) and Dimitrios S. Nikolopoulos (QUB) May 2015
Energy-Efficient In-Memory Data Stores on Hybrid Memory Hierarchies
Eleventh International Workshop on Data Management on New Hardware June 2015
Energy-Efficient In-Memory Data Stores on Hybrid Memory Hierarchies - - PowerPoint PPT Presentation
Energy-Efficient In-Memory Data Stores on Hybrid Memory Hierarchies Eleventh International Workshop on Data Management on New Hardware June 2015 Ahmad Hassan (SAP), Hans Vandierendonck (QUB) and Dimitrios S. Nikolopoulos (QUB) May 2015
Ahmad Hassan (SAP), Hans Vandierendonck (QUB) and Dimitrios S. Nikolopoulos (QUB) May 2015
Eleventh International Workshop on Data Management on New Hardware June 2015
Presentation structure
Cores
Demand
Technology Limitations Technology Evolution
Research problem Proposed solution Methodology Evaluation Conclusion
Every 2 years, there is a 30% relative decrease in Main Memory DRAM capacity per processor core
ISCA 2009: web.eecs.umich.edu/~twenisch/papers/isca09-disaggregate.pdf
Research problem Proposed solution Methodology Evaluation Conclusion
DRAM has technology limitations β physical scalability limits and inefficient power consumption
Research problem Proposed solution Methodology Evaluation Conclusion
Scalability Power- inefficiency
Technology Scaling for Large Memory Capacity: DRAM has hit scaling limit (Hard to scale below 40 nm) [ITRS.
International Technology Roadmap for Semiconductors, 2011]
Main memory subsystem energy: DRAM-based main memory consumes 30-40% of the total server power [L. A. Barroso et al. Synthesis Lectures on Computer Arch. 2009]
Different Main Memory Technologies
Feature DRAM RRAM STTRAM PCM
Cell Size 6 β 8πΊ2 > 5πΊ2 37πΊ2 8 β 16 πΊ2 Read Latency ~30ns ~116ns ~105ns ~151ns Write Latency ~30ns ~145ns ~77ns ~396ns Read Energy* 5.90 4.81 16.60 80.41 Write Energy* 12.70 13.80 21.05 418.6 Static Energy YES Negligible Negligible Negligible Byte-Addressable YES YES YES YES Write Endurance > 1015 > 105 > 1015 > 108
*Read/write Energy is presented in nanojoule per 32 byte access http://www3.pucrs.br/pucrs/files/uni/poa/facin/pos/relatoriostec/tr060.pdf http://dl.acm.org/citation.cfm?id=2742854.2742886
Research problem Proposed solution Methodology Evaluation Conclusion
All this means is that,
Research problem Proposed solution Methodology Evaluation Conclusion
And our research problem becomesβ¦.
Research problem Proposed solution Methodology Evaluation Conclusion
NVM (Non-volatile memory) is an emerging main memory technology that is byte-addressable like DRAM
Research problem Proposed solution Methodology Evaluation Conclusion
Lower leakage power than DRAM
Research problem Proposed solution Methodology Evaluation Conclusion
Advantage
Lower leakage power than DRAM Large capacity and better scalability than DRAM
Research problem Proposed solution Methodology Evaluation Conclusion
Advantage Advantage
Lower leakage power than DRAM Large capacity and better scalability than DRAM Higher latency and dynamic energy than DRAM
Research problem Proposed solution Methodology Evaluation Conclusion
Advantage Advantage Disadvantage
Research problem Proposed solution Methodology Evaluation Conclusion
Because of the higher latency, and
Research problem Proposed solution Methodology Evaluation Conclusion
Challenge! How to use NVM as main memory technology without hitting NVM low latency bottleneck and reducing main memory subsystemβs energy?
Research problem Proposed solution Methodology Evaluation Conclusion
Challenge! How to use NVM as main memory technology without hitting NVM low latency bottleneck and reducing main memory subsystemβs energy? Proposed Solution: Hybrid NVM/DRAM main memory systemβ¦and weβll explain howβ¦
For such hybrid memory schemes, Application-level data management is useful β because it provides a hardware- independent way to manage data
Research problem Proposed solution Methodology Evaluation Conclusion
One key finding was that, objects presented more accurate granularity of data than pages
Application Instrumentation
Application Source Profiling Tool Instrumented Executable Run Benchmark / Collect profiling data Apply Analytical Models for object* placement
* Objects are individual program variables and memory allocations.
Research problem Proposed solution Methodology Evaluation Conclusion
Modified Application Source
Profiling Tool
Collected Metric
Memory Loads Memory Stores Off-chip Memory accesses Memory Allocations Allocation sizes Callpath Lifetime
Research problem Proposed solution Methodology Evaluation Conclusion
Instrumented Exe
Splay tree
Application Source Code
Register Allocations
Cache simulator Stats File
LLVM PASS
Adds new instructions to profile loads and stores
Loads/Stores
All Accesses Off-chip Accesses
Memory Profiling Library
Performance and Energy Models
π΅ππ΅ππΈππ΅π = ππ ππ + ππ₯ππ₯ + (1 β ππ ) ππππ· ππ and ππ₯ are number of main memory read and write accesses respectively, ππ and ππ₯ are DRAM read and write latencies respectively and ππππ· is last level cache latency
π΅ππ΅πΉπΈππ΅π= ππ πΉπ + ππ₯πΉπ₯ + π ππΈππ΅ππ ππ πππ ππ₯ are DRAM read and write access respectively. πΉπ and πΉπ₯ are read and write energies respectively.
Research problem Proposed solution Methodology Evaluation Conclusion
Object Placement Algorithm
1.
βπ΅ππ΅πΉ = π΅ππ΅πΉπΈππ΅π β π΅ππ΅πΉπππ
2.
βπ΅ππ΅π = π΅ππ΅ππΈππ΅π β π΅ππ΅ππππ
3.
Sort total objects on βπ΅ππ΅π
4.
βπ΅ππ΅π β€
π π=π‘+1
Ξ» π΅ππ΅ππΈππ΅π
π π=1 Research problem Proposed solution Methodology Evaluation Conclusion
Where Ξ» is a user-configurable parameter
Benchmarks and Simulation
Benchmarks ο± MonetDB β In-memory column store
ο± TPCH analytical queries
ο± Memcached β In-memory key-value store
ο± Twitter and Yahoo Cloud Serving Benchmark
Simulation ο± GEM5 Syscall emulation. 512 MB DRAM, 8GB RRAM ο± Custom application-level memory allocators for DRAM and RRAM
Research problem Proposed solution Methodology Evaluation Conclusion
MonetDB Analysis
1.
Research problem Proposed solution Methodology Evaluation Conclusion
MonetDB: Performance Degradation vs Energy Savings
5 10 15 20 25 30 35 40 45 50 Q9 Q18 Q21 NVM SWP RaPP
Research problem Proposed solution Methodology Evaluation Conclusion
82 84 86 88 90 92 94 Q9 Q18 Q21 NVM SWP RaPP (%) (%) Performance Degradation Energy Savings
Conclusion
demands.
manage data on hybrid memories.
data management on hybrid memory.
Research problem Proposed solution Methodology Evaluation Conclusion
Acknowledgements
centres/HPDC/Articles/EUMarieCurieFellowshipNovosoft/)
Contact information: Ahmad Hassan ahmad.hassan@sap.com