Energy-Efficient In-Memory Data Stores on Hybrid Memory Hierarchies - - PowerPoint PPT Presentation

β–Ά
energy efficient in memory data stores on hybrid memory
SMART_READER_LITE
LIVE PREVIEW

Energy-Efficient In-Memory Data Stores on Hybrid Memory Hierarchies - - PowerPoint PPT Presentation

Energy-Efficient In-Memory Data Stores on Hybrid Memory Hierarchies Eleventh International Workshop on Data Management on New Hardware June 2015 Ahmad Hassan (SAP), Hans Vandierendonck (QUB) and Dimitrios S. Nikolopoulos (QUB) May 2015


slide-1
SLIDE 1

Ahmad Hassan (SAP), Hans Vandierendonck (QUB) and Dimitrios S. Nikolopoulos (QUB) May 2015

Energy-Efficient In-Memory Data Stores on Hybrid Memory Hierarchies

Eleventh International Workshop on Data Management on New Hardware June 2015

slide-2
SLIDE 2
  • Research Problem
  • Proposed Solution
  • Methodology
  • Evaluation
  • Conclusion

Presentation structure

slide-3
SLIDE 3

Research Problem

slide-4
SLIDE 4

Processor technology has evolved faster than Main Memory

  • 1. More Processor

Cores

  • 2. More Parallelism
  • 3. More Capacity

Demand

  • 4. Main Memory DRAM

Technology Limitations Technology Evolution

Research problem Proposed solution Methodology Evaluation Conclusion

slide-5
SLIDE 5

Every 2 years, there is a 30% relative decrease in Main Memory DRAM capacity per processor core

ISCA 2009: web.eecs.umich.edu/~twenisch/papers/isca09-disaggregate.pdf

Research problem Proposed solution Methodology Evaluation Conclusion

slide-6
SLIDE 6

DRAM has technology limitations – physical scalability limits and inefficient power consumption

Research problem Proposed solution Methodology Evaluation Conclusion

Scalability Power- inefficiency

Technology Scaling for Large Memory Capacity: DRAM has hit scaling limit (Hard to scale below 40 nm) [ITRS.

International Technology Roadmap for Semiconductors, 2011]

Main memory subsystem energy: DRAM-based main memory consumes 30-40% of the total server power [L. A. Barroso et al. Synthesis Lectures on Computer Arch. 2009]

slide-7
SLIDE 7

Different Main Memory Technologies

Feature DRAM RRAM STTRAM PCM

Cell Size 6 – 8𝐺2 > 5𝐺2 37𝐺2 8 – 16 𝐺2 Read Latency ~30ns ~116ns ~105ns ~151ns Write Latency ~30ns ~145ns ~77ns ~396ns Read Energy* 5.90 4.81 16.60 80.41 Write Energy* 12.70 13.80 21.05 418.6 Static Energy YES Negligible Negligible Negligible Byte-Addressable YES YES YES YES Write Endurance > 1015 > 105 > 1015 > 108

*Read/write Energy is presented in nanojoule per 32 byte access http://www3.pucrs.br/pucrs/files/uni/poa/facin/pos/relatoriostec/tr060.pdf http://dl.acm.org/citation.cfm?id=2742854.2742886

Research problem Proposed solution Methodology Evaluation Conclusion

slide-8
SLIDE 8

All this means is that,

DRAM is not a viable choice for applications that demand large memory

Research problem Proposed solution Methodology Evaluation Conclusion

slide-9
SLIDE 9

And our research problem becomes….

DRAM is not a viable choice for applications that demand large memory Can Non-Volatile Memories (NVM) present a better alternative?

Research problem Proposed solution Methodology Evaluation Conclusion

slide-10
SLIDE 10

Proposed Solution

slide-11
SLIDE 11

NVM (Non-volatile memory) is an emerging main memory technology that is byte-addressable like DRAM

Before we dive down further, let’s quickly re-cap what an NVM is

Research problem Proposed solution Methodology Evaluation Conclusion

slide-12
SLIDE 12

Lower leakage power than DRAM

Using NVM over DRAM has key advantages – such as power efficiency

Research problem Proposed solution Methodology Evaluation Conclusion

Advantage

slide-13
SLIDE 13

Lower leakage power than DRAM Large capacity and better scalability than DRAM

Using NVM over DRAM has key advantages – such as power efficiency and better scalability

Research problem Proposed solution Methodology Evaluation Conclusion

Advantage Advantage

slide-14
SLIDE 14

Lower leakage power than DRAM Large capacity and better scalability than DRAM Higher latency and dynamic energy than DRAM

However it has its downsides too – NVM has higher latency than DRAM

Research problem Proposed solution Methodology Evaluation Conclusion

Advantage Advantage Disadvantage

slide-15
SLIDE 15

So we gather a pure NVM-based approach is not viable either

Research problem Proposed solution Methodology Evaluation Conclusion

Pure NVM-based solution

slide-16
SLIDE 16

Because of the higher latency, and

Research problem Proposed solution Methodology Evaluation Conclusion

Pure NVM-based solution

Challenge! How to use NVM as main memory technology without hitting NVM low latency bottleneck and reducing main memory subsystem’s energy?

slide-17
SLIDE 17

So instead a hybrid NVM/DRAM approach could be the answer we are looking for...

Research problem Proposed solution Methodology Evaluation Conclusion

Pure NVM-based solution

Challenge! How to use NVM as main memory technology without hitting NVM low latency bottleneck and reducing main memory subsystem’s energy? Proposed Solution: Hybrid NVM/DRAM main memory system…and we’ll explain how…

slide-18
SLIDE 18

For such hybrid memory schemes, Application-level data management is useful – because it provides a hardware- independent way to manage data

  • Data management on Hybrid memory at:
  • 1. Application Level
  • 2. Operating System Level
  • 3. Hardware Level

Research problem Proposed solution Methodology Evaluation Conclusion

One key finding was that, objects presented more accurate granularity of data than pages

slide-19
SLIDE 19

Methodology

slide-20
SLIDE 20

Application Instrumentation

Application Source Profiling Tool Instrumented Executable Run Benchmark / Collect profiling data Apply Analytical Models for object* placement

* Objects are individual program variables and memory allocations.

Research problem Proposed solution Methodology Evaluation Conclusion

Modified Application Source

slide-21
SLIDE 21

Profiling Tool

Collected Metric

Memory Loads Memory Stores Off-chip Memory accesses Memory Allocations Allocation sizes Callpath Lifetime

Research problem Proposed solution Methodology Evaluation Conclusion

Instrumented Exe

Splay tree

Application Source Code

Register Allocations

Cache simulator Stats File

LLVM PASS

Adds new instructions to profile loads and stores

Loads/Stores

All Accesses Off-chip Accesses

Memory Profiling Library

slide-22
SLIDE 22

Performance and Energy Models

  • Performance Model

π΅π‘π΅π‘ˆπΈπ‘†π΅π‘ = πœˆπ‘ π‘€π‘  + 𝜈π‘₯𝑀π‘₯ + (1 βˆ’ πœˆπ‘ ) 𝑀𝑀𝑀𝐷 πœˆπ‘  and 𝜈π‘₯ are number of main memory read and write accesses respectively, 𝑀𝑠 and 𝑀π‘₯ are DRAM read and write latencies respectively and 𝑀𝑀𝑀𝐷 is last level cache latency

  • Energy Model

𝐡𝑁𝐡𝐹𝐸𝑆𝐡𝑁= πœˆπ‘ πΉπ‘  + 𝜈π‘₯𝐹π‘₯ + 𝑇 π‘„πΈπ‘†π΅π‘π‘ˆ πœˆπ‘  π‘π‘œπ‘’ 𝜈π‘₯ are DRAM read and write access respectively. 𝐹𝑠 and 𝐹π‘₯ are read and write energies respectively.

Research problem Proposed solution Methodology Evaluation Conclusion

slide-23
SLIDE 23

Object Placement Algorithm

1.

βˆ†π΅π‘π΅πΉ = 𝐡𝑁𝐡𝐹𝐸𝑆𝐡𝑁 βˆ’ π΅π‘π΅πΉπ‘‚π‘Šπ‘

2.

βˆ†π΅π‘π΅π‘ˆ = π΅π‘π΅π‘ˆπΈπ‘†π΅π‘ βˆ’ π΅π‘π΅π‘ˆπ‘‚π‘Šπ‘

3.

Sort total objects on βˆ†π΅π‘π΅π‘ˆ

4.

βˆ†π΅π‘π΅π‘ˆ ≀

𝑂 𝑗=𝑑+1

Ξ» π΅π‘π΅π‘ˆπΈπ‘†π΅π‘

𝑂 𝑗=1 Research problem Proposed solution Methodology Evaluation Conclusion

Where Ξ» is a user-configurable parameter

slide-24
SLIDE 24

Evaluation

slide-25
SLIDE 25

Benchmarks and Simulation

Benchmarks  MonetDB – In-memory column store

 TPCH analytical queries

 Memcached – In-memory key-value store

 Twitter and Yahoo Cloud Serving Benchmark

Simulation  GEM5 Syscall emulation. 512 MB DRAM, 8GB RRAM  Custom application-level memory allocators for DRAM and RRAM

Research problem Proposed solution Methodology Evaluation Conclusion

slide-26
SLIDE 26

MonetDB Analysis

1.

Research problem Proposed solution Methodology Evaluation Conclusion

slide-27
SLIDE 27

MonetDB: Performance Degradation vs Energy Savings

5 10 15 20 25 30 35 40 45 50 Q9 Q18 Q21 NVM SWP RaPP

Research problem Proposed solution Methodology Evaluation Conclusion

82 84 86 88 90 92 94 Q9 Q18 Q21 NVM SWP RaPP (%) (%) Performance Degradation Energy Savings

slide-28
SLIDE 28

Conclusion

  • Use of NVM as main memory is inevitable for meeting main memory capacity

demands.

  • Application-level data management provides a hardware independent way to

manage data on hybrid memories.

  • For the workloads we studied, objects provide better granularity than pages for

data management on hybrid memory.

  • Hybrid DRAM / NVM main memory found promising for in-memory data stores.
  • Future work on dynamic data placement techniques through operator level rules.

Research problem Proposed solution Methodology Evaluation Conclusion

slide-29
SLIDE 29

Acknowledgements

  • Nanostreams Project (http://www.nanostreams.eu)
  • NovoSoft Project (http://www.qub.ac.uk/research-

centres/HPDC/Articles/EUMarieCurieFellowshipNovosoft/)

slide-30
SLIDE 30

Thank you!

Contact information: Ahmad Hassan ahmad.hassan@sap.com