Getting M g Mor ore P e Per erformance w ce with Pol olym - PowerPoint PPT Presentation

Getting M g Mor ore P e Per erformance w ce with Pol olym ymorphism sm f from om E Emer erging g Mem emory T y Tec echnol ologies es Iyswarya Narayanan, Aishwarya Ganesan, Anirudh Badam, Sriram Govindan, Bikash Sharma, Anand Sivasubramaniam

Resource needs of cloud storage applications span multiple aspects Volatile and High Capacity persistent accesses SSD SSD High performance PDF Trimmed tail 2 Latency

Cloud applications are diverse! In terms of their capacity needs for both volatile reads and persistent writes Reads Writes 0.8 Unique pages accessed within a time window 0.6 0.4 0.2 0 Cloud Storage Map Reduce Search-Index Search-Serve Read and Write intensive Read intensive write intensive 3

Cloud applications are diverse! In terms of volume of read and write accesses Search-Serve Across different applications Temporally within same applications How to effectively provision memory and storage resources for diverse cloud storage applications? 4

DRAM and SSD are memory and storage resources Volatile Persistent Low latency High latency Low capacity High capacity Latency Cost Capacity 5

They are rigid in their performance characteristics In Function: Memory Volatile Persistent or Storage Low latency High latency Low capacity High capacity In Latency: fast DRAM vs. slow SSD In Capacity: Based on Latency server SKU Cost Capacity 6

Can emerging memories help meet diverse resource needs for cloud storage apps across several dimensions? Non-volatile Lower latencies w.r.t SSD Larger and flexible: capacity and latency Volatile Persistent Low capacity High capacity 3D XPoint Low latency High latency + - + - Compressed Battery Backed DRAM Latency Can we exploit these emerging memory technologies to overcome Cost Capacity drawbacks of existing resources? 7

What are the design choices to integrate emerging memory technologies in cloud servers? Persistent memory NVM based file Transparent cache programming systems (memory or storage) Benefit volatile and No changes to applications Low cost and transparent persistent accesses High NVM provisioning cost for entire storage Intrusive code changes Benefits reads or writes needs! to applications! and not both! Intrusive code changes to OS and FS! 8

Emerging memory technologies are polymorphic They can function as both memory and storage! persistent Persistent write Volatile memory cache cache Memory Transparent extension (direct write cache above access via loads, SSD (via stores) block interface) volatile Can we exploit functional polymorphism knob?

Functional polymorphism can benefit applications with competing volatile and persistent flows dm-cache to use a part of NVM as write cache Rest – additional memory accessible via load/stores MySQL TPC-C 15 Tail latency 10 5 0 0 50 100 % NVM used as write cache Partitioning NVM between memory and storage reduces latency What if the working set exceeds physical memory/write-cache capacity? 10

Impact of insufficient physical capacity + fixed resource characteristics on application performance Application’s working set split between two fixed latency tiers Probability 95 th percentile DRAM SSD Tail latency is determined by the slowest tier Access Latency 11

Representational polymorphism knob to tune latency and capacity Application’s working set split between two fixed latency tiers Probability Faster tier morphs 95 th percentile to hold more working set 95 th percentile DRAM SSD Tail latency reduces Access Latency 12

Representational polymorphism can benefit applications Write Access Read Access 1000 % Increase in capacity 12 Much lower Access Latency (us) 800 2X to 8X 10 latency increase in compare to 8 effective 600 SSD! capacity 6 400 4 2 200 0 0 4096 2048 1024 512 MapReduce SearchServe Compressed Access Granularity (bytes) Our goal: Effectively serve diverse cloud applications using polymorphic emerging memory based cache 13

PolyEMT: Polymorphic Emerging Memory Technology based cache NVM can be Unmodified Application Battery-backed DRAM, 3D-Xpoint, etc. Memory Interface Memory ns Block Interface DRAM NVM 1 us NVM 10 us 100 us SSD Functional Polymorphism: 1 Memory vs. Storage ? Storage Cloud applications are diverse: One partition size does not fit all! 14

PolyEMT: Polymorphic Emerging Memory Technology based cache NVM can be Unmodified Application Battery-backed DRAM, 3D-Xpoint, etc. Memory Interface Memory ns Block Interface DRAM NVM 1 us NVM 2 2 10 us Compressed Compressed 100 us SSD Functional Polymorphism: 1 Memory vs. Storage ? Storage Representational Polymorphism: 2 We need to navigate performance trade-off across capacity, latency, and Capacity vs. Latency ? persistence dimensions! 15

Key idea of PolyEMT cache • Address the most significant bottleneck first using the emerging memory based cache • Then gradually morph its characteristics to further improve performance What is the most significant bottleneck for a generic application with mixes of reads and writes ? 16

Persistent writes (file writes, flushes, msyncs) incurs high latency in existing systems Persistent tier is Use BB-DRAM as And, SSDs are asymmetric in much slower Write-Cache to SSD their read/write latency DRAM DRAM Avg. 95 Read Persistent 5 Read Persistent Misses writes Misses writes 4 Latency (ms) Block File System Block File System 3 BB-DRAM Write Cache SSD SSD 2 SSD SSD writes reads Reads Writes 1 0 SSD SSD Reads Writes 17

EMT entirely in Write-Cache is inefficient usage for read accesses as they are byte addressible As write-cache and Resource is byte addressable! memory extension As write-cache DRAM DRAM BB-DRAM How to apportion Read Persistent Persistent Read NVM capacity Misses writes writes Misses between memory and Block File System Block File System Storage functions? BB-DRAM Write Cache BB-DRAM Write Cache SSD SSD SSD SSD Reads Writes Reads Writes SSD SSD 18

Tuning write-cache capacity in the presence of competing read and write flows Persistent Write 1 Latency 0 0 75 50 25 100 % BB-DRAM in Storage 19

Tuning write-cache capacity in the presence of competing read and write flows % EMT in Memory 0 50 25 75 100 Volatile Latency Persistent Write 1 1 Latency 0 0 0 75 50 25 100 % EMT in Storage 20

Balance the overall impact of read and write accesses % EMT in Memory % EMT in Memory Application Performance 0 50 25 75 0 100 25 50 75 100 Volatile Latency Persistent Write 1 1 1 Latency 0 0 0 0 50 25 0 100 75 75 50 25 100 % EMT in Storage % EMT in Storage Incrementally repurpose Write-Cache blocks as memory pages to balance read/write performance. 21

When the physical capacity is insufficient, exploit representational polymorphism Functional + Representational Functional polymorphic cache polymorphic cache DRAM BB-DRAM DRAM BB-DRAM Compressed-BB-DRAM Persistent Read Read Persistent writes Misses Misses writes Block File System Block File System BB-DRAM Write Cache BB-DRAM Write Cache Compressed BB-DRAM SSD SSD SSD SSD Reads Writes Reads Writes SSD SSD No latency benefits by separating memory and storage functions! 22

When the physical capacity is insufficient, exploit representational polymorphism Shared compressed Functional + Representational Functional representation polymorphic cache polymorphic cache DRAM BB-DRAM DRAM BB-DRAM DRAM BB-DRAM Read Persistent Compressed-BB-DRAM Persistent Read Misses writes Read Persistent writes Misses Block File System Misses writes Battery-backed DRAM Block File System Block File System Shared-Compressed BB-DRAM Write Cache BB-DRAM Write Cache BB-DRAM Compressed BB-DRAM SSD SSD SSD SSD SSD SSD Reads Writes Reads Writes Reads Writes SSD SSD SSD No latency benefits by separating Shared compression layer reduces memory and storage functions! compute requirements too! 23

PolyEMT optimization steps at a glance On scheduling a new application 1. EMT as persistent Write-Back Cache 4. LRU based 2. Exploit functional polymorphism capacity On dynamic phase management changes within an application 3. Exploit representational polymorphism 24

PolyEMT prototype • PolyEMT library and runtime • mmap(): native load/store access • msync(): persist dirty data to NVM write cache persist data to SSD in background • More implementation details in the paper 25

Evaluation Setup • Azure VM • DRAM (26GB) • Battery Backed –DRAM (6GB) • SSD • CPU based compression • Redis Key-Value store with persistence capability • Data set size: • 38GB much higher than DRAM+BB-DRAM capacity • YCSB benchmarks 26

Transparent integration policies under evaluation • Dram-Extension • Write-Cache • Write-Cache + Functional polymorphism • Write-Cache + Functional polymorphism + Representational polymorphism

Performance benefits of PolyEMT on throughput Write-Cache Functional Functional+Representational Normalized Throughput wrt DRAM-Extension 8 6 5X 4.55X 4 2.5X 2 0 a b c d e f Mean Addressing the most significant bottleneck improves performance by 2.5X Exploiting polymorphisms further improves performance by 70% and 90%

Getting M g Mor ore P e Per erformance w ce with Pol olym - PowerPoint PPT Presentation

Getting M g Mor ore P e Per erformance w ce with Pol olym ymorphism sm f from om E Emer erging g Mem emory T y Tec echnol ologies es Iyswarya Narayanan, Aishwarya Ganesan, Anirudh Badam, Sriram Govindan, Bikash Sharma, Anand

POL OLYM YMARK RK PRO ROJE JECT Novel Identification Technology for High-value Plastics

KDE Edu KDE for Education Aleix Pol, Laszlo Papp July 2012 Aleix Pol, Laszlo Papp KDE Edu

THE SOUTH AFRICAN IRON ANd STEEl vAlUE CHAIN KUMBA IRON ORE March 2011 BACKGROUNd TO KUMBA

Presentation on THRESHOLD VALUE OF IRON ORE P.Ramesh Babu (DGM-Geology) S.Manoj Kumar (SM Ore

Site Visit to Sishen Mine 14 April 2010 Summary: Kumba Iron Ore Large hematite ore body:

Econom ical Aspects Econom ical Aspects Pay per Risk Pay per Use Pay per Use Pay per

Good Mor Good Mor ning, Ki Or ning, Ki Or a a I am going to begin by telling you a little

1 [2-4] Mor M. Peretz, Switch-Mode Power Supplies Capacitor types Polarized (electrolytic) Non

DCM rev. IL Vx Ton Toff T Ts Vx Vin VL Vin-Vo Vo -Vo [2-2] Mor M. Peretz,

Ore g o n DRE Pro g ra m Se rg e a nt E va n Se the r Ore g o n Sta te Po lic e T o pic s fo

DIRECT SHIPPING (DSO) IRON ORE Guinea, Africa GUINEA IRON ORE LIMITED giolimited.com November

AN EMERGING IRON ORE PRODUCER, UNLOCKING THE VALUE OF SOUTH AUSTRALIAN IRON ORE SOUTH

Site Visit to Sishen Mine 14 April 2010 Summary: Kumba Iron Ore Large hematite ore body:

KUMBA IRON ORE LIMITED KUMBA IRON ORE LIMITED 2011 Annual results presentation Real Mining. Real

Niobium Production in Ningxia Com pany Logo Start from ore Com pany Logo Start from ore Com

PRINCIPAL WELLBEING Movement Sponsor: What is the change that is affecting you the most? M ore

COVID-19 vaccine prioritization: Work Group considerations Sarah Mbaeyi, MD MPH July 29, 2020

Multi-view Active Learning Ion Muslea University of Southern California Outline Multi-view

Gradient flow and the EMT on the lattice Hiroshi Suzuki Kyushu

SPARQLstream and Morph- streams: Hands on Session Jean-Paul Calbimonte & Oscar Corcho Share,

Global Banks and Systemic Debt Crises Juan Martin Morelli Pablo Ottonello Diego Perez New York

An end-to-end approach for the verification problem: learning the right distance Joo Monteiro

Emerging Non Volatile Memory Resistive Memory Technologies Key concept: replace DRAM cell

Thermodynamics of 2+1 flavor QCD with the SF t X method based on the gradient flow SF t X method :

Getting M g Mor ore P e Per erformance w ce with Pol olym - PowerPoint PPT Presentation

Getting M g Mor ore P e Per erformance w ce with Pol olym ymorphism sm f from om E Emer erging g Mem emory T y Tec echnol ologies es Iyswarya Narayanan, Aishwarya Ganesan, Anirudh Badam, Sriram Govindan, Bikash Sharma, Anand

POL OLYM YMARK RK PRO ROJE JECT Novel Identification Technology for High-value Plastics

KDE Edu KDE for Education Aleix Pol, Laszlo Papp July 2012 Aleix Pol, Laszlo Papp KDE Edu

THE SOUTH AFRICAN IRON ANd STEEl vAlUE CHAIN KUMBA IRON ORE March 2011 BACKGROUNd TO KUMBA

Presentation on THRESHOLD VALUE OF IRON ORE P.Ramesh Babu (DGM-Geology) S.Manoj Kumar (SM Ore

Site Visit to Sishen Mine 14 April 2010 Summary: Kumba Iron Ore Large hematite ore body:

Econom ical Aspects Econom ical Aspects Pay per Risk Pay per Use Pay per Use Pay per

Good Mor Good Mor ning, Ki Or ning, Ki Or a a I am going to begin by telling you a little

1 [2-4] Mor M. Peretz, Switch-Mode Power Supplies Capacitor types Polarized (electrolytic) Non

DCM rev. IL Vx Ton Toff T Ts Vx Vin VL Vin-Vo Vo -Vo [2-2] Mor M. Peretz,

Ore g o n DRE Pro g ra m Se rg e a nt E va n Se the r Ore g o n Sta te Po lic e T o pic s fo

DIRECT SHIPPING (DSO) IRON ORE Guinea, Africa GUINEA IRON ORE LIMITED giolimited.com November

AN EMERGING IRON ORE PRODUCER, UNLOCKING THE VALUE OF SOUTH AUSTRALIAN IRON ORE SOUTH

Site Visit to Sishen Mine 14 April 2010 Summary: Kumba Iron Ore Large hematite ore body:

KUMBA IRON ORE LIMITED KUMBA IRON ORE LIMITED 2011 Annual results presentation Real Mining. Real

Niobium Production in Ningxia Com pany Logo Start from ore Com pany Logo Start from ore Com

PRINCIPAL WELLBEING Movement Sponsor: What is the change that is affecting you the most? M ore

COVID-19 vaccine prioritization: Work Group considerations Sarah Mbaeyi, MD MPH July 29, 2020

Multi-view Active Learning Ion Muslea University of Southern California Outline Multi-view

Gradient flow and the EMT on the lattice Hiroshi Suzuki Kyushu

SPARQLstream and Morph- streams: Hands on Session Jean-Paul Calbimonte &amp; Oscar Corcho Share,

Global Banks and Systemic Debt Crises Juan Martin Morelli Pablo Ottonello Diego Perez New York

An end-to-end approach for the verification problem: learning the right distance Joo Monteiro

Emerging Non Volatile Memory Resistive Memory Technologies Key concept: replace DRAM cell

Thermodynamics of 2+1 flavor QCD with the SF t X method based on the gradient flow SF t X method :

SPARQLstream and Morph- streams: Hands on Session Jean-Paul Calbimonte & Oscar Corcho Share,