Dynamic and Transparent Data Tiering for In-Memory Databases in - PDF document

Dynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments Carsten Meyer Martin Boissier Adrian Michaud EMC 2 Corporation Hasso Plattner Institute Hasso Plattner Institute Potsdam, Germany Potsdam, Germany Hopkinton, USA carsten.meyer@hpi.de martin.boissier@hpi.de adrian.michaud@emc.com Jan Ole Vollmer Ken Taylor David Schwalb EMC 2 Corporation Hasso Plattner Institute Hasso Plattner Institute Potsdam, Germany Hopkinton, USA Potsdam, Germany jan.vollmer@student.hpi.de ken.taylor@emc.com david.schwalb@hpi.de Matthias Uflacker Kurt Roedszus EMC 2 Corporation Hasso Plattner Institute Potsdam, Germany Hopkinton, USA matthias.uflacker@hpi.de kurt.roedszus@emc.com ABSTRACT for main memory intensive systems such as main memory- resident databases. Current in-memory databases clearly outperform their disk- Main memory-resident databases are databases whose pri- based counterparts. In parallel, recent PCIe-connected mary persistence is main memory, therefore also called in- NAND flash devices provide significantly lower access la- memory databases (IMDBs). In-memory databases have re- tencies than traditional disks allowing to re-introduce clas- cently been in the focus of database research [8, 11, 12] as sical memory paging as a cost-efficient alternative to storing well as in the focus of commercial database vendors as Mi- all data in main memory. This is further eased by new, crosoft [5], SAP [7], Oracle [15], or IBM [17]. While the dedicated APIs which bypass the operating system, opti- first in-memory databases were optimized for transactional mizing the way data is managed and transferred between enterprise workloads (so-called Online Transaction Process- a DRAM caching layer and NAND flash. In this paper, ing, OLTP), more recent approaches focus on mixed work- we will present a new approach for in-memory databases loads. A mixed workload combines a transactional workload that leverages such an API to improve data management with a more complex and computation intensive analytical without jeopardizing the original performance superiority of workload (so-called Online Analytic Processing, OLAP). in-memory databases. The approach exploits data relevance Observations of production enterprise systems have shown and places less relevant data onto a NAND flash device. For that data is kept over a period of five to ten years for reg- real-world data access skews, the approach is able to effi- ulatory or ‘just-in-case’ purposes. However, looking at the ciently evict a substantial share of the data stored in mem- actual workload reveals that accesses are highly skewed to- ory while suffering a performance loss of less than 30%. wards small portions of the data, while the larger part re- mains rarely accessed or even untouched. While storing all 1. INTRODUCTION data in dynamic random-access memory (DRAM) is viable when bandwidth requirements are high [2], storing irrelevant Storage Class Memory (SCM) is a class of solid state mem- data on DRAM can be considered a waste of resources, as ory whose performance characteristics set it apart from main DRAM is expensive and limited in capacity. Consequently, memory as well as classical disk drives. The latest gener- one of the major research questions of this paper is “How to ation of PCIe-connected NAND flash cards has consider- place less relevant data on an SCM tier in a mixed workload ably lowered the performance gap between main memory scenario with minimal performance impact?”. This research as the fastest storage layer and disks, promising improved topic of how to allocate data on different tiers is well known I/O latency and bandwidth [21]. These characteristics make in the context of transactional workloads by tracking re- SCM especially attractive to be used as a memory extension cently accessed tuples or blocks [4, 6]. But mixed workloads pose totally new challenges as analytical queries often access data that is of low relevance for the daily transactional business, but of high relevance for analytical tasks. If the database is able to evict substantial parts of the database to secondary storage without sacrificing the performance advantages of in-memory databases, the total cost of ownership (TCO) can be reduced. Not only are large main memory-based server systems more expensive to acquire than their disk-based counterparts, they are also more 1

Dynamic and Transparent Data Tiering for In-Memory Databases in - PDF document

Dynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments Carsten Meyer Martin Boissier Adrian Michaud EMC 2 Corporation Hasso Plattner Institute Hasso Plattner Institute Potsdam, Germany Potsdam, Germany

Storage Tiering Storage Tiering Jason M. Coposky June 5-7, 2018 @jason_coposky iRODS User

Dynamic Memory Management 333 Dynamic Memory Management Process Memory Layout Process Memory

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Dynamic Memory Allocation Today Dynamic memory allocation mechanisms & policies

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Transparent Assessment Providing transparent goals and expectations for students Jonathon Adams

Simulating Transparent Migration in Java Java doesnt provide transparent migration. non

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

TIERING History and Purpose 2012 CAS Ratemaking and Product Management Seminar Agenda

Tiering of SEA/EIA in relation to energy infra plans & projects Maarten Scheffers ( senior

TIERING History and Purpose Spring 2010 CAS Meeting Agenda n Definition n History n

Adverse Drug Tiering Practices in the United States Michaela Jackson, MS Public Health

Data to deliver better policy David Turvey A/g Division Head Office of the Chief Economist May

Data Cleaning & Checking: Minim ising Garbage Prof. Gavin T L Brow n ( gt.brow

Introduction and Overview Lars Peter Riishojgaard WMO Secretariat, Geneva Outline WMO

Click to edit Master title style QARTOD in Practice Presented by Luke Campbell Lessons Learned

Application and Platform Adaptive Scientific Software Lennart Johnsson Dragan Mirkovic

Grade 10 Option Counselling February 2020 What Compulsories do you have left? 18 compulsory

Compression of a Dictionary Jan Lnsk, Michal emli ka zizelevak@matfyz.cz

MobiLiteracy Uganda (MLIT Uganda) Results of a controlled trial of an SMS- based literacy support

Dynamic and Transparent Data Tiering for In-Memory Databases in - PDF document

Dynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments Carsten Meyer Martin Boissier Adrian Michaud EMC 2 Corporation Hasso Plattner Institute Hasso Plattner Institute Potsdam, Germany Potsdam, Germany

Storage Tiering Storage Tiering Jason M. Coposky June 5-7, 2018 @jason_coposky iRODS User

Dynamic Memory Management 333 Dynamic Memory Management Process Memory Layout Process Memory

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Dynamic Memory Allocation Today Dynamic memory allocation mechanisms &amp; policies

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Transparent Assessment Providing transparent goals and expectations for students Jonathon Adams

Simulating Transparent Migration in Java Java doesnt provide transparent migration. non

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

TIERING History and Purpose 2012 CAS Ratemaking and Product Management Seminar Agenda

Tiering of SEA/EIA in relation to energy infra plans &amp; projects Maarten Scheffers ( senior

TIERING History and Purpose Spring 2010 CAS Meeting Agenda n Definition n History n

Adverse Drug Tiering Practices in the United States Michaela Jackson, MS Public Health

Data to deliver better policy David Turvey A/g Division Head Office of the Chief Economist May

Data Cleaning &amp; Checking: Minim ising Garbage Prof. Gavin T L Brow n ( gt.brow

Introduction and Overview Lars Peter Riishojgaard WMO Secretariat, Geneva Outline WMO

Click to edit Master title style QARTOD in Practice Presented by Luke Campbell Lessons Learned

Application and Platform Adaptive Scientific Software Lennart Johnsson Dragan Mirkovic

Grade 10 Option Counselling February 2020 What Compulsories do you have left? 18 compulsory

Compression of a Dictionary Jan Lnsk, Michal emli ka zizelevak@matfyz.cz

MobiLiteracy Uganda (MLIT Uganda) Results of a controlled trial of an SMS- based literacy support

Dynamic Memory Allocation Today Dynamic memory allocation mechanisms & policies

Tiering of SEA/EIA in relation to energy infra plans & projects Maarten Scheffers ( senior

Data Cleaning & Checking: Minim ising Garbage Prof. Gavin T L Brow n ( gt.brow