Squall: Fine-Grained Live Reconfiguration for Partitioned Main - PowerPoint PPT Presentation

Squall: Fine-Grained Live Reconfiguration for Partitioned Main Memory Databases AARON J. ELMORE, VAIBHAV ARORA, REBECCA TAFT, ANDY PAVLO, DIVY AGRAWAL, AMR EL ABBADI

Higher OLTP Throughput Demand for High-throughput transactional systems (OLTP) especially due to web-based services ◦ Cost per GB for RAM is dropping. ◦ Network memory is faster than local disk. Let’s use Main-Memory

Scaling-out via Partitioning Growth in scale of the data Data Partitioning enables managing scale via Scaling-Out.

Approaches for main-memory DBMS * Highly concurrent, latch-free data structures – Hekaton, Silo Partitioned data with single-threaded executors – Hstore, VoltDB *Excuse the generalization

Procedure Name Input Parameters Client Application Slide Credits: Andy Pavlo

The Problem: Workload Skew High skew increases latency by 10X and decreases throughput by 4X Partitioned shared-nothing systems are especially susceptible

The Problem: Workload Skew Possible solutions: Provision resources for peak load (Very expensive and brittle!) Resources Capacity Demand Time Unused Resources

The Problem: Workload Skew Possible solutions: Limit load on system (Poor performance!) Resources Time

Need Elasticity

The Promise of Elasticity Resources Capacity Demand Time Unused resources Slide Credits: Berkeley RAD Lab

What we need… Enable system to elastically scale in or out to dynamically adapt to changes in load Change the partition plan Add nodes Reconfiguration Remove nodes

Problem Statement Need to migrate tuples between partitions to reflect the updated partition plan . Partition Warehouse Partition Warehouse Partition 1 [0,1) Partition 1 [0,2) Partition 2 [2,3) Partition 2 [2,4) Partition 3 [1, 2),[3,6) Partition 3 [4,6) Would like to do this without bringing the system offline: ◦ Live Reconfiguration

E-Store Normal Load operation, Reconfiguration imbalance high level complete detected monitoring Tuple level Online reconfiguration monitoring (Squall) (E-Monitor) Tuple placement New partition Hot tuples, planning plan partition-level (E-Planner) access counts

Live Migrations Solutions are Not Suitable Predicated on disk based solutions with traditional concurrency and recovery. Zephyr : Relies on concurrency (2PL) and disk pages. ProRea : Relies on concurrency (SI and OCC) and disk pages. Albatross : Relies on replication and shared disk storage. Also introduces strain on source.

Not Your Parents’ Migration Single threaded execution model ◦ Either doing work or migration More than a single source and destination (and the destination is not cold) ◦ Want lightweight coordination Presence of distributed transactions and replication

Squall Given plan from E-Planner, Squall physically moves the data while the system is live Pull based mechanism – Destination pulls from source Conforms to H-Store single-threaded execution model While data is moving, transactions are blocked – but only on partitions moving the data o To avoid performance degradation, Squall moves small chunks of data at a time , interleaved with regular transaction execution

1. Initialization and Identify migrating data 2. Live reactive pulls for required data Squall Steps 3. Periodic lazy/async pulls for large chunks 0 1 1 0 3 4 4 3 Reconfiguration (New Plan, Leader ID) 2 2 Outgoing: 2 Pull Partition 2 Partition 1 Partition 2 Partition 1 W_ID=2 9 8 9 5 6 5 6 8 Pull W_ID>5 Incoming: 2 Incoming: 5 7 10 7 10 Outgoing: 5 Partitioned by Warehouse id Partition 3 Partition 4 Partition 3 Partition 4 Client

Chunk Data for Asynchronous Pulls

Why Chunk? Unknown amount of data when not partitioned by clustered index. Customers by W_ID in TPC-C Time spent extracting, is time not spent on TXNS.

Async Pulls Periodically pull chunks of cold data These pulls are answered lazily – Start at lower priority than transactions. Priority increases with time. Execution is interwoven with extracting and sending data (dirty the range!)

Chunking Async Pulls Data Data Async Pull Request Source Destination

Keys to Performance Properly size reconfiguration granules and space them apart. Split large reconfigurations to limit demands on a single partition. Redirect or pull only if needed. Tune what gets pulled. Sometimes pull a little extra.

Optimization: Splitting Reconfigurations 1. Split by pairs of source and destination - Avoids contention to a single partition ◦ Example: partition 1 is migrating W_ID 2,3 to partitions 3 and 7, execute as two reconfigurations. 2. Split large objects and migrate one piece at a time

Evaluation Workloads YCSB TPC-C Baselines Stop & Copy Purely Reactive – Only Demand based pulling Zephyr+ - Purely Reactive + Asynchronous Chunking with Pull Prefetching (Semantically equivalent to Zephyr)

YCSB Latency YCSB data shuffle 10% pairwise YCSB cluster consolidation 4 to 3 nodes

Results Highlight TPC-C load balancing hotspot warehouses

All about trade-offs Trading off time to complete migration and performance degradation. Future work to consider automating this trade-off based on service level objectives.

I Fell Asleep… What Happened? Partitioned Single Threaded Main Memory Environment -> Susceptible to Hotspots. Elastic data Management is a solution -> Squall provides a mechanism for executing a fine grained live reconfiguration Questions?

Tuning Optimizations

Sizing Chunks Static analysis to set chunk sizes, future work to dynamically set sizing and scheduling. Impact of chunk sizes on a 10% reconfiguration during a YCSB workload.

Spacing Async Pulls Delay at destination between new async pull requests. Impact on chunk sizes on a 10% reconfiguration during a YCSB workload with 8mb chunk size.

Effect of Splitting into Sub-Plans Set a cap on sub-plan splits, and split on pairs and ability to decompose migrating objects

Squall: Fine-Grained Live Reconfiguration for Partitioned Main - PowerPoint PPT Presentation

Squall: Fine-Grained Live Reconfiguration for Partitioned Main Memory Databases AARON J. ELMORE, VAIBHAV ARORA, REBECCA TAFT, ANDY PAVLO, DIVY AGRAWAL, AMR EL ABBADI Higher OLTP Throughput Demand for High-throughput transactional systems

Fine Grained Access Control Fine-Grained Access Control Fine Grained Access Control

Fine-Grained Access Control Fine Grained Access Control Fine-grained access control examples:

Fine-Grained Geographic Communication (Geocast) Nexus Workshop Frank Drr 23.07.2003 1

Average-Case Fine-Grained Hardness Marshall Ball Alon Rosen Manuel Sabin Prashant Nalini

Fine-grained Visual Analysis: From Classification to Retrieval Yi-Zhe Song SketchX Lab, CVSSP,

Mechanized Verification of Fine-grained Concurrent Programs Ilya Sergey Aleks Nanevski

Existing Facility Reconfiguration Needs Assessment SHERMAN POLICE FACILITY RECONFIGURATION

Junfeng Fan ESAT/COSIC ECC implementation methods Multi-core systems Coarse-Grained

Building Partitioned Architectures Building Partitioned Architectures based on the based on the

Block and Triangular Matrices Block Matrices Defn. A partitioned matrix has the rows and columns

Combining Data-Intense and Compute-Intense Methods for Fine-Grained Morphological Analyses Petra

Fine-Grained Power Modeling for Smartphones Using System Call Tracing Based on paper and

Fine-Grained Tracking of Grid Infections Ashish Gehani SRI Basim Baig, Salman Mahmood, Dawood

Addressing Inter-Class Similarity in Fine-Grained Visual Classification Abhimanyu Dubey

Fine-grained Image Recognition Lei Wang VILA group School of Computing and Information

On the Correctness Criteria of Fine-Grained Access Control in Relational Databases Qihua Wang,

MC714 - Sistemas Distribuidos slides by Maarten van Steen (adapted from Distributed System - 3rd

Updating an Embedded System About me Me: Software Engineer at DENX, Gmbh U-Boot

How a Technology Client became 1st in North America to be TMMi Level 3 [Presented by] Suresh

Cloud Storage from a Clients Perspective Binbing Hou, Feng Chen Louisiana State University

Tutorial: Getting Started with Git Introduction to version control Benefits of using Git Basic

Introduction to visualization with Hank Childs, LBNL July 18, 2011

The State Of Docker and Vagrant Tooling in Eclipse Jeff Johnston jjohnstn@redhat.com Roland

Integration of the Italian cache federation within CMS computing model Diego Ciangottini on

Sambuz

Useful Links

Newsletter

Mail Us