Making Non-Distributed Databases, Distributed Ioannis - PowerPoint PPT Presentation

Making Non-Distributed Databases, Distributed Ioannis Papapanagiotou, PhD Shailesh Birari

Dynomite Ecosystem ● Dynomite - Proxy layer ● Dyno - Client ● Dynomite-manager - Ecosystem orchestrator ● Dynomite-explorer - UI

Problems & Observations ● Needed a data store: o Scalable & highly available o High throughput, low latency o Netflix use case is active-active ● Master-slave storage engines: o Do not support bi-directional replication o Cannot withstand a Monkey attack o Cannot easily perform maintenance

What is Dynomite? A framework that makes non-distributed data stores, distributed. Can be used with many key-value storage engines Features : highly available, automatic failover, node warmup, tunable consistency, backups/restores

Dynomite @ Netflix ● Running around 2.5 years in PROD ● 70 clusters ● ~1000 nodes used by internal microservices ● Microservices based on Java, Python, NodeJS

Pluggable Storage Engines ● Layer on top of a non- distributed key value data store RESP RESP ○ Peer-peer, Shared Nothing ○ Auto-Sharding ○ Multi-datacenter ○ Linear scale ○ Replication ○ Gossiping

Topology ● Each rack contains one copy of data, partitioned across multiple nodes in that rack ● Multiple Racks == Higher Availability (HA)

Replication ● A client can connect to any node on the Dynomite cluster when sending requests. If node owns the data, o ▪ data are written in local data-store and asynchronously replicated. If node does not own the data o node acts as a coordinator ▪ and sends the data in the same rack & replicates to other nodes in other racks and DC.

Dyno Client - Java API ● Connection Pooling ● Load Balancing ● Effective failover ● Pipelining ● Scatter/Gather ● Metrics, e.g. Netflix Insights

Dyno Load Balancing ● Dyno client employs token aware load balancing. ● Dyno client is aware of the cluster topology of Dynomite within the region, can write to specific node using consistent hashing.

Dyno Failover ● Dyno will route requests to different racks in failure scenarios.

Dynomite on the Cloud RESP

Moving across engines Rack A Rack B

Dynomite-manager: Warm up 1. Dynomite-manager identifies which node has the same token in the same DC 2. Leverage master/slave replication 3. Checks for peer syncing a. difference between master and slave offset 4. Once master and slave are in sync, Dynomite is set to allow write only 5. Dynomite is set back to normal state 6. Checks for health of the node - Done!

Dynomite-Explorer (UI) • Node.js web app with a Polymer-based user-interface • Support Redis’ rich data types • Avoid operations that can negatively impact Redis server performance • Extended for Dynomite awareness • Allow extension of the server to integrate with the Netflix ecosystem

Dynomite-Explorer

Roadmap ● Data reconciliation & repair v2 ● Optimizations of RocksDB configuration ● Optimizing backups through SST ● Others….

More information • Netflix OSS: • https://github.com/Netflix/dynomite • https://github.com/Netflix/dyno • https://github.com/Netflix/dynomite- manager • Chat: https://gitter.im/Netflix/dynomite

Dynomite: S3 backups/restores ● Why? Disaster recovery o Data corruption o ● How? Storage dumps data on the instance drive o Dynomite-manager sends data to S3 buckets o ● Data per node are not large so no need for incrementals. ● Use case: clusters that use Dynomite as a storage layer o Not enabled in clusters that have short TTL or use Dynomite as a o cache

Dynomite-manager ● Token management for multi-region deployments ● Support AWS environment ● Automated security group update in multi-region environment ● Monitoring of Dynomite and the underlying storage engine ● Node cold bootstrap (warm up) ● S3 backups and restores ● REST API

Performance Setup ● Instance Type: ○ Dynomite: i3.2xlarge with NVMe ○ NDBench: m2.2xls (typical of an app@Netflix) ● Replication factor: 3 ○ Deployed Dynomite in 3 zones in us-east-1 ○ Every zone had the same number of servers ● Demo app used simple workloads key/value pairs ○ Redis: GET and SET ● Payload ○ Size: 1024 Bytes ○ 80%/20% reads over writes

Throughput

Latencies

Consistency ● DC_ONE Reads and writes are propagated synchronously only to the node in local rack o and asynchronously replicated to other racks and data centers ● DC_QUORUM Reads and writes are propagated synchronously to quorum number of nodes o in the local data center and asynchronously to the rest. The DC_QUORUM configuration writes to the number of nodes that make up a quorum. A quorum is calculated, and then rounded down to a whole number. If all responses are different the first response that the coordinator received is returned. ● DC_SAFE_QUORUM Similarly to DC_QUORUM, but the operation succeeds only if the read/write o succeeded on a quorum number of nodes and the data checksum matches . If the quorum has not been achieved then an error response is generated by Dynomite.

Deploying Dynomite in PROD ● Unit testing in Github ● Building EC2 AMI in “experimental” ● Pipelines for performance analysis ● Promotion to “candidate” ● Beta Testing ● Promotion to “release”

Reconciliation ● Reconciliation is based timestamps (newest wins) and is performed by a Spark cluster ● Jenkins job to avoid clock skewness

Reconciliation: Design Principles We would prefer to alleviate the processing load of performing the reconciliation from each node in the cluster and off load it to a high performance computation in memory cluster based on Spark.

Reconciliation: Architecture ● Forcing Redis (or any other storage engine) to dump data to the disk ● Encrypted communication between Dynomite and Spark cluster ● Chunking the data - retry in case of a failure. ● Bandwidth Throttler

Making Non-Distributed Databases, Distributed Ioannis - PowerPoint PPT Presentation

Making Non-Distributed Databases, Distributed Ioannis Papapanagiotou, PhD Shailesh Birari Dynomite Ecosystem Dynomite - Proxy layer Dyno - Client Dynomite-manager - Ecosystem orchestrator Dynomite-explorer - UI Problems &

Unclouding Pollution Maps Ioannis Konstantinidis February 21, 2014 Ioannis Konstantinidis FFT

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Module 3: Creating and Managing Databases Overview Creating Databases Creating

Distributed Databases Distributed database management system A distributed database (DDB) is

DISTRIBUTED DATABASES CHAPTER 25 LECTURE OVERVIEW What are distributed databases?

GEMS/Food Databases and GEMS/Food Databases and GEMS/Food Databases and in the Food Supply

Image Databases Image Databases Image Databases Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Lecture 10: Larger-than-Memory Databases 1 / 53 Larger-than-Memory Databases Recap

Databases and PHP Accessing databases from PHP PHP & Databases l PHP can connect to

3. Text and document databases Normal databases: formatted records; document databases:

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

CS377: Database Systems Distributed Databases Distributed Databases

Making maps pretty Andrea Aime Jim Groffen Making Maps Pretty Making Maps Pretty 1 1 Making

Neo4j and graph databases Presented By: Stephanie McIntyre Graph Databases: The Database Model

Adapting TCP for Recon fj gurable Datacenter Networks Matthew K. Mukerjee* , Christopher Canel*

Approaches for Resilience Against Cascading Fail ilures in in Clo loud Datacenters Haoyu Wang,

What Can We Learn from Four Years of Data Center Hardware Failures? Guosai Wang, Lifei Zhang, Wei

DUNE DAQ-CF interface slides by Giles Barr for Technical

Power Rack Oven Bringing the oven to the 21 st century. Yellow

Equipment Configuration Objective At the end of the module you will be able to prepare ISAM

Fault Domains in Mesos Vinod Kone (vinodkone@apache.org) About me Apache Mesos PMC and

MRG - AMQP trading system in a rack Carl Trieloff Senior Consulting Software Engineer/ Director

Sambuz

Useful Links

Newsletter

Mail Us

Making Non-Distributed Databases, Distributed Ioannis - PowerPoint PPT Presentation

Making Non-Distributed Databases, Distributed Ioannis Papapanagiotou, PhD Shailesh Birari Dynomite Ecosystem Dynomite - Proxy layer Dyno - Client Dynomite-manager - Ecosystem orchestrator Dynomite-explorer - UI Problems &

Unclouding Pollution Maps Ioannis Konstantinidis February 21, 2014 Ioannis Konstantinidis FFT

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Module 3: Creating and Managing Databases Overview Creating Databases Creating

Distributed Databases Distributed database management system A distributed database (DDB) is

DISTRIBUTED DATABASES CHAPTER 25 LECTURE OVERVIEW What are distributed databases?

GEMS/Food Databases and GEMS/Food Databases and GEMS/Food Databases and in the Food Supply

Image Databases Image Databases Image Databases Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Lecture 10: Larger-than-Memory Databases 1 / 53 Larger-than-Memory Databases Recap

Databases and PHP Accessing databases from PHP PHP &amp; Databases l PHP can connect to

3. Text and document databases Normal databases: formatted records; document databases:

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

CS377: Database Systems Distributed Databases Distributed Databases

Making maps pretty Andrea Aime Jim Groffen Making Maps Pretty Making Maps Pretty 1 1 Making

Neo4j and graph databases Presented By: Stephanie McIntyre Graph Databases: The Database Model

Adapting TCP for Recon fj gurable Datacenter Networks Matthew K. Mukerjee* , Christopher Canel*

Approaches for Resilience Against Cascading Fail ilures in in Clo loud Datacenters Haoyu Wang,

What Can We Learn from Four Years of Data Center Hardware Failures? Guosai Wang, Lifei Zhang, Wei

DUNE DAQ-CF interface slides by Giles Barr for Technical

Power Rack Oven Bringing the oven to the 21 st century. Yellow

Equipment Configuration Objective At the end of the module you will be able to prepare ISAM

Fault Domains in Mesos Vinod Kone (vinodkone@apache.org) About me Apache Mesos PMC and

MRG - AMQP trading system in a rack Carl Trieloff Senior Consulting Software Engineer/ Director

Sambuz

Useful Links

Newsletter

Mail Us

Databases and PHP Accessing databases from PHP PHP & Databases l PHP can connect to