CockroachDB Scalable, survivable, strongly consistent, SQL - - PowerPoint PPT Presentation

cockroachdb
SMART_READER_LITE
LIVE PREVIEW

CockroachDB Scalable, survivable, strongly consistent, SQL - - PowerPoint PPT Presentation

CockroachDB Scalable, survivable, strongly consistent, SQL presented by Ben Darnell / CTO About Me Co-founder of Cockroach Labs Previously at Google, Dropbox, Square @cockroachdb Agenda Motivation High-level architecture


slide-1
SLIDE 1

Scalable, survivable, strongly consistent, SQL

CockroachDB

presented by Ben Darnell / CTO

slide-2
SLIDE 2

@cockroachdb

About Me

  • Co-founder of Cockroach Labs
  • Previously at Google, Dropbox, Square
slide-3
SLIDE 3

@cockroachdb

  • Motivation
  • High-level architecture
  • Some CockroachDB Features
  • Q & A
  • Interruptions are encouraged!

Agenda

slide-4
SLIDE 4

@cockroachdb

Motivation

slide-5
SLIDE 5

@cockroachdb

Limitations of Existing Databases

Relational

Hard to scale horizontally

NoSQL

Scalability with strings attached

OR

  • Limited transactions: developer

burden due to complex data modeling

  • Limited indexes: lost flexibility

with querying and analytics

  • Eventual consistency:

correctness issues and higher risk of data corruption

  • Scalability: manual sharding

results in high operational complexity and application rewrites

  • Replication: wasted resources

(stand-by servers) or lost consistency (asynchronous replication)

slide-6
SLIDE 6

@cockroachdb

CockroachDB: The Best of Both Worlds

  • Single binary/symmetric nodes
  • Applications see one logical DB, including cross-datacenter, global
  • Self-healing/self-balancing
  • Scale out is as simple as adding nodes
  • SQL
slide-7
SLIDE 7

@cockroachdb

High-Level Architecture

slide-8
SLIDE 8

@cockroachdb

Abstraction Stack

SQL Transactional KV Distribution Replication Storage

slide-9
SLIDE 9

@cockroachdb

Transactional KV

SQL Transactional KV Distribution Replication

  • Monolithic sorted key-value map
  • Automatically replicated and distributed
  • Consistent
  • Self-healing
slide-10
SLIDE 10

@cockroachdb

Transactional KV: ACID

SQL Transactional KV Distribution Replication

  • Atomicity. All operations or no operations.
  • Consistency. No violating constraints.
  • Isolation. Exclusive database access.
  • Durability. Committed data survives crashes.
slide-11
SLIDE 11

@cockroachdb

SQL: Structured Data Model

Inventory

  • Tables
slide-12
SLIDE 12

@cockroachdb

SQL: Structured Data Model

  • Tables
  • Rows

Inventory

slide-13
SLIDE 13

@cockroachdb

SQL: Structured Data Model

ID Name Quantity 1 Glove 1 2 Ball 4 3 Shirt 2 4 Shorts 12 5 Bat 6 Shoes 4

  • Tables
  • Rows
  • Columns

Inventory

slide-14
SLIDE 14

@cockroachdb

SQL: Structured Data Model

Name Ball Bat Glove Shirt Shoes Shorts ID Name Quantity 1 Glove 1 2 Ball 4 3 Shirt 2 4 Shorts 12 5 Bat 6 Shoes 4

  • Tables
  • Rows
  • Columns
  • Indexes

Inventory Name_Idx

slide-15
SLIDE 15

@cockroachdb

SQL

SQL Transactional KV Distribution Replication

CREATE TABLE inventory ( id INTEGER PRIMARY KEY, name VARCHAR, quantity INTEGER, INDEX name_index (name) );

slide-16
SLIDE 16

@cockroachdb

SQL: Key anatomy

key /<table>/<index>/<key>/<column> Value /inventory/primary/1/name Apple /inventory/primary/1/quantity 12 /inventory/primary/2/name Orange /inventory/primary/2/quantity 15 id name quantity 1 Apple 12 2 Orange 15

=

INSERT INTO inventory VALUES (1, ‘Apple’, 12); INSERT INTO inventory VALUES (2, ‘Orange’, 15);

slide-17
SLIDE 17

@cockroachdb

Distribution: Sharding

The data is split into ~64MB ranges. Each holds a contiguous range of the key space.

Ø-lem

apricot banana blueberry cherry grape

lem-pea

lemon lime mango melon

  • range

pea-∞

peach pear pineapple raspberry strawberry

slide-18
SLIDE 18

@cockroachdb

Distribution: Index

Ø-lem

apricot banana blueberry cherry grape

lem-pea

lemon lime mango melon

  • range

pea-∞

peach pear pineapple raspberry strawberry Ø-lem lem-pea pea-∞

shard index

An index maps from key to range ID

slide-19
SLIDE 19

@cockroachdb

Distribution: Split

Ø-lem

apricot banana blueberry cherry grape

lem-pea

lemon lime mango melon

  • range

pea-str

peach pear pineapple raspberry Ø-lem lem-pea

shard index str-∞

strawberry tamarillo tamarind str-∞ pea-str

Split when a range is too large (or too hot, or…)

slide-20
SLIDE 20

@cockroachdb

Replication: Survivability

SQL Transactional KV Distribution Replication

  • Each range is replicated to three or more

nodes

  • Consensus via Raft
  • "Leaseholder" optimization to allow reads

to be served without consensus

  • Multi-Version Concurrency Control
slide-21
SLIDE 21

@cockroachdb

Data Distribution: Placement

Node 1

Range 1 Range 2

Node 2

Range 1 Range 2

Node 3

Range 1

Range 2

Range 3 Range 3 Range 2 Range 3

Each range is replicated to three or more nodes

slide-22
SLIDE 22

@cockroachdb

Data Distribution: Rebalancing

Node 1

Range 1 Range 2

Node 2

Range 1 Range 2

Node 3

Range 1

Node 4

Range 2

Range 3 Range 3 Range 2 Range 3

Adding a new (empty) node

slide-23
SLIDE 23

@cockroachdb

Data Distribution: Rebalancing

Node 1

Range 1 Range 2

Node 2

Range 1 Range 2

Node 3

Range 1

Node 4

Range 2

Range 3 Range 3 Range 2 Range 3 Range 3

A new replica is allocated, data is copied.

slide-24
SLIDE 24

@cockroachdb

Data Distribution: Rebalancing

Node 1

Range 1 Range 2

Node 2

Range 1 Range 2

Node 3

Range 1

Node 4

Range 2

Range 3 Range 3 Range 2 Range 3 Range 3

The new replica is made live, replacing another.

slide-25
SLIDE 25

@cockroachdb

Data Distribution: Rebalancing

Node 1

Range 1 Range 2

Node 2

Range 1 Range 2

Node 3

Range 1

Node 4

Range 2

Range 3 Range 3 Range 2 Range 3

The old (inactive) replica is deleted.

slide-26
SLIDE 26

@cockroachdb

Data Distribution: Rebalancing

Node 1

Range 1 Range 2 Range 2

Node 2 Node 3

Range 1

Node 4

Range 2

Range 3 Range 1 Range 3 Range 2 Range 3

Process continues until nodes are balanced.

slide-27
SLIDE 27

@cockroachdb

Data Distribution: Recovery

Node 1

Range 1 Range 2 Range 2

Node 2 Node 3

Range 1

Node 4

Range 2

Range 3 Range 1 Range 3 Range 2 Range 3

X

Losing a node causes recovery of its replicas.

slide-28
SLIDE 28

@cockroachdb

Data Distribution: Recovery

Node 1

Range 1 Range 2 Range 2

Node 2 Node 3

Range 1

Node 4

Range 2

Range 3 Range 1 Range 3 Range 2 Range 3

X

Range 1 Range 3

A new replica gets created on an existing node.

slide-29
SLIDE 29

@cockroachdb

Data Distribution: Recovery

Node 1

Range 1 Range 2 Range 2

Node 3

Range 1

Node 4

Range 2

Range 3 Range 2 Range 3 Range 1 Range 3

Once at full replication, the old replicas are forgotten.

slide-30
SLIDE 30

@cockroachdb

Some CockroachDB Features

slide-31
SLIDE 31

@cockroachdb

Geographic Zone Configurations

  • Control where your data is
  • Nodes are tagged with attributes and hierarchical localities
  • Rules target these
  • Zero downtime data migrations
slide-32
SLIDE 32

@cockroachdb

Geo-Partitioning

■ Domicile data according to customer

○Meet regulatory constraints ○Low-latency reads / writes

■ One logical database

○Simplified app development

slide-33
SLIDE 33

@cockroachdb

Distributed SQL

SELECT l_shipmode, AVG(l_extendedprice) FROM lineitem GROUP BY l_shipmode;

slide-34
SLIDE 34

@cockroachdb

Online Schema Changes

  • Based on Google's F1 Paper
  • State machine, possibly with backfill
  • Zero downtime
slide-35
SLIDE 35

Questions?

jobs@cockroachlabs.com github.com/cockroachdb www.cockroachlabs.com

slide-36
SLIDE 36

@cockroachdb

Other Topics

  • (New in 2.1) Query optimizer
  • Testing with Jepsen
  • Graphical admin UI
  • Distributed import
slide-37
SLIDE 37

@cockroachdb

Backup/Restore

  • Distributed
  • Consistent to a point in time
  • Incremental