cockroachdb
play

CockroachDB Scalable, survivable, strongly consistent, SQL - PowerPoint PPT Presentation

CockroachDB Scalable, survivable, strongly consistent, SQL presented by Ben Darnell / CTO About Me Co-founder of Cockroach Labs Previously at Google, Dropbox, Square @cockroachdb Agenda Motivation High-level architecture


  1. CockroachDB Scalable, survivable, strongly consistent, SQL presented by Ben Darnell / CTO

  2. About Me • Co-founder of Cockroach Labs • Previously at Google, Dropbox, Square @cockroachdb

  3. Agenda ● Motivation ● High-level architecture ● Some CockroachDB Features ● Q & A ● Interruptions are encouraged! @cockroachdb

  4. Motivation @cockroachdb

  5. Limitations of Existing Databases Relational NoSQL Hard to scale horizontally Scalability with strings attached ● ● Scalability: manual sharding Limited transactions: developer results in high operational burden due to complex data complexity and application modeling OR ● rewrites Limited indexes: lost flexibility ● Replication: wasted resources with querying and analytics ● (stand-by servers) or lost Eventual consistency: consistency (asynchronous correctness issues and higher replication) risk of data corruption @cockroachdb

  6. CockroachDB: The Best of Both Worlds • Single binary/symmetric nodes • Applications see one logical DB, including cross-datacenter, global • Self-healing/self-balancing • Scale out is as simple as adding nodes • SQL @cockroachdb

  7. High-Level Architecture @cockroachdb

  8. Abstraction Stack SQL Transactional KV Distribution Replication Storage @cockroachdb

  9. Transactional KV • Monolithic sorted key-value map SQL • Automatically replicated and distributed Transactional KV • Consistent Distribution • Self-healing Replication @cockroachdb

  10. Transactional KV: ACID • Atomicity. All operations or no operations. SQL • Consistency. No violating constraints. Transactional KV • Isolation. Exclusive database access. Distribution • Durability. Committed data survives crashes. Replication @cockroachdb

  11. SQL: Structured Data Model Inventory ● Tables @cockroachdb

  12. SQL: Structured Data Model Inventory ● Tables ● Rows @cockroachdb

  13. SQL: Structured Data Model Inventory ID Name Quantity ● Tables 1 Glove 1 ● Rows ● Columns 2 Ball 4 3 Shirt 2 4 Shorts 12 5 Bat 0 6 Shoes 4 @cockroachdb

  14. SQL: Structured Data Model Name_Idx Inventory Name ID Name Quantity ● Tables Ball 1 Glove 1 ● Rows ● Columns Bat 2 Ball 4 ● Indexes Glove 3 Shirt 2 Shirt 4 Shorts 12 Shoes 5 Bat 0 Shorts 6 Shoes 4 @cockroachdb

  15. SQL CREATE TABLE inventory ( SQL id INTEGER PRIMARY KEY, Transactional KV name VARCHAR, Distribution quantity INTEGER, Replication INDEX name_index (name) ); @cockroachdb

  16. SQL: Key anatomy INSERT INTO inventory VALUES ( 1 , ‘Apple’ , 12 ); INSERT INTO inventory VALUES ( 2 , ‘Orange’ , 15 ); id name quantity key /<table>/<index>/<key>/<column> Value 1 Apple 12 / inventory /primary/ 1 / name Apple = / inventory /primary/ 1 / quantity 12 2 Orange 15 / inventory /primary/ 2 / name Orange / inventory /primary/ 2 / quantity 15 @cockroachdb

  17. Distribution: Sharding The data is split into ~64MB ranges . Each holds a contiguous range of the key space. Ø-lem lem-pea pea-∞ peach apricot lemon banana lime pear blueberry mango pineapple cherry melon raspberry grape orange strawberry @cockroachdb

  18. Distribution: Index An index maps from key to range ID shard index Ø-lem lem-pea pea-∞ Ø-lem lem-pea pea-∞ peach apricot lemon banana lime pear blueberry mango pineapple cherry melon raspberry grape orange strawberry @cockroachdb

  19. Distribution: Split Split when a range is too large (or too hot, or…) shard index Ø-lem lem-pea pea-str str-∞ Ø-lem lem-pea pea-str str-∞ peach strawberry apricot lemon banana lime pear tamarillo blueberry mango pineapple tamarind cherry melon raspberry grape orange @cockroachdb

  20. Replication: Survivability ● Each range is replicated to three or more SQL nodes Transactional KV ● Consensus via Raft Distribution ● "Leaseholder" optimization to allow reads Replication to be served without consensus ● Multi-Version Concurrency Control @cockroachdb

  21. Data Distribution: Placement Node 1 Node 2 Node 3 Range 1 Range 1 Range 1 Each range is replicated Range 2 Range 2 Range 2 to three or more nodes Range 2 Range 3 Range 3 Range 3 @cockroachdb

  22. Data Distribution: Rebalancing Node 1 Node 2 Node 3 Range 1 Range 1 Range 1 Adding a new (empty) Range 2 Range 2 Range 2 node Range 2 Range 3 Range 3 Range 3 Node 4 @cockroachdb

  23. Data Distribution: Rebalancing Node 1 Node 2 Node 3 Range 1 Range 1 Range 1 A new replica is Range 2 Range 2 Range 2 allocated, data is Range 2 Range 3 Range 3 Range 3 copied. Node 4 Range 3 @cockroachdb

  24. Data Distribution: Rebalancing Node 1 Node 2 Node 3 Range 1 Range 1 Range 1 The new replica is made Range 2 Range 2 Range 2 live, replacing another. Range 2 Range 3 Range 3 Range 3 Node 4 Range 3 @cockroachdb

  25. Data Distribution: Rebalancing Node 1 Node 2 Node 3 Range 1 Range 1 Range 1 The old (inactive) replica Range 2 Range 2 Range 2 is deleted. Range 2 Range 3 Range 3 Node 4 Range 3 @cockroachdb

  26. Data Distribution: Rebalancing Node 1 Node 2 Node 3 Range 1 Range 1 Range 1 Process continues until Range 2 Range 2 nodes are balanced. Range 2 Range 3 Range 3 Node 4 Range 2 Range 3 @cockroachdb

  27. Data Distribution: Recovery X Node 1 Node 2 Node 3 Range 1 Range 1 Range 1 Losing a node causes Range 2 Range 2 recovery of its replicas. Range 2 Range 3 Range 3 Node 4 Range 2 Range 3 @cockroachdb

  28. Data Distribution: Recovery X Node 1 Node 2 Node 3 Range 1 Range 1 Range 1 A new replica gets Range 2 Range 2 created on an existing Range 2 Range 3 Range 3 Range 3 node. Node 4 Range 1 Range 2 Range 3 @cockroachdb

  29. Data Distribution: Recovery Node 1 Node 3 Range 1 Range 1 Once at full replication, Range 2 Range 2 the old replicas are Range 2 Range 3 Range 3 forgotten. Node 4 Range 1 Range 2 Range 3 @cockroachdb

  30. Some CockroachDB Features @cockroachdb

  31. Geographic Zone Configurations ● Control where your data is ● Nodes are tagged with attributes and hierarchical localities ● Rules target these ● Zero downtime data migrations @cockroachdb

  32. Geo-Partitioning ■ Domicile data according to customer ○ Meet regulatory constraints ○ Low-latency reads / writes ■ One logical database ○ Simplified app development @cockroachdb

  33. Distributed SQL SELECT l_shipmode, AVG(l_extendedprice) FROM lineitem GROUP BY l_shipmode; @cockroachdb

  34. Online Schema Changes • Based on Google's F1 Paper • State machine, possibly with backfill • Zero downtime @cockroachdb

  35. Questions? jobs@cockroachlabs.com github.com/cockroachdb www.cockroachlabs.com

  36. Other Topics • (New in 2.1) Query optimizer • Testing with Jepsen • Graphical admin UI • Distributed import @cockroachdb

  37. Backup/Restore • Distributed • Consistent to a point in time • Incremental @cockroachdb

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend