cockroachdb s survivability model
play

CockroachDBs Survivability Model Scalable, Survivable, Consistent, - PowerPoint PPT Presentation

CockroachDBs Survivability Model Scalable, Survivable, Consistent, SQL presented by Marc Berhault / Engineer @cockroachdb CockroachDB: Make Data Easy Scalable Survivable Strongly Consistent SQL And... Open Source


  1. CockroachDB’s Survivability Model Scalable, Survivable, Consistent, SQL presented by Marc Berhault / Engineer @cockroachdb

  2. CockroachDB: Make Data Easy Scalable ■ Survivable ■ Strongly Consistent ■ SQL ■ And... Open Source ■ @cockroachdb

  3. Agenda Architecture: SQL layer ■ Transactions ■ Sharding ■ Replication ■ Survivability: Rebalancing ■ Repairs ■ @cockroachdb

  4. Architecture @cockroachdb

  5. Architecture (high-level) Abstraction stack: In the network: SQL SQL SQL Transactional KV * * Distribution Storage Storage Replication Storage Store Store Store Store Range Range Range Range Range Range Range Range Range Range Range Range Node 1 Node 2 @cockroachdb

  6. SQL CREATE TABLE inventory ( id INTEGER PRIMARY KEY, SQL Transactional KV name VARCHAR, Distribution quantity INTEGER, Replication INDEX name_index (name)); INSERT INTO inventory VALUES (1, “Apple”, 3); @cockroachdb

  7. SQL: Data model ■ Tables inventory name_index ■ Rows id name quantity name id ■ Columns 1 Apple 3 Apple 1 ■ Indexes 2 Orange 12 Banana 4 3 Cherry 5 Cherry 3 4 Banana 7 Orange 2 @cockroachdb

  8. SQL: Key anatomy INSERT INTO inventory VALUES ( 1 , “Apple”, 3); inventory Key: /<table>/<index>/<key>/<column> Value /inventory/primary/ 1 /name Apple /inventory/primary/ 1 /quantity 3 name_index Key: /<table>/<index>/<key> Value /inventory/name_index/Apple 1 @cockroachdb

  9. Transactional KV: consistency ■ Update all keys atomically ■ Track across multiple SQL commands Transactional KV Distribution ■ Retry when necessary Replication @cockroachdb

  10. Optimistic Concurrency ■ CockroachDB uses optimistic concurrency control for lock-free transactions ■ In case of conflict: the losing transaction restarts @cockroachdb

  11. Distribution: scalability ■ Route KV commands to the appropriate shards SQL ■ Split batches if necessary Transactional KV Distribution Replication @cockroachdb

  12. Sharding: Index Each shard holds a contiguous span of the keyspace Ø-lem lem-pea pea-∞ peach apricot lemon banana lime pear blueberry mango pineapple cherry melon raspberry grape orange strawberry @cockroachdb

  13. Sharding: Index An index maps from key to range ID shard index Ø-lem lem-pea pea-∞ Ø-lem lem-pea pea-∞ peach apricot lemon banana lime pear blueberry mango pineapple cherry melon raspberry grape orange strawberry @cockroachdb

  14. Sharding: Split Split when a shard is too large shard index Ø-lem lem-pea pea-str pea-∞ Ø-lem lem-pea pea-str str-∞ peach strawberry apricot lemon banana lime pear tamarillo blueberry mango pineapple tamarind cherry melon raspberry grape orange @cockroachdb

  15. Replication: survivability ■ Each range is replicated to three or more nodes SQL ■ One replica of each range is the Transactional KV Distribution leader Replication @cockroachdb

  16. Replication Each set of replicas is a ■ Node 1 Node 2 Node 3 Raft group Range 1 Range 1 Range 1 Consistency provided by ■ Range 2 Range 2 quorum Range 2 Range 3 Range 3 Node 4 Range 2 Range 3 @cockroachdb

  17. Replication: Node storage ■ Data is stored locally in RocksDB ■ Embedded KV database ■ Provides atomic writes to multiple keys ■ Supports ordered scans @cockroachdb

  18. Reliability @cockroachdb

  19. Reliability ■ Symmetric nodes ■ Auto-balancing ■ Self-healing @cockroachdb

  20. Reliability: Rebalancing Node 1 Node 2 Node 3 Range 1 Range 1 Range 1 Range 2 Range 2 Range 2 Range 2 Range 3 Range 3 Range 3 @cockroachdb

  21. Reliability: Rebalancing Adding a new Node 1 Node 2 Node 3 (empty) node Range 1 Range 1 Range 1 Range 2 Range 2 Range 2 Range 2 Range 3 Range 3 Range 3 Node 4 @cockroachdb

  22. Reliability: Rebalancing A new replica is Node 1 Node 2 Node 3 allocated, data is Range 1 Range 1 Range 1 Range 2 Range 2 Range 2 copied. Range 2 Range 3 Range 3 Range 3 Node 4 Range 3 @cockroachdb

  23. Reliability: Rebalancing The new replica is Node 1 Node 2 Node 3 made live, replacing Range 1 Range 1 Range 1 Range 2 Range 2 Range 2 another. Range 2 Range 3 Range 3 Range 3 Node 4 Range 3 @cockroachdb

  24. Reliability: Rebalancing The old (inactive) Node 1 Node 2 Node 3 replica is deleted. Range 1 Range 1 Range 1 Range 2 Range 2 Range 2 Range 2 Range 3 Range 3 Node 4 Range 3 @cockroachdb

  25. Reliability: Rebalancing Process continues Node 1 Node 2 Node 3 until nodes are Range 1 Range 1 Range 1 Range 2 Range 2 balanced. Range 2 Range 3 Range 3 Node 4 Range 2 Range 3 @cockroachdb

  26. Reliability: Recovery Node 1 Node 2 Node 3 Range 1 Range 1 Range 1 Range 2 Range 2 Range 2 Range 3 Range 3 Node 4 Range 2 Range 3 @cockroachdb

  27. Reliability: Recovery X Losing a node causes Node 1 Node 2 Node 3 recovery of its Range 1 Range 1 Range 1 Range 2 Range 2 replicas. Range 2 Range 3 Range 3 Node 4 Range 2 Range 3 @cockroachdb

  28. Reliability: Recovery X A new replica gets Node 1 Node 2 Node 3 created on an Range 1 Range 1 Range 1 Range 2 Range 2 existing node. Range 2 Range 3 Range 3 Range 3 Node 4 Range 1 Range 2 Range 3 @cockroachdb

  29. Reliability: Recovery Once at full Node 1 Node 3 replication, the old Range 1 Range 1 Range 2 Range 2 replicas are Range 2 Range 3 Range 3 forgotten. Node 4 Range 1 Range 2 Range 3 @cockroachdb

  30. Zone configuration ■ Replication factor (default 3) ■ Geographical location (eg: 2 in Europe, 1 in US) ■ Machine attributes (ssd vs disk) @cockroachdb

  31. Status: BETA @cockroachdb

  32. Status: Beta Ready for development testing Roadmap: ■ Stability ■ Performance ■ Distributed SQL ■ Optimized JOINs @cockroachdb

  33. Thank You github.com/cockroachdb/cockroach CockroachLabs.com Gitter: cockroachdb @cockroachdb @cockroachdb

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend