cassandra
play

Cassandra A Decentralized Structured Storage System Motivation - PowerPoint PPT Presentation

Cassandra A Decentralized Structured Storage System Motivation Facebook Inbox search: Billions of write per day Geographical distribution of servers and users Data Model A table is a distributed multi-dimensional map indexed by


  1. Cassandra A Decentralized Structured Storage System

  2. Motivation • Facebook Inbox search: – Billions of write per day – Geographical distribution of servers and users

  3. Data Model • A table is a distributed multi-dimensional map indexed by a key • Columns are grouped together into sets called column families

  4. API • insert(table,key,rowMutation) • get(table,key,columnName) • insert(table,key,columnName)

  5. System Architecture: Partitioning • Partitions data across the cluster using consistent hashing • Each node in the system is assigned a random value on the ring space • A data item belong on the first node with a position larger than the item’s position • Only direct neighbour affected by a node • Incoming node alleviate heavily loaded nodes

  6. System Architecture: Replication • Each data item is replicated at N hosts • Coordinator node is in charge of the replication of the data • “Rack Unaware”: use N -1 successors • “Rack Aware” or “Data Centre Aware”: nodes elect a leader who assigns a replica range to every node

  7. System Architecture: Membership • Membership is based on Scuttlebutt: an anti- entropi Gossip based mechanism • Use Failure detection to avoid attempts to communicate with unreachable nodes

  8. System Architecture: Bootstrapping • When a node starts for the first time, it chooses a random token for its position in the ring • This information is then gossiped • When a node needs to join the cluster, it reads its configuration file which contains a few contact points within the cluster

  9. System Architecture: Scaling • When a new node is added, it gets assigned a token such that it can alleviate a heavily loaded node.

  10. System Architecture: Local Persistence • Write: – Use an in-memory data structure – Write to in-memory only performed after successful write into a commit log – When the in-memory data structure goes over a threshold, it dumps itself to disk • Read: – First look at in-memory data – Then check a bloom filter for each file in which the key could be

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend