Cassandra
Jonathan Ellis
Cassandra Jonathan Ellis Motivation Scaling reads to a relational - - PowerPoint PPT Presentation
Cassandra Jonathan Ellis Motivation Scaling reads to a relational database is hard Scaling writes to a relational database is virtually impossible and when you do, it usually isn't relational anymore The new face of data
Cassandra
Jonathan Ellis
Motivation
hard
virtually impossible
anymore
The new face of data
CAP theorem
Partition tolerance
T wo famous papers
structured data, 2006
value store, 2007
T wo approaches
db on top of GFS?”
hash table appropriate for the data center?”
10,000 ft summary
similar to Bigtable's
Cassandra highlights
unable tradeoffs between consistency and latency
Dynamo architecture & Lookup
Architecture details
Architecture layers
Messaging service Gossip Failure detection Cluster state Partitioner Replication Commit log Memtable SST able Indexes Compaction T
Hinted handoff Read repair Bootstrap Monitoring Admin tools
Writes
able
Memtable / SST able
Commit log
Disk
SST able format
SST able Indexes
(Similar to Hadoop MapFile / Tfile)
Compaction
Remove
to suppress data in older SST ables, until compaction
more
tombstone GC, after which tombstones are not repaired
Cassandra write properties
Read path
background and perform read repair
Cassandra read properties
ables
Consistency in a BASE world
vs MySQL with 50GB of data
Data model
ColumnFamilies
keyA column1 column2 column3 keyC column1 column7 column11 Column Byte[] Name Byte[] Value I64 timestamp
Super ColumnFamilies
keyF Super1 Super2 keyJ Super1 Super5 column column column column column column column column column column column column
T ypes of queries
Range queries
Modification
Thrift
struct Column { 1: binary name, 2: binary value, 3: i64 timestamp, } struct SuperColumn { 1: binary name, 2: list<Column> columns, } Column get_column(table, key, column_path, block_for=1) list<string> get_key_range(table, column_family, start_with="", stop_at="", max_results=100) void insert(table, key, column_path, value, timestamp, block_for=0) void remove(tablename, key, column_path_or_parent, timestamp)
Honestly, Thrift kinda sucks
Example: a multiuser blog
T wo queries
given blog, in reverse chronological order
chronological order
First try
JBE blog Cassandra is teh awesome BASE FTW Evan blog I like kittens And Ruby post comment comment post comment comment post comment comment post comment comment
<ColumnFamily T ype="Super" CompareWith="TimeString" CompareSubcolumnsWith="UUID" Name="Blog"/>
Second try
<ColumnFamily CompareWith="UUIDT ype" Name="Blog"/>
JBE blog Cassandra is teh awesome BASE FTW Evan blog I like kittens And Ruby Cassandr a is teh awesome comment comment Base FTW comment comment I like kittens comment comment And Ruby comment comment
<ColumnFamily CompareWith="UUIDT ype" Name="Comment"/>
Roadmap
Cassandra 0.3
est suite
Cassandra 0.4
Cassandra 0.5
Users
Production: facebook, RocketFuel Production RSN: Digg, Rackspace No date yet: IBM Research, T witter Evaluating: 50+ in #cassandra on freenode
More
http://www.allthingsdistributed.com/2008/1
T
http://www.vimeo.com/5145059
http://wiki.apache.org/cassandra/ArticlesAn