The State of Databases in 2019
Dinesh A. Joshi
@dineshjoshi dinesh.joshi@gatech.eduapache cassandra
The State of Databases in 2019 Dinesh A. Joshi @dineshjoshi - - PowerPoint PPT Presentation
The State of Databases in 2019 Dinesh A. Joshi @dineshjoshi dinesh.joshi@gatech.edu apache cassandra About Me Senior Software Engineer Apache Cassandra Committer > 10 YoE in Distributed Systems MS CS (Distributed Systems),
The State of Databases in 2019
Dinesh A. Joshi
@dineshjoshi dinesh.joshi@gatech.eduapache cassandra
About Me
Data Trends 📋
Data Growth
Source: https://www.seagate.com/www-content/our-story/trends/files/Seagate-WP-DataAge2025-March-2017.pdfData Criticality
Source: https://www.seagate.com/www-content/our-story/trends/files/Seagate-WP-DataAge2025-March-2017.pdfData Growth Fuel ⛽
Time Series!
Database Landscape 2019
Choices? 🧑
Operators & Developers
Operators & Developers
Developers Operators Both
Not always aligned!
Cascading Costs
DB Access Layer Services (REST, GRPC) UI / Presentation $$$ $
Polyglot Persistence
Source: https://en.wikipedia.org/wiki/Polyglot_persistencePolyglot persistence is the concept of using different data storage technologies to handle different data storage needs within a given software application – Wikipedia
Polyglot Persistence
Source: https://www.infoq.com/presentations/microservices-polyglot-persistencePolyglot Persistence
Source: https://www.infoq.com/presentations/microservices-polyglot-persistenceDatabase Landscape 2019
Landscape 2019
Relational Databases
Relational Databases
Relational Databases
NoSQL Databases
NoSQL Databases
NoSQL Databases
LevelDB
Industry Trends
SQL
Source: Google TrendsRelational
Source: https://db-engines.com/en/ranking_categoriesGraph vs Relational
Source: https://db-engines.com/en/ranking_categoriesTime Series, Wide Column DBs
Source: https://db-engines.com/en/ranking_categoriesPopularity Trends
Source: https://db-engines.com/en/ranking_categoriesPopularity Trends (All)
Source: https://db-engines.com/en/ranking_categoriesApache Cassandra
apache cassandra
Manage massive amounts of data, fast, without losing sleep!
apache cassandra
Source: http://cassandra.apache.org/What is MASSIVE Scale?
apache cassandra
DURABLE
Source: http://cassandra.apache.org/What is FAST?
apache cassandra
Source: http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdfLINEAR SCALABILITY
RELIABILITY?
apache cassandra
Cassandra Origins
Dynamo BigTable
apache cassandra
CAP Theorem
apache cassandraConsistency Availability Partition Tolerance
Apache Cassandra 4.0
What's new?
apache cassandra
Cassandra 4.0 Changes
Reliability & Stability 🐏
Scalability
Throughput vs Cluster Size
Throughput (RPS) Cluster Size (# of nodes)~1000 nodes
Time to recover (4.0 vs 3.x)
Time to recover (minutes) 20 40 60 80 100 120 AWS Instance Type i3.2xl i3.4xl i3.8xltrunk 3.x
Source: https://issues.apache.org/jira/browse/CASSANDRA-14765Time to recover (4.0 vs 3.x)
Source: https://issues.apache.org/jira/browse/CASSANDRA-14765Netty OpenSSL vs JDK SSL
Source: https://speakerdeck.com/normanmaurer/netty-one-framework-to-rule-them-all?slide=29Cassandra Networking (4.0 vs Pre 4.0)
Contribute
Questions?