A Cloud-native Architecture for Replicated Data Services
Hemant Saxena, Jeffery Pound University of Waterloo, SAP Labs Waterloo
A Cloud-native Architecture for Replicated Data Services Hemant - - PowerPoint PPT Presentation
A Cloud-native Architecture for Replicated Data Services Hemant Saxena, Jeffery Pound University of Waterloo, SAP Labs Waterloo Outline Problem overview Solution overview Kafka Cassandra Evaluation 2 Problem overview Cloud
Hemant Saxena, Jeffery Pound University of Waterloo, SAP Labs Waterloo
○ Kafka ○ Cassandra
2
➢
Cloud has become de facto standard for deploying applications
➢
However, applications designed for on-premise infrastructure find it challenging to leverage the Cloud storage efficiently, because:
○ Data replication for on-premise provides fault-tolerance (FT) and high availability (HA) ○ Whereas, Cloud storage already uses replication to provides FT and HA ○ Making application’s replication redundant resulting into additional storage cost
3
replica-set
client
Replicated application
4
replica-set
client
replication (replica-set)
replication
redundant replicas
additional storage cost Replicated application Storage service
5
We ask the following research question...
6
○ Kafka ○ Cassandra
7
application-level replication)
redundant replication
available.
replica-set
8
➢ We show how a well-known main-delta architecture can be used to leverage cloud storage efficiently
○ i.e. ensure no redundant replication ○ while maintaining the fault-tolerance and availability guarantees of the applications ➢
We show that incorporating main-delta architecture in existing on-premise applications is easy
○ by controlling how buffers are managed and flushed to storage ○ and it is compatible with the whole spectrum of replication strategies
9
➢
Originally designed for efficiently handling mixed read/update workloads
➢
Two parts
○ Static, read-only, read optimized main ○ Small, write-optimized delta ○ Deltas are merged with the main at regular intervals
10
maintained by application
Cloud storage (which is fault-tolerant)
replica-set
11
M M M
maintained by application
Cloud storage (which is fault-tolerant)
replica-set
How to merge the deltas?
12
M M M
➢
Details are in how the delta is merged to the main such that
○ No data is lost from any deltas ○ And applications have same guarantees as on-premise deployment ➢
Delta-merge strategy depends on the replication strategy
○ Single primary node means single delta to merge ○ Multiple primary nodes means multiple deltas to merge
13
Request-handler replica-set
▪Write to primary, read from any: ▪Write to any, read from any (e.g. quorum): ▪Write to primary, read from primary:
Request-handler Request-handler replica-set replica-set 14
deltas, on-disk data as main.
delta to main. Other replicas will discard their deltas when they are full.
new primary node takes the responsibility of merging deltas.
replica-set
15
M M M
be easily leveraged as delta and main.
the delta is tricky:
○ Each node can have different set
16
replica-set
M M M
cloud storage
combines the deltas and merges it to the main
17
○ Kafka ○ Cassandra
18
keeping the performance same
○ Used real Cloud infrastructure - Amazon Web Services (AWS) ○ Tested different types of storage types - EBS and EFS
19
○ md-kafka: main-delta architecture based Kafka implementation ○ kafka: vanilla Kafka
○ Replication factor 3x ○ Savings by design
storage (EBS)
EFS storage, due to batching
20
○ md-cassandra-efs: main-delta based Cassandra using EFS storage ○ cassandra-ebs: vanilla Cassandra using EBS ○ cassandra-efs: vanilla Cassandra using EFS
○ With replication factor of 3x
types of workloads
21
➢
Existing on-premise applications (with replication) when deployed on cloud ends up with redundant replication
➢
We proposed a main-delta based cloud-native architecture to solve this problem
○ Allowing for storage cost savings up to factor of k (applications replication factor) ➢
We show our approach is general enough to work with the complete spectrum of replication strategies
○ Simplest strategy: single primary (Kafka case study) ○ Complex strategy: quorum based systems(Cassandra case study)
22
Contact for any follow-up questions: Hemant Saxena email: hemant.saxena@uwaterloo.ca
23