Cloud Data Management Felix Gessert December 18, 2018, Universitt - - PowerPoint PPT Presentation

cloud data management
SMART_READER_LITE
LIVE PREVIEW

Cloud Data Management Felix Gessert December 18, 2018, Universitt - - PowerPoint PPT Presentation

Low Latency for Cloud Data Management Felix Gessert December 18, 2018, Universitt Hamburg, DBIS Group Presentation is loading Web Performance An Open Challenge With a Huge Impact = $1.7 Billion 100 ms faster +1% Revenue = $19 Billion


slide-1
SLIDE 1

Low Latency for Cloud Data Management

Felix Gessert December 18, 2018, Universität Hamburg, DBIS Group

slide-2
SLIDE 2

Presentation is loading

slide-3
SLIDE 3

+1% Revenue

Web Performance

An Open Challenge With a Huge Impact

100 ms faster →

= $1.7 Billion

+20% Ad Sales

= $19 Billion

500 ms faster

3

Greg Linden. Make Data

  • Useful. 2006

Tammy Everts. Time Is Money: The Business Value of Web Performance. O’Reilly Media, 2016.

slide-4
SLIDE 4
  • 2. Network Delays
  • 1. Backend Processing

4

  • 3. Frontend Rendering

Cloud-Based Web Applications

Three Sources of Page Load Time

slide-5
SLIDE 5

Latency is the Problem

Throughput vs. Latency

5

500 1000 1500 2000 2500 3000 3500 1 2 3 4 5 6 7 8 9 10

Page Load Time (ms) Bandwidth in MBit/s (at 60ms latency)

500 1000 1500 2000 2500 3000 3500 4000 4500 240 220 200 180 160 140 120 100 80 60 40 20

Page Load Time (ms) Latency in ms (at 5MBit/s bandwidth)

Mike Belshe. More Bandwidth Doesn’t Matter (much). Technical report, Google Inc., 2010.

Throughput Latency

VS

slide-6
SLIDE 6

Latency is the Problem

Throughput vs. Latency

6

Mike Belshe. More Bandwidth Doesn’t Matter (much). Technical report, Google Inc., 2010.

VS

2× Throughput = Same Load Time ½ Latency ≈ ½ Load Time

slide-7
SLIDE 7

Problem Statement

Four Challenges

7

Latency of Dynamic Data Polyglot Persistence Transaction Abort Rates Direct Client Access

1 2 3 4

slide-8
SLIDE 8

Problem Statement

Research Question

8

How can the latency of retrieving dynamic data from cloud services be minimized in an application- and database-independent way while maintaining strict consistency guarantees?

slide-9
SLIDE 9
  • 1. Latency

9

Problem Statement

Four Challenges

  • 2. Direct Access
  • 3. Transactions
  • 4. Polyglot Persistence
slide-10
SLIDE 10

Cloud Data Manage- ment Middleware Caching Dynamic Data

End-to-End Latency in Cloud-based Architectures

Background & Motivation Outlook and Summary

10

Outline

Providing Modern NoSQL Systems as a Low-Latency DBaaS

1 2

Cache Sketches: Solving Staleness

  • f Reads and

Queries

3

The Future of Polyglot Data Management in the Cloud

4

slide-11
SLIDE 11

Why is end-to-end latency an

  • pen problem?
slide-12
SLIDE 12

Background & Motivation

Network Cloud Data Management Frontend

12

slide-13
SLIDE 13

Background & Motivation

Network Cloud Data Management Frontend

State of the Art:  Web interface based on HTML, files, APIs, and application logic  Performance defined by critical rendering path Problems:  No direct access or integration to data management  Storage maintained manually

13

slide-14
SLIDE 14

Background & Motivation

Network Cloud Data Management Frontend

State of the Art:  Service interfaces use REST & HTTP  Web caching can reduce end-to-end latency Problem:  Web caching not compatible with consistent dynamic data

13

slide-15
SLIDE 15

Background & Motivation

Network Cloud Data Management Frontend

State of the Art:  SaaS, PaaS, and IaaS models  Scalability & multi-tenancy Problems:  Combination of cloud services entails high latency  Common application building blocks

  • ften re-implemented

13

slide-16
SLIDE 16

Background & Motivation

Network Cloud Data Management Frontend

State of the Art:  Scalability & high availability through NoSQL systems  Sharding & replication  Database-as-a-Service (DBaaS) model Problems:  Lack of common data management abstractions  DBaaS model not supported  Polyglot persistence manual & error-prone

13

slide-17
SLIDE 17

Data Management

13

Background & Motivation

RDBMS Document Store Key-Value Store Wide-Column Store Distributed File System

Volatile Data Large Data Sets Critical Data Static Files Nested Data

Challenge:

mapping problem?

data database

How to tackle the

slide-18
SLIDE 18

Data Management Techniques

18

Logging Update-in-Place Caching In-Memory Storage Append-Only Storage Global Secondary Indexing Local Secondary Indexing Query Planning Analytics Framework Materialized Views Commit/Consensus Protocol Synchronous Asynchronous Primary Copy Update Anywhere Range-Sharding Hash-Sharding Entity-Group Sharding Consistent Hashing Shared-Disk

Query Processing Sharding Replication Storage Management

Elasticity Consistency Read Latency Write Throughput Read Availability Write Availability Durability Write Latency Write Scalability Read Scalability Data Scalability Scan Queries ACID Transactions Conditional or Atomic Writes Joins Sorting Filter Queries Full-text Search Aggregation and Analytics

Functional Requirements Non-Functional Requirements

[GWFR16]

slide-19
SLIDE 19

19

Logging Update-in-Place Caching In-Memory Storage Append-Only Storage Global Secondary Indexing Local Secondary Indexing Query Planning Analytics Framework Materialized Views Commit/Consensus Protocol Synchronous Asynchronous Primary Copy Update Anywhere Range-Sharding Hash-Sharding Entity-Group Sharding Consistent Hashing Shared-Disk

Query Processing Sharding Replication Storage Management

Elasticity Consistency Read Latency Write Throughput Read Availability Write Availability Durability Write Latency Write Scalability Read Scalability Data Scalability Scan Queries ACID Transactions Conditional or Atomic Writes Joins Sorting Filter Queries Full-text Search Aggregation and Analytics

NoSQL Toolbox

Access Fast Lookups RAM Redis Memcache Unbounded AP CP Cassandra Riak Voldemort Aerospike HBase MongoDB CouchBase DynamoDB Complex Queries HDD-Size Unbounded Analytics MongoDB RethinkDB HBase, Accumulo ElasticSearch, Solr Hadoop, Spark Parallel DWH Cassandra, HBase Riak, MongoDB ACID Availability RDBMS Neo4j RavenDB MarkLogic CouchDB MongoDB SimpleDB Ad-hoc Cache Shopping- basket Order History OLTP Website Social Network Big Data Volume Volume CAP Query Pattern Consistency Example Applications

Decision Tree

[GWFR16]

slide-20
SLIDE 20

Backend-as-a-Service Database-as-a-Service

20

NoSQL Toolbox Decision Tree Unified Data Management API

Query Support Transaction Processing Data Validation Partial Updates Schema Management Code Execution Access Control Indexing & Configuration Object Persistence

[GFW+14]

slide-21
SLIDE 21

How can cloud data management be unified & combined with low latency?

slide-22
SLIDE 22

Orestes: Goals

A Data Management Middleware for Low Latency

22

Database Independence Low Latency with Tunable Consistency Scalable, Available, Multi-Tenant DBaaS & BaaS Functionality

slide-23
SLIDE 23

Orestes Concept

Overview

23

Heterogeneous Data Stores

Unmodified Database Systems Web and Mobile Applications

Scalable Data Management Platform (Multi-Tenancy, Scaling, Caching, Failover, …)

Data and Default Modules Web Caching for Low Latency DBaaS/BaaS Middleware Unified REST API

[GBR14, GB13]

slide-24
SLIDE 24

How can dynamic data be accelerated through web caching?

slide-25
SLIDE 25

The Web‘s Caching Model

25

Expiration-Based Caches:

 An object x is considered

fresh for TTLx seconds

 Server assigns TTLs for each

  • bject

Invalidation-Based Caches:

 Expose object eviction

  • peration to the server

Client Expiration- based Caches Invalidation- based Caches

Request Path

Server/DB

Invalidations, Objects Cache Hits

Browser Caches, Forward Proxies, ISP Caches Content Delivery Networks, Reverse Proxies

[GSW+15]

slide-26
SLIDE 26

Expiration Cache Invalidation Cache

Web Caching for Data Management

Overview of Cache Sketch Method

26

4 2

invalidate Add to Server Cache Sketch

3 1 1 1

Compact Cache Sketch Validate Freshness

1

Data Cached for Fixed TTL Without Cache Sketch: Stale Cached Data

[GSW+15, GSW+17]

slide-27
SLIDE 27

Client Expiration- based Caches Invalidation- based Caches

Request Path

Server/DB

Invalidations, Objects Cache Hits

Needs Invalidation?

Server Cache Sketch

10201040 10101010

Counting Bloom Filter Non-expired Object Keys Report Expirations and Writes

The Cache Sketch Approach

27

Needs Revalidation?

Client Cache Sketch

10101010

Bloom filter at connect Periodic every Δ seconds at transaction begin

Minimize Staleness Minimize Invalidations

1 4

Initialization from Cache

1 2

Δ-Atomic Consistency

3

Cache-Aware Transactions

4

Invalidation Minimization

3 2

[GSW+15]

slide-28
SLIDE 28

Cache Sketch

Main Properties

28

To ensure Δ-atomicity the Cache Sketch at time t contains key(x) of every object x that was written before it expired in all caches.

t1 r(x)

TTL

t1 + TTL

t2 w(x) t ct Retrieved Cache Sketch t3 r(x)

Δ

Staleness Bound

timespan for which x c

[GSW+15, GSW+17]

slide-29
SLIDE 29

Cache Sketch

Construction

29

To ensure compactness the Cache Sketch stores n keys in a Bloom filter with m bits, k hash functions and a false positive rate of 𝑔 ≈ 1 − exp

𝑙⋅𝑜 𝑛 𝑙.

k hash functions m Bloom filter bits 1 1 1 1 1

h1 hk ...

key find(key) Client Cache Sketch Bits = 1 no yes GET request Revalidation Cache Hit Miss

key key

Example

↓ 11 KB in size 20 000 entries & 5% false positives

[GSW+15, GSW+17]

slide-30
SLIDE 30

Writes Follow Reads Read Your Writes Monotonic Reads Monotonic Writes Δ-Atomicity Linearizability PRAM Causal Consistency Sequential Consistency (Δ,p)- Atomicity Δ-Atomicity (Δ,p)- Atomicity Read Your Writes Monotonic Reads Monotonic Writes PRAM Writes Follow Reads Causal Consistency Linearizability Sequential Consistency

Controllable Consistency Levels

Cache Sketch Guarantees

30

Controllable Staleness Default Guarantees Opt-in Guarantees With Cache Bypassing

  • P. Viotti and M. Vukoli´c. Consistency in Non-Transactional

Distributed Storage Systems. ACM Computing Surveys, 2016.

[WGW+18, GWR17, GWFR16, GR16, FWGR14, FWGR14]

slide-31
SLIDE 31

Determining TTLs

Trade-Off

31

Longer TTLs Shorter TTLs ⇩ Cache Misses ⇩ Invalidations ⇩ False Positive Rate

VS [GSW+17, GSW+15]

slide-32
SLIDE 32

TTL Estimation

Optimizing Expiration & Cacheability

32

  • 1. Collect workload statistics for reads and writes
  • 2. Estimate time to next write 𝐹[𝑈

𝑥] or mark uncacheable

Constrained Adaptive TTL Estimator C-LM Model LWMA Estimator EWMA Estimator

Ideal for Poisson Processes Quick Adaption to Changes Converges for Static Workloads Highly Space- Efficient

[GSW+15]

slide-33
SLIDE 33

Average throughput for YCSB workloads A and B (YCSB benchmark): Average latency for YCSB workloads A and B (YCSB benchmark):

Evaluation Results

Simulation & Benchmarking

33

CDN Client MongoDB Orestes

Setup:

Page load times with cached initialization (Simulation): Ireland California

[GSW+15]

slide-34
SLIDE 34

247 ms 3837 ms 2763 ms 1456 ms 4442 ms 1576 ms

California

260 ms 4226 ms 2645 ms 2122 ms 753 ms 1836 ms 266 ms 5263 ms 3214 ms 1622 ms 6573 ms 1944 ms 277 ms 9963 6505 ms 6325 ms 9321 ms 4697 ms Baqend Azure Parse Firebase Kinvey Apiomat Baqend Baqend Baqend

Frankfurt Tokyo Sydney

Evaluation Results

Industry Evaluation of Commercial Implementation

34

[GSW+17, Ges17]

slide-35
SLIDE 35

How can object caching be extended to query results?

slide-36
SLIDE 36

Query Caching

Challenges

36

Invalidation Detection Cache Coherence Query Result Representation

When do query results change? How to apply Cache Sketches to queries? What is the best result structure for caching?

Q

[GSW+17]

slide-37
SLIDE 37

Invalidation Detection

Cache Coherence for Query Results

37

Update Orestes Cached Query Result Cache Invalidation

1 1 1

Updated Cache Sketch Real-Time Queries Add Change Remove Query Events

Product A Product B

Scalable Streaming System (InvaliDB)

Query Expression ↓ Normalized String

[WRG18, WGF+17, GSW+17, WGFR16]

slide-38
SLIDE 38

Solution: Cost-based decision model weighs expected round-trips vs. invalidations

[𝑣𝑠𝑚1, 𝑣𝑠𝑚2, 𝑣𝑠𝑚3]

Object Lists ID Lists

[{𝑗𝑒: 𝑝𝑐𝑘1, 𝑜𝑏𝑛𝑓: "𝑏𝑚𝑗𝑑𝑓"}, {𝑗𝑒: 𝑝𝑐𝑘2, 𝑜𝑏𝑛𝑓: "𝑐𝑝𝑐"}, {𝑗𝑒: 𝑝𝑐𝑘3, 𝑜𝑏𝑛𝑓: "𝑓𝑤𝑓"}]

⇩ Invalidations ⇩ Round-Trips

VS

38

Learning Result Represenations

Handling Changes to Query Results

[GSW+17]

slide-39
SLIDE 39

Evaluation Results

Query Caching for YCSB-Based Workloads

39

Throughput with growing request parallelsim: Average end-to-end query and read latency:

11×

Throughput Improvement

47×

Lower Query Latency

Lower Read Latency

[GSW+17]

slide-40
SLIDE 40

Can Cache Sketches improve transaction performance?

slide-41
SLIDE 41

Problem of Optimistic Transactions

Abort Rates Depend on Latency

41

10 ms 50 ms 100 ms 150 ms

Transaction Abort Rates Increase Exponentially with Latency

[GBR14]

slide-42
SLIDE 42

Distributed Cache-Aware Transactions

DCAT Solves Latency Problem

42

Orestes Server Orestes Server Orestes Server

DB Coordinator

Client

Cache Cache Cache Begin Transaction Cache Sketch Reads Writes Buffer Commit: read-set and updates Committed OR aborted + stale objects Mutual Exclusion Writes Read all

  • 1. Cache Sketch: staleness barrier at transaction begin
  • 2. Shorter duration through cached reads

[GBR14]

slide-43
SLIDE 43

Results

Simulation-Based Abort Analysis

43

15×

Faster Transactions

More Objects Before Exceeding 2 Seconds

slide-44
SLIDE 44

Can polyglot data management be automated in the future?

slide-45
SLIDE 45

Vision

Automated Choice of Databases

45

Latency < 20ms

Annotated Schema

Polyglot Persistence Mediator

Application DB1 DB2 DB3

[SGR15]

slide-46
SLIDE 46

Towards Automated Polyglot Persistence

Three-Step Process

46

Requirements specified as SLA annotations for schemas (based on NoSQL Toolbox) Find or provision a suitable combination of databases through ranking algorithm Mediate data allocation and database operations between applications and databases

  • 1. Requirements
  • 2. Resolution
  • 3. Mediation

Counter Top-k Query 20 ms Write Latency Counter Redis MongoDB Counter Update Redis

[SGR15]

slide-47
SLIDE 47

Evaluation Results

Case Study

47

Article ID Title … Imp. Imp. ID

Document Sorted Set

2.1×

Higher Throughput

10 ms

Predictable Write Latency Scenario:

News Articles With Impression Counts

[SGR15]

slide-48
SLIDE 48

Future Work

Three Promising Areas

48

Proactive SLA Enforcement

 Monitor & predict database behavior  Action: change routing, live migration, polyglot scaling

Reinforcement Learning of Caching Decisions

 Learn best TTLs for any workload  Applications define goals

Polyglot Transaction Processing

 Optimal choice of concurrency & commit protocol – across DBs

[SKE+18, SG16]

slide-49
SLIDE 49

What are the main contributions?

slide-50
SLIDE 50

Publications (1/4)

50

[SKE+18] Michael Schaarschmidt, Alexander Kuhnle, Ben Ellis, Kai Fricke, Felix Gessert, and Eiko

  • Yoneki. LIFT: Reinforcement Learning in Computer Systems by Learning From
  • Demonstrations. arXiv preprint arXiv:1808.07903 (under submission), 2018.

[WRG18] Wolfram Wingerath, Norbert Ritter, and Felix Gessert. Real-Time & Stream Data Management: Push-Based Data in Research & Practice. Springer, book to be published in late 2018. [WGW+18] Wolfram Wingerath, Felix Gessert, Erik Witt, Steffen Friedrich, and Norbert Ritter. Real- time Data Management for Big Data. In Proceedings of the 21th International Conference

  • n Extending Database Technology, EDBT 2018, Vienna, Austria, March 26-29, 2018.

OpenProceedings.org, 2018. [GSW+17] Felix Gessert, Michael Schaarschmidt, Wolfram Wingerath, Erik Witt, Eiko Yoneki, and Norbert Ritter. Quaestor: Query Web Caching for Database- as-a-Service Providers. Proceedings of the VLDB Endowment, 2017. [GWR17] Felix Gessert, Wolfram Wingerath, and Norbert Ritter. Scalable Data Management: An In- Depth Tutorial on Nosql Data Stores. In BTW (Workshops), volume P-266 of LNI, pages 399–402. GI, 2017.

slide-51
SLIDE 51

Publications (2/4)

51

[WGF+17] Wolfram Wingerath, Felix Gessert, Steffen Friedrich, Erik Witt, and Norbert Ritter. The Case for Change Notifications in Pull-Based Databases. In Datenbanksysteme für Business, Technologie und Web (BTW 2017) - Workshopband, 2.-3. März 2017, Stuttgart, Germany, 2017. [GR17] Felix Gessert and Norbert Ritter. SCDM 2017 - Vorwort. In BTW (Workshops), volume P-266

  • f LNI, pages 211–213. GI, 2017.

[Ges17] Felix Gessert. Lessons Learned Building a Backend-as-a-Service. Baqend Tech Blog, May

  • 2017. (Accessed on 08/11/2017).

[GWFR16] Felix Gessert, Wolfram Wingerath, Steffen Friedrich, and Norbert Ritter. NoSQL Database Systems: A Survey and Decision Guidance. Computer Science - Research and Development, November 2016. [GR16] Felix Gessert and Norbert Ritter. Scalable Data Management: NoSQL Data Stores in Research and Practice. In 32nd IEEE International Conference on Data Engineering, ICDE, 2016. [SG16] Michael Schaarschmidt and Felix Gessert. Learning Runtime Parameters in Computer Systems with Delayed Experience Injection. In Deep Reinforcement Learning Workshop, NIPS, 2016.

slide-52
SLIDE 52

Publications (3/4)

52

[WGFR16] Wolfram Wingerath, Felix Gessert, Steffen Friedrich, and Norbert Ritter. Real- Time Stream Processing for Big Data. it - Information Technology, 58(4), January 2016. [GSW+15] Felix Gessert, Michael Schaarschmidt, Wolfram Wingerath, Steffen Friedrich, and Norbert

  • Ritter. The Cache Sketch: Revisiting Expiration-based Caching in the Age of Cloud Data
  • Management. In Datenbanksysteme für Business, Technologie und Web (BTW), 16.

Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme". GI, 2015. [GR15a] Felix Gessert and Norbert Ritter. Polyglot Persistence. Datenbank-Spektrum, 15(3):229– 233, November 2015. [GR15b] Felix Gessert and Norbert Ritter. Skalierbare NoSQL- und Cloud-Datenbanken in Forschung und Praxis. In Datenbanksysteme für Business, Technologie und Web (BTW 2015) - Workshopband, 2.-3. März 2015, Hamburg, Germany, pages 271–274, 2015. [Ges15] Felix Gessert. Low Latency Cloud Data Management through Consistent Caching and Polyglot Persistence. In Proceedings of the 9th Advanced Summer School on Service Oriented Computing, SummerSOC, 2015. [SGR15] Michael Schaarschmidt, Felix Gessert, and Norbert Ritter. Towards Automated Polyglot

  • Persistence. In Datenbanksysteme für Business, Technologie und Web (BTW), 16.

Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme", 2015.

slide-53
SLIDE 53

Publications (4/4)

53

[WFGR15] Wolfram Wingerath, Steffen Friedrich, Felix Gessert, and Norbert Ritter. Who Watches the Watchmen? On the Lack of Validation in NoSQL Benchmarking. In Datenbanksysteme für Business, Technologie und Web (BTW), 16. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme", 2015. [GBR14] Felix Gessert, Florian Bücklers, and Norbert Ritter. ORESTES: a Scalable Database-as-a- Service Architecture for Low Latency. In CloudDB, Data Engineering Workshops (ICDEW), pages 215–222. IEEE, 2014. [GFW+14] Felix Gessert, Steffen Friedrich, Wolfram Wingerath, Michael Schaarschmidt, and Norbert

  • Ritter. Towards a Scalable and Unified REST API for Cloud Data Stores. In 44. Jahrestagung

der Gesellschaft für Informatik, Informatik 2014, Big Data - Komplexität meistern, 22.-26. September 2014 in Stuttgart, Deutschland, volume 232 of LNI, pages 723–734. GI, 2014. [FWGR14] Steffen Friedrich, Wolfram Wingerath, Felix Gessert, and Norbert Ritter. NoSQL OLTP Benchmarking: A Survey. In 44. Jahrestagung der Gesellschaft für Informatik, Informatik 2014, Big Data - Komplexität meistern, 22.-26. September 2014 in Stuttgart, Deutschland, volume 232 of LNI, pages 693–704. GI, 2014. [GB13] Felix Gessert and Florian Bücklers. ORESTES: ein System für horizontal skalierbaren Zugriff auf Cloud-Datenbanken. In Informatiktage. GI, March 2013.

slide-54
SLIDE 54

Client (Browser) Expiration- based Caches Invalidation-based Caches Cloud Backend (DBaaS/BaaS) Database Systems

Expiration (TTL) Best Cacheable Structure

Cached Data

Files Records, Documents Query Results

{}

Main Contributions

Summary

54

Cache Coherence for Files, Records & Queries 2 1 TTL Estimation & Result Structure 3 Unified Data Management Interface 4 Database-Independent DBaaS and BaaS Scalable Cache-Aware Transactions Polylgot Persistence Mediation 5 6