Query Fresh: Log Shipping on Steroids Tianzheng Wang* Ryan Johnson - - PowerPoint PPT Presentation

query fresh log shipping on steroids
SMART_READER_LITE
LIVE PREVIEW

Query Fresh: Log Shipping on Steroids Tianzheng Wang* Ryan Johnson - - PowerPoint PPT Presentation

Query Fresh: Log Shipping on Steroids Tianzheng Wang* Ryan Johnson Ippokratis Pandis *Currently at Simon Fraser University High availability through log shipping Backup(s): Read + Failover Primary: Read + Write Real database Replay


slide-1
SLIDE 1

Query Fresh: Log Shipping on Steroids

Tianzheng Wang* Ryan Johnson Ippokratis Pandis

*Currently at Simon Fraser University

slide-2
SLIDE 2

High availability through log shipping

2

Primary: Read + Write Backup(s): Read + Failover Network Log Log Replay

Widely used in practice

“Real” database

slide-3
SLIDE 3

Desirable properties

3

Easy impl. & maintenance Fresh Fast primary High resource utilization Safe

slide-4
SLIDE 4

Strong safety and freshness

4

Commit? Persist + ship + wait ack Time

Primary Backup(s)

Persist log Ack Committed Replay

Synchronous log shipping Fast log replay

Ack

Sync or async

I/O, network and/or replay on the critical path

slide-5
SLIDE 5
  • ERMIA* TPC-C, 2-socket, 16 physical cores, 10Gbe

Synchronous log shipping: infeasible

5

* K. Kim, T. Wang, R. Johnson, I. Pandis, ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads, SIGMOD 2016

slide-6
SLIDE 6

eads s ba ba

  • ERMIA* TPC-C, 2-socket, 16 physical cores, 10Gbe

Synchronous log shipping: infeasible

6

Log rate > BW

Network + I/O: major bottleneck

* K. Kim, T. Wang, R. Johnson, I. Pandis, ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads, SIGMOD 2016

slide-7
SLIDE 7

Reality: asynchronous log shipping → freshness gap

7

Primary Backup(s)

Balance 9:40 $0 9:41 $50 Balance 9:41 $0 9:42 $0 9:43 $0 … 9:50 $50

Network

Safety and freshness traded for primary speed

Log Log Replay

slide-8
SLIDE 8
  • Synchronous log shipping: leverage modern hardware
  • Fast replay: append-only storage + indirection

Query Fresh

8

slide-9
SLIDE 9

Modern HW: synchronous log shipping possible

9

Non-volatile RAM (NVRAM)

NV-DIMM Memristor 3D XPoint

slide-10
SLIDE 10

Network no longer the biggest bottleneck

Trend: network tracks memory speed

10

* https://www.infinibandta.org/infiniband-roadmap/

slide-11
SLIDE 11

Modern HW: synchronous log shipping possible

11

NVRAM → Fast persistence

NV-DIMM Memristor 3D XPoint

High BW network → Fast transfer

InfiniBand, Converged Ethernet (56Gbps+)

RDMA over NVRAM: fast synchronous log shipping

See paper for challenges & soln.

slide-12
SLIDE 12

Desirable properties

12

Easy impl. & maintenance Fresh Fast primary High resource utilization Safe

slide-13
SLIDE 13
  • Sync. Shipping != Fresh Reads

13

The log Replay e “ eal” database Often serial

  • Two durable copies
  • Create actual tuples
  • Memory allocation
  • Many index operations

(esp. secondary indexes)

Heavyweight record creation + serial replay = stale

slide-14
SLIDE 14
  • Only keep one durable copy of data – the log
  • Redo-only logging, log record == data tuple
  • LSN == position in the log, directly comparable

Append-only storage: freshness possible

14

* K. Kim, T. Wang, R. Johnson, I. Pandis, ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads, SIGMOD 2016

slide-15
SLIDE 15

Query Fresh: Log == Database with RDMA + NVRAM

15

The log Replay Parallel (see paper) RDMA over NVRAM

  • Sync. commit: safe
  • Log tail in NVRAM
  • Indexes: key → RID
  • Queries check both arrays
  • Extract tuple location
  • Little memory allocation
  • No index operation

(except for inserts)

Primary Secondary

RID Where?

(New) Per-table replay array

LSN 10 1 LSN 20 … …

Fast sync log shipping + append-only = safe & fresh

slide-16
SLIDE 16

Query Fresh vs. Existing

16

Query Fresh balances all aspects

slide-17
SLIDE 17
  • 8 x 16-core (2-socket) nodes
  • 1 primary + up to 7 backups
  • Xeon E5-2650 v2, 64GB RAM, logs in tmpfs
  • Target NV-DIMM: DRAM as log buffer + CLWB/FLUSH emulation
  • Network
  • Query Fresh: 56Gbps Infiniband FDR 4x + RDMA
  • Other schemes: 10Gbps Ethernet + TCP
  • Benchmarks in ERMIA
  • Primary: Full TPC-C, low contention
  • Backups: StockLevel + OrderStaus

Evaluation

17

slide-18
SLIDE 18
  • 16 workers on primary, 4 replay threads + 12 workers on backups
  • Utilization = 75% (12 workers out of 16 total)

Query Fresh: maintains fast primary

18

Network saturated

slide-19
SLIDE 19
  • Freshness: backup read view / primary read view * 100%

Query Fresh: fresh and high utilization

19

slide-20
SLIDE 20
  • Slow network + Fast OLTP = Stale and Unsafe
  • Redundant data copies (dual-copy architecture)
  • Often serial, heavy-weighted log replay
  • Query Fresh = Fast network + NVRAM

+ Append-only storage with indirection

Conclusions

20

Thank you! Find out more in our paper and code repo!

https://github.com/ermia-db Fast, sync, safe Fast replay → Fresh reads