Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory - - PowerPoint PPT Presentation

low latency communication for fast dbms using rdma and
SMART_READER_LITE
LIVE PREVIEW

Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory - - PowerPoint PPT Presentation

Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory Philipp Fent, Alexander van Renen, Andreas Kipf, Technische Universitt Mnchen, April 23, 2020 Viktor Leis , Thomas Neumann, Alfons Kemper


slide-1
SLIDE 1

Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory

Philipp Fent, Alexander van Renen, Andreas Kipf, Viktor Leis∗, Thomas Neumann, Alfons Kemper

Technische Universität München, Friedrich-Schiller-Universität Jena∗

April 23, 2020

slide-2
SLIDE 2

Communication Performance

DBMS Data serialization format Network & transport protocol Physical interconnect Client Application Data serialization format Network & transport protocol Physical interconnect

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 2 / 11

slide-3
SLIDE 3

Communication Performance

DBMS Data serialization format Network & transport protocol Physical interconnect Client Application Data serialization format Network & transport protocol Physical interconnect ODBC

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 2 / 11

slide-4
SLIDE 4

Communication Performance

DBMS Data serialization format Network & transport protocol Physical interconnect Client Application Data serialization format Network & transport protocol Physical interconnect ODBC

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 2 / 11

slide-5
SLIDE 5

Communication Performance

Transactions / second

In-process Domain TCP

20 K 40 K 60 K

58

Linux Sockets

2.7 1.5 (a) 1 Thread In-process Domain TCP

100 K 200 K 300 K

344

Linux Sockets

16 8.9 (b) 8 Threads

Figure: TPC-C throughput using Silo

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 3 / 11

slide-6
SLIDE 6

Communication Performance

Transactions / second

In-process Domain TCP

20 K 40 K 60 K

58

Linux Sockets

2.7 1.5 (a) 1 Thread In-process Domain TCP

100 K 200 K 300 K

344

Linux Sockets

16 8.9 (b) 8 Threads

Figure: TPC-C throughput using Silo

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 3 / 11

slide-7
SLIDE 7

Understanding the Bottleneck

  • Misconception: Network is slow

Twofold actual bottleneck:

TCP Through kernel communication

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 4 / 11

slide-8
SLIDE 8

Understanding the Bottleneck

  • Misconception: Network is slow
  • Twofold actual bottleneck:
  • TCP

Through kernel communication

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 4 / 11

slide-9
SLIDE 9

Understanding the Bottleneck

  • Misconception: Network is slow
  • Twofold actual bottleneck:
  • TCP
  • Through kernel communication

DBMS

write()

Kernel

read()

Client

syscall > 10 k cycles syscall > 10 k cycles

Figure: Kernel based communication

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 4 / 11

slide-10
SLIDE 10

Understanding the Bottleneck

  • Misconception: Network is slow
  • Twofold actual bottleneck:
  • TCP
  • Through kernel communication

DBMS

1× mmap() 1× mmap()

Client Message bufger Kernel

memcpy() ≈ 100 cycles memcpy() ≈ 100 cycles

Figure: Direct memory access

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 4 / 11

slide-11
SLIDE 11

Low-Latency Communication Using Shared Memory

  • Co-hosted on the same machine
  • Latency similar to embedded DBs, e.g. SQLite
  • Ideal interconnect for container / Docker environment

Bootstrapped via Domain Sockets

Pass message bufger via cmsg ancillary data

Ringbufger with polling to transfer serialized data

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 5 / 11

slide-12
SLIDE 12

Low-Latency Communication Using Shared Memory

  • Co-hosted on the same machine
  • Latency similar to embedded DBs, e.g. SQLite
  • Ideal interconnect for container / Docker environment
  • Bootstrapped via Domain Sockets
  • Pass message bufger via cmsg ancillary data
  • Ringbufger with polling to transfer serialized data

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 5 / 11

slide-13
SLIDE 13

Low-Latency Communication Using Shared Memory

Available bandwidth depends on transmission parameters

4 kB 32 kB 256 kB 2 MB 16 MB 128 MB 1 GB Size of transmitted chunks [log] 4 kB 32 kB 256 kB 2 MB 16 MB 128 MB 1 GB Size of transmission buffer [log]

2 . 1 4 . 6 2 . 5 4 . 9 4 . 8 2 . 8 4 . 9 5 . 1 5 . 0 3 . 1 4 . 9 5 . 0 5 . 1 5 . 1 3 . 2 4 . 9 5 . 1 5 . 2 5 . 1 5 . 1 3 . 2 5 . 1 5 . 2 5 . 2 5 . 1 5 . 1 5 . 2 1 . 8 5 . 1 5 . 1 5 . 2 5 . 2 5 . 2 5 . 3 5 . 1 2 . 3 5 . 0 5 . 2 5 . 3 5 . 2 5 . 25.3 5 . 1 4 . 6 2 . 5 5 . 0 5 . 2 5 . 2 5 . 2 5 . 2 5 . 3 5 . 1 5 . 0 5 . 0 2 . 8 5 . 0 5 . 1 5 . 2 5 . 1 5 . 0 5 . 1 5 . 3 5 . 0 5 . 0 5 . 0 2 . 8 4 . 9 5 . 0 5 . 1 5 . 1 5 . 1 5 . 2 5 . 2 5 . 0 4 . 9 4 . 9 5 . 0 2 . 9 4 . 4 4 . 7 4 . 8 4 . 9 4 . 9 4 . 9 4 . 9 4 . 7 4 . 6 4 . 6 4 . 6 4 . 7 2 . 4 4 . 4 4 . 7 4 . 8 4 . 8 4 . 8 4 . 9 4 . 9 4 . 6 4 . 5 4 . 6 4 . 6 4 . 5 4 . 4 1 . 6 4 . 4 4 . 6 4 . 8 4 . 8 4 . 8 4 . 8 4 . 9 4 . 6 4 . 6 4 . 5 4 . 5 4 . 6 4 . 5 2 . 9 1 . 5 4 . 4 4 . 6 4 . 8 4 . 7 4 . 8 4 . 8 4 . 8 4 . 6 4 . 6 4 . 6 4 . 5 4 . 6 4 . 5 3 . 0 3 . 0 1 . 5 4 . 4 4 . 6 4 . 7 4 . 7 4 . 7 4 . 8 4 . 8 4 . 6 4 . 5 4 . 6 4 . 5 4 . 6 4 . 5 3 . 0 2 . 9 2 . 9 1 . 5 4 . 4 4 . 6 4 . 7 4 . 7 4 . 8 4 . 8 4 . 8 4 . 6 4 . 5 4 . 6 4 . 5 4 . 6 4 . 5 3 . 0 3 . 0 3 . 0 3 . 0 1 . 6 4 . 3 4 . 6 4 . 7 4 . 7 4 . 8 4 . 8 4 . 8 4 . 6 4 . 5 4 . 6 4 . 5 4 . 6 4 . 5 3 . 0 3 . 0 3 . 0 3 . 0 2 . 9 1 . 6

L2 cache L3 cache 2 GB/s 3 GB/s 4 GB/s 5 GB/s

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 6 / 11

slide-14
SLIDE 14

Low-Latency Communication Using RDMA

  • Co-located in the same datacenter
  • Bootstrapped via regular TCP/IP
  • Similar ringbufger as with Shared Memory

RDMA intricacies:

  • Sync. Throughput [msgs s]

1 4 16 64 256 200 K 400 K 600 K

Cache line

70

Size of message [Byte]

Write + Polling Two Writes Write + Immediate Send + Receive

1 4 16 64 256 200 K 400 K 600 K

Cache line

Size of message [Byte]

Two Writes Chained Writes Immediate Data

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 7 / 11

slide-15
SLIDE 15

Low-Latency Communication Using RDMA

  • Co-located in the same datacenter
  • Bootstrapped via regular TCP/IP
  • Similar ringbufger as with Shared Memory
  • RDMA intricacies:
  • Sync. Throughput [msgs/s]

1 4 16 64 256 200 K 400 K 600 K

Cache line

+70%

Size of message [Byte]

Write + Polling Two Writes Write + Immediate Send + Receive

1 4 16 64 256 200 K 400 K 600 K

Cache line

Size of message [Byte]

Two Writes Chained Writes Immediate Data

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 7 / 11

slide-16
SLIDE 16

Low-Latency Communication Using RDMA

  • Co-located in the same datacenter
  • Bootstrapped via regular TCP/IP
  • Similar ringbufger as with Shared Memory
  • RDMA intricacies:
  • Sync. Throughput [msgs/s]

1 4 16 64 256 200 K 400 K 600 K

Cache line

+70%

Size of message [Byte]

Write + Polling Two Writes Write + Immediate Send + Receive

1 4 16 64 256 200 K 400 K 600 K

Cache line

Size of message [Byte]

Two Writes Chained Writes Immediate Data

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 7 / 11

slide-17
SLIDE 17

Low-Latency Communication Using RDMA

  • Asymmetric connections
  • Many message bufgers → random accesses for polling

Cache effjcient mailbox polling

Two writes to separate memory regions

Scales up to the limit of RDMA’s reliable connections Mailbox effjcient polling X X Message Bufger [28] SELECT a FROM r WHERE x = 28 [30] SELECT e FROM r WHERE x = 81 in-fmight written after message

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 8 / 11

slide-18
SLIDE 18

Low-Latency Communication Using RDMA

  • Asymmetric connections
  • Many message bufgers → random accesses for polling
  • Cache effjcient mailbox polling
  • Two writes to separate memory regions

Scales up to the limit of RDMA’s reliable connections Mailbox effjcient polling X X Message Bufger [28] SELECT a FROM r WHERE x = 28 [30] SELECT e FROM r WHERE x = 81 in-fmight written after message

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 8 / 11

slide-19
SLIDE 19

Low-Latency Communication Using RDMA

  • Asymmetric connections
  • Many message bufgers → random accesses for polling
  • Cache effjcient mailbox polling
  • Two writes to separate memory regions
  • Scales up to the limit of RDMA’s reliable connections

Mailbox effjcient polling X X Message Bufger [28] SELECT a FROM r WHERE x = 28 [30] SELECT e FROM r WHERE x = 81 in-fmight written after message

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 8 / 11

slide-20
SLIDE 20

Results

Local YCSB-C [sync. tx/s] TCP SHM NP DS RDMA Silo + L5 50.5 K 685 K — 72.1 K 364 K DBMS X 7.56 K 11.5 K 11.5 K — — MySQL 10.0 K 45.9 K 27.6 K — — SQLite — 378 K — — — Remote YCSB-C [sync. tx/s] 1 G Eth 56 G IB RDMA Silo + L5 15 K 27 K 302 K DBMS X 3 1 K 3 7 K — MySQL 7 1 K 8 0 K — PostgreSQL 6 3 K 7 5 K —

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 9 / 11

slide-21
SLIDE 21

Results

Local YCSB-C [sync. tx/s] TCP SHM NP DS RDMA Silo + L5 50.5 K 685 K — 72.1 K 364 K DBMS X 7.56 K 11.5 K 11.5 K — — MySQL 10.0 K 45.9 K 27.6 K — — SQLite — 378 K — — — Remote YCSB-C [sync. tx/s] 1 G Eth 56 G IB RDMA Silo + L5 15 K 27 K 302 K DBMS X 3.1 K 3.7 K — MySQL 7.1 K 8.0 K — PostgreSQL 6.3 K 7.5 K —

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 9 / 11

slide-22
SLIDE 22

Results – Scale Out

1 5 10 15 20 1 M 2 M Number of clients

  • Sync. Throughput [tx/s]

RDMA 4 Servers RDMA 2 Servers RDMA 1 Server TCP

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 10 / 11

slide-23
SLIDE 23

Conclusion

  • L5 – Low-Level, Low-Latency Library

https://github.com/pfent/L5RDMA

  • Shared Memory and RDMA bring OLTP performance to clients
  • fent@in.tum.de

Transactions / second

In-process Domain TCP SHM RDMA

20 K 40 K 60 K

58

Linux Sockets

2.7 1.5

L5

32 14 (a) 1 Thread In-process Domain TCP SHM RDMA

100 K 200 K 300 K

344

Linux Sockets

16 8.9

L5

166 75 (b) 8 Threads

Philipp Fent et al. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory 11 / 11