More Than A Network: Distributed OLTP on Clusters of Hardware - - PowerPoint PPT Presentation

more than a network distributed
SMART_READER_LITE
LIVE PREVIEW

More Than A Network: Distributed OLTP on Clusters of Hardware - - PowerPoint PPT Presentation

More Than A Network: Distributed OLTP on Clusters of Hardware Islands Danica Porobic , Pnar Tzn, Raja Appuswamy, Anastasia Ailamaki 1 Multisocket multicores 21 cycles 72 cycles 2x12-core 237 cycles Intel Xeon E5-2650L v3 threads


slide-1
SLIDE 1

More Than A Network: Distributed OLTP on Clusters of Hardware Islands

Danica Porobic, Pınar Tözün, Raja Appuswamy, Anastasia Ailamaki

1

slide-2
SLIDE 2

2

Multisocket multicores

core core core L1 L2 L1 L2 L1 L2 L3 core L1 L2 memory controller Inter-socket links core core core L1 L2 L1 L2 L1 L2 L3 memory controller core L1 L2 Inter-socket links

L1

inter-socket links inter-socket links

Island

L3

threads socket 0 socket 1 21 cycles 72 cycles 237 cycles 2x12-core Intel Xeon E5-2650L v3

Challenge: non-uniform communication

slide-3
SLIDE 3

OLTP on Hardware Islands

Shared-everything Shared-nothing Island shared-nothing

 stable  not optimal  fast  sensitive to workload  robust middle ground

3

Optimal configuration depends on workload and hw

slide-4
SLIDE 4

Rack-scale hardware platforms

  • Abundant non-uniform parallelism

– Need to scale across many cores

  • Large main memories

– Datasets are memory resident

  • Network & DRAM converge

– Need to scale across multiple nodes

4

Complex hierarchy of Hardware Islands

slide-5
SLIDE 5

How different are clusters of Islands?

  • Does Island topology still matter in the cluster

environment?

  • Does faster communication always improve

throughput?

  • How do scale-up designs perform when used

in distributed deployments?

5

slide-6
SLIDE 6

Experimental setup

  • Shore-MT
  • 2 x 6-core Intel Xeon

X5660

  • 10 Gbps Ethernet
  • TCP/IP and shared

memory communication

  • TPC-C and partition-

sensitive microbenchmark

  • Silo
  • 8 x 10-core Intel Xeon

E7-L8867

  • Unix sockets and shared

memory communication

  • Partition-sensitive

microbenchmark

6

slide-7
SLIDE 7

Distributed deployments

scale-up scale-out hybrid

7

… … …

slide-8
SLIDE 8

Scaling out across the cluster

20 40 60 80 100 120 140 160 1 2 4 8 Throughput (KTps) Number of servers

TPC-C New Order

scale-up hybrid scale-out

8

No configuration is optimal for every cluster

100 200 300 400 500 600 1 2 4 8 Throughput (KTps) Number of servers

TPC-C Payment

Shore, TCP/IP

slide-9
SLIDE 9

Impact of placement

9

20 40 60 80 100 120 140 160 180 200 scale-out scale-up Throughput (KTps)

TPC-C Payment

5 10 15 20 25 30 35 40 45 scale-out scale-up Throughput (KTps)

TPC-C New Order

OS Bound

Thread migrations hurt performance & predictability

Shore, TCP/IP

slide-10
SLIDE 10

Partition sensitive microbenchmark

  • Single site version

– probe/update N rows from the local site

  • Multisite version

– probe/update 1 row from the local site – probe/update N-1 rows uniformly from any site – sites may reside on the same instance

10

slide-11
SLIDE 11

Impact of fast communication

11

10 20 30 40 50 60 70 80 90 scale-out scale-up scale-out scale-up Local 20% multisite Throughput (KTps)

Reading 20 rows

TCP/IP Shared memory 5 10 15 20 25 30 35 40 45 50 scale-out scale-up scale-out scale-up Local 20% multisite Throughput (KTps)

Updating 20 roes

Read-only: helps, Updates: little impact

Shore-MT

slide-12
SLIDE 12

Scaling out a scale-up system

12

10 20 30 40 scale-out scale-up scale-out scale-up Local 20% multisite Throughput (MTps)

Reading 2 rows

Sockets Shared memory 2 4 6 8 10 12 14 scale-out scale-up scale-out scale-up Local 1% multisite Throughput (MTps)

Updating 2 rows

Distributed updates cause severe throughput drop

Silo

slide-13
SLIDE 13

Why don’t updates scale out?

10 20 30 40 50 60 0.2 0.4 0.6 0.8 1 10 20 30 40 50 60 70 80 90 100 %Aborts Normalized throughput % distributed transactions Read-write Read-only RW aborts

Multicore-optimized OCC very sensitive to delays 13

slide-14
SLIDE 14

OLTP on a Cluster of Islands

  • Scale-up designs sensitive to scale-out delays
  • Islands-awareness required, but insufficient for
  • ptimal cluster deployments
  • Fast communication can improve throughput,

but does not guarantee improvement

Thank you!

14