ATraPos: Adaptive Transaction Processing on Hardware Islands Danica - - PowerPoint PPT Presentation

atrapos adaptive transaction
SMART_READER_LITE
LIVE PREVIEW

ATraPos: Adaptive Transaction Processing on Hardware Islands Danica - - PowerPoint PPT Presentation

ATraPos: Adaptive Transaction Processing on Hardware Islands Danica Porobic , Erietta Liarou, Pnar Tzn, Anastasia Ailamaki Data-Intensive Application and Systems Lab, EPFL Scaling up OLTP on multisockets Throughput 1 2 3 4 5 6 7 8


slide-1
SLIDE 1

ATraPos: Adaptive Transaction Processing on Hardware Islands

Danica Porobic, Erietta Liarou, Pınar Tözün, Anastasia Ailamaki

Data-Intensive Application and Systems Lab, EPFL

slide-2
SLIDE 2

2

Scaling up OLTP on multisockets

1 2 3 4 5 6 7 8 Throughput Number of sockets

Multisocket servers are severely under-utilized

slide-3
SLIDE 3

3

Multisocket multicores

Communication latencies vary by an order-of-magnitude

<10 cycles

Core Core Core L1 L2 L1 L2 L1 L2 L3 Core L1 L2 Memory controller Inter-socket links Core Core Core L1 L2 L1 L2 L1 L2 L3 Memory controller Core L1 L2 Inter-socket links

L1

Inter-socket links

Inter-socket links

50 cycles 500 cycles

Island

L3

threads Socket 0 Socket 1

slide-4
SLIDE 4

OLTP on Hardware Islands

Shared-everything Shared-nothing Island shared-nothing

4

slide-5
SLIDE 5

5

Scaling up on a 8-socket machine

1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 Throughput (MTPS) Number of sockets Shared-nothing Island shared-nothing Shared-everything

Islands significantly challenge scalability

8 socket x 10 core 800K row dataset Probing one row

slide-6
SLIDE 6

6

Physical partitioning for Islands

20 40 60 80 100 120 20 40 60 80 100 Throughput (KTps) % multisite transactions

Shared-nothing Island shared-nothing Shared-everything

No configuration is optimal for all environments

4 socket x 6 core 240K row dataset Updating 10 rows

slide-7
SLIDE 7

OLTP on Hardware Islands

Shared-everything Shared-nothing Island shared-nothing

7

 Stable  Not optimal  Fast  Sensitive to workload  Robust middle ground

  • Challenges

– Optimal configuration depends on workload and hardware – Expensive repartitioning due to physical data movement

ATraPos: hardware and workload-aware shared-everything adaptive system

slide-8
SLIDE 8

ATraPos: Adaptive Transaction Processing

8

  • No unnecessary inter-socket synchronization
  • Workload & hardware-aware partitioning
  • Lightweight monitoring and repartitioning

ATraPos: hardware and workload-aware shared-everything adaptive system

slide-9
SLIDE 9

Outline

  • Impact of Hardware Islands on OLTP
  • ATraPos

– Avoiding unnecessary synchronization – Workload & hardware-aware partitioning & placement – Lightweight monitoring & repartitioning

  • Summary

9

slide-10
SLIDE 10

10

Critical path of transaction execution

Many accesses to shared data structures

Core Core Core Core Core Core Core Core

Data System state

threads

slide-11
SLIDE 11

11

Perfectly partitionable workload

1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 Throughput (MTPS) Number of sockets Shared-nothing Centralized shared-everything

Accessing centralized data structures limits scalability

8 socket x 10 core 800K row dataset Probing one row

slide-12
SLIDE 12

12

PLP: Physiologically partitioned SE*

System state is still shared

Core Core Core Core Core Core Core Core

System state

*I. Pandis, et al: PLP: Page Latch-free Shared-everything OLTP, VLDB 2011 threads

slide-13
SLIDE 13

13

Perfectly partitionable workload

1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 Throughput (MTPS) Number of sockets Shared-nothing PLP Centralized shared-everything

Inter-socket accesses to system state are a bottleneck

8 socket x 10 core 800K row dataset Probing one row

slide-14
SLIDE 14

14

ATraPos: Island-aware SE

Core Core Core Core Core Core Core Core

System state System state

threads

slide-15
SLIDE 15

15

Perfectly partitionable workload

1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 Throughput (MTPS) Number of sockets Shared-nothing ATraPos PLP Centralized shared-everything

Island awareness brings scalability

8 socket x 10 core 800K row dataset Probing one row

slide-16
SLIDE 16

Outline

  • Impact of Hardware Islands on OLTP
  • ATraPos

– Avoiding unnecessary synchronization – Workload & hardware-aware partitioning & placement – Lightweight monitoring & repartitioning

  • Summary

16

slide-17
SLIDE 17

17

Naive partitioning and placement

300 600 900 1200 1500 1800 2100

PLP ATraPos HW-aware ATraPos Load balanced ATraPos

Throughput (KTPS)

Cores are overloaded with contending threads

8 socket x 10 core 800K rows per table Probing 1 row each from A and B

Probe A Probe B

1.9x

slide-18
SLIDE 18

18

ATraPos partitioning and placement

300 600 900 1200 1500 1800 2100

PLP ATraPos HW-aware ATraPos Load balanced ATraPos

Throughput (KTPS)

Ignoring Islands -> synchronization overhead

4.4x

Probe A Probe B

8 socket x 10 core 800K rows per table Probing 1 row each from A and B

slide-19
SLIDE 19

19

ATraPos partitioning and placement

300 600 900 1200 1500 1800 2100

PLP ATraPos HW-aware ATraPos Load balanced ATraPos

Throughput (KTPS)

ATraPos: balanced load + reduced synchronization

4.8x

Probe A Probe B

8 socket x 10 core 800K rows per table Probing 1 row each from A and B

slide-20
SLIDE 20

Outline

  • Impact of Hardware Islands on OLTP
  • ATraPos

– Avoiding unnecessary synchronization – Workload & hardware-aware partitioning & placement – Lightweight monitoring & repartitioning

  • Summary

20

slide-21
SLIDE 21

21

ATraPos monitoring

Probe A

Initialize with naive scheme Monitor the workload Evaluate cost model:

Balance the load Minimize synchronization

stats

slide-22
SLIDE 22

22

ATraPos monitoring

Initialize with naive scheme Monitor the workload Evaluate cost model

Probe A Probe B

Repartition stats

slide-23
SLIDE 23

Repartitioning Multi-Rooted B-trees

Splitting and merging B-trees accesses few pages

23

threads

slide-24
SLIDE 24

ATraPos repartitioning

24

30 60 90 120 150 180 10 20 30 40 50 60 70 80 Repartitioning cost (ms) Number of repartitioning actions merge split rearrange (split+merge)

Repartitioning of a table takes < 200ms

8 socket x 10 core 800K row table

slide-25
SLIDE 25

25

TATP - speedup over PLP

1 2 3 4 5 6 7 GetSubData GetNewDest UpdSubData Mix Normalized throughput ATraPos

ATraPos improves performance of TATP by 3.1-6.7x

PLP

8 socket x 10 core 800K subscribers

slide-26
SLIDE 26

Adapting to workload skew

26

0.5 1 1.5 2 2.5 3 3.5 4 4.5 10 20 30 40 50 Throughput (MTPS) Time (s) Static ATraPos

8 socket x 10 core 800K subscribers TATP GetSubData 50% requests to 20% data

ATraPos detects skew and quickly adapts

Repartitioning Monitoring

slide-27
SLIDE 27

27

Adapting to changing workload type

60 120 180 240 300 360 15 30 45 60 75 90 Throughput (KTPS) Time (s) Static ATraPos

ATraPos gracefully adapts to any change

UpdSubData GetNewDest Mix Repartitioning Monitoring 8 socket x 10 core 800K subscribers TATP

slide-28
SLIDE 28
  • Challenges

– Optimal configuration depends on workload and hardware – Expensive repartitioning due to physical data movement

  • ATraPos

– Minimal inter-socket accesses in the critical path – Workload & hardware-aware partitioning & placement – Lightweight monitoring and repartitioning

ATraPos: Adaptive OLTP for Islands

28

Thank you!