Preemptable Ticket Spinlocks Improving Consolidated Performance in - - PowerPoint PPT Presentation

preemptable ticket spinlocks
SMART_READER_LITE
LIVE PREVIEW

Preemptable Ticket Spinlocks Improving Consolidated Performance in - - PowerPoint PPT Presentation

Preemptable Ticket Spinlocks Improving Consolidated Performance in the Cloud Jiannan Ouyang, John Lange Department of Computer Science University of Pittsburgh VEE 13 03/17/2013 Motivation VM interference in overcommitted environments


slide-1
SLIDE 1

Jiannan Ouyang, John Lange Department of Computer Science University of Pittsburgh VEE ’13 03/17/2013

Preemptable Ticket Spinlocks

Improving Consolidated Performance in the Cloud

slide-2
SLIDE 2

Motivation

2

— VM interference in overcommitted environments

— OS synchronization overhead

— Lock holder preemption (LHP)

slide-3
SLIDE 3

— Lock Waiter Preemption

— significance analysis of lock waiter preemption

— Preemptable Ticket Spinlock

— implementation inside Linux

— Evaluation

— significant speedup over Linux

Contributions

3

slide-4
SLIDE 4

Spinlocks

4

— Basics

— lock() & unlock() — Busy waiting lock — generic spinlock: random order, unfair (starvation) — ticket spinlock: FIFO order, fair

— Designed for fast mutual exclusion

— busy waiting vs. sleep/wakeup

— spinlocks for short & fast critical sections (~1us)

— OS assumptions

— use spinlocks for short critical section only — never preempt a thread holding or waiting a kernel spinlock

slide-5
SLIDE 5

Preemption in VMs

5

— Lock Holder Preemption (LHP)

— virtualization breaks the OS assumption — vCPU holding a lock is unscheduled by VMM — preemption prolongs critical section (~1m v.s. ~1us)

— Proposed Solutions

— Co-scheduling and variants — Hardware-assisted scheme (Pause Loop Exiting) — Paravirtual spinlocks

slide-6
SLIDE 6

Preemption in Ticket Lock

6

1 2 head = 0 tail = 2 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

slide-7
SLIDE 7

Preemption in Ticket Lock

7

1 2 head = 0 tail = 2 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

slide-8
SLIDE 8

Preemption in Ticket Lock

8

1 2 3 head = 0 tail = 3 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

slide-9
SLIDE 9

Preemption in Ticket Lock

9

1 2 3 4 head = 0 tail = 4 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

slide-10
SLIDE 10

Preemption in Ticket Lock

10

1 2 3 4 tail = 4 head = 1 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

slide-11
SLIDE 11

Preemption in Ticket Lock

11

1 2 3 4 tail = 4 head = 1 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

Lock Holder Preemption!

slide-12
SLIDE 12

Preemption in Ticket Lock

12

1 2 3 4 tail = 4 head = 1 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

slide-13
SLIDE 13

Preemption in Ticket Lock

13

1 2 3 4 tail = 4 head = 1 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

slide-14
SLIDE 14

Preemption in Ticket Lock

14

2 3 4 tail = 4 head = 2 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

slide-15
SLIDE 15

Preemption in Ticket Lock

15

3 4 head = 3 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 tail = 4

slide-16
SLIDE 16

Preemption in Ticket Lock

16

3 4 5 head = 3 tail = 5 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

slide-17
SLIDE 17

Preemption in Ticket Lock

17

3 4 5 6 head = 3 tail = 6 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1

Lock Waiter Preemption wait on available resource

slide-18
SLIDE 18

Lock Waiter Preemption

18

— Lock waiter is preempted — Later waiters wait on an available lock — Possible to adapt to it, if we

— detect preempted waiter — acquire lock out of order

How significant is it??

slide-19
SLIDE 19

Waiter Preemption Dominates

19

LHP + LWP LWP ​𝐌𝐗 𝐌𝐗𝐐/𝐌𝐈𝐐 𝐌𝐈𝐐 +𝐌𝐗 𝐌𝐗𝐐 hackbench x1 1089 452 41.5% hackbench x2 44342 39221 88.5% ebizzy x1 294 166 56.5% ebizzy x2 1017 980 96.4% Table 2: Lock Waiter Preemption Problem in the Linux Kernel

Lock waiter preemption dominates in

  • vercommitted environments
slide-20
SLIDE 20

Challenges & Approach

20

— How to identify a preempted waiter?

— timeout threshold

— How to violate order constraints?

— allow timed out waiters get the lock randomly — ensure mutual exclusion between them

— How NOT to break the whole ordering mechanism?

— timeout threshold proportional to queue position

slide-21
SLIDE 21

Queue Position Index

21

N = ticket – queue_head

— ticket: copy of queue tail value upon enqueue — N: number of earlier waiters

n n+1 n+2 head = n tail = n+2 ticket = n+2 N = 2

slide-22
SLIDE 22

Proportional Timeout Threshold

22

T = N x t

— t is a constant parameter

— large enough to avoid false detection — small enough to save waiting time

— Performance is NOT t value sensitive

— most locks take ~1us & most spinning time wasted on locks

that wait ~1ms

— larger t does not harm & smaller t does not gain much

n n+1 n+2 head = n tail = n+2 t 2t Timeout Threshold

slide-23
SLIDE 23

Preemptable Ticket Spinlock

23

1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 2 3 4 5 head = 0 tail = 5 Timeout Threshold t 2t 3t 4t 5t

slide-24
SLIDE 24

Preemptable Ticket Spinlock

24

1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 2 3 4 5 head = 1 tail = 5 Timeout Threshold t 2t 3t 4t

slide-25
SLIDE 25

Preemptable Ticket Spinlock

25

1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 2 3 4 5 head = 1 tail = 5 Timeout Threshold t 2t 3t 4t

slide-26
SLIDE 26

Preemptable Ticket Spinlock

26

1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 4 5 head = 2 tail = 5 Timeout Threshold t 2t 3t

N = ticket – head

slide-27
SLIDE 27

Preemptable Ticket Spinlock

27

1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 4 5 head = 2 tail = 5 Timeout Threshold t 2t 3t

slide-28
SLIDE 28

Preemptable Ticket Spinlock

28

1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 5 head = 3 tail = 5 Timeout Threshold 2t

slide-29
SLIDE 29

Preemptable Ticket Spinlock

29

1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 5 head = 3 tail = 5 Timeout Threshold 2t

slide-30
SLIDE 30

Preemptable Ticket Spinlock

30

1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 5 head = 3 tail = 5 Timeout Threshold 2t

slide-31
SLIDE 31

Summary

31

— Preemptable Ticket Lock adapts to preemption

— preserve order in absence of preemption — violate order upon preemption

— Preemptable Ticket Lock preserves fairness

— order violations are restricted

— priority is always given to timed out waiters — timed out waiters bounded by vCPU numbers of a VM

slide-32
SLIDE 32

Implementation

32

— Drop-in replacement

— lock(), unlock(), is_locked(), trylock(), etc.

— Correct

— race condition free: atomic updates

— Fast

— performance is sensitive to lock efficiency

— ~60 lines of C/inline-assembly in Linux 3.5.0

slide-33
SLIDE 33

Paravirtual Spinlocks

33

— Lock holder preemption is unaddressed

— semantic gap between guest and host — paravirtualization: guest/host cooperation

— signal long waiting lock / put a vCPU to sleep — notify to wake up a vCPU / wake up a vCPU

— paravirtual preemptable ticket spinlock

— sleep when waiting too long after timed out — wake up all sleeping waiters upon lock releasing

slide-34
SLIDE 34

Evaluation

34

— Host

— 8 core 2.6GHz Intel Core i7 CPU, 8 GB RAM, 1Gbit NIC,

Fedora 17 (Linux 3.5.0)

— Guest

— 8 core, 1G RAM, Fedora 17 (Linux 3.5.0)

— Benchmarks

— hackbench, ebizzy, dell dvd store

— Lock implementations

— baseline: ticket lock, paravirtual ticket lock (pv-lock) — preemptable ticket lock — paravirtual (pv) preemptable ticket lock

slide-35
SLIDE 35

Hackbench

35

— Average Speedup

— preemptable-lock vs. ticket lock: 4.82X — pv-preemptable-lock v.s. ticket lock: 7.08X — pv-preemptable-lock v.s. pv-lock: 1.03X

slide-36
SLIDE 36

Ebizzy

36

Less variance over ticket lock and pv-lock

— in-VM preemption adaptivity — less VM interference

variance 80.36 vs. 10.94 variance 131.62 vs. 16.09

slide-37
SLIDE 37

Dell DVD Store (apache/mysql)

37

— Average Speedup

— preemptable-lock vs. ticket lock: 11.68X — pv-preemptable-lock v.s. ticket lock: 19.52X — pv-preemptable-lock v.s. pv-lock: 1.11X

slide-38
SLIDE 38

Evaluation Summary

38

— Preemptable Ticket Spinlocks speedup

— 5.32X over ticket lock

— Paravirtual Preemptable Ticket Spinlocks speedup

— 7.91X over ticket lock — 1.08X over paravirtual ticket lock

Average speedup across cases for all benchmarks

slide-39
SLIDE 39

— Lock Waiter Preemption

— most significant preemption problem in queue based lock under

  • vercommitted environment

— Preemptable Ticket Spinlock

— Implementation with ~60 lines of code in Linux

— Better performance in overcommitted environment

— 5.32X average speedup up over ticket lock w/o VMM support — 1.08X average speedup over pv-lock with less variance

Conclusion

39

slide-40
SLIDE 40

Thank You

40

— Jiannan Ouyang

— ouyang@cs.pitt.edu — http://www.cs.pitt.edu/~ouyang/

1 2 3 4 5 t 2t 3t 4t

Preemptable Ticket Spinlock