Scalability in the Clouds! A Myth or Reality? Sanidhya Kashyap , - - PowerPoint PPT Presentation

scalability in the clouds a myth or reality
SMART_READER_LITE
LIVE PREVIEW

Scalability in the Clouds! A Myth or Reality? Sanidhya Kashyap , - - PowerPoint PPT Presentation

Scalability in the Clouds! A Myth or Reality? Sanidhya Kashyap , Changwoo Min, Taesoo Kim Programmer's Paradise? A programmer day-to-day task: program compilation, like Linux kernel compilation. Relies on Buildbot to complete the job ASAP!


slide-1
SLIDE 1

Scalability in the Clouds! A Myth or Reality?

Sanidhya Kashyap, Changwoo Min, Taesoo Kim

slide-2
SLIDE 2

Programmer's Paradise?

  • A programmer day-to-day task: program

compilation, like Linux kernel compilation.

  • Relies on Buildbot to complete the job ASAP!
  • Expects the job to complete sooner with

increasing core count.

– With respect to vertical scalability, a parallel job with no

sequential bottleneck should scale with increasing core count.

slide-3
SLIDE 3

Programmer's Paradise?

  • A programmer day-to-day task: program

compilation, like Linux kernel compilation.

  • Relies on Buildbot to complete the job ASAP!
  • Expects the job to complete sooner with

increasing core count.

– With respect to vertical scalability, a parallel job with no

sequential bottleneck should scale with increasing core count.

How about using Cloud providers for our fun and their profjt?

slide-4
SLIDE 4

Clouds Trend

  • Trend is changing

Larger instances (40 → vCPUs) are available.

  • Will Buildbot really scale?

2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 10 20 30 40 50

1 4 8 8 16 32 32 32 32 40

vCPUs

slide-5
SLIDE 5

Scalability Behavior in the Clouds

20 40 60 80 100 120 140 4 8 12 16 20 24 28 32 builds / hour #vCPUs EC2

slide-6
SLIDE 6

Scalability Behavior in the Clouds

20 40 60 80 100 120 140 4 8 12 16 20 24 28 32 builds / hour #vCPUs EC2 GCE

slide-7
SLIDE 7

Scalability Behavior in the Clouds

20 40 60 80 100 120 140 4 8 12 16 20 24 28 32 builds / hour #vCPUs EC2 GCE Azure

slide-8
SLIDE 8

Scalability Behavior in the Clouds

20 40 60 80 100 120 140 4 8 12 16 20 24 28 32 builds / hour #vCPUs EC2 GCE Azure 16-core E5

slide-9
SLIDE 9

Scalability Behavior in the Clouds

20 40 60 80 100 120 140 4 8 12 16 20 24 28 32 builds / hour #vCPUs EC2 GCE Azure 16-core E5

slide-10
SLIDE 10

Scalability Behavior in the Clouds

20 40 60 80 100 120 140 4 8 12 16 20 24 28 32 builds / hour #vCPUs EC2 GCE Azure 16-core E5

slide-11
SLIDE 11

Scalability Behavior in the Clouds

20 40 60 80 100 120 140 4 8 12 16 20 24 28 32 builds / hour #vCPUs EC2 GCE Azure 16-core E5

slide-12
SLIDE 12

Scalability Behavior in the Clouds

20 40 60 80 100 120 140 4 8 12 16 20 24 28 32 builds / hour #vCPUs EC2 GCE Azure 16-core E5

?

slide-13
SLIDE 13

Scalability Behavior in VMs with Higher-core count

50 100 150 200 250 20 40 60 80 100 120 140 160 builds / hour #vCPUs Host

slide-14
SLIDE 14

Scalability Behavior in VMs with Higher-core count

50 100 150 200 250 20 40 60 80 100 120 140 160 builds / hour #vCPUs Host Guest

slide-15
SLIDE 15

Scalability Behavior in VMs with Higher-core count

6.7x

50 100 150 200 250 20 40 60 80 100 120 140 160 builds / hour #vCPUs Host Guest

slide-16
SLIDE 16
  • Performance

degradation occurs due to drastic increase in VMEXITS (halt exits).

Why?

200 400 600 800 1000 1200 1400 20 40 60 80 100 120 140 160

#halt exits x 1000 #vCPUs

Guest 50 100 150 200 20 40 60 80 100 120 140 160

builds / hour

Guest

slide-17
SLIDE 17
  • Performance

degradation occurs due to drastic increase in VMEXITS (halt exits).

Why?

200 400 600 800 1000 1200 1400 20 40 60 80 100 120 140 160

#halt exits x 1000 #vCPUs

Guest 50 100 150 200 20 40 60 80 100 120 140 160

builds / hour

Guest

Spinlock is sleeping!

slide-18
SLIDE 18

Spinlock Evolution in the Linux Kernel

slide-19
SLIDE 19

Spinlock Evolution in the Linux Kernel

Test-and- Test-And-Set spinlock

slide-20
SLIDE 20

Spinlock Evolution in the Linux Kernel

Test-and- Test-And-Set spinlock 2.6.25 (April 2008) Ticket spinlock Fairness

slide-21
SLIDE 21

3.15 (July 2014) qspinlock, variant

  • f MCS lock (yet to

be merged)

Spinlock Evolution in the Linux Kernel

Test-and- Test-And-Set spinlock 2.6.25 (April 2008) Ticket spinlock Fairness Shared cacheline contention

slide-22
SLIDE 22

3.15 (July 2014) qspinlock, variant

  • f MCS lock (yet to

be merged)

Spinlock Evolution in the Linux Kernel

Test-and- Test-And-Set spinlock 2.6.25 (April 2008) Ticket spinlock Fairness Shared cacheline contention 3.11 ( 2013) Paravirtual Ticket spinlock

slide-23
SLIDE 23

3.15 (July 2014) qspinlock, variant

  • f MCS lock (yet to

be merged)

Spinlock Evolution in the Linux Kernel

Test-and- Test-And-Set spinlock 2.6.25 (April 2008) Ticket spinlock Fairness Shared cacheline contention 3.11 ( 2013) Paravirtual Ticket spinlock 4.0 (May, 2015) OTicket

slide-24
SLIDE 24

Ticket Spinlock

  • Guaranteed FIFO ordering.
  • Mitigates starvation with increasing core count.

F&I(*addr) {

  • ld = *addr;

*addr++; return old; }

tail head

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); }

  • ut: ;

} void unlock() { head++; }

slide-25
SLIDE 25

Ticket Spinlock

  • Guaranteed FIFO ordering.
  • Mitigates starvation with increasing core count.

F&I(*addr) {

  • ld = *addr;

*addr++; return old; }

tail head

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); }

  • ut: ;

} void unlock() { head++; }

slide-26
SLIDE 26

Ticket Spinlock

  • Guaranteed FIFO ordering.
  • Mitigates starvation with increasing core count.

F&I(*addr) {

  • ld = *addr;

*addr++; return old; }

tail head

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); }

  • ut: ;

} void unlock() { head++; }

slide-27
SLIDE 27

Ticket Spinlock

  • Guaranteed FIFO ordering.
  • Mitigates starvation with increasing core count.

F&I(*addr) {

  • ld = *addr;

*addr++; return old; }

tail head

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); }

  • ut: ;

} void unlock() { head++; }

slide-28
SLIDE 28

Ticket Spinlock

  • Guaranteed FIFO ordering.
  • Mitigates starvation with increasing core count.

F&I(*addr) {

  • ld = *addr;

*addr++; return old; }

tail head

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); }

  • ut: ;

} void unlock() { head++; }

slide-29
SLIDE 29

Ticket Spinlock

  • Guaranteed FIFO ordering.
  • Mitigates starvation with increasing core count.

F&I(*addr) {

  • ld = *addr;

*addr++; return old; }

tail head

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); }

  • ut: ;

} void unlock() { head++; }

slide-30
SLIDE 30

Ticket Spinlock

  • Guaranteed FIFO ordering.
  • Mitigates starvation with increasing core count.

F&I(*addr) {

  • ld = *addr;

*addr++; return old; }

tail head

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); }

  • ut: ;

} void unlock() { head++; }

slide-31
SLIDE 31

Ticket Spinlock

  • Guaranteed FIFO ordering.
  • Mitigates starvation with increasing core count.

F&I(*addr) {

  • ld = *addr;

*addr++; return old; }

tail head

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); }

  • ut: ;

} void unlock() { head++; }

slide-32
SLIDE 32

Complexity of Ticket Spinlock in Virtualized Environment

Hypervisor vCPU1 vCPU2

Guest OS

CPU1

  • vCPUs are scheduled by host

scheduler.

  • Semantic gap between the

hypervisor and guest OS.

slide-33
SLIDE 33

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Holder Preemption: vCPU holding the lock

gets preempted.

slide-34
SLIDE 34

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Holder Preemption: vCPU holding the lock

gets preempted.

Scheduled Preempted

slide-35
SLIDE 35

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Holder Preemption: vCPU holding the lock

gets preempted.

head = 0 tail = 1 Scheduled Preempted

slide-36
SLIDE 36

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Holder Preemption: vCPU holding the lock

gets preempted.

1 head = 0 tail = 2 Scheduled Preempted

slide-37
SLIDE 37

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Holder Preemption: vCPU holding the lock

gets preempted.

1 2 head = 0 Scheduled Preempted

slide-38
SLIDE 38

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Holder Preemption: vCPU holding the lock

gets preempted.

1 2 head = 1 Scheduled Preempted

slide-39
SLIDE 39

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Holder Preemption: vCPU holding the lock

gets preempted.

2 1 head = 1 Scheduled Preempted

slide-40
SLIDE 40

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Holder Preemption: vCPU holding the lock

gets preempted.

2 1 head = 1 Scheduled Preempted

Lock Holder Preemption!

slide-41
SLIDE 41

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Waiter preemption: The next waiter is

preempted before acquiring the lock.

3 tail = 4

slide-42
SLIDE 42

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Waiter preemption: The next waiter is

preempted before acquiring the lock.

1 2 head = 1 3 tail = 4 Scheduled Preempted

slide-43
SLIDE 43

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Waiter preemption: The next waiter is

preempted before acquiring the lock.

1 head = 1 3 tail = 4 Scheduled Preempted 2

slide-44
SLIDE 44

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Waiter preemption: The next waiter is

preempted before acquiring the lock.

3 tail = 4 Scheduled Preempted 2

slide-45
SLIDE 45

Complexity of Ticket Spinlock in Virtualized Environment

  • Lock Waiter preemption: The next waiter is

preempted before acquiring the lock.

3 tail = 4 Scheduled Preempted 2

Lock Waiter Preemption!

slide-46
SLIDE 46

Current Solution to LHP and LWP

  • Handling lock requests depending on the

lock state.

– Lock: yield if long wait. – Unlock: wake up the preempted waiter.

  • A paravirtual interface to track state change.
slide-47
SLIDE 47

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-48
SLIDE 48

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-49
SLIDE 49

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-50
SLIDE 50

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-51
SLIDE 51

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-52
SLIDE 52

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-53
SLIDE 53

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-54
SLIDE 54

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-55
SLIDE 55

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-56
SLIDE 56

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-57
SLIDE 57

Paravirtual Ticket Spinlock

#defj fjne SPIN_THRESHOLD (1 << 15) int head = 0; int tail = 0; int threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count);

+ slowpath_spin(tail);

}

  • ut: ;

} void unlock() {

+ wakeup_cpu(head + 1);

head++; }

  • Lock:

– Fast path: spin till a certain

threshold value.

– Slow path: notify the

hypervisor to de-schedule the thread.

  • Unlock:

– Wake-up procedure to re-

schedule the next waiting thread.

slide-58
SLIDE 58

Problem: The Mechanism to Annotate the Slow Behavior

The slowpath_spin issues the hlt instruction The hypervisor traps the instruction Then it de-schedules the vCPU.

  • Probable cause of degradation:

– Most vCPUs trap to the hypervisor – Switching overhead between guest and host +

communication cost to wake-up other vCPUs increases

slide-59
SLIDE 59

Key idea: Ordering

  • OTicket tries to exploit the ordering.
  • Lock:

– Lower ticket distance

longer spin. →

– Allows more spinning to nearby waiters.

  • Unlock:

– Wake-up multiple waiters. – Reduces latency for the upcoming waiters.

slide-60
SLIDE 60

OTicket: Opportunistic Spinning

+#defj fjne EAGER_WAITERS 4 +#defj fjne TICKET_QUEUE 18 +#defj fjne SPIN_MAX_THRESHOLD 34 #defj fjne SPIN_THRESHOLD 15 int head = 0; int tail = 0; + u64 threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); + if(my_ticket - head < TICKET_QUEUE) { + threshold = SPIN_MAX_THRESHOLD + >> (dist – 1); + } for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); slowpath_spin(tail); }

  • ut: ;

}

1000 100 10

slide-61
SLIDE 61

OTicket: Opportunistic Spinning

+#defj fjne EAGER_WAITERS 4 +#defj fjne TICKET_QUEUE 18 +#defj fjne SPIN_MAX_THRESHOLD 34 #defj fjne SPIN_THRESHOLD 15 int head = 0; int tail = 0; + u64 threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); + if(my_ticket - head < TICKET_QUEUE) { + threshold = SPIN_MAX_THRESHOLD + >> (dist – 1); + } for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); slowpath_spin(tail); }

  • ut: ;

}

1000 100 10

slide-62
SLIDE 62

OTicket: Opportunistic Spinning

+#defj fjne EAGER_WAITERS 4 +#defj fjne TICKET_QUEUE 18 +#defj fjne SPIN_MAX_THRESHOLD 34 #defj fjne SPIN_THRESHOLD 15 int head = 0; int tail = 0; + u64 threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); + if(my_ticket - head < TICKET_QUEUE) { + threshold = SPIN_MAX_THRESHOLD + >> (dist – 1); + } for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); slowpath_spin(tail); }

  • ut: ;

}

1000 100 10

slide-63
SLIDE 63

OTicket: Opportunistic Spinning

+#defj fjne EAGER_WAITERS 4 +#defj fjne TICKET_QUEUE 18 +#defj fjne SPIN_MAX_THRESHOLD 34 #defj fjne SPIN_THRESHOLD 15 int head = 0; int tail = 0; + u64 threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); + if(my_ticket - head < TICKET_QUEUE) { + threshold = SPIN_MAX_THRESHOLD + >> (dist – 1); + } for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); slowpath_spin(tail); }

  • ut: ;

}

1000 100 10

slide-64
SLIDE 64

OTicket: Opportunistic Spinning

+#defj fjne EAGER_WAITERS 4 +#defj fjne TICKET_QUEUE 18 +#defj fjne SPIN_MAX_THRESHOLD 34 #defj fjne SPIN_THRESHOLD 15 int head = 0; int tail = 0; + u64 threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); + if(my_ticket - head < TICKET_QUEUE) { + threshold = SPIN_MAX_THRESHOLD + >> (dist – 1); + } for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); slowpath_spin(tail); }

  • ut: ;

}

1000 100 10

slide-65
SLIDE 65

OTicket: Opportunistic Spinning

+#defj fjne EAGER_WAITERS 4 +#defj fjne TICKET_QUEUE 18 +#defj fjne SPIN_MAX_THRESHOLD 34 #defj fjne SPIN_THRESHOLD 15 int head = 0; int tail = 0; + u64 threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); + if(my_ticket - head < TICKET_QUEUE) { + threshold = SPIN_MAX_THRESHOLD + >> (dist – 1); + } for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); slowpath_spin(tail); }

  • ut: ;

}

1000 100 10

slide-66
SLIDE 66

OTicket: Opportunistic Spinning

+#defj fjne EAGER_WAITERS 4 +#defj fjne TICKET_QUEUE 18 +#defj fjne SPIN_MAX_THRESHOLD 34 #defj fjne SPIN_THRESHOLD 15 int head = 0; int tail = 0; + u64 threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); + if(my_ticket - head < TICKET_QUEUE) { + threshold = SPIN_MAX_THRESHOLD + >> (dist – 1); + } for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); slowpath_spin(tail); }

  • ut: ;

}

1000 100 10

slide-67
SLIDE 67

OTicket: Opportunistic Spinning

+#defj fjne EAGER_WAITERS 4 +#defj fjne TICKET_QUEUE 18 +#defj fjne SPIN_MAX_THRESHOLD 34 #defj fjne SPIN_THRESHOLD 15 int head = 0; int tail = 0; + u64 threshold = SPIN_THRESHOLD; void lock() { my_ticket = F&I(tail); + if(my_ticket - head < TICKET_QUEUE) { + threshold = SPIN_MAX_THRESHOLD + >> (dist – 1); + } for( ; ; ) { int count = threshold; do { if(my_ticket == head); goto out; } while(--count); slowpath_spin(tail); }

  • ut: ;

}

1000 100 10

slide-68
SLIDE 68

OTicket: Opportunistic Wake-up

void unlock() { + for(count = 1; count <= EAGER_WAITERS; + ++count) { + wakeup_cpu(head + count);

}

head++; }

t+1 t+2 t+3 t

Wake-up sleeping waiters

slide-69
SLIDE 69

OTicket: Opportunistic Wake-up

void unlock() { + for(count = 1; count <= EAGER_WAITERS; + ++count) { + wakeup_cpu(head + count);

}

head++; }

t+1 t+2 t+3 t

Wake-up sleeping waiters

slide-70
SLIDE 70

OTicket: Opportunistic Wake-up

void unlock() { + for(count = 1; count <= EAGER_WAITERS; + ++count) { + wakeup_cpu(head + count);

}

head++; }

t+1 t+2 t+3 t

Wake-up sleeping waiters

slide-71
SLIDE 71

OTicket: Opportunistic Wake-up

void unlock() { + for(count = 1; count <= EAGER_WAITERS; + ++count) { + wakeup_cpu(head + count);

}

head++; }

t+1 t+2 t+3 t

Wake-up sleeping waiters

slide-72
SLIDE 72

OTicket: Opportunistic Wake-up

void unlock() { + for(count = 1; count <= EAGER_WAITERS; + ++count) { + wakeup_cpu(head + count);

}

head++; }

t+1 t+2 t+3 1000 100 t+1 t+2 t+3 t

Wake-up sleeping waiters

slide-73
SLIDE 73

OTicket: Opportunistic Wake-up

void unlock() { + for(count = 1; count <= EAGER_WAITERS; + ++count) { + wakeup_cpu(head + count);

}

head++; }

t+1 t+2 t+3 1000 100 t+1 t+2 t+3 t

Wake-up sleeping waiters

slide-74
SLIDE 74

OTicket: Opportunistic Wake-up

void unlock() { + for(count = 1; count <= EAGER_WAITERS; + ++count) { + wakeup_cpu(head + count);

}

head++; }

t+1 t+2 t+3 1000 100 t+1 t+2 t+3 t

Wake-up sleeping waiters

slide-75
SLIDE 75

OTicket: Opportunistic Wake-up

void unlock() { + for(count = 1; count <= EAGER_WAITERS; + ++count) { + wakeup_cpu(head + count);

}

head++; }

t+1 t+2 t+3 1000 100 t+1 t+2 t+3 t

Wake-up sleeping waiters

slide-76
SLIDE 76

Outline

  • Scalability issue in the Clouds
  • Scalability issue in VMs with higher core count
  • OTicket design
  • Evaluation
  • Conclusion
slide-77
SLIDE 77

OTicket: Guest vs Host

  • Improves guest performance by almost 5x.
  • Reduces halt exits by 6x.

50 100 150 200 20 40 60 80 100 120 140 160

builds / hour #vCPUs

Host Guest 200 400 600 800 1000 1200 1400 20 40 60 80 100 120 140 160

#halt exits x 1000 #vCPUs

Guest

slide-78
SLIDE 78

OTicket: Guest vs Host

  • Improves guest performance by almost 5x.
  • Reduces halt exits by 6x.

50 100 150 200 20 40 60 80 100 120 140 160

builds / hour #vCPUs

Host Guest OTicket 200 400 600 800 1000 1200 1400 20 40 60 80 100 120 140 160

#halt exits x 1000 #vCPUs

Guest OTicket

slide-79
SLIDE 79

OTicket Performance Breakdown

  • Opportunistic spinning prohibits sleeping.

20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160

builds / hour #vCPUs

OTicket

slide-80
SLIDE 80

OTicket Performance Breakdown

  • Opportunistic spinning prohibits sleeping.

20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160

builds / hour #vCPUs

OTicket Opportunistic spinning

slide-81
SLIDE 81

OTicket Performance Breakdown

  • Opportunistic spinning prohibits sleeping.

20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160

builds / hour #vCPUs

OTicket Opportunistic spinning Opportunistic wake-up

slide-82
SLIDE 82

Importance of Wake-ups

  • Oversubscribed tenants.
  • OTicket performs better due to opportunistic

wake-up.

5 10 15 20 25 30 35 40 45 20 40 60 80 100 120 140 160

builds / hour #vCPUs

Guest

slide-83
SLIDE 83

Importance of Wake-ups

  • Oversubscribed tenants.
  • OTicket performs better due to opportunistic

wake-up.

5 10 15 20 25 30 35 40 45 20 40 60 80 100 120 140 160

builds / hour #vCPUs

Guest Longer spinning

slide-84
SLIDE 84

Importance of Wake-ups

  • Oversubscribed tenants.
  • OTicket performs better due to opportunistic

wake-up.

5 10 15 20 25 30 35 40 45 20 40 60 80 100 120 140 160

builds / hour #vCPUs

Guest OTicket Longer spinning

slide-85
SLIDE 85

Other Spinlock Alternatives

  • Two spinlock implementaions:

– Current ticket spinlock – Fast-queue spinlock

50 100 150 200 250 20 40 60 80 100 120 140 160

builds / hour #vCPUs

Guest OTicket

slide-86
SLIDE 86

Other Spinlock Alternatives

  • Two spinlock implementaions:

– Current ticket spinlock – Fast-queue spinlock

50 100 150 200 250 20 40 60 80 100 120 140 160

builds / hour #vCPUs

Guest OTicket Qspin

slide-87
SLIDE 87

Other Spinlock Alternatives

  • Two spinlock implementaions:

– Current ticket spinlock – Fast-queue spinlock Qspinlock has the same issue. Our design has been already acknowledged!

50 100 150 200 250 20 40 60 80 100 120 140 160

builds / hour #vCPUs

Guest OTicket Qspin

slide-88
SLIDE 88

Conclusion

  • Identifjed a new class of problem.

– not cacheline contention. – sleepy spinlock anomaly.

  • Carefully utilized the ordering property can

scale the spinlock:

– Opportunistic spinning. – Opportunistic wake-up.

slide-89
SLIDE 89

Conclusion

  • Identifjed a new class of problem.

– not cacheline contention. – sleepy spinlock anomaly.

  • Carefully utilized the ordering property can

scale the spinlock:

– Opportunistic spinning. – Opportunistic wake-up.

slide-90
SLIDE 90

Future Work

  • Scalability of other synchronization

primitives in virtualized environment?

  • 50

100 150 200 250 300 10 20 30 40 50 60 70 80

builds / hour #vCPUs

Ideal Host OTicket

?

slide-91
SLIDE 91

Sanidhya Kashyap sanidhya@gatech.edu Changwoo Min, Taesoo Kim

Thank you! Questions?

https://github.com/sslab-gatech/vbench