scaling guest os critical sections with e cs
play

Scaling Guest OS Critical Sections with e CS Sanidhya Kashyap, - PowerPoint PPT Presentation

Scaling Guest OS Critical Sections with e CS Sanidhya Kashyap, Changwoo Min, Taesoo Kim The physical and virtual CPU abstraction Mismatch between CPU abstraction 2 The physical and virtual CPU abstraction Mismatch between CPU


  1. Scaling Guest OS Critical Sections with e CS Sanidhya Kashyap, Changwoo Min, Taesoo Kim

  2. The physical and virtual CPU abstraction Mismatch between ● CPU abstraction 2

  3. The physical and virtual CPU abstraction Mismatch between ● CPU abstraction pCPU 1 pCPU 2 pCPU 3 pCPU 4 Physical machine (Host) 3

  4. The physical and virtual CPU abstraction Mismatch between ● CPU abstraction Hardware pCPU 1 pCPU 2 pCPU 3 pCPU 4 abstraction Physical machine (Host) 4

  5. The physical and virtual CPU abstraction ... App App App Mismatch between ● Virtual machine vCPU 1 vCPU 2 vCPU 3 vCPU 4 CPU abstraction Hypervisor Hardware pCPU 1 pCPU 2 pCPU 3 pCPU 4 abstraction Physical machine (Host) 5

  6. The physical and virtual CPU abstraction ... App App App Mismatch between ● Virtual machine Software vCPU 1 vCPU 2 vCPU 3 vCPU 4 CPU abstraction abstraction Hypervisor Hardware pCPU 1 pCPU 2 pCPU 3 pCPU 4 abstraction Physical machine (Host) 6

  7. The physical and virtual CPU abstraction ... App App App Mismatch between ● Virtual machine Software vCPU 1 vCPU 2 vCPU 3 vCPU 4 CPU abstraction abstraction Hypervisor Hardware pCPU 1 pCPU 2 pCPU 3 pCPU 4 abstraction Physical machine (Host) VM consolidation ● Apps Apps Apps Apps - Contention on pCPU VM1 VM2 VM3 VM4 Multiple vCPUs Hypervisor pCPU 1 pCPU 2 pCPU 3 pCPU 4 Physical machine (Host) 7

  8. The physical and virtual CPU abstraction ... App App App Mismatch between ● Virtual machine Software vCPU 1 vCPU 2 vCPU 3 vCPU 4 A vCPU can be preempted without notification CPU abstraction abstraction Hypervisor Hardware pCPU 1 pCPU 2 pCPU 3 pCPU 4 abstraction Physical machine (Host) VM consolidation ● Apps Apps Apps Apps - Contention on pCPU VM1 VM2 VM3 VM4 Multiple vCPUs Hypervisor pCPU 1 pCPU 2 pCPU 3 pCPU 4 Physical machine (Host) 8

  9. The physical and virtual CPU abstraction ... App App App Mismatch between ● Virtual machine Software vCPU 1 vCPU 2 vCPU 3 vCPU 4 A vCPU can be preempted without notification CPU abstraction abstraction Hypervisor Hardware pCPU 1 pCPU 2 pCPU 3 pCPU 4 abstraction Physical machine (Host) VM consolidation ● Apps Apps Apps Apps - Contention on vCPU VM1 VM2 VM3 VM4 Multiple vCPUs Double scheduling issue Hypervisor pCPU 1 pCPU 2 pCPU 3 pCPU 4 Physical machine (Host) 9

  10. Double scheduling: Lock holder preemption (LHP) A B C vCPU 1 vCPU 1 vCPU 2 vCPU 3 File Access a file vCPU vCPU Running task scheduled preempted in a VM vCPU holding a lock is preempted ● Preemption hinders forward progress of the VM ● Can lead to application slowdown by 20 -- 130% ● 10

  11. Efforts to mitigate preemption issues Research efforts Current practice Focussed only non-blocking locks Mostly address other preemption ● ● Acquire iff sufficient schedule time problem ○ Hotplug vCPUs on the fly ● Blocking locks ○ May not scale to large vCPU VMs ○ Unfair non-blocking locks ○ VM co-scheduling ● Hardware features to mitigate ● Does not always alleviate the issue ○ preemptions 11

  12. Efforts to mitigate preemption issues Research efforts Current practice Focussed only non-blocking locks Mostly address other preemption ● ● Acquire iff sufficient schedule time problem ○ Hotplug vCPUs on the fly ● Blocking locks ○ May not scale to large vCPU VMs ○ Unfair non-blocking locks ○ VM co-scheduling ● Hardware features to mitigate Prior approaches are mostly specialized ● Does not always alleviate the issue ○ preemptions 12

  13. Still the double scheduling is looming! LHP for blocking locks ● mutex, rwsem ○ Readers preemption (RP) in read-write locks ● A reader is preempted while holding the lock ○ Interrupt context preemption (ICP) ● Preemption of a vCPU processing an interrupt ○ Blocked-waiter wakeup (BWW) ● Waking up a blocked thread on an idle vCPU is at least 10 times costlier ○ 13

  14. Still the double scheduling is looming! LHP for blocking locks ● mutex, rwsem ○ Readers preemption (RP) in read-write locks ● A reader is preempted while holding the lock ○ Semantic gap between virtual and physical CPU Interrupt context preemption (ICP) ● Preemption of a vCPU processing an interrupt ○ Blocked-waiter wakeup (BWW) ● Waking up a blocked thread on an idle vCPU is at least 10 times costlier ○ 14

  15. Our approach to address semantic gap Insight: A vCPU may be running a critical task! Approach: Avoid preempting a vCPU with a critical task Design: Identify and mark/unmark a critical task 15

  16. Identifying each critical section with e CS A B C vCPU 1 vCPU 1 vCPU 2 vCPU 2 vCPU 3 File Access a file Scheduled Preempted Running task vCPU vCPU in a VM Synchronization primitives protect critical sections → ensure OS progress ● Mark and unmark critical sections before and after the critical section ● Conservative, but effective approach to address each preemption problem ● 60 LoC annotates 85K lock invocations in 13M LoC in Linux ○ 16

  17. Identifying each critical section with e CS A B C vCPU 1 vCPU 1 vCPU 2 vCPU 2 vCPU 3 File Access a file Scheduled Preempted Running task Enlightened vCPU vCPU in a VM vCPU Synchronization primitives protect critical sections → ensure OS progress ● Mark and unmark critical sections before and after the critical section ● Conservative, but effective approach to address each preemption problem ● 60 LoC annotates 85K lock invocations in 13M LoC in Linux ○ 17

  18. Identifying each critical section with e CS A B C vCPU 1 vCPU 2 vCPU 2 vCPU 3 File Access a file Scheduled Preempted Running task Enlightened vCPU vCPU in a VM vCPU Synchronization primitives protect critical sections → ensure OS progress ● Mark and unmark critical sections before and after the critical section ● Conservative, but effective approach to address each preemption problem ● 60 LoC annotates 85K lock invocations in 13M LoC in Linux ○ 18

  19. Sharing the state for efficient notification Each vCPU shares memory with the hypervisor ● ... vCPU(A) vCPU(B) vCPU(C) eCS eCS eCS vCPU updates information for critical sections ● states states states VM Notifies critical task to the hypervisor ○ non_preemptable_ecs_count Hypervisor also updates scheduler context ● preemptable_ecs_count pcpu_overloaded (0/1) before/after scheduling out a vCPU vcpu_preempted (0/1) Enables vCPU to make efficient scheduling ○ ... eCS eCS eCS decisions states states states vCPU(A) vCPU(B) vCPU(C) state state state Hypervisor 19

  20. Lightweight para-virtualized APIs to update states Hint API ... vCPU(A) vCPU(B) vCPU(C) activate_non_preemptable_ecs(cpu) eCS eCS eCS states states states VM deactivate_non_preemptable_ecs(cpu_id) VM → Hypervisor non_preemptable_ecs_count activate_preemptable_ecs(cpu_id)) preemptable_ecs_count pcpu_overloaded (0/1) vcpu_preempted (0/1) deactivate_preemptable_ecs(cpu_id) is_vcpu_preempted(cpu_id) Hypervisor → VM ... eCS eCS eCS states states states is_pcpu_overloaded(cpu_id) vCPU(A) vCPU(B) vCPU(C) state state state Updated by each vCPU; read by the hypervisor Hypervisor Update by the hypervisor; read by a vCPU 20

  21. Hypervisor checks eCS state before scheduling out a vCPU ➁ A B C ... vCPU(A) vCPU(B) vCPU(C) VM1 eCS eCS eCS vCPU 1 vCPU 1 vCPU 2 vCPU 3 states states states VM1 ➂ File ecs_count (0) ecs_count (1) ecs_count (0) ➃ Access a file Scheduled Preempted Running task ➅ Enlightened ... vCPU vCPU in a VM vCPU ➀ ➄ ➆ vCPU 1 VM1 ➀ Running vCPU 1 ➁ vCPU 1 acquires lock ➂ vCPU 1 updates eCS count vCPU 1 VM2 ➃ Hypervisor checks states before vCPU 1 preemption pCPU 1 ➄ Hypervisor lets vCPU 1 runs for extra time Time shared ➅ vCPU 1 finishes and updates eCS count ➆ Hypervisor penalizes vCPU 1 later 21

  22. Hypervisor checks eCS state before scheduling out a vCPU ➁ A B C ... vCPU(A) vCPU(B) vCPU(C) VM1 eCS eCS eCS vCPU 1 vCPU 1 vCPU 2 vCPU 3 states states states VM1 ➂ File ecs_count (0) ecs_count (1) ecs_count (0) ➃ Access a file Scheduled Preempted Running task ➅ Enlightened Penalized schedule Extended schedule ... vCPU vCPU in a VM vCPU ➀ ➄ ➆ vCPU 1 VM1 ➀ Running vCPU 1 ➁ vCPU 1 acquires lock ➂ vCPU 1 updates eCS count vCPU 1 VM2 ➃ Hypervisor checks states before vCPU 1 preemption pCPU 1 ➄ Hypervisor lets vCPU 1 runs for extra time Time shared ➅ vCPU 1 finishes and updates eCS count ➆ Hypervisor penalizes vCPU 1 later 22

  23. The case for system eventual fairness Hypervisor accounts extra time and later penalizes the enlightened VM ● Penalize the schedule of an enlightened VM ○ Extend the schedule of the very next VM ○ Hypervisor optimistically extends time for an enlightened CS ● Decision made just before scheduling out a vCPU ○ Extra time (schedule) to avoid preemption: 1 ms ○ 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend