RT-Xen: Real-Time Virtualization Sisu Xi, Meng Xu , Chenyang Lu, Linh - - PowerPoint PPT Presentation
RT-Xen: Real-Time Virtualization Sisu Xi, Meng Xu , Chenyang Lu, Linh - - PowerPoint PPT Presentation
RT-Xen: Real-Time Virtualization Sisu Xi, Meng Xu , Chenyang Lu, Linh T.X. Phan, Christopher D. Gill, Insup Lee, Oleg Sokolsky Real-Time Virtualization Cars: Consolidate ~100 ECUs -> ~10 multicore processors Infotainment on Linux or
Real-Time Virtualization
- Cars: Consolidate ~100 ECUs -> ~10 multicore processors
Infotainment on Linux or Android Safety-critical control on AUTOSAR
- Cloud Computing’s Killer App: Gaming [IEEE Spectrum]
Need to compute and stream 30 to 50 frames per second 1
Applications must meet real-time performance constraints on virtualized platforms!
RT-Xen: Real-Time Virtualization
- Real-time hypervisor scheduling framework in Xen
Implement a suite of real-time scheduling algorithms
- Based on compositional scheduling theory
VMs specify resource interfaces Real-time guarantees to tasks in VMs
- Open source
RT-Xen patch submitted https://sites.google.com/site/realtimexen/ 2
Xen Virtualization Architecture
- Guest OS runs on VCPUs
- Hypervisor schedules VCPUs on PCPUs
- Credit scheduler
[Weight, Cap] per VM Round robin 3
RT-Xen Interface
- VM resource interface
A set of VCPUs, each characterized by <period, budget> Optional: use cpumask to specify VCPU affinity with PCPUs Hide task-specific information
- Real-Time scheduling algorithms
Ordering of VCPUs?
Priority scheme
Placement of VCPUs?
Global vs. partition
Resource isolation?
Server mechanisms
4
Real-Time Scheduling Policies
- Priority scheme
Static priority: Deadline Monotonic (DM) Dynamic priority: Earliest Deadline First (EDF)
- Global scheduling
Schedule VCPUs based on global information Allow VCPU migration across cores Flexible use of multiple cores Migration overhead and cache penalty
- Partitioned scheduling
Assign and bind VCPUs to PCPUs Schedule VCPUs on each core independently May underutilize PCPUs No migration overhead or associated cache penalty
5
Scheduling a VCPU as a Deferrable Server
6
time T1 (10, 3) T2 (10, 3) 5 10 15 5 10 15 time Deferrable Server (5,3) Budget
- A VCPU receives budget us of CPU resources every period us
Budget is replenished at every start of period VCPU consumes budget when running, suspends when no budget
left
Preserves budget when there is no task
RT-Xen Investigation Roadmap
- Single-core
RT-Xen 1.0
- Single-core enhanced
RT-Xen 1.1
- Multi-core
RT-Xen 2.0
7 Global Scheduling Fixed Priority (DM) Partitioned Scheduling Dynamic Priority (EDF)
Periodic Deferrable Periodic Deferrable Deferrable Periodic Deferrable Periodic Sporadic Polling
Work Conserving Periodic Capacity Reclaiming Periodic
gDM gEDF pEDF pDM
RT-Xen 2.0: Run Queues
- A run queue
holds VCPUs that are runnable (have task to run) has two parts: VCPUs with budget and out of budget is sorted by priority (DM or EDF) within each part
- rt-global: all cores share one run queue with a spinlock
- rt-partition: one run queue per core
- Patches for more efficient implementation on the way!
8
RunQ
VCPUs with budget VCPUs out of budget Sorted by priority
Experimental Setup
- Hardware: Intel i7 processor, six cores running at 3.33 GHz
Dedicate one PCPU to domain 0 All guest VMs use the remaining cores
- Cache architecture
Each core has dedicated L1 cache (32 KB) and L2 cache (256 KB) All six cores share L3 cache (12 MB) Inclusive L3 cache, all data in L2 cache must also be in L3 cache
- Software
Xen 4.3 patched with RT-Xen Guest OS: Linux patched with LITMUSRT 9
RT-Xen 2.0: Scheduling Overhead
10
rt-global has extra overhead due to global lock credit has high max overhead due to load balancing
RT-Xen 2.0: Credit Scheduler
11
credit missed deadline at 22% CPU capacity RT-Xen delivers real-time performance up to 78%
RT-Xen 2.0: gEDF vs. pEDF
12
Global scheduling wins empirically!
gEDF + deferrable server -> best real-time performance
Demo
- YouTube: “RT-Xen Demonstration”
https://www.youtube.com/watch?v=wisxWn3mR5s 13
Patch Status
- Patch RFC v1 & v2
July 10th & July 29th
gEDF + deferrable server cpupool support
- Patch RFC v3
Expected on Aug 24th
Scheduling trace support Performance improvement (splitting RunQ)
- Patch RFC v4
Expected before Sep 10th
Performance improvement
- Timer based budget replenishment
- Improve the timing resolution of budget
14
Conclusion
- Diverse applications demand real-time virtualization
Real-time virtualization in embedded area Cloud gaming
- RT-Xen provides real-time performance
Efficient implementation of diverse real-time scheduling policies Leverage compositional scheduling theory -> analytical guarantee gEDF + deferrable server wins empirically 15
Research Contributions
- RT-Xen 1.0: S. Xi, J. Wilson, C. Lu, and C.D. Gill,
RT-Xen: Towards Real-Time Hypervisor Scheduling in Xen, ACM International Conferences on Embedded Software (EMSOFT) 2011
- RT-Xen 1.1: J. Lee, S. Xi, S. Chen, L.T.X. Phan, C. Gill, I. Lee, C. Lu and O. Sokolsky
Realizing Compositional Scheduling through Virtualization, IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) 2012
- RT-Xen 2.0: S. Xi, M. Xu, C. Lu, L.T.X. Phan, C. Gill, I. Lee, and O. Sokolsky
Real-Time Multi-Core Virtual Machine Scheduling in Xen, ACM International Conferences on Embedded Software (EMSOFT) 2014
- RT-Xen 2.1 + RT-OpenStack: S. Xi, C. Li, C. Lu, C. Gill, M. Xu, L.T.X. Phan, C. Gill, I. Lee, and O. Sokolsky
RT-OpenStack: Co-Hosting RT VM with non RT VMs, in submission
- RTCA: S. Xi, C. Li, C. Lu, and C. Gill
Prioritizing Local Inter-Domain Communication in Xen, ACM/IEEE International Symposium on Quality of Service (IWQoS) 2013
- RT-Xen patch (gEDF with deferrable server)
- RT-Xen: Real-Time Virtualization in Xen, Xen Blog, 2013
- RT-Xen: Real-Time Virtualization in Xen, Xen Developer Summit, 2014
16
Backup Slides
17
RT-Xen 1.0
18
Implementation – PCPU
19 Budget X Budget Task RunQ RunQ X Task RdyQ RdyQ
VCPU Position
Three Queues within One Physical Core
VCPU Params
(period, budget, priority)
IDLE
Periodic
0 3 IDLE
Server Design – Deferrable & Polling
- Servers (Period, Budget, Priority)
S1 (5, 3) with Two Tasks T1 (10, 3 )
20
T2 (10, 3 )
Time 5 0 10 15
Budget in S1 Actual Execution
Deferrable Server
Time 5 0 10 15 3
back-to-back
Budget in S1 Actual Execution
Polling Server
Time 5 0 10 15 3
- 1. Replenish?
- 2. Budget but NO task?
Server Design – Periodic & Sporadic
- Servers (Period, Budget, Priority)
S1 (5, 3) with Two Tasks T1 (10, 3 )
21
T2 (10, 3 )
Time 5 0 10 15
Budget in S1 Actual Execution
Sporadic Server
Time 5 0 10 15 3
5 +3 5 +3
- verhead ++
Budget in S1 Actual Execution
Periodic Server
Time 5 0 10 15 3
theory favored
- 1. Replenish?
- 2. Budget but NO task?
?
RT-Xen 2.0
22
RT-Xen 2.0: Workload
- Periodic task sets: [period, execution time, deadline]
- CPU-intensive, independent tasks
- Randomly generate the task sets until a total task utilization, then
distribute tasks to four VMs, and apply compositional scheduling theory to calculate each VM’s resource interface
- 25 task sets per data point, measure fraction of schedulable
tasksets
23
RT-Xen 2.0: Context Saved
8/18/2014
24
Less than 1 us overhead for the spinlock
RT-Xen 2.0: Theory vs. Experiments
25
gEDF < pEDF theoretically due to pessimistic analysis gEDF > pEDF empirically, thanks to global scheduling
RT-Xen 2.0: Theory vs. Experiments
8/18/2014
26
gEDF > pEDF empirically, thanks to global scheduling gEDF < pEDF theoretically due to pessimistic analysis
RT-Xen 2.0: How about Cache?
27
Benefit of global scheduling dominate migration cost
- n a shared L3-cache platform.
RT-Xen 2.0: Context Switch
8/18/2014
28