Xen and the Art of Virtualization
Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauery, Ian Pratt, Andrew Warfield University of Cambridge Computer Laboratory, SOSP 2003
Presenter: Dhirendra Singh Kholia
Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir - - PowerPoint PPT Presentation
Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauery, Ian Pratt, Andrew Warfield University of Cambridge Computer Laboratory, SOSP 2003 Presenter: Dhirendra Singh Kholia
Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauery, Ian Pratt, Andrew Warfield University of Cambridge Computer Laboratory, SOSP 2003
Presenter: Dhirendra Singh Kholia
x86-64, Itanium and PowerPC architectures. Xen can securely execute multiple virtual machines, each running its own OS, on a single physical system with close-to-native performance.
Ring 2 , Ring 3). Ring 1 and Ring 2 are unused
– VMM needs to run on highest privilege level (Ring 0) to provide isolation, resource scheduling and performance BUT Guest Kernels too are designed to run in Ring 0
Source: Ring Diagrams: http://duartes.org/gustavo/blog/post/cpu-rings-privilege-and-protection
non- virtualizable instructions) without sufficient permissions causes silent failures instead of generating a “convenient” trap (GPF) to VMM. Thus, a VMM will never get an opportunity to simulate the effect of the instruction
– Requires modifications to Guest OS’s Kernel. – Improved performance (due to exposure of real hardware,
– Exposing real time allows correct handling of time-critical stuff like TCP timeouts and RTT estimates.
– Conceptually it can be understood as adding Ring -1 above Ring 0 in which hypervisor executes and can trap and emulate privileged instructions – Allows for a much cleaner implementation of full virtualization.
Ring 0 Ring 2 Ring 1 Ring 3 User Applications
Binary Translation
VMM Full Virtualization
Guest OS
Xen
Guest OS
Paravirtualization
Control Plane User Apps Dom0 http://www.cs.uiuc.edu/homes/kingst/spring2007/cs598stk/slides/20070201-kelm-thompson-xen.ppt
table access)
etc)
code base size modified/added.
and Guest Applications run unmodified in Ring 3 (hence Guest OS remains protected)
hypercall ABI instead of executing privileged and sensitive instructions directly. A hypercall (0x82) is a software trap from a domain to the hypervisor, just as a syscall (0x80) is a software trap from user space to the kernel. e.g. When the system is idle, Linux issues HLT instruction which requires Ring 0 privilege to execute. In XenoLinux this is replaced by a hypercall which transfer control to Xen Ring 0 from Ring 1.
OS’s address space. This is done to save a TLB flush when going from Ring 1 to Ring 0 (VMM). Xen itself is protected by segmentation.
are registered with Xen for validation.
for system calls, allowing direct calls from an application into its guest OS and avoiding indirecting through Xen on every call.
Source: http://www.linuxjournal.com/article/8540
– E.g. data arrival on network; virtual disk transfer complete
– E.g. set of page table updates
Networking Example: A Domain (Request Producer) can supply buffers using “requests” and Xen (Response Producer) provides “responses” to signal arrival of packet into the buffers. In order this efficiently (avoid copy of packet data from Xen to Domain pages) Xen exchanges the its packet buffer with an unused page frame which has to be supplied by the Domain! Sort of message passing abstraction built on top of Xen SHM IPC
validating them and propagating changes to the MMU ‘shadow’ page table.
validates them
hypervisor space, or access to other VMs.
to VFR.
Transmit and Receive).
is avoided by using Gather DMA technique in NIC driver.
technique.
– Use the I/O ring – Guest I/O scheduler reorders requests prior to enqueuing them on the ring – Xen can also reorder requests to improve performance
Source: http://www.arunviswanathan.com/content/ppts/xen_virt.pdf
Source: http://www.linuxjournal.com/article/8909
Benchmarks (all taken from Ian’s presentation in 2006)
In short, Xen provides close to native performance!
hardware assisted virtualization (Intel VT, AMD-V)
and host. Runs FreeBSD, Windows (using HVM) as guest.
enhanced Power Management, XenCenter for management.
(KVM is gaining support!)
affected by a compromised Guest OS, running on top Dom0 ? – Game Over , protection of Domain 0 is critical!
can reduce the vulnerable surface of Xen (In one of their Security presentation they admit they should minimize the TCB). What are the other implication that might have towards the system if we remove Dom 0 Guest OS ? – Where will the management code go?, Xen relies on Dom0 drivers.
problems if we don't want to modify operating system any more by using Intel-VT. - With Intel-VT, Xen isn’t mapped into Guest OS address space.
next step he can extract confidential information via a cross-VM
E.g: side-channels: cross-VM information leakage due to the sharing of physical resources (e.g., the CPU’s data caches). In the multi-process environment, such attacks have been shown to enable extraction of RSA and AES secret keys. How this problem can be avoided in XEN? - ???
while all other domains see virtual abstractions of
domain run in the same address space, i.e. that of
DMA write to the memory of an arbitrary domain? – Drivers can be pushed out from Domain 0(Ring 1) to “Driver Domains”(Ring 1). This makes the system more robust. However the fundamental problem of unsafe DMA access is solved by IOMMU hardware.
which is often considered as a waste of the resources? - Yes, Resource Management is complicated Xen can do memory over commitment and then use ballooning to do dynamic memory management. Parallax handles the space management problem (using COW?). Memory and disk are cheap these days though, I would focus more on isolation, QoS and security problems.
balloon driver or modifying the kernel memory management routine to adjust memory usage of a domain. Both these approaches seem to require the modification of the OS. With hardware supported virtualization now allowing OSes to run unmodified, how is this problem solved? – The “balloon” driver works with HVM guest.
domains fairly (to balance the load for each domain)? How about some domains always have heavier average load than other domains? – The new CREDIT scheduler assigns a “weight” and a “cap” to each domain. A domain with 2X weight implies that it gets twice as much CPU as a domain with weight X. Cap decides how many processors the Domain can use. You can always assign (even at runtime) higher weight to a Domain which requires more CPU time.
domain to Domain0 is better than building a domain entirely within
reduced? – By Xen the authors mean the VMM part running in Ring 0. Domain 0 runs in Ring 1. Management code has to be present and Domain 0 is the logical place to put it!
systems to share hardware in a safe and resource managed fashion, when the Xen prototype can only support XenoLinux guest OS when this paper is written – Xen today handles many different Guest OSes. Even in 2003 they had a working XP prototype (it could run notepad and minesweeper).
2 privilege levels in hardware? – Yes I think so, with 2 privilege levels Guest OS wouldn’t be able to protect itself from applications.
the whole architecture impair? I mean, then, how to separate the guest OS kernel and guest application in a safe-proof fashion? – 3 Rings are good, 2 are NOT!
the I/O system to directly transfer from/to the disk? It seems I/O performance could be improved in this way. Is it hard? - Xen already does Zero-Copy transfer (by using DMA) for Disk I/O. Did I understand the question correctly?
from Xen VMM, which will make a lot of overhead between
Transfers, Underlying IPC used (SHM) is fast, Batching Updates and Events, PCI Pass through.
space seems to be a great consumption if 100 OSes run on VMM. Does this paper mean that Xen need to use 64MB for each process run on each OS run on it? If it is the case, it seems to be a disaster. - NO!, Xen is mapped into top 64MB of every guest address space. It doesn’t physically consume 64MB of RAM for every Guest OS
different kinds of operating systems running on the same machine, especially applications nowadays are becoming more and more portable on different platforms? – To test the the very same portable applications Virtual Machines are an excellent solution! You can run Windows, Linux, OSX on the same box and test your applications.
to port an guest OS, the porting work of Windows XP was still incomplete in their experiments. So do you think it really easy to achieve that? - It ran into licensing problems (M$!). With HVM, such a port is not required. I leave the answering of last part to the audience
Windows XP. A quick Web search reveals that licensing issues prevent this port from ever being published; thus, today, Windows XP can only be run under Xen using hardware-assisted virtualization (added in Xen 3). Why do the authors bother describing the paravirtualization of Windows XP, when no researcher can replicate their results and no user can take advantage of this port (due to unavailability of the code)? – Simply to illustrate that different OSes could be potentially be ported to run on top of Xen with minimal changes, that would be my guess!
performance, so I'm wondering is there any scenario that we may prefer binary translation as VMware
hardware virtualization. However BT is still used because it gives better performance than VT in some scenarios.
that every privileged instruction has to be validated by Xen? How does VMware handle such a problem? - ???
but to allow each OS to perform paging itself. They state that this decision was made to help achieve performance isolation, by preventing one domain from performing thrashing-inciting memory access patterns and thus reducing the performance of other domains. Is there any paging policy that would allow the VMM to perform paging, with all the attendant benefits (better resource sharing in asymmetric-load situations, etc), while not suffering substantially from a breakdown in performance isolation? - ???
referred in Section 1?
wall-clock time. The virtual time is used by the guest OS to make proper scheduling decisions but nowadays, Intel-VT enables us to use unmodified
virtual time, how can it make good scheduling decisions? By using Intel-VT, how could we provide the guest OS the virtual time, at the same time to give it the real time?
rings-privilege-and-protection
ability to support a secure virtual machine monitor
http://www.linuxjournal.com/article/8540