Virtual Machines Heyi Li and Zhen Cao (Some of the figures are from - - PowerPoint PPT Presentation

virtual machines
SMART_READER_LITE
LIVE PREVIEW

Virtual Machines Heyi Li and Zhen Cao (Some of the figures are from - - PowerPoint PPT Presentation

Fall 2014 :: CSE 506 :: Section 2 (PhD) Virtual Machines Heyi Li and Zhen Cao (Some of the figures are from the Internet) Fall 2014 :: CSE 506 :: Section 2 (PhD) Outline Basic concepts When virtual is better Implementation When


slide-1
SLIDE 1

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Virtual Machines

Heyi Li and Zhen Cao (Some of the figures are from the Internet)

slide-2
SLIDE 2

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Outline

  • Basic concepts
  • When virtual is better
  • Implementation
  • When virtual is harder
slide-3
SLIDE 3

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Basic Concepts

  • What is a virtual machine?

– An emulation of a particular computer system

  • System VM vs. Process VM

– System VM: supports the execution of a complete OS (Xen) – Process VM: supports the execution of a single process (JVM)

  • Hypervisor (VMM)

– Computer software that creates and runs VMs

  • Type I & II Hypervisor

VMware ESX, Microsoft Hyper-V, Xen

Hardware Hypervisor VM1 VM2 Type 1 (bare-metal)

Host Guest

Hardware Hosting OS Process Hypervisor VM1 VM2 Type 2 (hosted)

VMware Workstation, Microsoft Virtual PC, Sun VirtualBox, QEMU, KVM Host Guest

slide-4
SLIDE 4

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Applications and Benefits

  • Energy efficiency
  • Reducing Maintenance costs
  • Rapid deployment
  • Security

Server Consolidation

HWn

HW0 VM1 VMn

OS App OS App

HW VM1 VMn VMM

OS App OS App

Test and Development

VM1 HW VMM

OS App OS App

slide-5
SLIDE 5

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Virtualization Requirements

  • Fidelity

– Software on the VM executes identically to its execution on hardware, barring time effects

  • Performance

– Performance overhead must be small

  • Safety

– The VMM manages all hardware resources

slide-6
SLIDE 6

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Obstacles for X86

  • Trap-and-emulate

– All virtualization-sensitive instructions are also privileged instructions

  • x86 architecture once thought to be not fully virtualizable

– Certain privileged instructions behave differently when run in unprivileged mode (POPF) – Certain unprivileged instructions can access privileged state (SGDT)

  • Techniques to address inability to virtualize x86

– Full virtualization w/o hardware support – Binary Translation (VMware ESX) – Paravirtualization (Xen) – Hardware-assisted virtualization

slide-7
SLIDE 7

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Binary Translation

slide-8
SLIDE 8

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Binary Translation

  • Binary: input is binary x86 code, not source code
  • On-the-fly: dynamic and on demand
  • Only need to translate kernel mode code

– User mode: direct execution

  • Even for kernel mode, most instruction sequences don’t change
  • Instructions that do change:

– Indirect control flow: call/ret, jmp – PC-relative addressing – Privileged instructions

slide-9
SLIDE 9

Fall 2014 :: CSE 506 :: Section 2 (PhD)

1. A translation unit stops at 12 instructions

  • r a control-flow instruction

2. Translated into Compiled Code Fragments(CCF) and cached

TU Binary Translator Translation Cache CCF PC [x] [y] ([x], [y]) Hash Table Execute 1 5 3 2 4

3. Track the translation cache with a hash table 4. Execute the CCF 5. Continuation (either fall-through or taken- branch)

slide-10
SLIDE 10

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Memory

Guest Virtual Address (gVA) Space 4GB Guest Physical Address (gPA) Space Host Physical Address (hPA) Space Guest Page Table (Visible to guest OS) VMM PhysMap (Pmap) (Maintained by VMM) 4GB 4GB Shadow Page Table (Resides in hardware and maintained by VMM)

slide-11
SLIDE 11

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Shadow Page Tables

  • Translation from gVA to hPA directly by hardware
  • If not present, page fault generated by hardware
  • Hidden page fault: the mapping present in guest page table

– VMM walks the guest page table to determine the gPA backing that gVA – VMM allocates a physical page, and adds the mapping to Pmap – Updates the shadow page table

  • True page fault: the mapping not present in guest page table

– VMM generates an exception on the virtual cpu – Resume executing on the first instruction of the guest exception handler

slide-12
SLIDE 12

Fall 2014 :: CSE 506 :: Section 2 (PhD)

I/O Virtualization – Direct I/O Model

  • Place drivers for high-performance I/O

devices directly into hypervisor

  • Not attempt to have the virtual hardware

match the specific underlying hardware

  • Virtualize selected, canonical I/O devices
  • Problems

– Larger Hypervisor – Need to protect hypervisor from driver faults

Hypervisor

Shared Devices

I/O Services Device Drivers

VM0

Guest OS and Apps

VMn

Guest OS and Apps

Full Virtualization

slide-13
SLIDE 13

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Paravirtualization

slide-14
SLIDE 14

Fall 2014 :: CSE 506 :: Section 2 (PhD)

CPU Virtualization

  • Privilege levels in x86

– Ring 0: Xen – Ring 1: guest OS – Ring 3: user apps

  • Isolation

– Guest user mode and guest kernel mode

  • Page table “supervisor” bit: PTE_U

– Guest OS and VMM

  • Segmentation

– Problem with x86-64

slide-15
SLIDE 15

Fall 2014 :: CSE 506 :: Section 2 (PhD)

CPU Virtualization (cont.)

  • Privileged instructions

– Hypercalls – Modify source codes – Validated and executed by Xen (e.g., installing a new PT)

  • Exceptions

– Registered with Xen once. Accepted (validated) if don’t require to execute exception handlers in ring0. – Called directly without Xen intervention – All syscalls from apps to guest OS handled this way (and executed in ring1)

  • Page fault handlers are special

– Faulting address can be read only in ring 0 – Xen reads the faulting address and passes it via stack to the OS handler in ring1

slide-16
SLIDE 16

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Memory Virtualization

  • Physical memory

– At domain creation, hardware pages “reserved” – Domain can increase/decrease its quota – Xen does not guarantee that the hardware pages are contiguous

  • Virtual memory

– Register guest OS page tables directly with MMU – Guest OS allocates and initializes a page from its own memory reservation and registers it with Xen

  • Every guest OS has its own address space
  • Xen occupies top 64MB of every address space.
  • To save switching costs between address spaces (hypervisor calls)

– Xen involved only in memory updates

slide-17
SLIDE 17

Fall 2014 :: CSE 506 :: Section 2 (PhD)

I/O Virtualization – Indirect I/O Model

  • Uses a privileged virtual

machine (Domain0) for all device drivers

  • Simple interfaces for guest OSes
  • Pros

– higher security

  • Cons

– lower performance

Shared Devices

I/O Services

Hypervisor

Device Drivers

Service VMs VMn VM0

Guest OS and Apps

Guest VMs

Paravirtualization

slide-18
SLIDE 18

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Hardware-assist Virtualization (HVM)

slide-19
SLIDE 19

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Intel’s VT-x

  • More-privileged mode for VMM
  • Less-privileged mode for guest

OS

  • Eliminate de-privileging of Ring

for guest OS

Ring 3 Ring 0 VMX Root Virtual Machines (VMs) Apps OS VM Monitor (VMM) Apps OS

VM Exit VM Entry

slide-20
SLIDE 20

Fall 2014 :: CSE 506 :: Section 2 (PhD)

VM Control Structure(VMCS)

  • Execution controls determine when exits occur

– Access to privileged state, occurrence of exceptions, etc. – Flexibility provided to avoid unwanted exits

  • Guest-state area

– Processor state saved into the guest-state area on VM exits and loaded on VM entries

  • Host-state area

– Processor state loaded from the host-state area on VM exits

  • Other
slide-21
SLIDE 21

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Extended Page Table(EPT)

  • A new page-table structure, under the control of the VMM

– Defines mapping between GPA & HPA – EPT base pointer (new VMCS field) points to the EPT page tables – EPT (optionally) activated on VM entry, deactivated on VM exit

  • Guest has full control over its own IA-32 page tables

– No VM exits due to guest page faults, INVLPG, or CR3 changes

Guest Page Tables

Guest Linear Address Guest Physical Address

Extended Page Tables

Host Physical Address EPT Base Pointer (EPTP) CR3

slide-22
SLIDE 22

Fall 2014 :: CSE 506 :: Section 2 (PhD)

I/O Virtualization

Hypervisor

Shared Devices

I/O Services Device Drivers

VM0

Guest OS and Apps

VMn

Guest OS and Apps

Full Virtualization

Shared Devices

I/O Services

Hypervisor

Device Drivers

Service VMs VMn VM0

Guest OS and Apps

Guest VMs

Paravirtualization

Assigned Devices

Hypervisor

VM0

Guest OS and Apps Device Drivers

VMn

Guest OS and Apps Device Drivers

Pass-through Model

slide-23
SLIDE 23

Fall 2014 :: CSE 506 :: Section 2 (PhD)

IOMMU

  • Device pass through

– Directly assign a physical device to a particular guest OS – Address space translation handled transparently

  • Device isolation

– Safely map a device to a particular guest without risking the integrity of other guests

slide-24
SLIDE 24

Fall 2014 :: CSE 506 :: Section 2 (PhD)

IOMMU

  • Translation Control Entry

– Translation from a DMA address to a host memory address

slide-25
SLIDE 25

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Security Problems

  • Transience

– Large numbers of machines appear and disappear from the network sporadically

  • Diversity

– Long and painful upgrade cycles

  • Identity

– Difficult to establish who owns a VM running on a particular physical host

  • Mobility

– Can be easily copied over a network or carried on portable storage media

slide-26
SLIDE 26

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Discussion

slide-27
SLIDE 27

Fall 2014 :: CSE 506 :: Section 2 (PhD)

Thanks!