VMBus (Hyper-V) devices in QEMU/KVM Roman Kagan - - PowerPoint PPT Presentation

vmbus hyper v devices in qemu kvm
SMART_READER_LITE
LIVE PREVIEW

VMBus (Hyper-V) devices in QEMU/KVM Roman Kagan - - PowerPoint PPT Presentation

VMBus (Hyper-V) devices in QEMU/KVM Roman Kagan <rkagan@virtuozzo.com> About me with Virtuozzo (formerly Parallels, formerly SWSoft) since 2005 in different roles including large-scale automated testing development for


slide-1
SLIDE 1

VMBus (Hyper-V) devices in QEMU/KVM

Roman Kagan <rkagan@virtuozzo.com>

slide-2
SLIDE 2

About me

  • with Virtuozzo (formerly Parallels, formerly SWSoft)

since 2005

  • in different roles including
  • large-scale automated testing development for

container and hypervisor

  • proprietary Parallels hypervisor development
  • now: opensource QEMU/KVM-based Virtuozzo

hypervisor development

slide-3
SLIDE 3

Disclaimers

➢ all trademarks are the property of their respective

  • wners

➢ the only authoritative and up-to-date documentation is the code

slide-4
SLIDE 4

Outline

  • 1. Motivation

a. virtual h/w choice for Windows VM

  • 2. Hyper-V / VMBus emulation

a. layers & components b. implementation details c. implementation status

  • 3. Summary & outlook
slide-5
SLIDE 5

Motivation

wanted:

  • performance
  • easy to deploy
  • support

W i n d

  • w

s

  • CPU
  • RAM
  • HDD

XXXX-XXXX-XXXX

?

QEMU KVM QEMU KVM QEMU KVM QEMU KVM

slide-6
SLIDE 6

Choice #1: h/w emulation

✔ easy to deploy ✔ support ✘ performance

VM

e1000 IDE W i n d

  • w

s

slide-7
SLIDE 7

Virtual machine ≠ physical machine

physical machine:

  • all CPU and RAM is

yours

  • timing is (somewhat)

predictable virtual machine:

  • can be preempted
  • can be swapped out
  • many things become

expensive (APIC, I/O, MSRs, etc) answer: paravirtualization

slide-8
SLIDE 8

Choice #2: VirtIO

WindowsGuestDrivers (aka virtio-win) ✔ performance ✘ easy to deploy ✘ support

VM

virtio net virtio scsi W i n d

  • w

s d r i v e r s

slide-9
SLIDE 9

What’s wrong with virtio-win?

WHQL ⇒ SVVP ⇒ support GPL WHQL in order to ship it, you need to own it

Certified? No…

slide-10
SLIDE 10

Choice #3: Hyper-V emulation

✔ performance ✔ easy to deploy ✔ support sounds like a plan!

VM

VMBus net VMBus strg W i n d

  • w

s

slide-11
SLIDE 11

Hyper-V: how to?

  • 1. Microsoft docs on GitHub
  • 2. Linux guest code for Hyper-V (everything under

CONFIG_HYPERV)

  • 3. trial & error
  • e.g. things work with Linux hyperv guest but break

with Windows guest

slide-12
SLIDE 12

Hyper-V paravirtualization

  • previously implemented enlightenments
  • management MSRs
  • synthetic interrupt controller
  • timers
  • hypercalls
  • VMBus
  • devices
slide-13
SLIDE 13

Hyper-V preexisting enlightenments

  • management MSRs
  • GUEST_OS_ID
  • VP_INDEX
  • hypercall infrastructure
  • scheduler
  • NOTIFY_LONG_SPIN_WAIT hypercall
  • LAPIC
  • MSR access to EOI / ICR / TPR
  • APIC assist page (aka pvEOI)
slide-14
SLIDE 14

Hyper-V management MSRs

  • reset
  • panic
  • CRASH_CTL, CRASH_P0…P3 — BSOD info
  • VP_RUNTIME
slide-15
SLIDE 15

Hyper-V clocks

partition reference time: monotonic clock in 100ns ticks since boot

  • time reference counter:

rdmsr HV_X64_MSR_TIME_REF_COUNT

  • 1 vmexit / clock read
  • no hardware requirements
slide-16
SLIDE 16

Hyper-V clocks (cont’d)

  • TSC reference page: similar to kvm_clock

time = (scale * tsc) >> 64 + offset

  • no vmexits
  • invariant TSC req’d
  • ne per VM
  • read consistency via seqcount
  • seqcount == 0 ⇒ fall-back to time ref count
  • no seqlock semantics ⇒ use fall-back on updates ⇒

monotonicity with time ref count req’d

slide-17
SLIDE 17

Hyper-V SynIC (synthetic interrupt controller)

  • LAPIC extension managed via MSRs
  • 16 SINT’s per vCPU
  • AutoEOI support
  • incompatible with APICv
  • KVM_IRQ_ROUTING_HV_SINT
  • GSI → vCPU#, SINT#
  • irqfd support
  • KVM_EXIT_HYPERV(SYNIC) on MSR access
slide-18
SLIDE 18

Hyper-V SynIC — message page

guest receive:

  • read payload
  • msg_type: atomic

TYPE_NNN→TYPE_NONE

  • EOI or EOM ⇒ eventfd

hypervisor post:

  • msg_type: CAS

TYPE_NONE→TYPE_NNN

  • write payload
  • deliver SINTx

SINTx SINT0 … SINT15 … 256 bytes 4096 bytes header payload msg_type

slide-19
SLIDE 19

Hyper-V SynIC — event flags page

SINTx SINT0 … SINT15 … 2048 event flags (bits) 256 bytes 4096 bytes

hypervisor signal:

  • event flag: CAS 0→1
  • deliver SINTx

guest receive:

  • event flag: atomic 1→0
  • EOI or EOM ⇒ eventfd
slide-20
SLIDE 20

Hyper-V timers

  • per vCPU: 4 timers × 2 MSRs (config, count)
  • in partition reference time
  • SynIC messages HVMSG_TIMER_EXPIRED
  • expiration time
  • delivery time
  • in KVM ⇒ first to take message slot
  • periodic / one-shot
  • lazy (= discard) / period modulation (= slew)
slide-21
SLIDE 21

Hyper-V hypercalls

extend existing implementation in KVM:

  • new hypercalls
  • HVCALL_POST_MESSAGE
  • HVCALL_SIGNAL_EVENT
  • pass-through to userspace
  • KVM_EXIT_HYPERV(HCALL)
  • stub implementation in QEMU
slide-22
SLIDE 22

Hyper-V VMBus

  • announced via ACPI
  • host–guest messaging connection
  • host → guest: SINT & message page
  • guest → host: POST_MESSAGE hypercall
  • used to
  • negotiate version and parameters
  • discover & setup devices
  • setup channels
slide-23
SLIDE 23

Hyper-V VMBus channel

entity similar to VirtIO virtqueue

  • descriptor rings akin to VirtIO vrings
  • 1+ per device
  • signaling:
  • host → guest: SINT & event flags page
  • guest → host: SIGNAL_EVENT hypercall
  • used for data transfer
slide-24
SLIDE 24

Hyper-V VMBus devices

  • util (shutdown, heartbeat, timesync, VSS, etc)
  • storage
  • net
  • balloon
slide-25
SLIDE 25

Firmware support

needed to boot off Hyper-V storage or network

  • SeaBios
  • OVMF

⇒ port over from kernel

slide-26
SLIDE 26

Summary

  • Hyper-V / VMBus emulation is a viable solution to

make Windows guests’ life on QEMU/KVM easier

  • we have the groundwork in KVM and QEMU mostly

complete

  • the actual VMBus devices implementation is being

worked on

slide-27
SLIDE 27

Outlook

  • performance measurement & tuning
  • vhost integration
  • AF_VSOCK transport
  • event logging
  • debugging
  • more devices
  • input
  • video