Vhost and VIOMMU
Jason Wang <jasowang@redhat.com> (Wei Xu <wexu@redhat.com>) Peter Xu <peterx@redhat.com>
Vhost and VIOMMU Jason Wang <jasowang@redhat.com> (Wei Xu - - PowerPoint PPT Presentation
Vhost and VIOMMU Jason Wang <jasowang@redhat.com> (Wei Xu <wexu@redhat.com>) Peter Xu <peterx@redhat.com> Agenda IOMMU & Qemu vIOMMU background Motivation of secure virtio DMAR (DMA Remapping) Design
Jason Wang <jasowang@redhat.com> (Wei Xu <wexu@redhat.com>) Peter Xu <peterx@redhat.com>
08/18/16 VHOST AND VIOMMU 2
– Design Overview – Implementation illustration – Performance optimization – vhost device iotlb
08/18/16 VHOST AND VIOMMU 3
–
A hardware component provides two main functions: IO Translation and Device Isolation.
–
DMA Remapping(DMAR), IO space address presented by devices are translated to physical address coupled with access permission on the fly, so the ability of devices are limited to access specific regions of memory.
–
Interrupt Remapping (IR), Some architectures also support interrupt remapping, in a manner similar to memory remapping.
–
An emulated IOMMU which behaves as a real one.
–
The functionality is always a subset of a physical unit depending on implementation.
–
Only Intel, ppc, sun4m iommus are support in qemu currently.
08/18/16 VHOST AND VIOMMU 4
Memory vIOMMU Emulated Devices vCPU vMMU
VM
Memory vIOMMU Emulated Devices vCPU vMMU
VM HOST
Host Memory IOMMU MMU Hardware Devices CPU
08/18/16 VHOST AND VIOMMU 5
08/18/16 VHOST AND VIOMMU 7
gpa Virtio-Net Backends Vring
Vhost-net Vhost-user Other virtio-net backends
tx/rx
Memory API Virtio-Net
gpa Qemu
Guest
Virtio-Net Backend Service
gpa-to-hva
Guest pages
08/18/16 VHOST AND VIOMMU 8 iova
Virtio-Net Backends Vring
Vhost-net Vhost-user Other virtio-net backends
tx/rx
IOMMU Driver vIOMMU IOTLB API
dma api iotlb entry lookup
Memory API Virtio-Net
iova
Qemu
Guest
Virtio-Net Backend Service
iova-to-hva Guest Pages
08/18/16 VHOST AND VIOMMU 9
–
–
08/18/16 VHOST AND VIOMMU 10
–
–
08/18/16 VHOST AND VIOMMU 11
–
–
–
08/18/16 VHOST AND VIOMMU 12
Root Complex Translation Agent (TA) PCIe Device A PCIe Device B ats request ats completion device iotlb cache
Memory
08/18/16 VHOST AND VIOMMU 13
–
–
–
–
08/18/16 VHOST AND VIOMMU 14
a
translate iova 'd' iotlb-miss 'd' iotlb-update 'd' iotlb invalidate 'c'
(d, size, wo)
IOTLB API
lookup new
error report
illegal address range update 'd' guest unmap 'c'
Vring
Tx/Rx
device iotble cache entries interval tree
(a, size, ro)
legal address range
Vhost IOTLB API
(b, size, wo) (c, size, rw) (d, size, wo)
08/18/16 VHOST AND VIOMMU 15
08/18/16 VHOST AND VIOMMU 17
System Bus Bridge Signal-based Interrupts (MSI/MSIX) IOAPIC Line-based Interrupts PCI Bus Processor Local APIC Processor Local APIC Processor Local APIC
Kinds of interrupts:
–
Line-based (edge/level)
–
Signal-based (MSI/MSI-X)
IRQ chips
–
IOAPIC
–
Local APICs (LAPICs)
08/18/16 VHOST AND VIOMMU 18
– MSI and IOAPIC interrupts
– How to define interface between user and kernel space? – How to enable vhost fast irq path (irqfd)?
08/18/16 VHOST AND VIOMMU 19
–
Fill in IOAPIC entry with interrupt information (trigger mode, destination ID, destination mode, etc.).
–
When line triggered, interrupt sent to CPU with information stored in IOAPIC entry.
–
Fill in IRTE with interrupt information (in system memory).
–
Fill in IOAPIC entry with IRTE index.
–
When line triggered, fetch IRTE index from IOAPIC entry, send the interrupt with information stored in specific IRTE.
08/18/16 VHOST AND VIOMMU 20
Interrrupt Request (MSI) Interrrupt Request (MSI with IR) IRTE IRTE IRTE IRTE IRTE IRTE IRTE IRTE IRTE IRTE IRTE IRTE IRTE IRTE IRTE IRTE Interrupt Remapping T able Interrrupt Request (MSI) Interrrupt Remapping T able Entry (IRTE) Lookup Indexing Parse Delivered Delivered MSI Delivery without IR MSI Delivery with IR
08/18/16 VHOST AND VIOMMU 21
– Leverage existing GSI routing table in KVM – Instead of translate “on the fly”, translate during setup – Easy to implement (no KVM change required) – Little performance impact (slow setup, fast delivery) – Only support “split|off” kernel irqchip, not “on”
08/18/16 VHOST AND VIOMMU 22
vhost KVM Event Guest Notifjer IRQ injection
GSI Routing T able
Guest
MSI Message 1 MSI Message 2 MSI Message 3 MSI Message 4
QEMU
Setup Setup
08/18/16 VHOST AND VIOMMU 23
vhost KVM Event Guest Notifjer IRQ injection
GSI Routing T able
Guest
T ranslated MSI Message 4 T ranslated MSI Message 3 T ranslated MSI Message 2 T ranslated MSI Message 1
QEMU
Setup Setup
08/18/16 VHOST AND VIOMMU 24
qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \
qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \
08/18/16 VHOST AND VIOMMU 25
– Performance dropped drastically – TCP_STREAM: 24500 Mbps
– TCP_RR: 25000 trans/s
– Around 5% performance drop for throughput (pktgen) – Still more work TBD...
08/18/16 VHOST AND VIOMMU 26
–
–
–
–
–
08/18/16 VHOST AND VIOMMU 27
08/18/16 VHOST AND VIOMMU 28
08/18/16 VHOST AND VIOMMU 29
Mode IOAPIC APIC “ON” In kernel “SPLIT” In userspace In kernel In kernel In userspace “OFF” In userspace
qemu-system-x86_64 -M q35,kernel-irqchip={on|off|split} qemu-system-x86_64 -M q35,kernel-irqchip={on|off|split}