Virtio 1 - why do it? And - are we there yet? 2015 Michael S. - - PowerPoint PPT Presentation

virtio 1 why do it and are we there yet 2015 michael s
SMART_READER_LITE
LIVE PREVIEW

Virtio 1 - why do it? And - are we there yet? 2015 Michael S. - - PowerPoint PPT Presentation

Virtio 1 - why do it? And - are we there yet? 2015 Michael S. Tsirkin Red Hat Uses material from https://lwn.net/Kernel/LDD3/ Gcompris, tuxpaint 1 Distributed under the Creative commons license. Lots of work ... main-title 300 250


slide-1
SLIDE 1

1

Virtio 1 - why do it? And - are we there yet? 2015 Michael S. Tsirkin Red Hat

Uses material from https://lwn.net/Kernel/LDD3/ Gcompris, tuxpaint Distributed under the Creative commons license.

slide-2
SLIDE 2

2

Lots of work ...

2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 50 100 150 200 250 300

main-title

year commits Aug 6 to Aug 6 in each year

slide-3
SLIDE 3

3

Virtio 1: update

  • Documented assumptions
  • More Robust
  • More Extendable
slide-4
SLIDE 4

4

Conformance statements

Virtio 0.9 Virtio 1.0

  • DRIVER_OK status bit is set.
  • The device can now be used.

drv→probe(dev); netif_carrier_on(dev) add_status(dev, DRIVER_OK); The driver MUST NOT notify the device before setting DRIVER_OK.

drv→probe(dev); add_status(dev, DRIVER_OK); netif_carrier_on(dev)

slide-5
SLIDE 5

5

Virtio 0.9: inflate

FF FF 00 00 0.......................................31 FF FF 01 00 00 00 01 00 0.......................................31

DRIVER

+1

slide-6
SLIDE 6

6

Virtio 1.0: inflate

FF FF 00 00 0.......................................31 00 00 01 00 0.......................................31

DRIVER

+1

FF FF 00 00 00 00 01 00

slide-7
SLIDE 7

7

Generation counter

FFFFFFFF 00000000 0.......................................63 00000000 00000001 0.......................................63

DRIVER

+1 1

FFFFFFFF 00000001 00000000 00000001

slide-8
SLIDE 8

8

Memory map

COMMON FEATURES QUEUE STATUS ISR

0.9 1.0

DEVICE SPECIFIC CAPABILITY LIST IO BAR MEMORY BAR VIRTIO CAPABILITY #1 VIRTIO CAPABILITY #2 DEVICE SPECIFIC COMMON FEATURES QUEUE STATUS ISR ...

slide-9
SLIDE 9

9

Virtio 0.9: Port IO vs Memory

Port IO MM IO x86 decode: address x86 decode: data Fast on x86 32/64 bit Page tables Required by PCI Express

slide-10
SLIDE 10

10

0.9

VQ NUMBER 0...................15 DATA

1.0

Fast MMIO avoid need to decode data

NOTIFY 0...................15 ADDRESS VQ NUMBER DATA NOTIFY 0...................15 ADDRESS 16............……......31 IGNORED

slide-11
SLIDE 11

11

Virtio 1: Access times on KVM x86: Cycles per access (lower is better)

CPU cycles 500 1000 1500 2000 2500 3000 3500 4000 MMIO Fast MMIO Port IO

slide-12
SLIDE 12

12

Virtio 1: Port IO vs Memory

Port IO MM IO x86 decode: address Fast on x86 32/64 bit Page tables Required by PCI Express

slide-13
SLIDE 13

13

Memory Region Aliases

CAPABILITY LIST IO BAR MEMORY BAR VIRTIO CONFIG CAPABILITY VIRTIO CAPABILITY Queue Notify Queue Notify VirtQueue Queue Notify

slide-14
SLIDE 14

14

soft mac

52 54 00 12 34 56

DRIVER

Ethernet MAC 52 54 00 12 34 56 VirtQueue

DRIVER

0.9 1.0

slide-15
SLIDE 15

15

Virtio feature negotiation

1 1

  • |-

0..............1...........2............. DEVICE FEATURES DRIVER 1

  • |-

DRIVER FEATURES

Defaults must be maintained forever!

slide-16
SLIDE 16

16

Virtio 1: Error handling

  • DRIVER: set features
  • DRIVER: set FEATURES_OK bit
  • DEVICE: check features
  • DEVICE: clear FEATURES_OK on error
  • DRIVER: check FEATURES_OK bit
  • DRIVER: fail gracefully if not set

!

slide-17
SLIDE 17

17

Error handling: Virtio 0.9

  • Can't recover from device errors
  • Not very useful?
  • Just stop guest.
slide-18
SLIDE 18

18

Vhost-user

virtio-net GUEST VM RAM VHOST USER CLIENT

SETUP DMA Client crash or restart need not cause guest crash!

slide-19
SLIDE 19

19

DEVICE_NEEDS_RESET

DRIVER

Read STATUS; Detect: NEEDS_RESET set Write STATUS=0 Will reset device Reconfigure device. Write STATUS=DRIVER_OK Restart operation.

slide-20
SLIDE 20

20

Compatibility

DRIVER Legacy Modern Legacy Modern DRIVER Legacy Legacy Modern DRIVER Modern DRIVER Legacy Modern Legacy DRIVER Modern

Transitional Device & Driver Legacy Driver Legacy Device

slide-21
SLIDE 21

21

Are we there yet?

GUEST VHOST USER

DMA

GUEST BIOS

VHOST

slide-22
SLIDE 22

22

What to expect?

  • Current: Virtio-v1.0-cs03
  • Next bugfix: Virtio-v1.0-cs04

– Virtio-blk: writeback / writethrough control – More update guidance

  • Next feature: Virtio-v1.1-cs01

– Virtio-input – Virtio-gpu – Virtio-vsock

slide-23
SLIDE 23

23

TX: Interrupt avoidance

uplink

slide-24
SLIDE 24

24

TX: Interrupt coalescing

uplink

slide-25
SLIDE 25

25

Pass-through for nested virt

  • Memory mapped: use page tables
  • IOMMU: translate and protect guest memory

Virtio Net (on host)

slide-26
SLIDE 26

26

Virtio as PCI Express device

  • Uses memory mapped IO support
  • Multi-root for NUMA
  • Native hotplug
  • Advanced Error Reporting
slide-27
SLIDE 27

27

Summary

  • Why do it?

– Improved robustness for virtual devices

  • Are we there yet?

– Yes! – And there's more to come.

slide-28
SLIDE 28

28

Thank you!

slide-29
SLIDE 29

29

Virtio 0.9: Port IO versus memory

  • n KVM x86: cycles per access

(lower is better)

CPU cycles 500 1000 1500 2000 2500 3000 3500 4000 MMIO Port IO

slide-30
SLIDE 30

30

Virtio 1.0

PCI CCW (PPC) MMIO (ARM)

OASIS Virtio TC

slide-31
SLIDE 31

31

Virtio 1.0

  • Virtio PCI:

– Replace Port IO with Memory mapped IO – PCI Express (hotplug, AER, multi-root, SRIOV) – Infinite features

  • Reduced memory requirements
  • Fixed endianness
  • Compatibility
slide-32
SLIDE 32

32

Port IO: outl

EF OUT REASON QUALIFICATION STATE VM Exit (%DX) %EAX notify VQ#

slide-33
SLIDE 33

33

Memory mapped IO: writel

89 MOV REASON GUEST ADDRESS RIP VM Exit (%EDI) %RSI 3E

VALID? PTE

slide-34
SLIDE 34

34

Fast MMIO

MOV REASON GUEST ADDRESS VM Exit (%EDI) %RSI

VALID? PTE

notify VQ#

slide-35
SLIDE 35

35

Multiple interfaces

CAPABILITY LIST IO BAR MEMORY BAR VIRTIO CAPABILITY #1 VIRTIO CAPABILITY #2

slide-36
SLIDE 36

36

Memory requirements

desc avail used VQ

0.9

desc avail used VQ

1.0

slide-37
SLIDE 37

37

0.9

1 1

  • |-

0.......................................31 DEVICE FEATURES DRIVER

1 v

1

  • |-

DRIVER FEATURES 0... …. …. …. …. …. …. ... 1 2 3 4 DRIVER STATUS = FEATURES_OK

1.0

….. …. ….

features

SEL

slide-38
SLIDE 38

38

Endianness

Virtio LE Virtio BE Device LE Device BE Virtio LE Device Device Device

Virtio 1.0

intel PPC

Virtio 0.9

slide-39
SLIDE 39

39

0.9 1.0

Driver Driver Device Device Compatibility

compatibility

slide-40
SLIDE 40

40

Packet layout

Virtio 1.0 Virtio 0.9

INDIRECT next header header

slide-41
SLIDE 41

41

Packet layout: transactions per sec (higher is better)

transactions/sec 500 1000 1500 2000 2500 3000 3500 virtio 0.9 virtio 1.0

slide-42
SLIDE 42

42

More: virtio 1.0 versus 0.9.5

  • Virtio 9p
  • Virtio blk: WCE
  • Virtio-net Multiqueue
  • Virtio-net dynamic offloads
  • Already upstream (based on spec draft)
slide-43
SLIDE 43

43

vhost updates

  • Vhost scsi
  • Vhost-net zero copy transmit
  • No need for driver changes
slide-44
SLIDE 44

44

Kvm networking

  • Openvswitch – if time allows
  • Ethernet bridge
slide-45
SLIDE 45

45

Bridge FDB

London Heathrow Paris CDG London Paris Paris Heathrow CDG London uplink

slide-46
SLIDE 46

46

Flood: DOS potential

London Heathrow Paris CDG Bangkok Paris Heathrow CDG London uplink

slide-47
SLIDE 47

47

Disable flood

London Heathrow Paris CDG London Paris Paris Heathrow CDG London Bangkok uplink

slide-48
SLIDE 48

48

softmac

  • Ifconfig eth0 hw ether 00:12:23:45:67:89

virtio-net MAC

00:12:23:45:67:89

NEW

slide-49
SLIDE 49

49

Using softmac/non promiscuous

London Heathrow Paris CDG Paris Paris Heathrow CDG

NEW

uplink

slide-50
SLIDE 50

50

Work in progress

  • ELVIS (vhost blk/vhost net)
  • Virgl
  • Vhost-net performance
slide-51
SLIDE 51

51

RX latency

NIC HOST VHOST VM

slide-52
SLIDE 52

52

Fast rx

NIC HOST VHOST VM

Current?

RAM?

slide-53
SLIDE 53

53

Fast rx: transactions per sec (higher is better)

transactions/sec 1000 2000 3000 4000 5000 6000 7000 thread irq

Hit 331668 Miss 79

slide-54
SLIDE 54

54

Vhost-net threading

tap VHOST VM

RX TX

NIC VHOST VM

slide-55
SLIDE 55

55

Vhost-net thread pool

tap VM NIC VM WQ VHOST VHOST

slide-56
SLIDE 56

56

threading: UDP RR transactions/sec (higher is better)

256 512 1024 2048 4096 8192 16384 2000 4000 6000 8000 10000 12000 14000 16000 thread wq

slide-57
SLIDE 57

57

threading: TCP STREAM transactions/sec (higher is better)

256 512 1024 2048 4096 8192 16384 2000 4000 6000 8000 10000 12000 14000 thread wq

slide-58
SLIDE 58

58

summary

  • Performance
  • Manageability
  • Security
slide-59
SLIDE 59

59

Questions?

slide-60
SLIDE 60

60

OVS: flow match

PACKET FLOW 192.68.0.1 22 192.68.0.1 12865 22 VM 12865 OVS-VSWITCHD kernel userspace

slide-61
SLIDE 61

61

OVS: wildcard match

PACKET FLOW 192.68.0.1

*

22 VM 12865 OVS-VSWITCHD kernel userspace

slide-62
SLIDE 62

62

Wilcard: netperf CRR (higher is better)

bi-connections/sec 500 1000 1500 2000 2500 match wildcard

slide-63
SLIDE 63

63