KVM MMU Virtualization Xiao Guangrong - - PowerPoint PPT Presentation

kvm mmu virtualization
SMART_READER_LITE
LIVE PREVIEW

KVM MMU Virtualization Xiao Guangrong - - PowerPoint PPT Presentation

KVM MMU Virtualization Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Index What is MMU Virtualization? How to implement MMU Virtualization? How to optimize MMU Virtualization? What will I do? 2 What is MMU Virtualization


slide-1
SLIDE 1

KVM MMU Virtualization

Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>

slide-2
SLIDE 2

2

Index

 What is MMU Virtualization?  How to implement MMU Virtualization?  How to optimize MMU Virtualization?  What will I do?

slide-3
SLIDE 3

3

What is MMU Virtualization

CPU MMU NO Virtualization Virtual address Physical Memory Physical address IN Virtualization Guest view GVA Guest Physical Memory GPA MMU Virtualization CPU MMU Host Physical Memory

......

GVA: guest virtual address GPA: guest physical address

slide-4
SLIDE 4

4

The functions of MMU Virtualization

 Translate guest physical address to the

specified host physical address

 Control the memory access permission

– R/W, NX, U/S

 Track Accessed/Dirty bits of guest page table

slide-5
SLIDE 5

5

GFN to PFN in KVM

 Use ioctl(fd, KVM_SET_USER_MEMORY_REGION, kvm_userspace_memory_region

) to register guest physical memory

– guest_phys_addr, memory_size, userspace_addr

GPA Guest VMM

guest_phys_addr

HVA HPA

userspace_addr memory_size

Host GFN: Guest Frame Number PFN: Host Page Frame Number GPA: Guest Physical Address HVA: Host Virtual Address HPA: Host Physical Address

slide-6
SLIDE 6

6

How to implement MMU Virtualization?

 Soft MMU

– Implemented by software

 Hard MMU

– Supported by hardware

  • NPT on SVM from AMD
  • EPT on VMX from Intel
slide-7
SLIDE 7

7

Soft MMU: overview

Guest

CR3

GFN To PFN Guest Page Host Page

Shadow

CR3

Shadow

L3

Guest

L2

Guest

L1

Guest

L3

Guest

L3

Shadow

L2

Shadow

L1

(level, access, gfn...... )

Load to CR3

write-protected write-protected write-protected

slide-8
SLIDE 8

8

Soft MMU: how to implement (1)

 Lower the access permission to keep

consistency:

– Guest pages table are write-protected to keep the consistency between shadow page table and guest page table – Remove the writable bit of shadow PTE if the Dirty bit of guest PTE is not set

slide-9
SLIDE 9

9

Soft MMU: how to implement (2)

 The guest events we should intercept:

– Page fault

  • If guest page table is invalid, inject this page fault

to guest

  • Atomically set A/D bit of guest page table
  • Setup/fix the mapping of shadow page
  • Sync shadow page table and guest page table if it

is a write page fault

– Load CR3

  • Flush TLB
  • Load the new shadow page

– INVLPG

  • Flush TLB
slide-10
SLIDE 10

10

Hard MMU

 EPT/NPT functions:

– The new layer to translate guest physical address to host physical address – Use EPT/NPT for all guest physical address accesses, including MMIO and guest page table walking

 Comparing to Soft MMU:

– It's simple and reduces lots of VM exits...

slide-11
SLIDE 11

11

Hard MMU: overview

Directory offset Table offset

  • ffset

Guest Directory Guest Table Guest Page gCR3 EPT/NPT + + + GVA GPA HPA GPA GPA HPA HPA

slide-12
SLIDE 12

12

Hard MMU: translate GPA to HPA

20 ~ 12 11 ~ 0 29 ~ 21 38 ~ 30 47 ~ 39 63 ~ 48

EPTP

EPT/NPT

Host Page

GPA

HPA HPA HPA HPA HPA

+ + + + +

slide-13
SLIDE 13

13

How to optimize MMU Virtualization?

 1. No-trap for non-present PTEs  2. Unsync shadow pages  3. PTE prefetch  4. KSM  5. THP

slide-14
SLIDE 14

14

  • 1. No-trap for non-present PTEs

 Objective

– Reduce VM exits

 In soft MMU, if the page-fault is caused by PTE

not present(PFEC.P = 0), it is not intercepted by the host

 It only works on VMX

– VMCB.PFEC_MASK = 1, MCB.PFEC_MATCH = 1, VMCB.EXCEPTION_BITMAP.#PF = 1

PTE: Page Table Entry FPEC: Page Fault Error Code

slide-15
SLIDE 15

15

  • 2. Unsync shadow pages

 Objective

– Reduce VM exits

 Background:

– In soft MMU, in order to keep consistency, we need to write-protect the guest page table

 This mechanism can let guest writes these

pages directly

slide-16
SLIDE 16

16

  • 2. Unsync shadow pages

 For the performance reason, we allow the guest

page table to be writable if and only if the page is the last level page-structure (Level 1)

 Base on TLB rules

– We need to flush TLB to ensure the translation use the modified page structures, then we can Intercept the TLB flush operations and sync shadow pages – Sometimes, TLB is not need to be flushed(old PTE.P = 0, PTE.A=0, raise access permission), then it can be synced though page fault

slide-17
SLIDE 17

17

  • 2. Unsync shadow pages

Guest Page table BBB A'A'A' Guest Write Sync Flush TLB Page Fault Guest Page table BBB B'B'B' Guest Host Shadow page table Shadow page table

slide-18
SLIDE 18

18

  • 3. PTE prefetch

 Objective

– Reduce VM exits

 When #PF occurs, we prefetch other invalid

shadow PTEs, so if these PTEs are accessed, the #PF can be avoided

slide-19
SLIDE 19

19

  • 3. PTE prefetch

Guest Page table BBB .P=0 Page fault CCC AAA .P=0 .P=0 Guest Page table BBB CCC AAA A'A'A' B'B'B' C'C'C' Guest Host Shadow page table Shadow page table access

slide-20
SLIDE 20

20

  • 4. KSM

 Objective

– Saving memory

 Kernel Shared Memory  It's a memory-saving de-duplication feature

slide-21
SLIDE 21

21

  • 4. KSM

Origin state:

GFN Guest 1 HVA aaa GFN Guest 2 HVA aaa VMM1 VMM2 Host

Merge

Writable Writable

slide-22
SLIDE 22

22

  • 4. KSM

Merge state:

GFN Guest 1 HVA

Free

GFN Guest 2 HVA aaa VMM1 VMM2 Host

COW

Read-only Read-only If Guest 1 writes the page

slide-23
SLIDE 23

23

  • 4. KSM

Cow state:

GFN Guest 1 HVA bbb GFN Guest 2 HVA aaa VMM1 VMM2 Host Writable Read-only

slide-24
SLIDE 24

24

  • 5. THP

 Objective

– Reduce memory accesses while guest and EPT/NPT page table walking – Improve TLB usage

 Transparent Hugepage  Both host and guest can use huge page

automatically

slide-25
SLIDE 25

25

What will I do

 Lockless MMU

– Feature:

  • Avoid big lock for the whole MMU

– Lock shadow page instead of whole MMU

  • Lockless to walk shadow page

– Use RCU to avoid shadow page to be freed

  • Lockless to update shadow PTE

– Cmpxchg...

slide-26
SLIDE 26

26

What will I do

 Lockless MMU

– Advantage:

  • Allow VCPU to run concurrently on MMU path
  • Good preparing work for KSM to track dirty page

which is mapped by shadow page table

  • Good preparing work for LRU algorithm of MMU

page eviction

slide-27
SLIDE 27

27

Questions?

slide-28
SLIDE 28

28

Thanks!