Improve ARM guest performance with 64KB pages Julien Grall - - PowerPoint PPT Presentation
Improve ARM guest performance with 64KB pages Julien Grall - - PowerPoint PPT Presentation
Improve ARM guest performance with 64KB pages Julien Grall julien.grall@citrix.com Xen Developper Summit 2015 K ezaco Why? Constraints Implementation Improvements Conclusion K ezaco Page is 64KB Remove 1-level of page table
K´ ezaco Why? Constraints Implementation Improvements Conclusion
K´ ezaco
◮ Page is 64KB ◮ Remove 1-level of page table compare to 4K
◮ Faster TLB lookup
◮ Introduced for AArch64 in ARMv8
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 2 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
4KB page granularity
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 3 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
64KB page granularity
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 4 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Why?
◮ Choice of the granularity done at config time in Linux ◮ Some major distribution will ship only Linux with 64KB page
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 5 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Xen and hypercall
◮ Based on 4KB page granularity ◮ Must be able to run guests with different page granularity
◮ Modifying the interface too much might not be possible Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 6 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
PV drivers
◮ Grant are currently only 4KB
◮ Based on the hypercall page granularity
◮ Must be able to talk with the current backend/frontend
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 7 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Goal
◮ First implementation ◮ Allowing 64KB guest running on current Xen
◮ No modification in hypercalls and PV protocol
◮ Get something upstreamed quickly
◮ Linux with 64KB page is crashing at the moment Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 8 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Changes in Xen
◮ Hypervisor: None ◮ Tools: 3 minor patches to use correct size for the rings
◮ Present in Xen 4.6 ◮ Backported requested in Xen 4.5 Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 9 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Changes in Linux
◮ Linux is assuming that Xen is using the same page granularity
◮ Need to introduce XEN PAGE * helpers
◮ 1 foreign grant = 1 Linux page
◮ Easier implementation ◮ 60KB of memory waste per grant ◮ Affect only backend domain
◮ A Linux page may be split between multiple grant
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 10 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Example of handling request on 4K
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 11 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Example of handling request on 64K
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 12 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Changes in Linux - 2
◮ Introduce of helpers to deal with the splitting
◮ Avoid to expose the page granularity to PV drivers ◮ Easier to spot changes which don’t handle 64/4 KB granularity Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 13 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Improvement - 1
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 14 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Improvement - 1
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 15 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Support of 64KB grant - 1
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 16 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Support of 64KB grant - 2
◮ PV drivers can take advantages of it
◮ No need to split page ◮ Less grants to setup
◮ Need to find agreement on where the grant size is decided:
◮ during the protocol negotiation ◮ can change for each request Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 17 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Improvement 2 - Memory Usage
◮ Sharing a Linux page between multiple foreign grant
◮ Need some care with swiotlb
◮ Make Xen drivers fully using the Linux page
◮ Event Channel ◮ PV Ring ◮ ... Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 18 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Status
◮ Where are we?
◮ First implementation done ◮ Only net and block PV drivers supported ◮ On the way to version 4
◮ Future
◮ Write design doc for grant improvement ◮ Fix memory usage with 64KB page granularity ◮ Convert the remaining PV drivers and QEMU Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 19 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion
Fin
Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 20 / 20