io uring in qemu high performance disk io for linux
play

io_uring in QEMU: high-performance disk IO for Linux FOSDEM 2020 - PowerPoint PPT Presentation

io_uring in QEMU: high-performance disk IO for Linux FOSDEM 2020 Julia Suvorova, Red Hat Software Engineer 1 Agenda What well discuss today io_uring API QEMU structure Features of io_uring and how they helped QEMU Benchmarks What


  1. io_uring in QEMU: high-performance disk IO for Linux FOSDEM 2020 Julia Suvorova, Red Hat Software Engineer 1

  2. Agenda What we’ll discuss today io_uring API QEMU structure Features of io_uring and how they helped QEMU Benchmarks What left to do 2

  3. QEMU I/O path I/O path in VM virtio-blk QEMU driver userspace kernel userspace kernel VM vHW Host HW 3

  4. QEMU I/O path Existing solutions Async I/O Linux AIO (aio=native) ▸ Thread pool (aio=threads) ▸ Other NVME passthrough (vfio) ▸ SPDK ▸ 4

  5. QEMU I/O path I/O path in VM virtio-blk QEMU driver userspace kernel userspace kernel VM vHW Host HW ----------------This part we can improve-------------- 5

  6. io_uring interface io_uring Yet another kernel ring buffer New interface for truly asynchronous ▸ communication with kernel: latest versions support network and some other syscalls Part of linux 5.1 ▸ 6

  7. io_uring interface Main features Unlike Linux AIO, separate queues for submission and ▸ completion (sqes and cqes) Sqes and cqes are shared between userspace and kernel ▸ Async flush ▸ Submission: Completion: QEMU -> kernel -> hw QEMU <- kernel <- hw 7

  8. io_uring interface Interface Three new system calls: io_uring_setup(u32 entries, struct io_uring_params *p) Can choose different regimes ▸ io_uring_enter(unsigned int fd, unsigned int to_submit, unsigned int min_complete, unsigned int flags, sigset_t *sig) Submit submissions and fetches completions within one syscall ▸ (Not in Linux AIO!) io_uring_register(unsigned int fd, unsigned int opcode, void *arg, unsigned int nr_args); Register fd ahead. No need to do fget() and fput() on each ▸ submission and completion respectively Register buffers (struct iovec) ahead. Saves get_user_pages() ▸ 8 and put_pages()

  9. io_uring interface How fast is it? Benchmarks on bare metal Test with fio 3.14: aio=libaio operation=randread NVMe SSD Intel Optane 320G CPU Intel Xeon Silver 2.20GHz 9

  10. io_uring inside QEMU Integration into QEMU What’s done: Outreachy project idea ▸ Implemented by Aarushi Mehta ▸ Basic functionality is merged upstream (will be in QEMU 5.0) ▸ Known issues: Problems with file locking in fd registration ▸ IOPOLL is not implemented ▸ 10

  11. io_uring inside QEMU Integration into QEMU Reuse Linux AIO approach Qemu event loop is based on AIO context (future improvement: can be switched to io_uring) Add aio context -> use epoll for completion check Now we submit requests with io_uring_enter() and check completions on irq Liburing usage: Easier to use, less mistakes 11

  12. io_uring inside QEMU Integration into QEMU How to launch -drive file=test.img,format=raw,cache=none,aio=io_uring Works with both IO_DIRECT and cache workload 12

  13. io_uring inside QEMU How fast has it got without extra features? Test with fio 3.14: aio=libaio operation=randread NVMe SSD Intel Optane 320G CPU Intel Xeon Silver 2.20GHz 13

  14. Features and benchmarks Fd registration Register set of fd on which I/O is operated with io_uring_register() Saves atomic fget() on submission path Saves atomic fput() on completion path 14

  15. Features and benchmarks Does this help much? Not really by itself Test with fio 3.14: aio=libaio operation=randread NVMe SSD Intel Optane 320G CPU Intel Xeon Silver 2.20GHz 15

  16. Features and benchmarks Submission polling Run a kernel thread to wait for submissions, need to wake up with syscall io_uring_setup() with flag SQ_POLL Needs fd registration for effective usage Now we submit requests without syscall and get completions on irq - path without syscalls 16

  17. Features and benchmarks Completion polling Poll completions with busy waiting on io_uring_enter() io_uring_setup() with CPU consuming, but no context switching In combination with SQ_POLL - the fastest way on heavy workloads 17

  18. Features and benchmarks Performance Not implemented yet Source: 18 Insert source data here Insert source data here Insert source data here

  19. In someone’s todo Future improvements Merge SQ_POLL and fd registration File buffers registration and IO_POLL Switch to io_uring as default aio (if supported) Ideas: Switch main loop to io_uring 19

  20. Thank you linkedin.com/company/red-hat youtube.com/user/RedHatVideos Questions? facebook.com/redhatinc twitter.com/RedHat 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend