communication for enterprise
play

Communication for Enterprise Appliances Anton Burtsev , Kiran - PowerPoint PPT Presentation

Fido: Fast Inter-Virtual-Machine Communication for Enterprise Appliances Anton Burtsev , Kiran Srinivasan, Prashanth Radhakrishnan, Lakshmi N. Bairavasundaram, Kaladhar Voruganti, Garth R. Goodson NetApp, Inc University of Utah, School


  1. Fido: Fast Inter-Virtual-Machine Communication for Enterprise Appliances Anton Burtsev † , Kiran Srinivasan, Prashanth Radhakrishnan, Lakshmi N. Bairavasundaram, Kaladhar Voruganti, Garth R. Goodson NetApp, Inc † University of Utah, School of Computing

  2. Enterprise appliances Network attached storage, routers, etc. • High performance • Scalable and highly-available access 2

  3. Example Appliance • Monolithic kernel • Kernel components Problems: • Fault isolation • Performance isolation • Resource provisioning 3

  4. Split architecture 4

  5. Benefits of virtualization • High availability • Fault-isolation • Micro-reboots • Partial functionality in case of failure • Performance isolation • Resource allocation • Consolidation and load balancing, VM migration • Non-disruptive updates • Hardware upgrades via VM migration • Software updates as micro-reboots • Computation to data migration 5

  6. Main Problem: Performance Is it possible to match performance of a monolithic environment? • Large amount of data movement between components • Mostly cross-core • Connection oriented (established once) • Throughput optimized (asynchronous) • Coarse grained (no one-word messages) • Multi-stage data processing • Main cost contributors • Transitions to hypervisor • Memory map/copy operations • Not VM context switches (multi-cores) • Not IPC marshaling 6

  7. Main Insight: Relaxed Trust Model • Appliance is built by a single organization • Components: • Pre-tested and qualified • Collaborative and non-malicious • Share memory read-only across VMs! • Fast inter-VM communication • Exchange only pointers to data • No hypervisor calls (only cross-core notification) • No memory map/copy operations • Zero-copy across entire appliance 7

  8. Contributions • Fast inter-VM communication mechanism • Abstraction of a single address space for traditional systems • Case study • Realistic microkernelized network attached storage 8

  9. Design 9

  10. Design Goals • Performance • High-throughput • Practicality • Minimal guest system and hypervisor dependencies • No intrusive guest kernel changes • Generality • Support for different communication mechanisms in the guest system 10

  11. Transitive Zero Copy • Goal • Zero-copy across entire appliance • No changes to guest kernel • Observation • Multi-stage data processing 11

  12. Pseudo Global Virtual Address Space 2 64 Insight: • CPUs support 64-bit address space • Individual VMs have no need in it 0 12

  13. Pseudo Global Virtual Address Space 2 64 0 13

  14. Pseudo Global Virtual Address Space 2 64 0 14

  15. Transitive Zero Copy 15

  16. Fido: High-level View 16

  17. Fido: High-level View • “c” – connection management • “m” – memory mapping • “s” – cross-VM signaling 17

  18. IPC Organization • Shared memory ring • Pointers to data 18

  19. IPC Organization • Shared memory ring • Pointers to data • For complex data structures • Scatter-gather array 19

  20. IPC Organization • Shared memory ring • Pointers to data • For complex data structures • Scatter-gather array • Translate pointers 20

  21. IPC Organization • Shared memory ring • Pointers to data • For complex data structures • Scatter-gather array • Translate pointers • Signaling: • Cross-core interrupts (event channels) • Batching and in-ring polling 21

  22. Fast device-level communication • MMNet • Link-level • Standard network device interface • Supports full transitive zero-copy • MMBlk • Block-level • Standard block device interface • Zero-copy on write • Incurs one copy on read 22

  23. Evaluation 23

  24. MMNet Evaluation Loop NetFront XenLoop MMNet • AMD Opteron with 2 2.1GHz 4-core CPUs (8 cores total) • 16GB RAM • NVidia 1Gbps NICs • 64-bit Xen (3.2), 64-bit Linux (2.6.18.8) • Netperf benchmark (2.4.4) 24

  25. MMNet: TCP Throughput 12000 10000 Throughput (Mbps) 8000 Monolithic 6000 Netfront XenLoop 4000 MMNet 2000 0 0.5 1 2 4 8 16 32 64 128 256 Message Size (KB) 25

  26. MMBlk Evaluation Monolithic XenBlk MMNet • Same hardware • AMD Opteron with 2 2.1GHz 4-core CPUs (8 cores total) • 16GB Ram • NVidia 1Gbps NICs • VMs are configured with 4GB and 1GB RAM • 3 GB in-memory file system (TMPFS) • IOZone benchmark 26

  27. MMBlk Sequential Writes 600 500 Throughput (MB/s) 400 300 Monolithic XenBlk 200 MMBlk 100 0 4 8 16 32 64 128 256 512 1K 2K 4K Record Size (KB) 27

  28. Case Study 28

  29. Network-attached Storage 29

  30. Network-attached Storage • RAM • VMs have 1GB each, except FS VM (4GB) • Monolithic system has 7GB RAM • Disks : • RAID5 over 3 64MB/s disks • Benchmark • IOZone reads/writes 8GB file over NFS (async) 30

  31. Sequential Writes 90 80 70 Throughput (MB/s) 60 50 Monolithic 40 Native-Xen 30 MM-Xen 20 10 0 4 8 16 32 64 128 256 512 1K 2K 4K Record Size (KB) 31

  32. Sequential Reads 80 70 60 Throughput (MB/s) 50 40 Monolithic Native-Xen 30 MM-Xen 20 10 0 4 8 16 32 64 128 256 512 1K 2K 4K Record Size (KB) 32

  33. TPC-C (On-Line Transactional Processing) 350 300 Transactions/minute (tpmC) 250 Monolithic 200 MMXen 150 Native-Xen 100 50 0 33

  34. Conclusions • We match monolithic performance • “ Microkernelization ” of traditional systems is possible! • Fast inter-VM communication • The search for VM communication mechanisms is not over • Important aspects of design • Trust model • VM as a library (for example, FSVA) • End-to-end zero copy • Pseudo Global Virtual Address Space • There are still problems to solve • Full end-to-end zero copy • Cross-VM memory management • Full utilization of pipelined parallelism 34

  35. Thank you. aburtsev@flux.utah.edu 35

  36. Backup Slides 36

  37. Related Work • Traditional microkernels [L4, Eros, CoyotOS] • Synchronous (effectively thread migration) • Optimized for single-CPU, fast context switch, small messages (often in registers), efficient marshaling (IDL) • Buffer management [Fbufs, IOLite, Beltway Buffers] • Shared buffer is a unit of protection • Fast-forward – fast cache-to-cache data transfer • VMs [Xen split drivers, XWay, XenSocket, XenLoop] • Page flipping, later buffer sharing • IVC, VMCI • Language-based protection [Singularity] • Shared heap, zero-copy (only pointer transfer) • Hardware acceleration [Solarflare] • Multi-core OSes [Barrelfish, Corey, FOS] 37

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend