got loss get zovn
play

Got Loss? Get zOVN! Daniel Crisan, Robert Birke, Gilles Cressier, - PowerPoint PPT Presentation

Got Loss? Get zOVN! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat ACM SIGCOMM 2013, 12-16 August, Hong Kong, China Research Zurich Research Laboratory Application Performance in Virtualized Datacenter


  1. Got Loss? Get zOVN! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat ACM SIGCOMM 2013, 12-16 August, Hong Kong, China Research – Zurich Research Laboratory

  2. Application Performance in Virtualized Datacenter Networks Virtualized Server 1 Virtualized Server 2 Virtualized Server 3 … Virtualized Server N VM 1 VM K 1 VM 1 VM K 2 VM 1 VM K 3 VM 1 VM K N vNIC vNIC vNIC vNIC vNIC vNIC vNIC vNIC Virtual Switch Virtual Switch Virtual Switch Virtual Switch NIC NIC NIC NIC Physical Datacenter Network Switch Switch short-and-fat links Switch Switch Router Router End-users accessing Global Internet datacenter services long-and-fat links 2 Research – Zurich Research Laboratory

  3. Physical Network: Lossless Links • IBM builds flow-controlled links since the 8 0’s • High Performance Computing community - large scale lossless distributed systems • Flow control improves performance • HPC and Datacenter communities disconnected • Why do we disregard the Ethernet flow-control? • PAUSE widely available, largely ignored • Converged Enhanced Ethernet – applies HPC and Storage lessons • Priority Flow Control (standardized 2011) • Constantly improved for 1T 3 Research – Zurich Research Laboratory

  4. Virtual Networks are Different Physical Virtual Networks Networks   Packet forwarding   Deterministic bandwidth and delay   Link level flow control   Bandwidth allocation µs ms Latency Virtual Networks in embryonic stage 4 Research – Zurich Research Laboratory

  5. Contributions • Loss identification and characterization in virtual networks • Dirty-slate approach for latency sensitive applications Exploit a L2 technique to the benefit of TCP and application • • Introduce zero-loss Overlay Virtual Network Flow-controlled virtual switch • • Evaluation with Partition/Aggregate Prototype implementation • Cross-layer simulation • Flow control improves application performance 5 Research – Zurich Research Laboratory

  6. Outline • Introduction • Losses in Virtual Networks • zOVN Architecture • Evaluation • Conclusions 6 Research – Zurich Research Laboratory

  7. Losses in Virtual Networks Physical Machine VM 1 vSwitch vNIC Tx Port A Tx Source VM 3 Port C Rx vNIC Rx Sink VM 2 vNIC Tx Port B Tx Source • Packets traverse a series of queues Producer/Consumer problem on each queue • Not implemented correctly on each queue 7 Research – Zurich Research Laboratory

  8. Losses in Virtual Networks (2) Physical Machine VM 1 vSwitch 1 2 vNIC Tx Port A Tx 3 Source VM 3 4 5 6 Port C Rx vNIC Rx Sink VM 2 1 2 vNIC Tx Port B Tx 3 Source Numbers: measurement points • Inject UDP packets at (1) • Count how many still arrive at (6) • Loss locations • vSwitch – between (3) and (4) • Receive stack – between (5) and (6) • 8 Research – Zurich Research Laboratory

  9. Losses in Virtual Networks (3) 200 Injected traffic [MBps] 150 Stack Loss 100 vSwitch Loss Received 50 0 C1 C2 C3 C4 C5 C6 C7 Configuration Hypervisor vNIC vSwitch C1 Qemu/KVM Virtio Linux Bridge C2 Qemu/KVM Virtio Open vSwitch C3 Qemu/KVM Virtio VALE C4 H2 N2 S4 C5 H2 E1000 S4 C6 Qemu/KVM E1000 Linux Bridge C7 Qemu/KVM E1000 Open vSwitch 9 Research – Zurich Research Laboratory

  10. Outline • Introduction • Losses in Virtual Networks • zOVN Architecture • Evaluation • Conclusions 10 Research – Zurich Research Laboratory

  11. TX Path Hypervisor vSwitch VM Guest kernel free socket Tx skb write Application enqueue return Qdisc value Port B Tx vNIC Tx start_xmit receive start/stop return queue value zOVN bridge NIC overlay Port A Rx NIC Tx encapsulation send frame receive wake-up Physical link PAUSE 11 Research – Zurich Research Laboratory

  12. RX Path: Fix Stack Loss Hypervisor vSwitch VM Guest kernel socket Rx Port B Rx vNIC Rx NET RX netif_receive read Application send Softirq skb pause/resume return return queue value value setsockopt Select lossy or zOVN bridge NIC lossless. overlay Port A Tx NIC Rx decapsulation receive frame send wake-up Physical link PAUSE 12 Research – Zurich Research Laboratory

  13. Lossless Virtual Switch Senders: Receivers: • • Produce packets Consume • Start forwarder packets • • Sleep Start forwarder • Sleep vSwitch Port 1 Tx Port 1 Rx Forwarder: • Move packets from Tx to Rx • Pause Tx ports if Port 2 Tx Port 2 Rx Rx port full • Wake-up Tx ports when something Port N Tx Port N Rx is consumed 13 Research – Zurich Research Laboratory

  14. Fully Lossless Path Physical Machine VM 1 vSwitch 1 2 vNIC Tx Port A Tx 3 Source VM 3 4 5 6 Port C Rx vNIC Rx Sink VM 2 1 2 vNIC Tx Port B Tx 3 Source • Fixed  vSwitch – between (3) and (4)  Receive stack – between (5) and (6) 14 Research – Zurich Research Laboratory

  15. Outline • Introduction • Losses in Virtual Networks • zOVN Architecture • Evaluation • Conclusions 15 Research – Zurich Research Laboratory

  16. Partition/Aggregate Workload Master 1 4 2 2 2 2 3 3 3 3 Worker Worker Worker Worker • Problem: TCP incast • During Aggregate, buffers might overflow. • For short flows: TCP ineffective, ACK clock stalled. • Must rely on timeouts. • Partition and Aggregate – datacenter internal • Open to optimizations 16 Research – Zurich Research Laboratory

  17. Testbed Setup IBM x3550 M4 Server IBM x3550 M4 Server IBM x3550 M4 Server IBM x3550 M4 Server … … … … VM 1 VM 1 VM 1 VM 1 VM 16 VM 16 VM 16 VM 16 vSwitch vSwitch vSwitch vSwitch 10G 1G 10G 1G 10G 1G 10G 1G IBM G8264 HP 1810-8G 10G Switch 1G Switch Data network Control network 4x Rack Servers • 16 physical cores + HyperThreading • Intel 10G adapters (ixgbe drivers) • 16 VMs / server • 8 VMs for PA traffic* • * a s in “ DCTCP: Efficient 8 VMs produce background flow • Packet Transport for the Commoditized Data Center ” SIGCOMM 2010 17 Research – Zurich Research Laboratory

  18. Testbed Results (CUBIC) 1000 Mean completion time [ms] Virtual Network Physical Network Flow Control Flow Control 100 LL No No LZ No Yes 10 ZL Yes No ZZ Yes Yes 1 1 10 100 1000 10000 Response size [Packets] Virtual only better than physical only: vSwitch primary • congestion point. Physical switch congestion negligible No improvement for short/long flows: Long transfers can • remain on lossy priorities 18 Research – Zurich Research Laboratory

  19. Simulation Setup • Larger topology: 256 servers • 4 VMs / server • 3 VMs produce PA traffic • 1 VM background flows • Assumption: infinite CPU 19 Research – Zurich Research Laboratory

  20. Simulation Results (64 packets) 45 Mean completion time [ms] 40 Virtual Network Physical Network 35 Flow Control Flow Control 30 LL No No 25 LZ No Yes 20 ZL Yes No 15 Yes Yes ZZ 10 5 0 NewReno Vegas Cubic Confirm findings from prototype experiments • (LZ) Physical only flow control: shift the drop point into the • virtual network (ZZ) Both flow controls required for better performance • 20 Research – Zurich Research Laboratory

  21. Faster CPUs or faster networks? • Loss ratio influenced by CPU/network speed ratio TX RX • Slow CPU coupled with a • Fast CPU coupled with a fast network is desirable slow network is desirable • e.g. Xeon + 1G network drops more than Core2 • e.g. Xeon + 10G network + 1G network drops more than Xeon + 1G network • Conflicting requirements: cannot solve problem by changing hardware The only solution: add flow control ! 21 Research – Zurich Research Laboratory

  22. Conclusions • Loss identification and characterization in OVN • First flow-controlled vSwitch for future Overlay Virtual Networks • Dirty-slate approach for latency sensitive applications • Un-tuned TCP • Commodity 1-10G Ethernet fabric • Result replication trivial • Orthogonal to other proposals • Lossless links: Order of magnitude completion time reduction in Partition/Aggregate 22 Research – Zurich Research Laboratory

  23. Backup 23 Research – Zurich Research Laboratory

  24. Encapsulation in Overlay Virtual Networks Fabric Destination Server Source Server Controller VM VM VM VM VM VM (1) (5) (2) (3) vSwitch vSwitch (4) Physical Cache Cache Network Payload TCP| IP|Eth Encap|UDP|IP|Eth Workflow Source VM sends packet to its attached vSwitch. 1. vSwitch queries the Controller to find the address of the 2. destination. Controller answers. The information is cached by the switch. 3. Packet sent over physical network encapsulated with new headers. 4. Packet decapsulated at destination virtual Switch. 5. 24 Research – Zurich Research Laboratory

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend