Got Loss? Get zOVN! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat ACM SIGCOMM 2013, 12-16 August, Hong Kong, China Research – Zurich Research Laboratory
Application Performance in Virtualized Datacenter Networks Virtualized Server 1 Virtualized Server 2 Virtualized Server 3 … Virtualized Server N VM 1 VM K 1 VM 1 VM K 2 VM 1 VM K 3 VM 1 VM K N vNIC vNIC vNIC vNIC vNIC vNIC vNIC vNIC Virtual Switch Virtual Switch Virtual Switch Virtual Switch NIC NIC NIC NIC Physical Datacenter Network Switch Switch short-and-fat links Switch Switch Router Router End-users accessing Global Internet datacenter services long-and-fat links 2 Research – Zurich Research Laboratory
Physical Network: Lossless Links • IBM builds flow-controlled links since the 8 0’s • High Performance Computing community - large scale lossless distributed systems • Flow control improves performance • HPC and Datacenter communities disconnected • Why do we disregard the Ethernet flow-control? • PAUSE widely available, largely ignored • Converged Enhanced Ethernet – applies HPC and Storage lessons • Priority Flow Control (standardized 2011) • Constantly improved for 1T 3 Research – Zurich Research Laboratory
Virtual Networks are Different Physical Virtual Networks Networks Packet forwarding Deterministic bandwidth and delay Link level flow control Bandwidth allocation µs ms Latency Virtual Networks in embryonic stage 4 Research – Zurich Research Laboratory
Contributions • Loss identification and characterization in virtual networks • Dirty-slate approach for latency sensitive applications Exploit a L2 technique to the benefit of TCP and application • • Introduce zero-loss Overlay Virtual Network Flow-controlled virtual switch • • Evaluation with Partition/Aggregate Prototype implementation • Cross-layer simulation • Flow control improves application performance 5 Research – Zurich Research Laboratory
Outline • Introduction • Losses in Virtual Networks • zOVN Architecture • Evaluation • Conclusions 6 Research – Zurich Research Laboratory
Losses in Virtual Networks Physical Machine VM 1 vSwitch vNIC Tx Port A Tx Source VM 3 Port C Rx vNIC Rx Sink VM 2 vNIC Tx Port B Tx Source • Packets traverse a series of queues Producer/Consumer problem on each queue • Not implemented correctly on each queue 7 Research – Zurich Research Laboratory
Losses in Virtual Networks (2) Physical Machine VM 1 vSwitch 1 2 vNIC Tx Port A Tx 3 Source VM 3 4 5 6 Port C Rx vNIC Rx Sink VM 2 1 2 vNIC Tx Port B Tx 3 Source Numbers: measurement points • Inject UDP packets at (1) • Count how many still arrive at (6) • Loss locations • vSwitch – between (3) and (4) • Receive stack – between (5) and (6) • 8 Research – Zurich Research Laboratory
Losses in Virtual Networks (3) 200 Injected traffic [MBps] 150 Stack Loss 100 vSwitch Loss Received 50 0 C1 C2 C3 C4 C5 C6 C7 Configuration Hypervisor vNIC vSwitch C1 Qemu/KVM Virtio Linux Bridge C2 Qemu/KVM Virtio Open vSwitch C3 Qemu/KVM Virtio VALE C4 H2 N2 S4 C5 H2 E1000 S4 C6 Qemu/KVM E1000 Linux Bridge C7 Qemu/KVM E1000 Open vSwitch 9 Research – Zurich Research Laboratory
Outline • Introduction • Losses in Virtual Networks • zOVN Architecture • Evaluation • Conclusions 10 Research – Zurich Research Laboratory
TX Path Hypervisor vSwitch VM Guest kernel free socket Tx skb write Application enqueue return Qdisc value Port B Tx vNIC Tx start_xmit receive start/stop return queue value zOVN bridge NIC overlay Port A Rx NIC Tx encapsulation send frame receive wake-up Physical link PAUSE 11 Research – Zurich Research Laboratory
RX Path: Fix Stack Loss Hypervisor vSwitch VM Guest kernel socket Rx Port B Rx vNIC Rx NET RX netif_receive read Application send Softirq skb pause/resume return return queue value value setsockopt Select lossy or zOVN bridge NIC lossless. overlay Port A Tx NIC Rx decapsulation receive frame send wake-up Physical link PAUSE 12 Research – Zurich Research Laboratory
Lossless Virtual Switch Senders: Receivers: • • Produce packets Consume • Start forwarder packets • • Sleep Start forwarder • Sleep vSwitch Port 1 Tx Port 1 Rx Forwarder: • Move packets from Tx to Rx • Pause Tx ports if Port 2 Tx Port 2 Rx Rx port full • Wake-up Tx ports when something Port N Tx Port N Rx is consumed 13 Research – Zurich Research Laboratory
Fully Lossless Path Physical Machine VM 1 vSwitch 1 2 vNIC Tx Port A Tx 3 Source VM 3 4 5 6 Port C Rx vNIC Rx Sink VM 2 1 2 vNIC Tx Port B Tx 3 Source • Fixed vSwitch – between (3) and (4) Receive stack – between (5) and (6) 14 Research – Zurich Research Laboratory
Outline • Introduction • Losses in Virtual Networks • zOVN Architecture • Evaluation • Conclusions 15 Research – Zurich Research Laboratory
Partition/Aggregate Workload Master 1 4 2 2 2 2 3 3 3 3 Worker Worker Worker Worker • Problem: TCP incast • During Aggregate, buffers might overflow. • For short flows: TCP ineffective, ACK clock stalled. • Must rely on timeouts. • Partition and Aggregate – datacenter internal • Open to optimizations 16 Research – Zurich Research Laboratory
Testbed Setup IBM x3550 M4 Server IBM x3550 M4 Server IBM x3550 M4 Server IBM x3550 M4 Server … … … … VM 1 VM 1 VM 1 VM 1 VM 16 VM 16 VM 16 VM 16 vSwitch vSwitch vSwitch vSwitch 10G 1G 10G 1G 10G 1G 10G 1G IBM G8264 HP 1810-8G 10G Switch 1G Switch Data network Control network 4x Rack Servers • 16 physical cores + HyperThreading • Intel 10G adapters (ixgbe drivers) • 16 VMs / server • 8 VMs for PA traffic* • * a s in “ DCTCP: Efficient 8 VMs produce background flow • Packet Transport for the Commoditized Data Center ” SIGCOMM 2010 17 Research – Zurich Research Laboratory
Testbed Results (CUBIC) 1000 Mean completion time [ms] Virtual Network Physical Network Flow Control Flow Control 100 LL No No LZ No Yes 10 ZL Yes No ZZ Yes Yes 1 1 10 100 1000 10000 Response size [Packets] Virtual only better than physical only: vSwitch primary • congestion point. Physical switch congestion negligible No improvement for short/long flows: Long transfers can • remain on lossy priorities 18 Research – Zurich Research Laboratory
Simulation Setup • Larger topology: 256 servers • 4 VMs / server • 3 VMs produce PA traffic • 1 VM background flows • Assumption: infinite CPU 19 Research – Zurich Research Laboratory
Simulation Results (64 packets) 45 Mean completion time [ms] 40 Virtual Network Physical Network 35 Flow Control Flow Control 30 LL No No 25 LZ No Yes 20 ZL Yes No 15 Yes Yes ZZ 10 5 0 NewReno Vegas Cubic Confirm findings from prototype experiments • (LZ) Physical only flow control: shift the drop point into the • virtual network (ZZ) Both flow controls required for better performance • 20 Research – Zurich Research Laboratory
Faster CPUs or faster networks? • Loss ratio influenced by CPU/network speed ratio TX RX • Slow CPU coupled with a • Fast CPU coupled with a fast network is desirable slow network is desirable • e.g. Xeon + 1G network drops more than Core2 • e.g. Xeon + 10G network + 1G network drops more than Xeon + 1G network • Conflicting requirements: cannot solve problem by changing hardware The only solution: add flow control ! 21 Research – Zurich Research Laboratory
Conclusions • Loss identification and characterization in OVN • First flow-controlled vSwitch for future Overlay Virtual Networks • Dirty-slate approach for latency sensitive applications • Un-tuned TCP • Commodity 1-10G Ethernet fabric • Result replication trivial • Orthogonal to other proposals • Lossless links: Order of magnitude completion time reduction in Partition/Aggregate 22 Research – Zurich Research Laboratory
Backup 23 Research – Zurich Research Laboratory
Encapsulation in Overlay Virtual Networks Fabric Destination Server Source Server Controller VM VM VM VM VM VM (1) (5) (2) (3) vSwitch vSwitch (4) Physical Cache Cache Network Payload TCP| IP|Eth Encap|UDP|IP|Eth Workflow Source VM sends packet to its attached vSwitch. 1. vSwitch queries the Controller to find the address of the 2. destination. Controller answers. The information is cached by the switch. 3. Packet sent over physical network encapsulated with new headers. 4. Packet decapsulated at destination virtual Switch. 5. 24 Research – Zurich Research Laboratory
Recommend
More recommend