 
              Network Stack as a Service in the Cloud Zhixiong Niu 1 , Hong Xu 1 , Dongsu Han 2 , Peng Cheng 3 , Yongqiang Xiong 3 , Guo Chen 3 , Keith Winstein 4 1 City University of Hong Kong 2 KAIST 3 Microsoft Research Asia 4 Stanford University
Imagine you’re a tenant. You want to deploy a new stack. 2
Motivation: Tenants I heard that BBR is great. Let’s deploy it to my VMs! 3
Motivation: Tenants I heard that BBR is great. Let’s deploy it to my VMs! VM 3
Motivation: Tenants I heard that BBR is great. Let’s deploy it to my VMs! Stack VM 3
Motivation: Tenants I heard that BBR is great. Let’s deploy it to my VMs! Stack VM Problem: cannot deploy a stack across OSes 3
Motivation: Tenants I heard that BBR is great. Let’s deploy it to my VMs! Stack VM Problem: cannot deploy a stack across OSes 3
Motivation: Tenants Stack VM 4
Motivation: Tenants Stack VM No. of commits of mTCP and F-stack in 2017 30 mTCP 22.5 F-stack 15 7.5 0 Aug Sep Oct Nov 4
Motivation: Tenants Stack VM No. of commits of mTCP and F-stack in 2017 30 mTCP 22.5 F-stack 15 7.5 0 Aug Sep Oct Nov Problem: high deployment and maintenance cost 4
Motivation: Tenants Stack VM No. of commits of mTCP and F-stack in 2017 30 mTCP 22.5 F-stack 15 7.5 0 Aug Sep Oct Nov Problem: high deployment and maintenance cost 4
So your life as a tenant sucks. What about the cloud provider ? 5
Motivation: Provider I know that BBR is great. Let me deploy it for my tenants! 6
Motivation: Provider I know that BBR is great. Let me deploy it for my tenants! VM Tenant Stack Hypervisor Provider 6
Motivation: Provider I know that BBR is great. Let me deploy it for my tenants! VM Tenant Stack Hypervisor Provider Problem: can’t touch the tenant stack 6
Motivation: Provider I know that BBR is great. Let me deploy it for my tenants! VM Tenant Stack Hypervisor Provider Problem: can’t touch the tenant stack 6
So what’s wrong here? 7
VM APP2 APP1 Networking API Tenant Network Stack vNIC Provider Current architecture 8
Network stack is coupled to the guest OS VM APP2 APP1 Networking API Tenant Network Stack vNIC Provider Current architecture 8
VM APP1 APP2 Tenant Networking API Network stack module Provider Network Stack 9
VM Interface unchanged APP1 APP2 (BSD sockets, etc.) Tenant Networking API Network stack module Provider Network Stack 9
VM Interface unchanged APP1 APP2 (BSD sockets, etc.) Tenant Networking API Network stack module Provider Network Stack Packets handled in the NSM 9
Vision: Network Stack as a Service VM Interface unchanged APP1 APP2 (BSD sockets, etc.) Tenant Networking API Network stack module Provider Network Stack Packets handled in the NSM 9
What’re the benefits?
Flexibility for Tenants mTCP NSM VM BBR NSM VM 11
Flexibility for Tenants mTCP NSM VM BBR NSM VM ‣ Stack independent of the guest OS 11
Flexibility for Tenants mTCP NSM VM BBR NSM VM ‣ Stack independent of the guest OS ‣ No deployment or maintenance cost 11
Efficiency for Provider 12
Efficiency for Provider ‣ Offer meaningful SLAs NSM Capacity Price mTCP 25Mpps $2/hr mTCP 50Mpps $4/hr F-Stack 20Mpps $2/hr 12
Efficiency for Provider ‣ Offer meaningful SLAs ‣ Optimize resource utilization NSM Capacity Price mTCP 25Mpps $2/hr BBR mTCP 50Mpps $4/hr NSM F-Stack 20Mpps $2/hr 12
Efficiency for Provider ‣ Offer meaningful SLAs ‣ Optimize resource utilization NSM Capacity Price mTCP 25Mpps $2/hr BBR mTCP 50Mpps $4/hr NSM F-Stack 20Mpps $2/hr ‣ Easier to assert coordination and control 12
Efficiency for Provider ‣ Offer meaningful SLAs ‣ Optimize resource utilization NSM Capacity Price mTCP 25Mpps $2/hr BBR mTCP 50Mpps $4/hr NSM F-Stack 20Mpps $2/hr ‣ Easier to assert coordination and control NUM pHost mon. Fabric 12
Accelerate Innovation VM VM VM VM mTCP mTCP mTCP Oct 2017 Nov 2017 Dec 2017 … 13
Accelerate Innovation VM VM VM VM mTCP mTCP mTCP Oct 2017 Nov 2017 Dec 2017 … ‣ Allow stack to evolve independently with the guest OS ‣ Write once, run everywhere 13
Accelerate Innovation VM VM VM VM mTCP mTCP mTCP Oct 2017 Nov 2017 Dec 2017 … ‣ Allow stack to evolve independently with the guest OS Not possible in current architecture ‣ Write once, run everywhere 13
NetKernel VM NSM APP1 APP2 Network Stack Network API vNIC Virtual Switch / Embedded Switch (SR-IOV) Hypervisor pNICs Physical NICs 14
NetKernel VM NSM APP1 APP2 Network Stack Network API Socket API GuestLib vNIC Virtual Switch / Embedded Switch (SR-IOV) Hypervisor pNICs Physical NICs 14
NetKernel VM NSM APP1 APP2 Network Stack Network API Socket API ServiceLib GuestLib vNIC Virtual Switch / Embedded Switch (SR-IOV) Hypervisor pNICs Physical NICs 14
NetKernel VM NSM APP1 APP2 Network Stack Huge Network API page Socket API ServiceLib Data Data GuestLib vNIC Virtual Switch / Embedded Switch (SR-IOV) Hypervisor pNICs Physical NICs 14
NetKernel VM NSM APP1 APP2 Network Stack Huge Network API page Socket API ServiceLib Data Data GuestLib Queues vNIC CoreEngine Virtual Switch / Embedded Switch (SR-IOV) Hypervisor pNICs Physical NICs 14
Microbenchmark ‣ 3000 lines of C code, in user space ‣ QEMU KVM 2.5.0, Linux Kernel 4.9 ‣ Intel Xeon CPU E5-2618L v3 @ 2.30GHz x 2 Communication between ServiceLib and GuestLib (Random read and copy) Chunk 64B 512B 1KB 2KB 4KB 8KB size Latency 8ns 64ns 117ns 214ns 425ns 809ns 15
Microbenchmark ‣ 3000 lines of C code, in user space ‣ QEMU KVM 2.5.0, Linux Kernel 4.9 ‣ Intel Xeon CPU E5-2618L v3 @ 2.30GHz x 2 Communication between ServiceLib and GuestLib (Random read and copy) Chunk 64B 512B 1KB 2KB 4KB 8KB size Latency 8ns 64ns 117ns 214ns 425ns 809ns 64Gbps 81Gbps 15
Windows VM + BBR NSM BBR NSM VM VM 350ms rtt 12Mbps Uplink Beijing California 12 Throughput 9 (Mbps) 6 3 0 Win + NSM BBR Linux BBR Windows CTCP Linux CUBIC 16
Takeaway ‣ Vision: Network Stack as a Service ‣ Decouple the network stack from the guest OS ‣ Better flexibility and efficiency, and faster innovation ‣ NetKernel as a solution ‣ GuestLib, ServiceLib, CoreEngine 17
Research Agenda 18
Research Agenda ‣ NSM form ‣ VM? unikernel-based VMs? containers? hypervisor modules? 18
Research Agenda ‣ NSM form ‣ VM? unikernel-based VMs? containers? hypervisor modules? ‣ Support for containers ‣ Currently a container has to use the host stack ‣ Different containers on the same host use different stacks 18
Research Agenda ‣ NSM form ‣ VM? unikernel-based VMs? containers? hypervisor modules? ‣ Support for containers Spark DCTCP ‣ Currently a container has to use the host stack Nginx BBR ‣ Different containers on the same host use different stacks 18
Research Agenda ‣ NSM form ‣ VM? unikernel-based VMs? containers? hypervisor modules? ‣ Support for containers Spark DCTCP ‣ Currently a container has to use the host stack Nginx BBR ‣ Different containers on the same host use different stacks ‣ Network stacks to NSMs ‣ … 18
Open Questions ‣ Any downsides? ‣ Other use cases in a production cloud? ‣ How about a private data center? ‣ What’s the right abstraction boundary of the network stack? 19
Recommend
More recommend