Network Stack as a Service in the Cloud Zhixiong Niu 1 , Hong Xu 1 , - - PowerPoint PPT Presentation

network stack as a service in the cloud
SMART_READER_LITE
LIVE PREVIEW

Network Stack as a Service in the Cloud Zhixiong Niu 1 , Hong Xu 1 , - - PowerPoint PPT Presentation

Network Stack as a Service in the Cloud Zhixiong Niu 1 , Hong Xu 1 , Dongsu Han 2 , Peng Cheng 3 , Yongqiang Xiong 3 , Guo Chen 3 , Keith Winstein 4 1 City University of Hong Kong 2 KAIST 3 Microsoft Research Asia 4 Stanford University Imagine


slide-1
SLIDE 1

Network Stack as a Service in the Cloud

Zhixiong Niu1, Hong Xu1, Dongsu Han2, Peng Cheng3, Yongqiang Xiong3, Guo Chen3, Keith Winstein4

1City University of Hong Kong 2KAIST 3Microsoft Research Asia 4Stanford University

slide-2
SLIDE 2

Imagine you’re a tenant. You want to deploy a new stack.

2

slide-3
SLIDE 3

Motivation: Tenants

I heard that BBR is great. Let’s deploy it to my VMs!

3

slide-4
SLIDE 4

Motivation: Tenants

I heard that BBR is great. Let’s deploy it to my VMs!

VM

3

slide-5
SLIDE 5

Motivation: Tenants

I heard that BBR is great. Let’s deploy it to my VMs!

VM

Stack

3

slide-6
SLIDE 6

Motivation: Tenants

I heard that BBR is great. Let’s deploy it to my VMs!

Problem: cannot deploy a stack across OSes

VM

Stack

3

slide-7
SLIDE 7

Motivation: Tenants

I heard that BBR is great. Let’s deploy it to my VMs!

Problem: cannot deploy a stack across OSes

VM

Stack

3

slide-8
SLIDE 8

Motivation: Tenants

VM

Stack

4

slide-9
SLIDE 9

Motivation: Tenants

7.5 15 22.5 30 Aug Sep Oct Nov

mTCP F-stack

  • No. of commits of mTCP and F-stack in 2017

VM

Stack

4

slide-10
SLIDE 10

Motivation: Tenants

7.5 15 22.5 30 Aug Sep Oct Nov

mTCP F-stack

  • No. of commits of mTCP and F-stack in 2017

Problem: high deployment and maintenance cost

VM

Stack

4

slide-11
SLIDE 11

Motivation: Tenants

7.5 15 22.5 30 Aug Sep Oct Nov

mTCP F-stack

  • No. of commits of mTCP and F-stack in 2017

Problem: high deployment and maintenance cost

VM

Stack

4

slide-12
SLIDE 12

So your life as a tenant sucks. What about the cloud provider?

5

slide-13
SLIDE 13

Motivation: Provider

I know that BBR is great. Let me deploy it for my tenants!

6

slide-14
SLIDE 14

Motivation: Provider

I know that BBR is great. Let me deploy it for my tenants!

Tenant Provider

Stack

VM

Hypervisor

6

slide-15
SLIDE 15

Motivation: Provider

I know that BBR is great. Let me deploy it for my tenants!

Tenant Provider

Stack

VM

Hypervisor

Problem: can’t touch the tenant stack

6

slide-16
SLIDE 16

Motivation: Provider

I know that BBR is great. Let me deploy it for my tenants!

Tenant Provider

Stack

VM

Hypervisor

Problem: can’t touch the tenant stack

6

slide-17
SLIDE 17

So what’s wrong here?

7

slide-18
SLIDE 18

VM vNIC APP2 APP1 Networking API

Provider Tenant

Network Stack

Current architecture

8

slide-19
SLIDE 19

VM vNIC APP2 APP1 Networking API

Provider Tenant

Network Stack

Network stack is coupled to the guest OS

Current architecture

8

slide-20
SLIDE 20

VM APP2 APP1 Networking API Network stack module

Provider Tenant

Network Stack

9

slide-21
SLIDE 21

VM APP2 APP1 Networking API Network stack module

Provider Tenant

Network Stack

Interface unchanged (BSD sockets, etc.)

9

slide-22
SLIDE 22

VM APP2 APP1 Networking API Network stack module

Provider Tenant

Network Stack

Interface unchanged (BSD sockets, etc.)

Packets handled in the NSM

9

slide-23
SLIDE 23

VM APP2 APP1 Networking API Network stack module

Provider Tenant

Network Stack

Interface unchanged (BSD sockets, etc.)

Packets handled in the NSM

Vision: Network Stack as a Service

9

slide-24
SLIDE 24

What’re the benefits?

slide-25
SLIDE 25

Flexibility for Tenants

VM VM

mTCP NSM BBR NSM

11

slide-26
SLIDE 26

Flexibility for Tenants

  • Stack independent of the guest OS

VM VM

mTCP NSM BBR NSM

11

slide-27
SLIDE 27

Flexibility for Tenants

  • Stack independent of the guest OS
  • No deployment or maintenance cost

VM VM

mTCP NSM BBR NSM

11

slide-28
SLIDE 28

Efficiency for Provider

12

slide-29
SLIDE 29

Efficiency for Provider

  • Offer meaningful SLAs

NSM Capacity Price mTCP 25Mpps $2/hr mTCP 50Mpps $4/hr F-Stack 20Mpps $2/hr

12

slide-30
SLIDE 30

Efficiency for Provider

  • Offer meaningful SLAs

NSM Capacity Price mTCP 25Mpps $2/hr mTCP 50Mpps $4/hr F-Stack 20Mpps $2/hr

  • Optimize resource utilization

BBR NSM

12

slide-31
SLIDE 31

Efficiency for Provider

  • Offer meaningful SLAs

NSM Capacity Price mTCP 25Mpps $2/hr mTCP 50Mpps $4/hr F-Stack 20Mpps $2/hr

  • Easier to assert coordination and control
  • Optimize resource utilization

BBR NSM

12

slide-32
SLIDE 32

Efficiency for Provider

  • Offer meaningful SLAs

NSM Capacity Price mTCP 25Mpps $2/hr mTCP 50Mpps $4/hr F-Stack 20Mpps $2/hr

  • Easier to assert coordination and control
  • Optimize resource utilization

BBR NSM

mon. pHost NUM Fabric

12

slide-33
SLIDE 33

Accelerate Innovation

13

VM VM VM VM

mTCP Oct 2017 mTCP Nov 2017 mTCP Dec 2017

slide-34
SLIDE 34

Accelerate Innovation

  • Allow stack to evolve independently with the guest OS
  • Write once, run everywhere

13

VM VM VM VM

mTCP Oct 2017 mTCP Nov 2017 mTCP Dec 2017

slide-35
SLIDE 35

Accelerate Innovation

  • Allow stack to evolve independently with the guest OS
  • Write once, run everywhere

13

VM VM VM VM

mTCP Oct 2017 mTCP Nov 2017 mTCP Dec 2017

Not possible in current architecture

slide-36
SLIDE 36

NetKernel

VM

APP1 APP2 Network API Virtual Switch / Embedded Switch (SR-IOV)

NSM

Network Stack

vNIC

pNICs Hypervisor Physical NICs

14

slide-37
SLIDE 37

GuestLib

Socket API

NetKernel

VM

APP1 APP2 Network API Virtual Switch / Embedded Switch (SR-IOV)

NSM

Network Stack

vNIC

pNICs Hypervisor Physical NICs

14

slide-38
SLIDE 38

GuestLib

Socket API

NetKernel

VM

APP1 APP2 Network API Virtual Switch / Embedded Switch (SR-IOV)

NSM

Network Stack ServiceLib

vNIC

pNICs Hypervisor Physical NICs

14

slide-39
SLIDE 39

GuestLib

Socket API

NetKernel

VM

APP1 APP2 Network API Virtual Switch / Embedded Switch (SR-IOV)

NSM

Network Stack ServiceLib

vNIC

pNICs Hypervisor Physical NICs

Data Data

Huge page

14

slide-40
SLIDE 40

GuestLib

Socket API

NetKernel

VM

APP1 APP2 Network API Virtual Switch / Embedded Switch (SR-IOV)

NSM

Network Stack ServiceLib

vNIC

pNICs Hypervisor Physical NICs

Data Data

Huge page

14

CoreEngine Queues

slide-41
SLIDE 41

Microbenchmark

  • 3000 lines of C code, in user space
  • QEMU KVM 2.5.0, Linux Kernel 4.9
  • Intel Xeon CPU E5-2618L v3 @ 2.30GHz x 2

Chunk size 64B 512B 1KB 2KB 4KB 8KB Latency 8ns 64ns 117ns 214ns 425ns 809ns Communication between ServiceLib and GuestLib (Random read and copy)

15

slide-42
SLIDE 42

Microbenchmark

  • 3000 lines of C code, in user space
  • QEMU KVM 2.5.0, Linux Kernel 4.9
  • Intel Xeon CPU E5-2618L v3 @ 2.30GHz x 2

Chunk size 64B 512B 1KB 2KB 4KB 8KB Latency 8ns 64ns 117ns 214ns 425ns 809ns Communication between ServiceLib and GuestLib (Random read and copy)

64Gbps 81Gbps

15

slide-43
SLIDE 43

Windows VM + BBR NSM

3 6 9 12 Win + NSM BBR Linux BBR Windows CTCP Linux CUBIC

Throughput (Mbps)

Beijing California

350ms rtt 12Mbps Uplink

BBR NSM

VM VM

16

slide-44
SLIDE 44

Takeaway

  • Vision: Network Stack as a Service
  • Decouple the network stack from the guest OS
  • Better flexibility and efficiency, and faster innovation
  • NetKernel as a solution
  • GuestLib, ServiceLib, CoreEngine

17

slide-45
SLIDE 45

Research Agenda

18

slide-46
SLIDE 46

Research Agenda

  • NSM form
  • VM? unikernel-based VMs?

containers? hypervisor modules?

18

slide-47
SLIDE 47

Research Agenda

  • NSM form
  • VM? unikernel-based VMs?

containers? hypervisor modules?

18

  • Support for containers
  • Currently a container has to use the

host stack

  • Different containers on the same

host use different stacks

slide-48
SLIDE 48

Research Agenda

  • NSM form
  • VM? unikernel-based VMs?

containers? hypervisor modules?

18

Spark Nginx

DCTCP BBR

  • Support for containers
  • Currently a container has to use the

host stack

  • Different containers on the same

host use different stacks

slide-49
SLIDE 49

Research Agenda

  • NSM form
  • VM? unikernel-based VMs?

containers? hypervisor modules?

18

Spark Nginx

DCTCP BBR

  • Support for containers
  • Currently a container has to use the

host stack

  • Different containers on the same

host use different stacks

  • Network stacks to NSMs
slide-50
SLIDE 50

Open Questions

  • Any downsides?
  • Other use cases in a production cloud?
  • How about a private data center?
  • What’s the right abstraction boundary
  • f the network stack?

19