NetKernel: Making Network Stack Part of the Virtualized - - PowerPoint PPT Presentation

netkernel making network stack part of the virtualized
SMART_READER_LITE
LIVE PREVIEW

NetKernel: Making Network Stack Part of the Virtualized - - PowerPoint PPT Presentation

NetKernel: Making Network Stack Part of the Virtualized Infrastructure Zhixiong Niu , Hong Xu, Peng Cheng, Qiang Su, Yongqiang Xiong, Tao Wang, Dongsu Han, Keith Winstein Current architecture in the cloud VM VM APP APP Guest OS Guest OS


slide-1
SLIDE 1

NetKernel: Making Network Stack Part of the Virtualized Infrastructure

Zhixiong Niu, Hong Xu, Peng Cheng, Qiang Su, Yongqiang Xiong, Tao Wang, Dongsu Han, Keith Winstein

slide-2
SLIDE 2

Current architecture in the cloud

2

VM

APP Guest OS DCN Infrastructure Hypervisor infrastructure

Network Stack

VM

APP Guest OS

Network Stack

slide-3
SLIDE 3

What’re the fundamental limitations?

3

slide-4
SLIDE 4

Have to deal with the network stack all by myself

Motivation: Tenants

TCP parameters

initcwnd initialRTO (ms) minRTO (ms) DelayedAckTimeout (ms)

4

BBR CUBIC MPTCP PCC CTCP DCTCP mTCP

StackMap

FastSocket

MegaPipe FlexSC

Kernel Buffer net.ipv4.tcp_rmem net.ipv4.tcp_wmem net.core.rmem_max net.core.wmem_max

slide-5
SLIDE 5

Tenants are primarily concerned with performance and functionality, not implementation details.

5

slide-6
SLIDE 6

Kernel Kernel NIC NIC NIC

Motivation: Operator

6

Resources

DPDK Kernel FPGA RDMA NIC

I know everything here. I can really help my tenants (and make some money!)

slide-7
SLIDE 7

Motivation: Operator

7

Stack

VM

Hypervisor

Provider Tenant

Can’t deploy new stacks (DCTCP) Difficult to perform management tasks Difficult to even define performance SLA Difficult to troubleshoot

Zero visibility or control

  • f the network stack
slide-8
SLIDE 8

Is there a better way?

8

slide-9
SLIDE 9

Making Network Stack Part of the Virtualized Infrastructure

Current architecture

9

Interface unchanged (BSD sockets, etc.)

Packets handled in the NSM

9

slide-10
SLIDE 10

Benefits

  • Better efficiency in management for the operator
  • Orchestrate the resource provisioning strategies more flexibly
  • Implement management functions as a part of user’s network stack
  • Deployment and performance gains for users without efforts
  • Enforce various kernel stack optimizations
  • Enforce high-performance userspace stacks
  • Use advanced hardware

10

slide-11
SLIDE 11

Design Challenges

  • How to transparently redirect socket API calls without changing

applications?

  • How to transmit the socket semantics between the VM and NSM?
  • How to ensure high performance with semantics transmission (e.g.,

100 Gbps)?

11

slide-12
SLIDE 12

Transparent socket API redirection

12

socket(), socket_sendmsg(), … nk_socket(), nk_socket_sendmsg(), …

  • A new sock type, SOCK_NETKERNEL
  • GuestLib: A complete implementation of BSD socket APIs

Tenant VM GuestLib

nk_socket(), nk_sendmsg(), …

BSD Socket API

socket(), send(), …

GuestLib

slide-13
SLIDE 13
  • NQE: NetKernel queue elements for semantics
  • NQE queues for semantics transmission and hugepages for data

transmission in NetKernel device

A lightweight semantics channel

13

1B

  • p

type 1B VM ID 1B Queue set ID 4B VM socket ID 8B

  • p_data

8B data pointer 4B size 5B rsved

Tenant VM GuestLib

nk_bind(), nk_sendmsg(), …

Huge pages

BSD Socket API

socket(), send(), …

NQE

(2) translate to NQE (1) NetKernel socket (3) response NQE (4) return to app

NetKernel device Queues

slide-14
SLIDE 14

Scalable lockless queues

14

  • Per-core queue set, lockless queues
  • NQE switching via CoreEngine

VM1

GuestLib NK device CoreEngine

connection table

queue set 1 ServiceLib

NSM 1

<VM ID, queue set ID, socket ID> <NSM ID, queue set ID, socket ID> <01, 01, 2A 3E 97 C3> <01, 01, C8 5D 42 6F> <01, 01, FC 68 4E 02> <01, 02, ?>

queue set 2 queue set 1

slide-15
SLIDE 15

VM based NSM.

15

  • Supports existing kernel and userspace stacks from various Oses
  • Provide good isolation to guarantee the performance
  • Run stacks independent of the hypervisor
slide-16
SLIDE 16

NetKernel

Tenant VM GuestLib (NetKernel Socket)

pNICs

NetKernel device Huge pages Huge pages queues

stripped area indicates a shared memory region mmap

BSD Socket APP2 APP1 NSM ServiceLib Huge pages Network Stack NetKernel CoreEngine

Virtual Switch or Embedded Switch (SR-IOV) vNIC

queues

16

slide-17
SLIDE 17

Implementation

  • QEMU KVM 2.5.0, Linux Kernel 4.9
  • Intel(R) Xeon(R) 16-core CPU @ 2.30GHz x 2
  • 256GB DDR4 2133MHz
  • Mellanox ConnectX-4 100G single port NIC

17

slide-18
SLIDE 18

Use Cases #1: Multiplexing

10 20 30 40 50 60 Time (min) 20 40 60 80 100 120 Normalized rps performance AG1 AG2 AG3

18

Application Gateway (AG): L7 proxy and load balancing services AG1 AG2 AG3 Normalized RPS Performance of a trace from a large cloud 4 core 4 core 4 core

slide-19
SLIDE 19

Use Cases #1: Multiplexing

19

10 20 30 40 50 60 Time (min) 2 4 6 8 10 12 14 Normalized rps per core Baseline Netkernel

AG1 AG2 AG3 NSM CoreEngine 1 core 1 core 1 core 5 cores 1 core

NetKernel: 9 Cores Baseline: 12 Cores

Benefit: NetKernel can help operator perform network management more efficiently

slide-20
SLIDE 20

Use Case #2: Deploying mTCP without API Change

  • mTCP doesn't support Nginx yet
  • mTCP ported as an NSM, fixed a bug in DPDK mlx5_core driver
  • Unmodified Nginx on mTCP without any tenant effort

20 50 100 150 200 250 300 350 400 1 vCPU 2 vCPUs 4 vCPUs Kernel Stack NSM mTCP NSM

Krps

mTCP NSM brings ~1.8x performance gain

slide-21
SLIDE 21

Use Case #3: Shared Memory Networking

64 128 256 512 1024 2048 4096 8192

0essage 6ize (B)

20 40 60 80 100 120

TKrougKSut (GbSs) Baseline 1et.ernel w. sKareG mem 160

21

  • The operator can easily detect the on-host traffic with NetKernel
  • For on-host traffic, it can use shared memory NSM to avoid TCP and

bridge overhead

Shared memory NSM can achieve >2x performance gain for on-host traffic Deployment and performance gains for users

Benefit: NetKernel can help user achieve deployment and performance gains

slide-22
SLIDE 22

Microbenchmarks: Throughput

22

1 2 3 4 5 6 7 8 # of vC38s 20 40 60 80 100 120

7KUougKput (Gbps) Baseline 1etKeUnel

  • Baseline (a VM) and NetKernel (a VM with a Linux Kernel) using

the same setting

  • 8 TCP connections, 8KB messages

Send Receive

Can achieve 100Gbps with 3 cores (send), 8 cores (receive)

1 2 3 4 5 6 7 8 # of vC38s 20 40 60 80 100 120

7KUougKput (Gbps) Baseline 1etKeUnel

slide-23
SLIDE 23

Microbenchmarks: RPS

23

  • Simple epoll server, short TCP conn.
  • 64B request/response

mTCP NSM brings 2x performance gain

1 2 3 4 5 6 7 8 # Rf vC38s 200 400 600 800 1000 1200 5eTuests / sec (x 103 )

Baseline 1et.eUnel 1et.eUnel w. P7C3 160 P7C3

slide-24
SLIDE 24

Discussion and future directions

  • How can I do Netfilter?
  • Hard to support for multiple-tenant NSM
  • What about troubleshooting performance issues?
  • Operator can easily monitor their NSMs by deploy additional mechanisms in the

NSMs

  • Does NetKernel increase the attack surface?
  • Own address spaces for NK device
  • Isolated channel between NSM and VM
  • Future directions
  • Performance isolation
  • Charging policies
  • FPGA/SoC

24

slide-25
SLIDE 25

Recap

  • Designed and implemented NetKernel
  • Decouples the network stack from the guest
  • Making it part of the virtualized infrastructure in the cloud
  • Enabled several new usecases
  • Multiplexing, mTCP NSM, Shared mem. NSM
  • Conducted comprehensive testbed evaluation with commodity 100G

NICs

  • Website
  • https://netkernel.net

25