Microboxes: High Performance NFV with Customizable, Asynchronous TCP - - PowerPoint PPT Presentation

microboxes high performance nfv
SMART_READER_LITE
LIVE PREVIEW

Microboxes: High Performance NFV with Customizable, Asynchronous TCP - - PowerPoint PPT Presentation

Microboxes: High Performance NFV with Customizable, Asynchronous TCP Stacks and Dynamic Subscriptions Guyue (Grace) Liu , Yuxin Ren, Mykola Yurchenko K.K. Ramakrishnan, Timothy Wood 1 Why Improve Existing NFV Frameworks? Existing NFV


slide-1
SLIDE 1

Microboxes: High Performance NFV with Customizable, Asynchronous TCP Stacks and Dynamic Subscriptions

Guyue (Grace) Liu, Yuxin Ren, Mykola Yurchenko K.K. Ramakrishnan, Timothy Wood

1

slide-2
SLIDE 2

Guyue Liu – George Washington University

Why Improve Existing NFV Frameworks?

  • Existing NFV frameworks focus on L2/L3 processing

OpenNetVM [Hotmiddlebox’16] E2 [SOSP’15] ClickOS [NSDI’14] netmap [Usenix ATC ‘12] PF_RING [SANE’04]

2

slide-3
SLIDE 3

Guyue Liu – George Washington University Transport Network Data Link Physical

L4 L3 L5-7 L2 L1

UDP TCP Ethernet, PPP, LLDP IP, ICMP, ARP HTTP, DNS, FTP, SMTP, SSH, POP TELNET L2 Fwd L3 Fwd Firewall NAT IPsec Shaper

Packet Application

Why Improve Existing NFV Frameworks?

  • Existing NFV frameworks are based on a packet-centric model

NF1

NFV IO

NIC NIC

NF2 NF3

P P P P

Data

3

slide-4
SLIDE 4

Guyue Liu – George Washington University Transport Network Data Link Physical

L4 L3 L5-7 L2 L1

UDP TCP Ethernet, PPP, LLDP IP, ICMP, ARP HTTP, DNS, FTP, SMTP, SSH, POP TELNET L2 Fwd L3 Fwd Firewall NAT IPsec Shaper

Packet Application

Load Balancer Web Proxy IDS Gateway Transcoder NFV IO

NIC NIC

P P

Data

P P

stack NF1

data

stack NF2

data

stack NF3

data

Why Improve Existing NFV Frameworks?

  • Existing NFV frameworks are based on a packet-centric model
  • Protocol processing becomes part of the NF. Repeated

protocol stack processing within a chain - redundant

[NSDI’17]

4

slide-5
SLIDE 5

Guyue Liu – George Washington University

20 40 60 80 100 120 140 1 2 3 4 5 6 7 8 Processing Latency (us) Chain Length stack-8KB fwd-8KB

169% 79%

Issue #1: Redundant Stack Processing

  • As the chain length increases, the overhead grows significantly

when going through stack processing multiple times

5

slide-6
SLIDE 6

Guyue Liu – George Washington University

Idea #1: Consolidate Stack Processing

  • How can we remove the redundancy within a chain?

NFV IO

NIC P P

stack

NF1

app

stack

NF2

app

stack

NF3

app

NIC P P 6

slide-7
SLIDE 7

Guyue Liu – George Washington University app NFV IO

Idea #1: Consolidate Stack Processing

  • How can we remove the redundancy within a chain?
  • Deploy all NFs and the stack as a single, monolithic process?

NIC

stack

NF1

NIC P P

NF2 NF3

P P 7

slide-8
SLIDE 8

Guyue Liu – George Washington University

Idea #1: Consolidate Stack Processing

  • How can we remove the redundancy within a chain?
  • Move stack processing from NF into NFV framework

NFV IO

NIC

stack

NF1

app

NIC P P

NF2

app

NF3

app

P P 8

slide-9
SLIDE 9

Guyue Liu – George Washington University

Issue #2: A Monolithic Stack is Not Efficient

1 2 3 4 5 6 7 8 9 10 Simple Fwd Connection Splicing TCP State Tracking TCP Bytetream Assembly Throughput (Gbps)

  • The throughput drops as the stack processing grows in functionality

9

slide-10
SLIDE 10

Guyue Liu – George Washington University

Stack

Idea #2: Customizable Stack Modules

  • How to avoid unnecessary processing in the stack?

NFV IO

NIC

NF1

app

NIC P P P P

NF2

app

NF3

app

10

slide-11
SLIDE 11

Guyue Liu – George Washington University

Idea #2: Customizable Stack Modules

  • How to avoid unnecessary processing in the stack?
  • Split stack into modules based on functionality and

customize processing for each NF/chain

NFV IO

NIC

NF1

app

NIC P P Bytestream Reconstruction State Monitoring P P

NF2

app

NF3

app

11

slide-12
SLIDE 12

Guyue Liu – George Washington University

Issue #3: Separate Stacks for NFs and Endpoint Applications

  • Middlebox NFs and Endpoint applications use different

underlying frameworks for protocol support

NFV IO

NIC

IDS

NIC P P

DPI MON

Stack 2 Stack 3 Stack 1

Linux TCP

Web Proxy

P P Socket API 12

slide-13
SLIDE 13

Guyue Liu – George Washington University

  • How to transparently manage both middleboxes and endpoints?

NFV IO

NIC

IDS

NIC P P

DPI MON

Stack 2 Stack 3 Stack 1

Linux TCP

Web Proxy

P P Socket API

Issue #3: Separate Stacks for NFs and Endpoint Applications

13

slide-14
SLIDE 14

Guyue Liu – George Washington University

Idea #3: Event Communication Interface

  • How to transparently manage both middleboxes and endpoints?
  • A flexible event interface can represent pkt., data and legacy

events for a variety of services

NFV IO

NIC NIC P P Stack 2 Stack 3 Stack 1 Stack 4

Event Event Event Event Event

IDS DPI MON

Web Proxy

14

slide-15
SLIDE 15

Guyue Liu – George Washington University

Microboxes = µStack + µEvent

  • Idea #1: Consolidate Stack Processing
  • Idea #2: Customizable Stack Modules
  • Idea #3: Event Communication Interface

µStack µEvent

NF NF NF NF NF

µStack µStack µStack µStack µStack µStack µStack µEvent µEvent µEvent µEvent

15

slide-16
SLIDE 16

Guyue Liu – George Washington University

Outline

➢ Why Improve NFV Frameworks? ➢ Microboxes = µStack + µEvent ➢ µStack

▪ Customizable Modules ▪ Consistency Challenges

➢ µEvent

▪ Hierarchy Events ▪ Publish/Subscribe Interface

16

slide-17
SLIDE 17

Guyue Liu – George Washington University

µStack Modules

  • We divide TCP processing into five basic µStacks and they

can be composed together to support different NFs.

L2/3 µStack TCP Split Proxy µStack TCP monitor µStack TCP Splicer µStack TCP Endpoint µStack

Web Proxy HTTP LB IDS Firewall Transcoder

17

slide-18
SLIDE 18

Guyue Liu – George Washington University

µStack Modules

  • Layer 2/3: network layer processing to determine what flow it

is associated with and maintain minimal state such as flow stats.

L2/3 µStack TCP Split Proxy µStack TCP monitor µStack TCP Splicer µStack TCP Endpoint µStack

Web Proxy HTTP LB IDS Firewall Transcoder

18

slide-19
SLIDE 19

Guyue Liu – George Washington University

µStack Modules

  • TCP Monitor: tracks the TCP state and reconstructs bytestream
  • f both the client and server side of a connection.

L2/3 µStack TCP Split Proxy µStack TCP monitor µStack TCP Splicer µStack TCP Endpoint µStack

Web Proxy HTTP LB IDS Firewall Transcoder

19

slide-20
SLIDE 20

Guyue Liu – George Washington University

µStack Modules

  • TCP Splicer: redirects a TCP connection after establishing the

handshake without support for modifying the bytestream.

L2/3 µStack TCP Split Proxy µStack TCP monitor µStack TCP Splicer µStack TCP Endpoint µStack

Web Proxy HTTP LB IDS Firewall Transcoder

20

slide-21
SLIDE 21

Guyue Liu – George Washington University

µStack Modules

  • TCP Endpoint: contains the full TCP logic and can terminate

and respond to client requests directly.

L2/3 µStack TCP Split Proxy µStack TCP monitor µStack TCP Splicer µStack TCP Endpoint µStack

Web Proxy HTTP LB IDS Firewall Transcoder

21

slide-22
SLIDE 22

Guyue Liu – George Washington University

µStack Modules

  • TCP Split Proxy: sets up two TCP connections with client and

server respectively and allows NFs to perform bytestream transformations.

L2/3 µStack TCP Split Proxy µStack TCP monitor µStack TCP Splicer µStack TCP Endpoint µStack

Web Proxy HTTP LB IDS Firewall Transcoder

22

slide-23
SLIDE 23

Guyue Liu – George Washington University

Stack Consistency

  • Stack and NFs are running on separate cores.
  • Both Stack and NFs need to access the stack state.

P1 P3

Stack State

NF1 NF2 NF3 Stack

P2

23

slide-24
SLIDE 24

Guyue Liu – George Washington University

Stack Consistency

P1 P3

Stack State

NF1 NF2 NF3 Stack

P2

  • Stack state could be inconsistent when NF reads the state

while stack has changed it based on new packet arrivals.

Stack Consistency: Protocol stack associated with each packet needs to be consistent when each NF processes this packet.

24

slide-25
SLIDE 25

Guyue Liu – George Washington University

Stack Consistency

  • Sequential processing can achieve the correctness but lead to

an inefficient pipeline

P1

Core 0 (Stack) Core 1 (NF1) Core 2 (NF2) Core 3 (NF3) time

P1 P1 P2 P2 P2 P2 P3

Stack State

P1 P1 P3 P3 P3 P3

NF1 NF2 NF3 Stack

P2

25

slide-26
SLIDE 26

Guyue Liu – George Washington University

Stack Consistency

  • Only one core is doing useful work while others are idle

Core 0 (Stack) Core 1 (NF1) Core 2 (NF2) Core 3 (NF3) time

P1 P1 P2 P2 P2 P2 P1 P1 P3 P3 P3 P3

Idle !

T

P1 P3

Stack State

NF1 NF2 NF3 Stack

P2

26

slide-27
SLIDE 27

Guyue Liu – George Washington University

Stack Consistency: Stack Snapshots

  • Take a snapshot of stack state for each packet to avoid

inconsistency problem

Core 0 (Stack) Core 1 (NF1) Core 2 (NF2) Core 3 (NF3) time

P1 P1 P2 P2 P2 P2 P1 P1 P3 P3 P3 P3

T

P1 P3

Stack State

NF1 NF2 NF3 Stack

P2

27

slide-28
SLIDE 28

Guyue Liu – George Washington University

Stack Consistency: Track Bytestream

  • Store an offset instead of copying the whole bytestream
  • Allow stack and NF processing to be performed asynchronously

P1 P3

Stack State

NF1 NF2 NF3 Stack

P2

bytestream Core 0 (Stack) Core 1 (NF1) Core 2 (NF2) Core 3 (NF3) time

P1 P1 P2 P2 P2 P2 P1 P1 P3 P3 P3 P3

T

28

slide-29
SLIDE 29

Guyue Liu – George Washington University

NF Consistency: Parallel Processing

  • Parallel processing increases core utilization and can be used

for NFs without dependencies (NFP [SIGCOMM’17], Parabox [SOSR’17])

NF1 NF2 NF3 Stack Core 0 (Stack) Core 1 (NF1) Core 2 (NF2) Core 3 (NF3) time

P1 P1 P2 P2 P2 P2 P1 P1 P3 P3 P3 P3

T

P1 P3 P2 P1 P1

29

slide-30
SLIDE 30

Guyue Liu – George Washington University NIC RSS

Flow Consistency: Parallel Stacks

  • Run multiple copies of the same stack to maximize performance
  • Packets are distributed at flow level to keep flow consistency

NF1 µStack NF5 NF3 NF4 µStack NF2 µStack µStack µStack

30

slide-31
SLIDE 31

Guyue Liu – George Washington University

µEvent

  • Why Event Interface ???
  • NFs operating at the application level care about “Data” instead of

individual packets. An event can encapsulate the data and notify the subscribers

NF1 NF2 NF3 Stack PKT

  • new pkt arrival
  • TCP SYN pkt

FLOW

  • end of the flow
  • bytestream data

NF

  • malicious flow
  • SQL request

STACK

  • TCP pkt
  • TCP flow

31

slide-32
SLIDE 32

Guyue Liu – George Washington University

µEvent Hierarchy

  • Event types: pkt, flow, stack and NF events organized as a hierarchy.
  • Event extension X/Y: a type can be extended to create a new type by

adding extra fields in addition to parent fields.

Event PKT FLOW PKT/TCP FLOW/TCP PKT/TCP/SYN PKT/TCP/FIN FLOW/TCP/TERMINATE PKT/TCP/FLOW_TYPE FLOW/QUIC ALERT FLOW/TCP/DATA_RDY ALERT/SIG_MATCH PKT/UDP PKT/QUIC FLOW/UDP

32

slide-33
SLIDE 33

Guyue Liu – George Washington University

µEvent Hierarchy

  • Event Hierarchy simplifies type checking and filtering
  • Set up publish/subscribe system: NF subscribes to portion of the

hierarchy and gets the subevents.

Event PKT PKT/TCP PKT/TCP/SYN PKT/TCP/FIN PKT/TCP/FLOW_TYPE PKT/UDP PKT/QUIC

33

slide-34
SLIDE 34

Guyue Liu – George Washington University

µEvent Hierarchy

  • Event Hierarchy simplifies type checking and filtering
  • Set up subscribe/publish system: NF subscribe to portion of the

hierarchy and gets the subevents.

Event PKT PKT/TCP PKT/TCP/SYN PKT/TCP/FIN PKT/UDP PKT/QUIC PKT/TCP/FLOW_TYPE

34

slide-35
SLIDE 35

Guyue Liu – George Washington University

µEvent Subscription

  • Controller translates pub/sub into a flow table service chain rules to

allow fast event propagation

  • When an event is triggered, the NF looks up the flow table for the next

hop associated with this event port

Controller

Monitor µStack

SIG_ MATCH

Logger

Flow Service Chain flow1 Mon stack (p2) -> DPI (p2) DPI (p3) -> SIG (p1) SIG (p2) -> Logger (p1)

DPI

35

DPI

SIG_ MATCH Monitor µStack

Logger

Pub/Sub architecture provides convenient, higher level interfaces based on the flow of events rather than the flow of packets

slide-36
SLIDE 36

Guyue Liu – George Washington University

Implementation

L4 Load Balancer L7 Load Balancer TCP Proxy nDPI SIG Match Flow Stats Logger

µStack Modules (mOS [NSDI’17], mTCP [NSDI’14]) NFV IO (DPDK) Chain Management + NF Communication (OpenNetVM) µEvent Pub/Sub Interface

36

slide-37
SLIDE 37

Guyue Liu – George Washington University

Evaluation

  • Does Microboxes improve the performance by removing

redundancy?

  • Can Microboxes provide customized stacks based on

application requirements?

  • Experiment Setup:
  • CloudLab Servers: Xeon E5-2660 v3 @ 2.60GHz CPUs (2 *10 cores)

10Gb NIC, 160GB memory

  • Traffic Generator: mTCP web server and client; Nginx 1.4.6 and

Apache Bench 2.3

37

slide-38
SLIDE 38

Guyue Liu – George Washington University

Evaluation: Remove Redundancy

NF1 NF2 NF3 Stack Stack Stack NF1 NF2 NF3 Stack

1 2 3 4 5 6 1 2 3 4 5 6 7 8 Throughput (Gbps) Chain Length mOS

microboxes

event pkt pkt event event

2nd Socket

38

slide-39
SLIDE 39

Guyue Liu – George Washington University

Evaluation: Remove Redundancy

NF1 NF2 NF3 Stack Stack Stack NF1 NF2 NF3 Stack

1 2 3 4 5 6 1 2 3 4 5 6 7 8 Throughput (Gbps) Chain Length mOS

microboxes

event pkt pkt event event

Removing redundant stack processing can improve the performance by ~2X or more

39

slide-40
SLIDE 40

Guyue Liu – George Washington University

Evaluation: Customize Stack Modules

nginx Web Server(s) 8KB file Clients 4K conns 50 100 150 200 250 300 350 400

DPDK L2 Fwd HAProxy L4 LB L7 LB L7 LB + 50% Cache L7 LB + 100% Cache Latency (us)

L2 FWD or HAProxy

40

slide-41
SLIDE 41

Guyue Liu – George Washington University

Evaluation: Customize Stack Modules

50 100 150 200 250 300 350 400

DPDK L2 Fwd HAProxy L4 LB L7 LB L7 LB + 50% Cache L7 LB + 100% Cache Latency (us)

Web Servers Clients L2/3 µStack L4 NF nginx Web Server(s) 8KB file Clients 4K conns

41

slide-42
SLIDE 42

Guyue Liu – George Washington University

Evaluation: Customize Stack Modules

50 100 150 200 250 300 350 400

DPDK L2 Fwd HAProxy L4 LB L7 LB L7 LB + 50% Cache L7 LB + 100% Cache Latency (us)

Web Servers Clients Splicer µStack L7 NF nginx Web Server(s) 8KB file Clients 4K conns

42

slide-43
SLIDE 43

Guyue Liu – George Washington University

Evaluation: Customize Stack Modules

50 100 150 200 250 300 350 400

DPDK L2 Fwd HAProxy L4 LB L7 LB L7 LB + 50% Cache L7 LB + 100% Cache Latency (us)

Web Servers Clients L7 NF Endpoint µStack Cache 50% Traffic Splicer µStack nginx Web Server(s) 8KB file Clients 4K conns

43

slide-44
SLIDE 44

Guyue Liu – George Washington University

Evaluation: Customize Stack Modules

50 100 150 200 250 300 350 400

DPDK L2 Fwd HAProxy L4 LB L7 LB L7 LB + 50% Cache L7 LB + 100% Cache Latency (us)

Web Servers Clients L7 NF Endpoint µStack Cache 100% Traffic Splicer µStack

Microboxes seamlessly integrates middleboxes and endpoints to build complex network services!

nginx Web Server(s) 8KB file Clients 4K conns

44

slide-45
SLIDE 45

Guyue Liu – George Washington University

Source Code

  • OpenNetVM now integrated with mOS/mTCP endpoint stack
  • Learn more at HPNFV tutorial (Friday 9:00AM – 5:20PM)
  • Microboxes with µStack and µEvents code coming soon

https://github.com/sdnfv/openNetVM

slide-46
SLIDE 46

Guyue Liu – George Washington University

Conclusion

✓Consolidate Stack Processing ✓Customizable Stack Modules ✓Unified Event Interface

Microboxes = µStack + µEvent

= stack snapshot + parallel stacks + parallel events + event hierarchy + publish/subscribe interface

  • Redundant Stack Processing
  • A Monolithic Stack
  • Separate Stacks/Interfaces

NF NF NF NF NF

µStack µStack µStack µStack µStack µStack µStack µEvent µEvent µEvent µEvent