CoCo: Compact and Optimized Consolidation of Modularized Service - - PowerPoint PPT Presentation

coco compact and optimized consolidation of modularized
SMART_READER_LITE
LIVE PREVIEW

CoCo: Compact and Optimized Consolidation of Modularized Service - - PowerPoint PPT Presentation

CoCo: Compact and Optimized Consolidation of Modularized Service Function Chains in NFV Zili Meng Jun Bi Haiping Wang Chen Sun Hongxin Hu NFV & Modularization Dedicated Dedicated Dedicated Dedicated NFV: Commodity Hardware Devices


slide-1
SLIDE 1

CoCo: Compact and Optimized Consolidation of Modularized Service Function Chains in NFV

Zili Meng Jun Bi Haiping Wang Chen Sun Hongxin Hu

slide-2
SLIDE 2

NFV & Modularization

2

VPN Firewall Monitor Load Balancer

NFV: Commodity Hardware Devices

VM VM VM VM

Dedicated Dedicated Dedicated Dedicated

Service Chain

Read Output Classifier Alert Drop elements Modularized SFC (MSFC)

slide-3
SLIDE 3

NFV & Modularization

3

VPN Firewall Monitor Load Balancer

NFV: Commodity Hardware Devices

VM VM VM VM

Dedicated Dedicated Dedicated Dedicated

Low Cost Flexibility Scalability

…… Service Chain

Read Output Classifier Alert Drop elements Modularized SFC (MSFC)

slide-4
SLIDE 4

However…

  • Two drawbacks:

–High latency –poor resource efficiency

4

slide-5
SLIDE 5

However…

  • Two drawbacks:

–High latency –poor resource efficiency

5

  • OpenBox [Sigcomm’16]

– Element reuse

  • NFVnice [Sigcomm’17]

– NF consolidation: containers in one VM (core).

Which elements to consolidate?

slide-6
SLIDE 6

Key Observations

6

E1 E2 E3 E4 E5 E6 E7 VM1 VM2 VM3 E1 E2 E3 E4 E5 E6 E7 VM1 VM2 VM3

placement affects MSFC performance by affecting inter‐VM transfers

slide-7
SLIDE 7

CoCo… identifies inter‐VM transfer between elements

  • ptimizes placement of elements on VMs
  • ptimizes dynamic scaling mechanism
slide-8
SLIDE 8

Challenges

  • Optimized Placement

–How to model the inter‐VM transfer? –How to find optimal solutions efficiently?

  • Optimized Dynamic Scaling

–How to reduce inter‐VM transfers during scaling out?

8

slide-9
SLIDE 9

Challenges

  • Optimized Placement

–How to model the inter‐VM transfer? –How to find optimal solutions efficiently?

  • Optimized Dynamic Scaling

–How to reduce inter‐VM transfers during scaling out?

9

Optimized Placer Individual Scaler

slide-10
SLIDE 10

Optimized Placer

  • Packet Transfer Cost:

– Four‐step transfer delay: – Service chain throughput: Θ – Delayed Bytes: Θ ⋅

  • Resource Analysis:

– Observation: The CPU utilization of an element is linear to processing speed

10

VM #n vSwitch vSwitch

④ ③

vNIC vNIC VM Memory VM Memory element element

Scheduler VM #1

① ②

vNIC vNIC VM Memory VM Memory element element Scheduler

slide-11
SLIDE 11

Optimized Placer

  • Packet Transfer Cost:

– Four‐step transfer delay: – Service chain throughput: Θ – Delayed Bytes: Θ ⋅

  • Resource Analysis:

– Observation: The CPU utilization of an element is linear to processing speed

11

VM #n vSwitch vSwitch

④ ③

vNIC vNIC VM Memory VM Memory element element

Scheduler VM #1

① ②

vNIC vNIC VM Memory VM Memory element element Scheduler

slide-12
SLIDE 12

VM2

Logger Alert

VM1

Header Classifier Stateful Payload Analyzer

Optimized Placer – 0‐1 Quadratic Programming

  • Intuition: Consolidate adjacent elements together

– If we place two adjacent elements together to one VM, there will be no inter‐VM packet transfer.

12

inter‐VM intra‐VM

slide-13
SLIDE 13

Optimized Placer – 0‐1 Quadratic Programming

  • ,

: indicating element is placed onto instance

  • Challenge: How to express two elements are placed together?

, ,

  • 13
  • 1

2 3 4 5 6 , 1 , 1 , ⋅ ,

  • 1

2 3 4 5 6 , 1 , 1 , ⋅ , 1

indicator: (quadratic)

slide-14
SLIDE 14

Optimized Placer – 0‐1 Quadratic Programming

  • Objective

– The total inter‐VM Delayed Bytes.

  • Constraints

– The placement cannot lead to the overload of any instances.

  • For other mathematical details, please refer to our paper.

14

slide-15
SLIDE 15

Optimized Individual Scaling

15

VM2

Logger Alert

VM1

Header Classifier Stateful Payload Analyzer

VM2

Logger Alert

VM1

Header Classifier Stateful Payload Analyzer

VM3

Stateful Payload Analyzer

state syn

~100ms according to OpenNF [Sigcomm’14]

MSFC before scaling Scaling with traditional method

additional packet transfer

slide-16
SLIDE 16

Optimized Individual Scaling

  • Key novelty

Migrate other elements consolidated together to release resources for the overloaded element.

16

slide-17
SLIDE 17

Optimized Individual Scaling

17

VM3 VM1 VM2

Logger Alert

VM1

Header Classifier Stateful Payload Analyzer

VM2

Logger Alert Header Classifier Stateful Payload Analyzer

VM2

Logger Alert

VM1

Header Classifier Stateful Payload Analyzer Stateful Payload Analyzer state syn

MSFC before scaling Scaling with traditional method CoCo

additional packet transfer

slide-18
SLIDE 18

Optimized Individual Scaling

  • Consistency guarantee mechanism

– Overload should be alleviated. – Migration will not lead to new hotspots.

  • Advantage of CoCo Individual Scaler

– No new hardware resource consumed – Additional packet transfer avoided – State synchronization avoided

  • Application scenario of CoCo Individual Scaler

– Imbalance between VMs (OFM [IWQoS’18])

18

slide-19
SLIDE 19

Optimized Individual Scaling

  • Consistency guarantee mechanism

– Overload should be alleviated. – Migration will not lead to new hotspots.

  • Advantage of CoCo Individual Scaler

– No new hardware resource consumed – Additional packet transfer avoided – State synchronization avoided

  • Application scenario of CoCo Individual Scaler

– Imbalance between VMs (OFM [IWQoS’18])

19

slide-20
SLIDE 20

Implementation and Evaluation

  • Evaluation Setup

– Docker for consolidation, DPDK version 16.11 – OpenNF [Sigcomm’14] and TFM [ICNP’16] for migration mechanisms. – MATLAB for solving 0‐1 Quadratic Programming – Intel(R) Xeon(R) E5‐2690 v2 CPUs, 256G RAM, 2×10G NICs

  • Evaluation Goal

– demonstrate the assumption of linearity – demonstrate the effectiveness of CoCo placement – demonstrate the performance of CoCo scaling

20

slide-21
SLIDE 21
  • 1. Throughput‐CPU Utilization
  • For one core only
  • Sender

– 0.9997

  • Classifier

– 100 rules on IP header – 0.9999997

21

slide-22
SLIDE 22
  • 2. Simulations on Placement
  • Evaluation Target

– Random: select available VMs randomly – Greedy: place elements in sequence chain‐by‐chain

  • Traffic: Randomly pick flows from CAIDA traffic
  • Two topology:

22

E1 E2 E3 E4 E5 E6

Chain 1 Chain 2

E1 E2 E3 E4 E5 E6

Chain 3

E7 E8 E9

Chain 1 Chain 2

slide-23
SLIDE 23
  • 2. Simulations on Placement
  • Performance
  • Resource Utilization

23

1 2 3 4 5 6 Topo1 Topo2 Sum of DB (MB) CoCo Greedy Random 59% 18% 0% 5% 10% 15% 20% 25% Topo1 Topo2 Placement Failure Rate CoCo Greedy Random

slide-24
SLIDE 24
  • 2. Simulations on Placement
  • Performance
  • Resource Utilization

24

1 2 3 4 5 6 Topo1 Topo2 Sum of DB (MB) CoCo Greedy Random 59% 18% 0% 5% 10% 15% 20% 25% Topo1 Topo2 Placement Failure Rate CoCo Greedy Random

slide-25
SLIDE 25
  • 3. Evaluation on Dynamic Scaling
  • Based on OpenNF [Sigcomm’14]
  • Per‐packet latency

25

VM3 VM1 VM2

Logger Alert Header Classifier Stateful Payload Analyzer

VM2

Logger Alert

VM1

Header Classifier Stateful Payload Analyzer Stateful Payload Analyzer 20 40 60 80 10 20 30 40 50 Latency (ms) Packet # (kilo) CoCo Traditional

by 45% traffic increases

slide-26
SLIDE 26

Conclusion

  • CoCo: Compact and Optimized Consolidation of MSFCs in NFV

– Optimized Placer – Individual Scaler

  • Significant Performance Improvement

– Up to 59% Delayed Bytes reduction in initial placement. – 45% latency reduction when dynamic scaling.

  • Future work

– Multi‐core placement – Intra‐core cache analysis

26

slide-27
SLIDE 27

Thank you!

netarchlab.tsinghua.edu.cn mengzl15@mails.tsinghua.edu.cn