CoCo: Compact and Optimized Consolidation of Modularized Service - - PowerPoint PPT Presentation
CoCo: Compact and Optimized Consolidation of Modularized Service - - PowerPoint PPT Presentation
CoCo: Compact and Optimized Consolidation of Modularized Service Function Chains in NFV Zili Meng Jun Bi Haiping Wang Chen Sun Hongxin Hu NFV & Modularization Dedicated Dedicated Dedicated Dedicated NFV: Commodity Hardware Devices
NFV & Modularization
2
VPN Firewall Monitor Load Balancer
NFV: Commodity Hardware Devices
VM VM VM VM
Dedicated Dedicated Dedicated Dedicated
Service Chain
Read Output Classifier Alert Drop elements Modularized SFC (MSFC)
NFV & Modularization
3
VPN Firewall Monitor Load Balancer
NFV: Commodity Hardware Devices
VM VM VM VM
Dedicated Dedicated Dedicated Dedicated
Low Cost Flexibility Scalability
…… Service Chain
Read Output Classifier Alert Drop elements Modularized SFC (MSFC)
However…
- Two drawbacks:
–High latency –poor resource efficiency
4
However…
- Two drawbacks:
–High latency –poor resource efficiency
5
- OpenBox [Sigcomm’16]
– Element reuse
- NFVnice [Sigcomm’17]
– NF consolidation: containers in one VM (core).
Which elements to consolidate?
Key Observations
6
E1 E2 E3 E4 E5 E6 E7 VM1 VM2 VM3 E1 E2 E3 E4 E5 E6 E7 VM1 VM2 VM3
placement affects MSFC performance by affecting inter‐VM transfers
CoCo… identifies inter‐VM transfer between elements
- ptimizes placement of elements on VMs
- ptimizes dynamic scaling mechanism
Challenges
- Optimized Placement
–How to model the inter‐VM transfer? –How to find optimal solutions efficiently?
- Optimized Dynamic Scaling
–How to reduce inter‐VM transfers during scaling out?
8
Challenges
- Optimized Placement
–How to model the inter‐VM transfer? –How to find optimal solutions efficiently?
- Optimized Dynamic Scaling
–How to reduce inter‐VM transfers during scaling out?
9
Optimized Placer Individual Scaler
Optimized Placer
- Packet Transfer Cost:
– Four‐step transfer delay: – Service chain throughput: Θ – Delayed Bytes: Θ ⋅
- Resource Analysis:
– Observation: The CPU utilization of an element is linear to processing speed
10
VM #n vSwitch vSwitch
④ ③
vNIC vNIC VM Memory VM Memory element element
…
Scheduler VM #1
① ②
vNIC vNIC VM Memory VM Memory element element Scheduler
Optimized Placer
- Packet Transfer Cost:
– Four‐step transfer delay: – Service chain throughput: Θ – Delayed Bytes: Θ ⋅
- Resource Analysis:
– Observation: The CPU utilization of an element is linear to processing speed
11
VM #n vSwitch vSwitch
④ ③
vNIC vNIC VM Memory VM Memory element element
…
Scheduler VM #1
① ②
vNIC vNIC VM Memory VM Memory element element Scheduler
VM2
Logger Alert
VM1
Header Classifier Stateful Payload Analyzer
Optimized Placer – 0‐1 Quadratic Programming
- Intuition: Consolidate adjacent elements together
– If we place two adjacent elements together to one VM, there will be no inter‐VM packet transfer.
12
inter‐VM intra‐VM
Optimized Placer – 0‐1 Quadratic Programming
- ,
: indicating element is placed onto instance
- Challenge: How to express two elements are placed together?
, ,
- 13
- 1
2 3 4 5 6 , 1 , 1 , ⋅ ,
- 1
2 3 4 5 6 , 1 , 1 , ⋅ , 1
indicator: (quadratic)
Optimized Placer – 0‐1 Quadratic Programming
- Objective
– The total inter‐VM Delayed Bytes.
- Constraints
– The placement cannot lead to the overload of any instances.
- For other mathematical details, please refer to our paper.
14
Optimized Individual Scaling
15
VM2
Logger Alert
VM1
Header Classifier Stateful Payload Analyzer
VM2
Logger Alert
VM1
Header Classifier Stateful Payload Analyzer
VM3
Stateful Payload Analyzer
state syn
~100ms according to OpenNF [Sigcomm’14]
MSFC before scaling Scaling with traditional method
additional packet transfer
Optimized Individual Scaling
- Key novelty
Migrate other elements consolidated together to release resources for the overloaded element.
16
Optimized Individual Scaling
17
VM3 VM1 VM2
Logger Alert
VM1
Header Classifier Stateful Payload Analyzer
VM2
Logger Alert Header Classifier Stateful Payload Analyzer
VM2
Logger Alert
VM1
Header Classifier Stateful Payload Analyzer Stateful Payload Analyzer state syn
MSFC before scaling Scaling with traditional method CoCo
additional packet transfer
Optimized Individual Scaling
- Consistency guarantee mechanism
– Overload should be alleviated. – Migration will not lead to new hotspots.
- Advantage of CoCo Individual Scaler
– No new hardware resource consumed – Additional packet transfer avoided – State synchronization avoided
- Application scenario of CoCo Individual Scaler
– Imbalance between VMs (OFM [IWQoS’18])
18
Optimized Individual Scaling
- Consistency guarantee mechanism
– Overload should be alleviated. – Migration will not lead to new hotspots.
- Advantage of CoCo Individual Scaler
– No new hardware resource consumed – Additional packet transfer avoided – State synchronization avoided
- Application scenario of CoCo Individual Scaler
– Imbalance between VMs (OFM [IWQoS’18])
19
Implementation and Evaluation
- Evaluation Setup
– Docker for consolidation, DPDK version 16.11 – OpenNF [Sigcomm’14] and TFM [ICNP’16] for migration mechanisms. – MATLAB for solving 0‐1 Quadratic Programming – Intel(R) Xeon(R) E5‐2690 v2 CPUs, 256G RAM, 2×10G NICs
- Evaluation Goal
– demonstrate the assumption of linearity – demonstrate the effectiveness of CoCo placement – demonstrate the performance of CoCo scaling
20
- 1. Throughput‐CPU Utilization
- For one core only
- Sender
– 0.9997
- Classifier
– 100 rules on IP header – 0.9999997
21
- 2. Simulations on Placement
- Evaluation Target
– Random: select available VMs randomly – Greedy: place elements in sequence chain‐by‐chain
- Traffic: Randomly pick flows from CAIDA traffic
- Two topology:
22
E1 E2 E3 E4 E5 E6
Chain 1 Chain 2
E1 E2 E3 E4 E5 E6
Chain 3
E7 E8 E9
Chain 1 Chain 2
- 2. Simulations on Placement
- Performance
- Resource Utilization
23
1 2 3 4 5 6 Topo1 Topo2 Sum of DB (MB) CoCo Greedy Random 59% 18% 0% 5% 10% 15% 20% 25% Topo1 Topo2 Placement Failure Rate CoCo Greedy Random
- 2. Simulations on Placement
- Performance
- Resource Utilization
24
1 2 3 4 5 6 Topo1 Topo2 Sum of DB (MB) CoCo Greedy Random 59% 18% 0% 5% 10% 15% 20% 25% Topo1 Topo2 Placement Failure Rate CoCo Greedy Random
- 3. Evaluation on Dynamic Scaling
- Based on OpenNF [Sigcomm’14]
- Per‐packet latency
25
VM3 VM1 VM2
Logger Alert Header Classifier Stateful Payload Analyzer
VM2
Logger Alert
VM1
Header Classifier Stateful Payload Analyzer Stateful Payload Analyzer 20 40 60 80 10 20 30 40 50 Latency (ms) Packet # (kilo) CoCo Traditional
by 45% traffic increases
Conclusion
- CoCo: Compact and Optimized Consolidation of MSFCs in NFV
– Optimized Placer – Individual Scaler
- Significant Performance Improvement
– Up to 59% Delayed Bytes reduction in initial placement. – 45% latency reduction when dynamic scaling.
- Future work
– Multi‐core placement – Intra‐core cache analysis
26